0% found this document useful (0 votes)
684 views510 pages

Offrey Vining PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
684 views510 pages

Offrey Vining PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 510

STATISTICAL PROCESS

MONITORING AND
OPTIMIZATION
This Page Intentionally Left Blank
STATISTICS: Textbooks and Monographs
A Series Edited by

D. B. Owen, Founding Editor,1972-1991

W. R. Schucany, Coordinating Editor


Department of Statistics
Southern MethodistUniversity
Dallas, Texas

W. J. Kennedy, Associate Editor


for Statistical Computing
Iowa State University

A. M. Kshirsagar, Associate Editor E. G . Schlling, Associate Editor


and for
for Multivariate Analysis for Statistical Quality Control
Experimental Design Rochester Instituteof Technology
Universityof Michigan

1. The Generalized Jackknife Statistic,H. L. Gray and W. R. Schucany


2. Multivariate Analysis, AnantM. Kshirsagar
3. Statistics and Society. Walter T. Federer
4. MultivariateAnalysis: A SelectedandAbstractedBibliography. 1957-1972, Kocher-
lakota Subrahmaniamand Kathleen Subrahmaniam
5. DesignofExperiments: A RealisticApproach,Virgil L. Anderson and Robert A.
McLean
6. Statistical and Mathematical Aspects of Pollution Problems, John W. Pratt
7. Introduction to ProbabilityandStatistics(in hnro parts),Part I: Probability:Part II:
Statistics, NarayanC. Giri
8. Statistical Theory of the Analysis of Experimental Designs, J. Ogawa
9. Statistical Techniquesin Simulation (in two parts). Jack f . C. Kleijnen
10. Data QualityControl and Editing, Joseph1. Naus
11. Cost of Living Index Numbers: Practice, Precision. and Theory,Kali S. Banejee
12. WeighingDesigns: For Chemistry.
Medicine,
Economics. Operations Research,
Statistics, Kali S. Banejee
13. The Search for Oil: Some Statistical Methods and Techniques, edited by 0.8. Owen
14. Sample Size Choice: Charts for Experiments with Linear Models, Robert E. Odeh and
Martin Fox
15. Statistical Methods forEngineersandScientists, Robat M. Bethea,Senjamin S.
Duran, and Thomas L. Boullion
16. Statistical Quality Control Methods,/wing W. Burr
17. On the Histoly of Statistics and Probability, editedby 0. 6. Owen
18. Econometrics, Peter Schmidt
19. Sufficient Statistics: Selected Contributions, Vasant S. Huzubazar (edited by Anant M.
Kshirsagar)
20. Handbook of Statistical Distributions, Jagdish K.fatel, C. H. Kapadia, and 0. B. Owen
21. Case Studies in Sample Design, A. C. Rosander
22.PocketBookofStatisticalTables,compiled by R. E. Odeh, D. B. Owen, Z. W.
Birnbaum, and L. Fisher
23. The Informationin Contingency Tables, D. V. Gokhale and Solomon Kullback
24. Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods, LeeJ.
Bain
25. Elementary Statistical Quality Control. /wing W. Burr
26. An Introduction to Probability andStatisticsUsing BASIC, RichardA. Gmneveld
27. Basic Applied Statistics, B. L. RaMoe and J. J. Hubert
28. A Primer in Probability, Kathleen Subrahmaniam
29. Random Processes: A First Look, R. Syski
30. Regression Methods: A Tool for Data Analysis, Rudolf J. Freundand PaulD. Minton
31. Randomization Tests, Eugene S. Edgington
32. Tables for Normal Tolerance Limits, Sampling Plans and Screening. Robert E. Odeh
and D. 8. Owen
33. Statistical Computing. Wlliam J. Kennedy, Jr., and James E. Gentle
34. Regression Analysis and Its Application: A Data-Oriented Approach, Richard F. Gunst
and Robert L. Mason
35. Scientific Strategies to Save Your Life,1. D. J. Bmss
36. Statistics in the Pharmaceutical Industry, edited by C. Ralph Buncher and Jia-Yeong
Tsay
37. Sampling from a Finite Population, J. Hajek
38. Statistical Modeling Techniques, S. S. Shapim and A. J. Gmss
39. Statistical Theory and Inference in Research, T. A. Bancmft and C.-f.Han
40. Handbook of the Normal Distribution, Jagdish K.fatel and Campbell B. Read
41. Recent Advances in Regression Methods, Hrishikesh D. Vinod and Aman Ullah
42. Acceptance Sampling in Quality Control, Edward G. Schilling
43. The Randomized Clinical Trial and Therapeutic Decisions, edited by Niels Tygstrup,
John M Lachin, and ErikJuhl
44. Regression Analysis of Survival Data in Cancer Chemotherapy, Walter H. Carter, Jr.,
Galen L. Wampler, and Donald M. Stablein
45. A Course in Linear Models, Anant M. Kshirsagar
46. Clinical Trials: Issues and Approaches, edited by Stanley H. Shapim and Thomas H.
Louis
47. Statistical Analysisof DNA Sequence Data, edited by B. S. Weir
48. Nonlinear Regression Modeling: A Unified Practical Approach, DavidA. Ratkowsky
49. Attribute Sampling Plans, Tables of Tests and Confidence Limits for Proportions.Rob-
ert E. Odeh and D. B. Owen
50. ExperimentalDesign,StatisticalModels,andGeneticStatistics,edited by Klaus
Hinkelmann
51. Statistical Methods for Cancer Studies, edited by Richard G. Comell
52. Practical Statistical Sampling for Auditors, ArthurJ. Wlbum
53. Statistical Methods for Cancer Studies, edited by Edward J. Wegman and James G.
Smith
54. Self-organizing Methods in Modeling: GMDH Type Algorithms, edited by Stanley J.
Fadow
55. Applied Factorial and Fractional Designs, RobertA. McLean and Virgil L. Anderson
56. Design of Experiments: Ranking and Selection, edited by Thomas J. Santner and hit
C.Tamhane
57.StatisticalMethodsforEngineersandScientists:SecondEdition,RevisedandEx-
panded, Robert M. Bethea, BenjaminS. Duran, and Thornas L. Boullion
58. Ensemble Modeling: Inference from Small-scale Properties to Large-Scale Systems.
Alan E. Geffand and Crayton C. Walker
59.ComputerModelingforBusinessandIndustry,Bruce L. Bowerman and Richard T.
OConnell
60. Bayesian Analysis of Linear Models. LyleD. Bruemeling
61. Methodological Issues for Health Care Surveys, Brenda Cox and Steven Cohen
62. Applied Regression Analysis and Experimental Design,Richard J. Brook and Gresory
C. Amold
63. Statpal: A Statistical Package for Microcomputers+PC-DOS Version for the IBM PC
and Compatibles. Bruce J. Chalmer and David G. Whitmore
64. Statpal: A Statistical Package for Microcomputers-Apple Version for the II, II+. and
Ile, David G. Whitmore and 6NCe J. Chalmer
65.NonparametricStatisticalInference:SecondEdition,RevisedandExpanded, Jean
Dickinson Gibbons
66. Design and Analysis of Experiments, Roger G. Petersen
67.StatisticalMethodsforPharmaceuticalResearchPlanning, Sfen W. Bergmanand
John C. Gittins
68. Goodness-of-Fit Techniques, edited by Ralph6. D'Agostino and Michael A. Stephens
69. Statistical Methods in Discrimination Litigation,edited by D. H. Kaye and Mike1 Aickin
70. Truncated and Censored Samples from Normal Populations,Helmut Schneider
71. Robust Inference, M. L. Tiku, W. Y. Tan, and N. Balaknshnan
72. Statistical Image Processing and Graphics,edited by EdwardJ. Wegman and Douglas
J. DePnest
73. Assignment Methods in Combinatorial Data Analysis,Lawrence J. Hubert
74. Econometrics and Structural Change, Lyle D. Broemeling and Hiroki Tsurumi
75. Multivariate Interpretation of Clinical Laboratory Data, AdelinAlbertandEugene K.
Hams
76. Statistical Tools for Simulation Practitioners,Jack P. C. Kleinen
77. Randomization Tests: Second Edition, Eugene S.Edgingion
78. A Folio of Distributions: A Collection of Theoretical Quantile-Quantile Plots,Edward 6.
Fowlkes
79. Applied Categorical Data Analysis, Daniel H. Freeman,Jr.
80. Seemingly Unrelated Regression Equations Models: Estimation and Inference, Viren-
dra K. Snvastava and David E. A. Giles
81. Response Surfaces: Designs and Analyses, Andre 1. Khun and John A.Come11
82. Nonlinear Parameter Estimation: An Integrated System in BASIC, John C. Nash and
Mary Walker-Smith
83. Cancer Modeling, edited by James R. Thompson and Bany W. Bmwn
84. Mixture Models: Inference and Applications to Clustering, Geoffrey J. McLachlan and
Kaye E. Basford
85. Randomized Response: Theory and Techniques, Anjit Chaudhun and Rahul Mukerjee
86. Biopharmaceutical Statistics for Drug Development, edited by Kar/E. Peace
Robert E. Odeh andD. 6. Owen
87. Parts per Million Values for Estimating Quality Levels,
88. Lognormal Distributions: Theory and Applications,edited by Edwin L. Crow and Kunio
Shimizu
89. Properties of Estimators for the Gamma Distribution,K. 0.Bowman and L. R. Shenton
90. Spline Smoothing and Nonparametric Regression,Randall L. Eubank
91. Linear Least Squares Computations, R. W. Farebrother
92. Exploring Statistics, Damaraju Raghavarao
93. Applied Time Series Analysis for Business and Economic Forecasting, Sufi M. Nazem
94. Bayesian Analysis of Time Series and Dynamic Models, edited by James C. Spall
95.TheInverseGaussianDistribution: Theory, Methodology,andApplications, Raj S.
Chhikara and J. Leroy Folks
96. Parameter Estimation in Reliability and Life Span Models, A. Clifford Cohen and Betty
Jones Whitten
97. Pooled Cross-Sectional and Time Series Data Analysis, Terry E. Dielman
98. Random Processes: A First Look, Second Edition, Revised and Expanded, R. Syski
99. Generalized Poisson Distributions: Properties and Applications, P. C. Consul
100. Nonlinear b-Norm Estimation, Rene Gonin andMhur H. Money
101. Model Discrimination for Nonlinear Regression Models,Dale S.Borowiak
102. Applied Regression Analysis in Econometrics, Howard E. Doran
103. Continued Fractions in Statistical Applications,K. 0.Bowman and L. R.Shenton
104. Statistical Methodologyin the Pharmaceutical Sciences. Donald A. Berry
105. Experimental Design in Biotechnology, Perry D. Haaland
106. Statistical Issues in Drug Research and Development, editedby Kad E. Peace
107. Handbook of Nonlinear Regression Models. DavidA. Ratkowsky
108. Robust Regression: Analysis and Applications, edited by Kenneth D. Lawrence and
Jeffrey L. Arthur
109. Statistical Design and Analysis of Industrial Experiments, edited by Subir Ghosh
110. U-Statistics: Theory and Practice,A. J. Lee
111. APrimer in Probability:SecondEdition,RevisedandExpanded,KathleenSubrah-
maniam
112. Data Quality Control: Theory and Pragmatics, edited by Gunar E. Liepins and V. R. R.
Uppulun'
113. Engineering Quality by Design: Interpreting the Taguchi Approach, Thomas 8.Barker
114. Survivorship Analysis for Clinical Studies, Eugene K. Hams and Adelin Albert
115. Statistical Analysis of Reliability and Life-Testing Models: Second Edition, LeeJ. Bain
and Max Engelhanlt
116. Stochastic Models of Carcinogenesis, Wai-Yuan Tan
117. Statistics and Society: Data Collection and Interpretation, Second Edition, Revised and
Expanded, Walter T. Federer
118. Handbook of Sequential Analysis,6.K. Ghosh and P. K. Sen
119. Truncated and Censored Samples: Theory and Applications, A. Clifford Cohen
120. Survey Sampling Principles,E. K. Foreman
121. Applied Engineering Statistics, Robert M. Bethea and R. RussellRhinehart
122. SampleSizeChoice:ChartsforExperimentswithLinearModels:SecondEdition,
Robert E. Odeh and Martin Fox
123. Handbook of the Logistic Distribution, edited by N. Balakn'shnan
124. Fundamentals of Biostatistical Inference, Chap T. Le
125. Correspondence Analysis Handbook,J.-P.Benzbcn'
126. Quadratic Forms in Random Variables: Theory and Applications, A. M. Mafhai and
Serge 8. Provost
127. Confidence Intervals on Variance Components. Richard K. Burdick and FranklinA.
Graybill
128. Biopharmaceutical Sequential Statistical Applications. edited by Karl E. Peace
129. Item Response Theory: Parameter Estimation Techniques, frank 8.Baker
130. Survey Sampling: Theory and Methods, Anjit Chaudhun' and Horst Stenger
131. Nonparametric Statistical Inference: Third Edition, Revised and Expanded, Jean Dick-
inson Gibbons and Subhabrata Chakraborti
132. BivariateDiscreteDistribution,SubrahmaniamKocherlakota and KathleenKocher-
lakota
133 Design and Analysis of Bioavailability and Bioequivalence Studies, Shein-Chung Chow
and Jen-pei Liu
134. MultipleComparisons.Selection,andApplicationsinBiometry, edited by Fred M.
HOPpe
135. Cross-OverExperiments:Design,Analysis,andApplication. David A.Ratkowsky,
Marc A. Evans,and J. Richard Alldredge
136. IntroductiontoProbabilityandStatistics:SecondEdition,RevisedandExpanded,
Narayan C. Gin'
137. Applied Analysis of Variance in Behavioral Science, edited by Lynne K. Edwards
138. Drug Safety Assessmentin Clinical Trials,editedby Gene S. Gilbert
139. Design of Experiments: A No-Name Approach, Thomas J. LOn3nZen and Virgil L. An-
derson
140. statistics in thePharmaceuticalIndustry:SecondEdition,RevisedandExpanded.
edited by C. Ralph Buncherand Jia-Yeong Tsay
141. Advanced Linear Models: Theory and Applications, Song-Gui Wangand Shein-Chung
Chow
142.
Multistage Selection
and
Ranking Procedures:Second-Order
Asymptotics, Nitis
Mukhopadhyayand Tumulesh K. S. Solanky
143.StatisticalDesignandAnalysisinPharmaceuticalScience:Validation,ProcessCon-
trols, and Stability, Shein-Chung Chow and Jen-pei Liu
144. Statistical Methods for Engineers and Scientists: Third Edition, Revised and Expanded,
Robert M. Eethea, BenjaminS. Duran, and Thomas L. Eoullion
145. Growth Curves, Anant M. Kshirsagarand William Boyce Smith
146. Statistical Bases of Reference Values in Laboratory Medicine, Eugene K. Ham's and
James C. Boyd
147. Randomization Tests: Third Edition, Revised and Expanded, Eugene S. Edgington
148.PracticalSamplingTechniques:SecondEdition,RevisedandExpanded,Ranjan K.
Som
149. Multivariate Statistical Analysis, Narayan C. Giri
150. Handbook of the Normal Distribution: Second Edition, Revised and Expanded, Jagdish
K. Patel and CampbellB. Read
151. Bayesian Biostatistics, edited by DonaldA. Berry and Dalene K. Stangl
152. Response Surfaces: Designs and Analyses, Second Edition, Revised and Expanded,
And& 1. Khuri and JohnA. Comell
153. Statistics of Quality, edited by Subir Ghosh, William R. Schucany,and William E. Smith
154. Linear and Nonlinear Models for the Analysis of Repeated Measurements, Edward F.
Vonesh and Vernon M. Chinchilli
155. Handbook of Applied Economic Statistics, Aman Ullahand David E. A. Giles
156. lrnprovlng Efficiency by Shrinkage: The James-Stein and Ridge Regression Estima-
tors, Marvin H. J. Gmber
157. Nonparametric Regression and Spline Smoothing: Second Edition, Randall L. Eu-
bank
158. Asymptotics. Nonparametrics, and Time Series, edited by Subir Ghosh
159. Multivariate Analysis, Design of Experiments, and Survey Sampling, edited by Subir
Ghosh
160. Statistical Process Monitoring and Control, edited by Sung H. Park and G. Geoffrey
Vining
161. Statistics for the 21st Century: Methodologies for Applications of the Future,C. R.
Rao and Gabor J. Szekely

Additional Volumesin Preparation


This Page Intentionally Left Blank
STATISTICAL PROCESS
MONITORING AND
OPTIMIZATION

edited by

Sung H. Park
Seoul National University
Seoul, Korea

G. GeoffreyVining
Virginia Polytechnic Institute
and State University
Blacksburg, Virginia

m
MARCEL

MARCELDEKKER,
INC. -
NEWYORK BASEL
D E K K E R
Library of Congress Cataloging-in-Publication Data

Statistical process monitoring and optimization / edited by Sung H. Park and


G. Geoffrey Vining
p.cm. - (Statistics,textbooks and monographs; v.160)
Includes bibliographical references.
ISBN: 0-8247-6007-7 (alk. paper)
1. Process control-Statistical methods. 2. Quality control-Statistical methods. 3.
Optimal designs (Statistics) I. Park, Sung H. 11. Vining, G. Geoffrey. 111. Series.
TS156.8.S75371999
658.5'62-dc21
CIP

This book is printed on acid-free paper.

Headquarters
Marcel Dekker. Inc.
270 Madison Avenue, New York, NY 10016
tel: 2 12-696-9000; fax: 21 2-685-4540

Eastern Hemisphere Distribution


Marcel Dekker AG
Hutgasse 4, Postfach 812, CH-4001 Basel, Switzerland
tel: 41-61-261-8482; fax: 41-61-261-8896

World Wide Web


http://www.dekker.com

The publisher offers discounts on this book when ordered in bulk quantities. For
more information, writeto Special Sales/Professional Marketingat the headquarters
address above.

Copyright $1) 2000 by Marcel Dekker, Inc. All Rights Reserved.

Neither this book nor any part may be reproduced or transmitted in any form or by
any means, electronic or mechanical, including photocopying, microfilming, and
recording, or by any information storage and retrieval system, without permission in
writing from the publisher.

Current printing (last digit):


10987654321

PRINTED IN THE UNITED STATES OF AMERICA


Preface
Thisbookhasbeenwrittenprimarilyforengineersandresearcherswho
wantto usesomeadvancedstatisticalmethodsforprocessmonitoring
and optimization in order to improve quality and productivity in industry,
and also for statisticians who want to learn more aboutrecent topics in this
general area. The book coversrecent advanced topics in statistical reasoning
in qualitymanagement,controlcharts,multivariateprocessmonitoring,
processcapabilityindex,designofexperiments(DOE)andanalysisfor
process control, and empirical model building for process optimization. It
will also be of interest to managers, quality improvement specialists, grad-
uate students, and other professionals with an interest in statistical process
control (SPC) and its related areas.
In
August 1995, the International
Conferenceon
Statistical
Methods and
Statistical
Computing for
Quality
and
Productivity
Improvement(ICSQP'95) washeld in Seoul,Korea,andmany of the
authors ofthisbookparticipated.Ayearlateraftertheconference,the
editors agreed to edit this book and invited some key conference partici-
pants and some other major contributors in the field who did not attend the
conference. Authors from 15 nations have joinedin this project, making this
truly a multinational book. The authors are all well-known scholars in SPC
and DOE areas. The book provides useful information for those who are
eager to learn about recent developments in statistical process monitoring
and optimization. It also provides an opportunity for joint discussion all
over the world in the general areas of SPC and DOE.
We would like tothankElizabethCurione,productioneditor of
Marcel Dekker, lnc., for her kind help and guidance, and Maria Allegra,
acquisitionseditor and manager at Marcel Dekker, Inc., for making the
publication of this book possible. We very much appreciate the valuable
contributions and cooperationof the authorswhich made the book a reality.
We sincerely hopethat it is a usefulsourceofvaluable informationfor
statistical process monitoring and optimization.

iii
iv Preface

We want to dedicate this book to God for giving us thenecessary


energy, health, and inspiration to write our chapters and to edit this book
successfully.

Sung H . Park
G. Ge0ffrc.y Vining
Contents
Preface iii
Contributors ix

PART 1 STATISTICALREASONING IN TOTALQUALITY


MANAGEMENT

1 On-Line Quality Control System Designs


Genichi Taguchi 1

2 Statistical Monitoring and Optimization in Total Quality


Management
Kai Kristensen 19

3 Quality Improvement Methods and Statistical Reasoning


G. Kanji 35

4 Leadership Profiles and the Implementation of Total Quality


Management for Business Excellence
Jens J . Dahlgaard, Su M i Park Dahlgaard, and
Anders N~rgaard 45

5 A Methodological Approach for the Integration of SPC


and EPC in Discrete Manufacturing Processes
Enrique
Del
Castillo,
Ruiner Gob, andElart
Von
Collani 77

6 Reliability Analysis of Customer Claims


Pasquale Erto 107

PART 2 CONTROL CHARTS AND PROCESS MONITORING

7 Some Recent Developments in Control Charts for Monitoring


a Proportion
Marion R . Reynolds,
Jr., and Zachary
Stoumbos
G. 117

V
vi Contents

8 Process Monitoring with Autocorrelated Data


Douglas C. Montgonwy mlcl Christinrr M . Mastrcmgelo 139

9 An Introduction to the New Multivariate Diagnosis Theory


with TWOKinds of Quality and Its Applications
Gongsu Zlznng 161

10 Applications of Markov Chains in Quality-Related Matters


Min-Te Cllao 175

1 1 Joint Monitoring of Process Mean and Variance Based on the


Exponentially Weighted Moving Averages
Fa11 Futt Gun 189

PART 3 MULTIVARIATEPROCESSMONITORINGAND
CAPABILITY INDICES

12 Multivariate Quality Control Procedures


A . J. Hcr~~ter 209

13 Autocorrelation in Multivariate Processes


Robert L . Mcrson r ~ n dJolm C. Young 223

14 Capability Indices for Multiresponse Processes


Alcrn Vce\’ers 24 1

15 Pattern Recognition and Its Applications i n Industry


R. Gncrt?rrrlesil\an anel J. R . Kcttenring 257

16 Assessing Process Capability with Indices


Fred A . Spiring 269

PART 4 EXPERIMENTALDESIGNANDANALYSIS FOR


PROCESS CONTROL

17 Experimental Strategies for Estimating Mean and Variance


Function
G . Geqffrel> Vining, Diane A . Schnuh, c 1 n d C d Modigl? 29 1

18 Recent Developments in Supersaturated Designs


Dennis K . J . Lin 305
Contents vii

19 Statistical Methods for Product Development:


Prototype Experiments
David M . Steinberg and Soren Bisgaurd 32 1

20 Optimal Approximate Designs for B-Spline Regression with


Multiple Knots
Norbert Gcdfie and Berthold Heiligers 339

21 On Dispersion Effects and Their Identification


Bo Bergman and Anders Hynf'n 359

PART 5 EMPIRICALMODELBUILDINGANDPROCESS
OPTIMIZATION

22 A Graphical Method for Model Fitting in Parameter Design


with Dynamic Characteristics
Sung H. Park and Je H . Choi 373

23 Joint Modeling of the Mean and Dispersion for the Analysis


of Quality Improvement Experiments
Youngjo Lee and John A . Nelder 387

24 Modeling and Analyzing the Generalized Interaction


Clzihiro Hirotsu 395

25 Optimization Methods in Multiresponse Surface Methodology


A n h i I . K h r i a d Elsie S. Valeroso 41 1

26 Stochastic Modeling for Quality Improvement in Processes


M . F. Ramallwto 435

27 Recent Developments in Response Surface Methodology and


Its Applications in Industry
Angela R. Nefland Raymond H. Myers 457

Index 483
This Page Intentionally Left Blank
Contributors
Bo Bergman, Ph.D. Linkoping University, Linkoping, and Department of
TQM, School of
Technology
Management,
Chalmers University
of
Technology, Gothenburg, Sweden

Seren Bisgaard, Ph.D. InstituteforTechnologyManagement, University


of St. Gallen, St. Gallen, Switzerland

Min-TeChao,Ph.D. Institute of StatisticalScience,AcademiaSinica,


Taipei, Taiwan, Republic of China

Je H. Choi, Ph.D. StatisticalAnalysisGroup,SamsungDisplay Devices


Co., Ltd., Suwon, Korea

JensJ. Dahigaard,Dr. Merc. Department of Information Science, The


Aarhus School of Business, Aarhus, Denmark

SuMi Park Dahlgaard, M.Sc., Lic. Orcon. Department of Information


Science, The Aarhus School of Business, Aarhus, Denmark

Enrique Dei Castillo, Ph.D. Department of Industrial and Manufacturing


Engineering, The Pennsylvania State University, University Park,
Pennsylvania

Pasquale Erto, M.D. Department of Aeronautical Design,University of


Naples Federico 11, Naples, Italy

Norbert Gaffke,Ph.D. Department of Mathematics,UniversitatMagde-


burg, Magdeburg, Germany

Fah FattGan,Ph.D. Department ofStatisticsandAppliedProbability,


National University of Singapore, Singapore, Republic of Singapore

ix
x Contributors

R.
Gnanadesikan, Ph.D. Department of Statistics,RutgersUniversity,
New Brunswick, New Jersey

Rainer Gob, Dr. Institute of Applied


Mathematics
and
Statistics,
University of Wuerzburg, Wuerzburg, Germany

A. J. Hayter, Ph.D. Department of Industrialand SystemsEngineering,


Georgia Institute of Technology, Atlanta, Georgia

Berthold Heiligers, Ph.D. Department of Mathematics, Universitit Magde-


burg, Magdeburg, Germany

Chihiro Hirotsu, Ph.D. Department of MathematicalEngineeringand


Information Physics, University of Tokyo, Tokyo, Japan

AndersHynCn, Ph.D. Department ofSystemsEngineering,ABBCorpo-


rate Research, Vasterds, Sweden

G. K. Kanji, B.Sc.,M.Sc.,Ph.D. Department ofStatistics, Sheffield


Business School, Sheffield Hallam University, Sheffield, England

J. R. Kettenring, Ph.D. Department of Mathematical SciencesResearch


Center, Telcordia Technologies, Morristown, New Jersey

AndrC I. Khuri, Ph.D. Department ofStatistics,UniversityofFlorida,


Gainesville, Florida

Kai Kristensen,
Dr. Merc. Department of Information Science, The
Aarhus School of Business, Aarhus, Denmark

Youngjo Lee, Ph.D. Department ofStatistics,SeoulNationalUniversity,


Seoul, Korea

Dennis K. J. Lin, Ph.D. Department of ManagementandInformation


Systems, The Pennsylvania State University, University Park, Pennsylvania

Robert L. Mason,Ph.D. Statistical Analysis Section, Southwest Research


Institute, San Antonio, Texas

Christina M. Mastrangelo, Ph.D. Department ofSystemsEngineering,


University of Virginia, Charlottesville, Virginia
Contributors xi

Carl ModighArkwrightEnterprisesLtd.,Paris,France

Douglas C. Montgomery, Ph.D. Department of IndustrialEngineering,


Arizona State University, Tempe, Arizona

Raymond H. Myers, Ph.D. Department of Statistics, Virginia Polytechnic


Institute and State University, Blacksburg, Virginia

Angela R. Neff, Ph.D. Department of Corporate Research and Develop-


ment, General Electric, Schenectady, New York

John A. Nelder, DSc., F.R.S. Department of Mathematics,Imperial


College, London. England

Anders Nsrgaard, MSc. Department of Information Science, The Aarhus


School of Business, Aarhus,Denmark,and Bulon Management, Viby,
Denmark

Sung H. Park, Ph.D. Department of Statistics, Seoul National University,


Seoul, Korea

M. F. Ramalhoto, Ph.D. Department of Mathematics,TechnicalUni-


versity of Lisbon, “Instituto Superior Tkchnico,” Lisbon, Portugal

Marion R. Reynolds, Jr., Ph.D. Departments ofStatisticsandForestry,


Virginia Polytechnic Institute and State University, Blacksburg, Virginia

Diane A. Schaub,Ph.D. DepartmentofIndustrialand SystemsEngi-


neering, University of Florida, Gainesville, Florida

Fred A. Spiring, Ph.D. Department of


Statistics, The University of
Manitoba,
andDepartment of Quality,
Pollard
Banknote
Limited,
Winnipeg, Manitoba, Canada

David M.Steinberg,Ph.D.Department of StatisticsandOperations


Research, Tel Aviv University, Tel Aviv, Israel

Zachary G. Stoumbos,Ph.D. Department of Management Science and


Information Systems, and
Rutgers
Center
for
Operations
Research
(RUTCOR), Rutgers University, Newark, New Jersey

GenichiTaguchi, DSc. Ohken Associate, Tokyo, Japan


xii Contributors

Elsie S. Valeroso, Ph.D. Department of Mathematics and Statistics, Mon-


tana State University, Bozeman, Montana

Alan Veevers, BSc., Ph.D. Department of Mathematical and Information


Sciences, Commonwealth Scientific and Industrial Research Organization,
Clayton, Victoria, Australia

G. Geoffrey Vining, Ph.D. Department of Statistics, Virginia Polytechnic


Institute and State University, Blacksburg, Virginia

Elart von Collani, Dr.rer. nat., Dr.rer. nat. habil., School of Economics,
University of Wuerzburg, Wuerzburg, Germany

John C. Young, Ph.D. Department of Mathematics and Computer Science,


McNeese State University, Lake Charles, Louisiana

Gongxu Zhang Research Institute


of
Management Science,
Beijing
University of Science and Technology, Beijing, People’s Republic of China
On-Line Quality Control System
Designs
Genichi Taguchi
Ohken Associate, Tokyo, Japan

1. INTRODUCTION

It is the responsibility of a production departmentto produce a product that


meets a designed quality level at the lowest cost. However, it is important to
not merely have the product qualitymeet specifications but to also endeavor
to bring quality as close as possible to the ideal value.

2. JAPANESEPRODUCTSANDAMERICANPRODUCTS

Many Japanese read an article on April 17, 1979, on the front page of Asahi
Shinbun, one of the most widely circulated newspapers in Japan, regarding a
comparison of the quality of color television sets produced by the Sony
factory in Japan with that of TVs produced by the Sony factory in San
Diego, California. The comparison was madeonthebasis of thecolor
distribution, which is related to the color balance. Although both factories
usedthe same design, the TVs fromtheSanDiegofactoryhadabad
reputation,andAmericanspreferredtheproductsfromJapan. Based on
this fact, Mr. Yamada, the vice presidentofSonyUnitedStatesatthat
time, described the difference in the article.
The difference in the quality characteristic distributions is shown in
Figure 1. It is seen from the figure that the color quality of Japanese-made

1
2 Taguchi

m-5 m m+5

Figure 1 Distribution of color quality in television sets of Sony U.S. and Sony
Japan.

TVs shown by the solid curve have approximately a normal distribution


withthetargetvalue at thecenter;itsstandarddeviation is about one-
sixth of the tolerance or 10 in certain units.
In qualitycontrol,the index of tolerancedivided bysix standard
deviations is called the process capability index, denoted by C,,:

tolerance
c'' -- 6 x standard deviation

The process capability of the Japanese-made TVs is therefore 1, and the


average quality level coincides with the target value.
The quality distribution of the sets produced in San Diego, shown by
the dash-dot curve, on the other hand, has less out-of-specification product
than that of the Japanese-made sets and is quitesimilartotheuniform
distribution for those products that are within the tolerance. Since the stan-
dard deviation of the uniform distribution is given by l/mof the toler-
ance, its process capability index is given by

tolerance
c- = 0.577
- (tolerance/m) x 6

whichshows thatitsprocesscapability index is worse thanthat of the


Japanese-made product.
A productwithout-of-tolerancequality is abadproduct. It is an
unpassedproduct, so it shouldnot be shippedout.Fromtheopposite
On-LineQualityControl 3

point of view, a product within tolerance should be considered good and


should be shipped. In a school examination, a score above 60 with 100 as the
full mark is considered to be a passing grade. A product quality that coin-
cides withitstargetvalue shouldhavea full mark.Qualitygradually
becomesworsewhen it deviatesfromthetarget value, and fails when it
exceeds the specification limits, or f 5 in this example.
In a school examination, a score of 59 or below 59is failing, 60 or
above 60 is passing. The scores are normally classified into the following
grades:
60- 69 D
70- 79 c
80- 89 B
90-100 A
I put grades A, B, and C in Figure 1. It is seen that the Japanese-made TVs
have more A's and fewer B's and C's.
To reduce the Japan-United States difference, Mr. Yamada dictated a
narrower tolerance for the San Diego factory, specifying B as the lowest
allowable quality limit. This is wrong, since specifying a more severe toler-
ance because of inferior process capability is similar to raising the passing
scorefrom 60 to 70 becauseof theincapabilityofstudents.Inschools,
teachers do not raise the limit for such students. Instead, teachers used to
lower the passing limit.
As stated above, loss is caused when the quality characteristic (denoted
by y ) deviates from the target value (denoted by m) regardless of how small
the deviation is. Let the loss be denoted by LO,). LO,) is the minimum wheny
coincides with the target value m , and we may put the loss to be 0.

L(m) = 0 (3)

When y = m , Lo,) is zero or minimum, and its differential coefficient is,


accordingly, zero.

L'(m) = 0 (4)

Using the Taylor expansion, the loss function LO,) is expanded as

--
L 'I( m )
-
2! 0,-
4 Taguchi

The constant and linear terms(differential terms) become zero from Eqs. (3)
and (4). If the third-order term and the following terms can be omitted, the
loss function is then

Let the allowance or the deviation of y from the middle value by A .


The more y deviates from m , the middle value, the more loss is caused. A
product whose deviation is less than its allowance A should pass inspection;
otherwisethe company will losemore.Whenthedeviationexceedsthe
allowance, the product should not be passed. therefore, when the deviation
is equal to the allowance, its loss is equal to the loss due to the disposal of
the failed product.
Let A (yen) signify the loss caused by disposing of a failed product.
Putting A and allowance A in Eq. (6), k is obtained as

lossofdisposingofafailedproduct -- A
k= -
(allowance)2 A2

Assume that the cost of repairing a failed color TV set is 600 yen. k is then
calculated as

600
k = -= 24.0 (yen)
52

The loss function is therefore

L = 24.001 - m ) 2 (9)

This equation applies to the casewhen a single product is manufactured. An


electrical manufacturing company in India (BHEL) said to me “Our com-
pany manufactures only one product, a certain type of nuclear power sta-
tion. There is no second machinery of the same type producted. Since the
variation of a single product is zero, standard deviation in statistics is not
applicable in our case.”
Variation is measured by the deviation from a target value or an ideal
value. Therefore, it can be obtained from Eq. (6) even when only one pro-
duct is produced. When there is more than one product, the averageof Eq.
(6) is calculated.Variance (a2),theaverageofthe square ofdifferences
between y and the target value, is used for this purpose. a2is correctly called
the average error squared, but we will call it variance for simplicity.
On-Line Quality Control 5

o2 = average of O,- m)' (10)

The loss function is given by

L = ka'

Thequality level differencebetweenSony U.S. andSonyJapan is


calculated from Eq. ( I 1) as shown in Table 1.
Table 1 shows that although the fraction defective of theJapanese
Sony factory is larger, its loss is one-third that of the U.S. Sony factory.
Inotherwords,theJapanesequality levelis threetimeshigher. IfVice
President Yamada specified anarrowertolerancesuchas I O x 213, the
quality level would be improved (assuming a uniform distribution within
the tolerance limits):

L = 24.0 x [= 1
x IO x

This shows that there is a 1 1 1 (= 200.0 - 88.9) yen improvement, but that
the Sony U.S. quality level is still 22. I (= 88.9 - 66.7) yen worse than that of
Sony Japan.
I f such an improvement were attained by repairing or adjusting failed
products whose quality level exceeds m f 1013 but lies within m f 5, holding
33.3% of the total production asseen from Figure 1, at a costof 600 yen per
unit, then the cost of repair per unit would be

600 x 0.333
(13) = 200(yen)

An 1 1 1.1 yen quality improvement at a cost of200 yen is not profitable. The
correct solution to this problem is to apply both on-line and off-line quality
control techniques.
I visitedtheLouisvillefactoryofthe General Electric Company in
September 1989. On the production line,workerswereinstructedto use

Table 1 Quality Comparison Between Sony Japan and Sony U.S


Standard Loss L Fraction
Country Average deviation Variance (yen) defection (%)
Japan 172 10/6 ( 1 0/6)* 66.7 0.27
United States tn 1o/m 100/12 200.0 0.00
6 Taguchi

go-nogogauges,whichdetermineonlypass or fail; there was a lack of


consciousness of the importance of the quality distribution within tolerance.
It was proposed that Shewhart’s control charts be used to control qualityby
the distribution of quality characteristics on production lines as a substitute
for a method using specification and inspection. Inspectors tend to consider
production quality as perfect if the fraction defective is zero. In Japan, none
of the companies that product JIS (Japanese Industrial Standards) products
are satisfied producing products whose quality level marginally passes the
JISspecifications.Instead,thecompaniesalwaysattempt to reducethe
quality distribution within the tolerance range. Nippon Denso Company,
for example, demands that its production lines and vendors improve their
process capability indexes above 1.33.
To determine the process capability index, data yl , y2, . . . , Y , are
~ col-
lected once or a few times a day for 3 months. The standard deviation is
obtained from the following equation, where m is the target value.

The process capability index C, is calculated as

c, = tolerance
60

The loss function L b ) is then determined as

L = k c2

3. WHAT IS ON-LINEQUALITYCONTROL?

Manufacturers contribute to society and grow through a series of activities


including product planning, product development, design, production, and
sales. Within these steps, routine quality control activity on productionlines
is called on-line quality control. It includes the following threeactivities:
I. Diagnosisandadjustmentof processes. This is called process
control. A manufacturing process is diagnosed at constant inter-
vals.Whentheprocess is judgedto be normal,productionis
continued;otherwisethecause of abnormality is investigated,
the abnormal condition is corrected, and production is restarted.
On-LineQualityControl 7

Preventive activities suchasadjustingamanufacturing process


when it appears to become abnormal are also included in this case.
2. Prediction and modification. In order to control a variable qual-
ity characteristic in a production line, measurements are made at
constantintervals.Fromthemeasurementresults,theaverage
quality of the products to be produced is predicted. If the pre-
dictedvaluedeviatesfromthetargetvalue,correctiveaction is
taken by moving the level of a variable (called a signal factor in
on-line quality control) to bring the deviation back to the target
value. This is called feedback control.
3. Measurementanddisposition.This is also called inspection.
Every product from a production lineis measured, and its disposi-
tion, such as scrapping or repair, is decided on when the result
shows the product to be out of specification.

Case 3 is different from cases 1 and 2 in that a manufacturing process


is the major objectof treatment for cases 1 and 2, while products are thesole
object of disposition in case 3.
The abovecases are explainedby an example of controlling the sensors
or measuring systems used in robots or in automatic control. Measurement
and disposition, case3, concern products, classifying them into pass and fail
categories and disposing of them. In a measuring system, it is important to
inspectthemeasuringequipmentandtodeterminewhetherthesystem
should be passed or failed. This is different from the calibration of equip-
ment. Calibration is meant to correct the deviationof parameters of a piece
of measuring equipment after a certain period of use that corresponds to the
concept implied in case 2, prediction and modification.
When measuring equipmentfdls out of calibration, either graduallyor
suddenly, it is replaced or repaired, which corresponds to the conceptin case
1, diagnosis and adjustment. I t is difficult in many cases to decide if the
equipment should be repaired or scrapped. Generally, the decision to repair
or replace is made when the error of the measuring equipment exceeds the
allowance of the product quality characteristic.
When measuring equipment cannot be adjusted by calibration and has
to be repaired or scrapped(called adjustment in on-line quality control), and
whenthere is ajudgingprocedure (called diagnosis in on-linequality
control) for these actions, it is more important to design a diagnosis and
adjustment system than to design a calibrating system.
Radical countermeasures such as determining the cause of the varia-
tion, followed by taking action to prevent a relapse (which are described in
control chart methods and called off-line quality control countermeasures)
are not discussed in this chapter. I am confident that a thorough on-line
8 Taguchi

quality control system design is the way to keep production lines from fall-
ing out of control. It is the objective of this chapter to briefly describe on-
line quality control methods and give their theoretical background.

4. EQUATIONANDANEXAMPLE FOR DIAGNOSISAND


ADJUSTMENT

In I Motor Company in the 1970s, there are 28 steps in the truck engine
cylinderblock productionline.Qualitycontrolactivity is necessary to
ensure normal production at each step. One of the steps, called boring by
reamers, is explained asan example, whichis also describedin detail in Ref. 1 .
Approximately 10 holes are bored at a time in each cylinder block by
reamers. A cylinder block is scrapped as defective if there is any hole bore
that is misaligned by more than I O pm, causing an 8000 yen loss, which is
denoted by A . The diagnosis cost to know whether holes are being bored
straight, designated by B, is 400 yen, and the diagnosing interval, denotedby
11, is 30 units. In the past half-year, 18,000 units were produced, and there
were seven quality control problems.
The average problem occurrence interval, denoted by U, is then

- 18,000
zi = -= 2570 (units)
7

When there is a problem such as a crooked hole, production is stopped,


reamers are replaced, the first hole bored after the replacement is checked,
and if it is normal, the production is continued. The total cost, including the
cost of stopping the production line, tool replacement, and labor, is called
the adjustment cost; it is denoted by C and is equal to 20,000 yen in this
example.
Insuchaprocessadjustmentforon-linequalitycontrol,thepara-
meterscharacterizingthethreesystemelement-theprocess,diagnosing
method,andadjusting method-include A , B , C, U, and C (the time lag
caused by diagnosis). The quality control cost when the diagnosis interval
is 11 is given by the theory described in Sections 5 and 6 as follows:

12 2 u u

Putting I I = 30, A = 8000 yen, B = 400 yen, C = 20,000 yen, ii = 2570, and
C = 1 unit in the above equation, the quality control cost per unit product of
this example would be
On-LineQualityControl 9

400
L=-+-
30 +1 8000
__ +-20,000 1 x 8000
(2570) 2 30 2570 + 2570
= 13.3 + 48.2 + 7.8 + 3.1
= 74.2 (yen)

Withannualproduction of 36,000 units,thetotalcostwould be 7 2 . 4 ~


36,000 = 2,610,000 yen. The improvement in quality control is needed to
reduce the quality control costgiven by Eq. (19). For this purpose, there are
two methods: one from the pertinent techniques and one from managerial
techniques. The former countermeasures include simplification of the diag-
nosis method or reduction of adjusting the cost, which must be specifically
researched case by case. For this, see Chapters 4-8 of Ref. I .
There are methods to reduce quality control cost while keeping current
process, current diagnosis, and adjustment methods unchanged. These man-
agerial techniques are soft techniques applicable to all kinds of production
processes. Two of these techniques are introduced in this chapter. One is the
determination of the diagnosis interval, and the other is the introduction of
preventive maintenance such a periodic replacement.
The optimum diagnosis interval is given by

2(u + C)B ‘ I 2
n= [ A -C l i ]

In the example of the boring process,

n=[
+
2(2570 1) x 400
8000 - 20,000/2570 1’’* = 16 (units)

The quality control cost from Eq. (19) when the diagnosis interval is 16 is

L=-+-
+
400 16 1 8000
- +-
20,000 1 x 8000
(2570) 2 16 2570 2570+

+ +
= 25.0 + 26.5 7.8 3.1 = 62.4 (yen)

There is a savings of 72.4 - 62.4 = 10.0 yen per unit product, or360,000 yen
per year. The value of L does not changesignificantly even when n varies by
20%. When n = 20, for example,
10 Taguchi

20,000
8000

= 63.6 (yen)

The difference from Eq. (22) is only 1.2 yen. It is permissible to allow about
20% error for the values of system parameters A , B, C, U, and L , or it is
permissible to adjust 11 within the range of 20% after the optimum diagnosis
interval is determined.
Next,the
introduction of preventive
a maintenance system is
explained.Inpreventivemaintenance activities, there are periodic checks
andperiodicreplacement.Inperiodicreplacement,acomponentpart
(which could be the cause of the trouble) is replaced with a new one at a
certain interval. For example. a tool with an average life of 3000 units of
product is replaced after producing 2000 units without checking.
Periodic checking is done to inspect products at a certain interval and
replace tools if product quality is within specification at the time inspected
but there is the possibility that it might become out-of-specification before
the next inspection. In this chapter, periodic replacement is described.
In the case of reamer boring, a majority of the problems are caused by
tools. The average problem-causing interval is ii = 2570 units, and periodic
replacement is made at an intervalof ii’ = 1500, which is much shorter than
the average life. Therefore, the probability of the process causing trouble
becomes very small. Assume that the replacement cost, denoted by C’, is
approximately the same as the adjustment cost C, or 18,000 yen. Assume
that the probability of the process causing trouble is 0.02. This probability
includes the instance of a reamer being bent by the pinholes existing in the
cylinder block, or some other cause. Then the true average problem-causing
interval will be improved from the current 2570 units to

- 1500
11 = ~ = 75,000
0.02

The optimum diagnosis interval )I would be

i1= [ +
2 x (75,000 1) x 400
8000 - 20,000/75,0001 .
= 87 =. 100 (units)

The quality control cost is then


On-LineQualityControl 11

L = (preventive maintenance cost) + (diagnosis and adjustment cost)

75,000 1
18,000
400 101 8000
20,000 1 x 8000
-
1
-5
0[%%+i(m)
0
' +m+
= 12.0 + (4.0 + 5.4 + 0.3 + 0.1)
= 12.0 + 9.8 = 21.8 (yen)

This is an improvement of 63.6 - 21.8 = 41.8 yen per unit compared to the
case without preventive maintenance, which is equivalent to 1,500,000 yen
per annum. If there were similar improvements in each of the 27 cylinder
block production steps, it would be an improvement of 42 million yen per
annum.
Such a quality control improvement is equivalent to the savings that
might be obtained from extending the average interval been problems 6.3
times without increasing any cost. In other words, this preventive mainte-
nance method has a merit parallel to thatof an engineering technology that
is so fantastic that it could extend the problem-causing interval by 6.3 times
without increasing any cost. For details, see Chapters 4-6 of Ref. 1.
Equations (18) and (20) may be approximately applied with satisfac-
tion regardless of the distribution of the production quantity before the
problem and despite variations in the fraction defective during the problem
period. These statements are proved in Sections 5 and 6.

5. PROOFOFEQUATIONSFORNONSPECIFIC
DISTRIBUTION

Parameters A , B, C, U, e, and n in the previous section are used similarly in


this section. Let P, (i = I , 2, . . .) be the probability of causing trouble for the
first time after the production was started at the ith unit. The probability of
causing trouble for the first time at the kth diagnosis is

When a problem is caused at the kth diagnosis, the number of defectives


varies from the maximum of n units to I; its average number of defective
units is given by
12 Taguchi

Assuming that

the average number of defective units will be ( n + l)/2.


In the loss function L, the average number of defectives in the second
term is

Since the first, third, and fourth terms of Eq. (18) are self-explanatory, the
loss function is given by

n 2

Next, the equation for the optimum diagnosis interval is derived. The
average problem-causing interval is ii. Since the diagnosis is made at n-unit
intervals, it is more correctly calculated to consider the losses from actual
recovery actionsor by thetime lag causedonceevery +
ii n / 2 units.
+
Therefore, U n / 2 is substituted for ii in Eq. (31).

It is easily understood from the previous example thatii is much larger than
+
n/2. Also, since there is n in the second term of the equation, ii n / 2 in the
+
denominator may be approximated to be U n / 2 = ii. If the approximation

is made, the equation of L is then approximated as

L = -+-(:)+1(1
n n +2 I A
B C
U -:)+-(I
nM CA
U ") n
U (33)
On-Line Quality Control 13

the above equation and then putting it to zero gives

Solving this equation, 11 is obtained as

t1= [ A - C/U - eA/U

Since

A >> C and A >> l liA


T
U

the following approximation is made:

1 1
A - $ -11% = ti( A - : ) ( C 1A -LA/U
- C/U )

Putting (35) to (34), 17 is

6. PROOF OF EQUATIONS FOR A LARGE FRACTION OF


DEFECTIVES DURING TROUBLE PERIOD

Inthissection, it is to be provedthat Eqs. (18) and (20) can beused


approximately even if the fraction of defectives during the trouble period
is larger than 0.
When a process is under normal conditions, it may be deemed that
there are no defectives. Assume that the fraction defective under abnormal
conditions is p and the loss when a defective units is not disposed of but is
14 Taguchi

sent on to thefollowing steps is D yen. After the process causes trouble, the
probability of detecting the trouble at the diagnosis is p and the probability
of failing to detect the trouble is 1 - p . Accordingly, the average number of
+
problems at the time the troubleis detected is ( n l)p/2. The probability of
detecting a problenl at thesecond diagnosis aftermissing the detection at the
first diagnosis is ( I - p ) p ; then the average number of defectives the inspec-
+
tion fails to detect is ( n 1)/2, and the number detectedis n p units. Thus we
obtain Table 2 .
From Table 2, theaverage loss by defectives when a process isin
trouble is

+ I?/?[( 1 - /$/I + 2( 1 - p ) 3 p + . . . + ( i - 2)( 1 -


I
p)"'p] D

D is normally much larger than A . The amount of loss in Eq. (37) is mini-
mum when p = 1 and becomes larger when p is close to zero. Putting p = 0
in Eq. (37) gives /7D,showingthattheequationsfor L and IZ should be
changed from Eqs. (18) and (20) to

Table 2 DiagnosisandProbability of ProblemDetection


On-Line Quality Control 15

and

2(U + e)B ‘I’


= [ 2 D - C/U]

where ( 1 1 + I ) = I I is approximated.
When the fraction defective during the trouble periodis not 100Y0,it is
normal to trace back and find defectives when a trouble is found. In this
case, there are no undetected defectives, so D = A . Equation (37) is there-
fore

Putting II + 1 k.11, Eq. ( 4 0 ) becomes

Therefore, the loss after tracing back to find defectives is I I A at maximum


and n A / 2 at minimum. If the equations for L and / I were determined as

B I?; l(2:) C [A
L=-+- - + Y + T
t1 li 11

and

t1=[ +
2(U l ) B ’I’
2 A - C/U
] (43)

it would become overdiagnosis, which is too costly. Although the fraction


defective can have any value,it would be good enough to consider about 0.5
for p . In that case, 1.5A is used instead of A . As described before, L and I I
are not significantly affected by the error in A up to 50%, so Eqs. (18) and
(20) can be satisfactorily applied.

7. PREDICTIONANDMODIFICATION

I n the control of a variable quality characteristic, a signal factor isused


for correcting the deviation of the characteristic from a target value. For
16 Taguchi

example, pressure of a pressis a signal factor to control the thicknessof steel


sheets, and flow of fuel is a signal factor to control temperature. For such a
control, the following three steps must be taken:

1. Determinetheoptimummeasuringinterval.
2. Forecast the average quality of products produced before the next
measurement.
3. Determine the optimum modifying quantity against the deviation
of the forecasted value from the target value.

After the above parameters are determined,

4. Modify the quality characteristic made by varying the level of the


signal factor.

To determine the optimum modifying quantity,an analysis of variance


method, called cyclic analysis, and the following loss equation (caused by
variation) are useful.

L = ka' (44)

where

loss caused by out-of-specification


k=
(allowance)'
' = average
a of
the
error
from
target value squared (46)

For Eq. (44). seeRef I , Chapters 1 and 2. The simplestprediction


method is to consider the measured value itself as the average quality of all
productsto be producedbeforethenextmeasurement.Therearemany
methods for this purpose. However, it is important to determine r$, the
error variance of such a prediction.
The optimum modifying quantity in step 3 is determined by forecast-
ing the average in step 2, which is denoted by !', and calculate the following
quantity:

Optimum
modifying
quantity = -B(I* - !qO) (47)

where yo is the target value and


On-Line Quality Control 17

Recently, more and more productionsystems using automatic machin-


ery and robots that handle the four stepslisted above have been developed.
For such systems, the center of quality control is the calibration of sensors
(measuring devices) employed by automatic machinery or robots and the
diagnosis of hunting phenomena. Steps 1 4 are therefore required.
A simple example is illustrated in the following. The specification of
the thickness of a metal sheet is 171 f 5 pm. The loss caused by defects is 300
yen per meter. The daily production is20,00Om, and the production line
operates 5 days a week or 40hr a week. Currently, measurement is made
once every 2 hr,costing 2000 yen forthemeasurement andadjustment
(correction or calibration). Thereis a tendency for the averageand variation
of thickeness to increaseduringthecourseofproduction.Theaverage
thickness increases 3 pm every 2 hr, and the error variance increases 8 pm2
in 2 hr.
Since the production is 5000 n1 in 2 hr, the average variance a' of the
products during 2 hr assuming that adjustmentis correctly made at the time
of measurement is

= 7.0 (49)

The daily loss in the loss function L, including the correcting cost, is

300
L = 7 X 7.0 X 20,000 + 4 x 2000 = 1,688,000 (yen) (50)
5-

Letting the optimum measuring and adjusting interval be 17,

300 20,000
L = 7 x 20,000 x
5- I1

+
= 0.028811~ 1 9 2 ~ + 40,000,000
I?
18 Taguchi

The optimum tI that minimizes Eq. (51) is about 430. Then the loss due to
prediction and correction is

L = 0.0288 x 4302 + 192 x 430 + 40,000,000


430
= 181,000 (yen) (52)

There is an improvement of 1.507 million yen per day. There is additional


improvementdue to thereduction of thepredictionerror.Forthis, see
Chapter 9 of Ref. 1.

REFERENCE

1. Taguchi. G. Taguchimethods.On-lineProduction. ASI. 1993.


2 . Taguchi, G. Taguchi on Robust Technology Development, ASME Press1993.
Statistical Monitoring and Optimization
in Total Quality Management
Kai Kristensen
The Aarhus School of Business, Aarhus, Denmark

1. MEASUREMENTWITHINTOTALQUALITY
MANAGEMENT

Modern measurement of quality should, of course, be closely related to the


definition of quality. The ultimate judge of quality is the customer, which
meansthata systemofqualitymeasurement shouldfocusontheentire
process that leads to customer satisfaction in the company, from the sup-
plier to the end user.
Total quality management (TQM) argues that a basic factor in the
creation of customer satisfaction is leadership, and it is generally accepted
that a basic aspect of leadership is the ability to deal with the future. This
has been demonstrated very nicely by, amongothers,Mr.Jan Leschly,
president of Smith Kline, who in a recent speech in Denmark compared
his actual way of leading with the ideal as he saw it. His points are demon-
strated in Figure 1. I t appears that Mr. Leschly argues that today he spends
approximately 60% of his time on ‘%refighting,” 25% on control, and 15%
on the future. In his own view a much more appropriate way of leading
would be to turn the figure upside down, so to speak, and spend 60% of his
time on the future, 25% on control, and only 15% on firefighting.
The situation described by Mr. Leschly holds true of many leaders in
the Western world. There is a clear tendency for leaders i n general to be
much more focused on short-term profits than on the process that creates
profit. This again may lead to firefighting and to the possible disturbance of
processes that may be in statistical control. Theresult of this may very well

19
20 Krlstensen

100

80

60
Firefighting
Control
40 c] Future

20

0 “

Actual Ideal

Figure 1 Leadership today and tomorrow.(Courtesy of Jan Leschly, Smith Kline.)

be an increase in the variability of the company’s performance and hence


andincrease in qualitycosts.Inthisway“theshort-termleader”who
demonstrates leadership by fighting fires all over the company may very
well be achieving quite the opposite of what he wants to achieve.
T o be more specific, “short-term leadership” may be synonymous with
low quality leadership, and in the future it will be necessary to adopt a
different leadership style in order to survive, a leadership style that in its
nature is long-term and that focuses on the processes that lead to the results
rather than the results themselves. This does not, of course, mean that the
results are uninteresting per se, but rather that when the results are there you
can do nothing about them. They are the results of actions taken a long time
ago.
All this is much easier said than done. In the modem business envir-
onment leaders may not be able to do anything but act on the short-term
basis because they do not have the necessary information to do otherwise.
To act on a long-term basis requires that you have an information system
that provides early warning and that makes it possible for youto make the
TotalQualityManagement 21

necessary adjustments to the processes and gives you time to make them
before they turn into unwanted business results. This is what modern mea-
surement of total quality is all about.
This idea isinvery goodaccordance with the official thoughts in
Europe. In a recent workingdocumentfromtheEuropeanCommission,
DGIII, the following is said aboutqualityandqualitymanagement
(European Commission, 1995):
The use of the new methodologies of total quality management is for the
leaders of the European companies a leading means to help themin the
currenteconomicscenario, whichinvolves not only dealing with
changes. but especially anticipating them.

Thus,totheEuropeanCommission,quality is primarilyaquestion of
changes and early warning.
To create an interrelated system of quality measurement it has been
decided to definethemeasurementsystem accordingtoTable I , where
measurements are classified according to two criteria: the interested party
(the stakeholder) and whether we are talking about processes or results.
Other typesofmeasurementsystemsaregiven in KaplanandNorton
( 1996).
As Table 1 illustrates, we distinguish between measurements related to
the process and measurements related to the results. The reason for this is
obvious in the light of what has been said above and in the light of the
definition of TQM. Furthermore we distinguish between three “interested
parties:” the company itself, the customer, and the society. The first two
should obviously be part of a measurement system according to the defini-
tion of TQM, and the third hasbeen included because thereis no doubt that

Table 1 Measurement of Quality - The Extended Concept


customer
company
The
The
The society
~~ ~~ ~~~~ ~ ~

The process Employee Control-


and
check-Control
and
satisfaction
(ESI)
points
concerning
checkpoints
Checkpoints the internal definition concerning e.g.
concerning the of product and service environment, life
internal structure quality cycles etc.

The result Business results Customer


satisfaction
‘Ethical
accounts’
ratios Financial (CSU Environmental
accounts Chcckpoints
describing the
customer satisfaction
22 Kristensen

the focus on companies in relation to theireffect on society will be increased


in the future andit is expected that very soon we are going tosee a lot of new
legislation within this area.
Traditional measurements have focused on the lower left-hand corner
of this table, Le., thebusinessresults, and we have built up extremely
detailed reporting systems that can provide information about all possible
ways of breaking down the business results. However, as mentioned above,
this type of information is pointing backwards in time, and at this stageit is
too late to do anything about the results. What we need is something that
can tell us about what is going to happen with business results in the future.
This type of information we find in the rest of the table, and we especially
believe and also have documentation to illustrate that the top set of entries
in the table are related in a closed loop that may be called the improvement
circle. This loop is demonstrated in Figure 2.
The improvementis particularly due to an increase in customer loyalty
stemmingfrom an increase in customersatisfaction.Therelationship
between customer satisfaction and customer loyalty has been documented
empirically several times. One example is Rank Xerox. Denmark, who in
their application for the Danish Quality Award reported that when they
analyzed customer satisfaction on a five-point scale where I isvery dissa-
tisfied and 5 is very satisfied they observed that on average 93% of those
customers who were very satisfied (a score of 5) came back as customers,
while only 60% of those who gave a 4 came back.

Figure 2 Theimprovementcircle.
TotalQualityManagement 23

Anotherexample is a largeDanish real estatecompanywho in a


customer satisfaction survey asked approximately 2500 customers to evalu-
ate the company on 20 different parameters. From this evaluation an aver-
age of customer satisfaction (customer satisfaction index) was calculated.
The entire evaluation took placeonafive-point scale with 5 as the best
score, which means that the custonler satisfaction index will have values in
the interval from 1 to 5. I n addition to the questions on parameters, a series
of questions concerning loyalty were asked, and from this a loyalty index
was computed and related to the customer satisfaction index. This analysis
revealed some very interesting results, which are summarized in Figure 3. in
which the customer satisfaction index is related to the probability of using
the real estate agent once again (probability of being loyal). It appears that
there is a very close relationship between customer satisfaction and custo-
mer loyalty. The relationship is beautifully described by a logistic model.
Furthermore, it appears from Figure 3 that in this case the loyalty is
around 35% when the customer satisfaction index is 3, i.e., neither good nor
bad. When the customer satisfaction increases to 4, a dramatic increase in
loyalty is observed. In this case the loyalty is more than 90%. Thus the area
between 3 and 4 is very important, and it appears that even very small
changes in customer satisfaction in this area may lead to large changes in
the probability of loyalty.
The observed relationship between business results and customer loy-
alty on the one hand and customer satisfaction on the other is very impor-

Probability of loyalty
12 q

I
I
1
1
1.0 I
i

~ satisfaction
customer Average
Figure 3 Probability of loyalty as a function of customer satisfaction.
24 Kristensen

tant information for modern management. This information provides an


early warning about future business results and thus provides management
with an instrument to correct failures before they affect business results.
The next logical step will be to take the analysis one step further back
and find internal indicators of quality that are closely related to customer
satisfaction.Inthiscasethewarningsystem willbe even better.These
indicators, which in Table 1 arenamedcontrolpointsandcheckpoints,
will, of course,be company-specificeven if somegenericmeasures are
defined.
Moving even further back, we come to employee satisfaction and other
measuresoftheprocess in thecompany.We expectthese tobe closely
related to the internally defined quality. This is actually one of the basic
assumptions of TQM. The moresatisfied and more motivated your employ-
ees, the higher the quality in the company [see Kristensen (1996)]. An indi-
cator of this has been established in the world’s largest service company, the
International Service System (ISS), where employee satisfaction and custo-
mer satisfaction have been measured on a regular basis for some years now
[see Kristensen and Dahlgaard (1997)]. In order to verify the hypothesis of
theimprovement circle in Figure 2, employeesatisfactionandcustomer
satisfaction were measured for 19 different districts in the cleaning division
of the company in 1993. The results were measured on a traditional five-
point scale, and the employee satisfaction and customer satisfaction indices
were both computed as weighted averages of the individual parameters. The
results are shown in Figure 4.
Theseinteresting figures showaclearlinearrelationshipbetween
employee satisfaction and customer satisfaction. The higher the employee
satisfaction, the higher the customer satisfaction. The equation of the rela-
tionship is as follows:

CSI = 0.75 + 0.89 ESI, R’ = 0.85

The coefficients of the equation are highly significant. Thus the standard
deviationoftheconstantterm is 0.33, andthat of theslope is 0.09.
Furthermore, we cannot reject a hypothesis that the slope is equal to 1.
It appears from this that a unit change in employee satisfaction gives
more or less the same change in customer satisfaction. We cannot, from
these figures alone, claim that this is a causal relationship, but we believe
that combined with other information this is strong evidence for the exis-
tence of an improvement circle like the one described in Figure 2. To us,
therefore, the creation of a measurement system along the lines given in
Table 1 is necessary.Onlyin this waywill management be ableto lead
Total QualityManagement 25

~ Employee satisfaction

Figure 4 Relationship between ESI and CSI, 19 districts.

the company upstream and thus prevent the disasters that inevitably follow
the firefighting of short-term management.
An example of an actual TQM measurement system is given in Figure
5 for a Danish medical company. It will be seen that the system follows the
methodology given in the Process section of Table 1.

2. MEASURING AND MONITORINGEMPLOYEE AND


CUSTOMER SATISFACTION

Since optimization and monitoring of the internal quality are dealt with
elsewhere in thisbook we are going to concentrateontheoptimization
and monitoring of customers whether they are internal (employees) or exter-
nal. First a theoretical, microeconomic model of satisfaction and loyalty is
constructedandthen we establisha“controlchart”for themanagerial
control of satisfaction.

2.1. A Model of Satisfaction and Loyalty


Since exactlythesamemodelapplies to bothcustomersatisfactionand
employee satisfaction we can without loss of generality base the entire dis-
cussion on a model for customer satisfaction.
26 Kristensen

RESULT / COMPANY -Tabletshimhour


*No.of new prod.
A
RESULT / CUSTOMER Customer
satisfaction
A
*Complaints
PROCESS / CUSTOMER *Returned goods ++On-time delivery
*Credit notes

PROCESS / COMPANY *Turnover


Employee
satisfaction *Absence

In Kristensen et al. (1992) a model linking customer satisfaction to


company profit was established. I n this model, customer satisfaction was
defined a s

I1

CSI = II'jC,

where 11 is the number of quality parameters. II', is the importance of 21 given


parameter, and c, is the evaluation. I t was assumed that the profit of the
company could be described as

where cp is an increasing function linking customer satisfaction to conlpany


earnings and the second factor on the right-hand side is a quadratic cost
function with IC, as a cost parameter.
TotalQualityManagement 27

By maximizing ( 2 ) with respect to the individual satisfactions (el's) it


can be shown that for identical cost parameters, i.e.,

the optimum allocation of resources will occur when

0, -
- ct
ll'; Wi
vii
i.e., when the degree offulfilment of customer expectations is identical for all
areas. This is based on the fact that the first-order condition for maximiza-
tion of Eq. (2) is equal to

From this it will be seen that if the right-hand side of Eq. (5) is equal to 1
then a very simple rule for optimum customer satisfaction will emerge:

This result, even if it is based on rather strong assumptions, hasbecome very


popular among business people, and a graphical representation known as
the quality map where each parameter is plotted with 11' on the s axis and c
on the axis has become a more-or-less standard tool for monitoring cus-
tomer and employee satisfaction. This is the reason we later on elaborate a
little on the quality map.
But even if this is the case we intend to take model (2) a step furtherin
order to incorporate customer loyalty. The reason for this is that customer
loyalty has gained a lot of interest among quality management researchers
recently because it seems so obvious that loyalty and quality are related. but
we still need asensiblemodel for relatingcustomerloyalty to profit [see
Kristensen and Martensen (1996)l.
We start by assuming that profit can be described as follows:

n = likelihood
buying
of
X
quantity
bought
- costs

where quantity bought is measured in sales prices.


The likelihood of buying is, of course, the loyalty function.We assume
that this function can be described as follows:
28 Kristensen

where

where c; is the satisfaction of parameter i for the main competitor. Thus the
elements of the loyalty function are related to the competitive position of a
given parametercombined withthe importance of theparameter. We
assume that the quantity bought given loyalty is a function of the customer
satisfaction index. This means that we will model the income or revenue of
the company as

This tells us that you may be very satisfied and still not buy very much,
because competition is very tough and hence loyalty is low. On the other
hand, when competition is very low, you may be dissatisfied and still buy
from the company even though you try to limit your buying as much as
possible.
Combining (IO) with the original model in ( 2 ) , we come to the follow-
ing model for the company profit:

Hence the optimum allocation of resourceswill be found by maximizing this


function with respect to e;, which is the only parameter that the company
can affect in the short run. Long-run optimizationwill, of course, be differ-
ent, but this is not part of the situation we consider here.
The first-order condition for the optimization of Eq. (1 1) is

6l-I
-= Lcp‘w;
6ci
+ cpL;wi - 2k;c,

By equating this to zero we get the following characterization result:


TotalQualityManagement 29

To make practical use of this result we assume that

which means that we may write the characterization result as

- = ff
N';
+ pL;
To put it differently, we have shown that if company resources have
been allocated optimally, then the degree to which you live up to customer
expectations should be a linear function of the contribution to loyalty. This
seems to be a very logical conclusion that will improve the interpretation of
the results of customer satisfaction studies.
Practical use of results (4),( 6 ) , and (15) will be easy, because in their
presentformyouonlyneedmarketinformationtousethem.Onceyou
collect information about c,, c;, )vi, and the customers' buying intentions,
the models can be estimated. In the case of a loyalty model you will most
likely use a logit specification for L and then L; will be easy to calculate.

2.2. Statistical Monitoring of the Satisfaction Process


Let

.Y = (f)
where c is an 11 x 1 vector of evaluations and w is an rz x 1 vector of impor-
tances. Assume that .x- is multivariate normal with covariance matrix

and expectation

According to the theoretical development we want to test the hypothesis


30 Kristensen

Assume a sample of N units, and let the estimates of (17) and (18) be

s= (i)
and

Let I be the identity matrix of order 11. Then our hypothesis may be written

Ho: (I1 - I ) =0

From this it is seen that the T' statistic is equal to

If the hypothesis is true, then

N-n
F= T2
(N - 1 ) ~

has an F-distribution with n and N - IZ degrees of freedom. The hypothesis


is rejected if the computed F-statistic exceeds the critical value F a : l r , ~ - l r .
Let

+
S(, = s,. SI,.- s,,,.- s,',. (25)

Then simultaneous confidence intervals for the differences between p, and


p I , , may be written as follows for any vector 1' = ( / I , /?, . . . , f l l ) :

N-11
1
/'(? - G ) - - / ' S J
[N
( N - I)/?
Fa:,l,N-ll
1 I/'(PC - P I J

< /'(C
- - M')
1
+[N
-
( N - l)n
/'S(,/ N - 1 1 Fa:~~.N-~~ 1
Now assume that the hypothesis is true, and let

I' = (0, . . . , 0, 1,0, . . . , 0)


TotalQualityManagement 31

Then we may write

or

To simplify. let us assume thatall differences have the same theoretical


variance. Then we may substitute the average .$ for s i , , which means that
the interval for monitoring satisfactionwill be constant. In that case
we may
set up the “control” chart shown in Figure 6 for monitoring satisfaction,
where the limits are given by

W
Figure 6 Quality map.
32 Kristensen

If a parameter falls between the dotted lines we cannot reject the hypothesis
that we have optimal allocation of resources. If, on the other hand, a para-
meter falls outside the limits, the process needs adjustment.
Weshouldrememberthatthelimitsaresimultaneous. If we want
individual control limits,which, of course, willbe muchnarrower, we
may substitute t a , N - l for

[
( N - I)/?
N-n 1
Fa:~~,N-~~

2.3. An Example
An actual data set from a Danish company is presented in Table 2. Seven
parameters were measured on a seven-point rating scale.
Now we are ready to set up the control chart for customer satisfaction.
We use formula (30) to get the limits,

= f(O.18lJ7.74 X 2.18 = f 0 . 7 4

From the control chart (Figure 7)we can see that most of the parameters are
in control but one parameter needs attention. The importance of the envir-
onnlental parameter is significantly greater than that of the evaluation of

Table 2 Data Set (Customer Satisfaction for a Printer)


Importance
Satisfaction
Sample
Variance of
Parameter Wi C, size difference
Operation 6.68 6.06 64 1.66
User friendliness 5.85 5.67 1.82
Print quality 5.99 5.48 1.80
Service 5.32 5.38 2.56
Speed 3.91 4.94 2.62
Price 4.64 5.02 2.69
Environmentally 5.17 4.18 " 2.16
friendly

Average 2.18
TotalQualityManagement 33

4 5 6 7

Figure 7 Control chart for customer satisfaction.

companyperformance. Hence thequality ofthisparametermust be


improved.

3. CONCLUSION

The use of the concept of total quality management expands the need for
measurement in the company. The measurementof quality will no longer be
limited totheproductionprocess.Now we need tomonitor“processes”
such as customer satisfaction and employee satisfaction. In this chapter I
have given a managerial model for the control of these processes, and we
haveconsideredapractical“control”chartthat will helpmanagement
choose the right parameters for improvement.

REFERENCES

European Commission, DG 111. A European Quality Promotion Policy. Bruxelles,


Feb. 17, 1995.
Kaplan RS, Norton DP. The Balanced Scorecard. Harvard Business School Press,
Boston, MA, 1996.
Kristensen K. Relating employee performance to customer satisfaction. World Class
Europe-Striving for Excellence. EFQM, Edinburgh, 1996, pp. 53-57.
34 Kristensen

Kristensen K, Dahlgaard JJ. ISS International Service System A/S "Denmark. The
European Way to Excellence. CaseStudy Series. DirectorateGeneral 111,
European Commission, Bruxelles. 1997.
Kristcnscn K, Martcnsen A. Linkingcustomersatisfaction to loyalty and perfor-
mance. ESOMAR Pub. Scr. 204: 159-169. 1996.
Kristcnsen K, Dahlgaard JJ, Kanji GK. On measurement of customer satisfaction.
Total Quality Managemcnt 3(2): 123--128. 1992.
3
Quality Improvement Methods and
Statistical Reasoning*
G.K. Kanji
Sheffield Hallam University, Sheffield, England

1. PRINCIPLES OF TOTALQUALITYMANAGEMENT

Total
quality
management (TQM) is aboutcontinuousperformance
improvement of individuals, groups, and organizations. What differentiates
total quality management from other managementprocesses is the emphasis
on continuous improvement. Total quality is not a quick fix; it is about
changing the way things are done-forever.
Seen in thisway, totalqualitymanagement is aboutcontinuous
performanceimprovement. To improveperformance,people need to
knowwhatto do andhowtodoit,havetherighttoolsto do it, be
abletomeasureperformance,and receive feedbackoncurrent levelsof
achievement.
Total quality management (Kanji and Asher, 1993) provides this by
adhering to a set of general governing principles. They are:

1. Delightthecustomer
2 . Management by fact
3. People-basedmanagement
4. Continuousimprovement

*For an extended version of this paper, see Kanji GK. Total Quality Management 5: 105. 1994.

35
36 Kanji

Each of theseprinciples can be used to drive the improvement process.


To achieve this, each principle is translated into practice by using two core
concepts, which show how to make the principle happen.
These concepts are:
Customer satisfaction
Internal customers are real
All work is a process
Measurement
Teamwork
People make quality
Continuous improvement cycle
Prevention
Further details of the four principles with the core concepts follow.
The pyramid principles of TQM are shown in Figure 1.

1.1. DelighttheCustomer
The first principle focuses on the external customers and asks “what would
delightthem?”Thisimpliesunderstanding needs-both of productand
service, tangible and intangible-and agreeing with requirements and meet-
ing them. Delighting the customer means being bestat what matters most to
customers, and this changes over time. Being in touch with these changes
and delighting the customer now and in the future form an integral part of
total quality management.
Thecoreconcepts of totalqualitythatrelatetotheprinciple of
delighting the customer are “customer satisfaction” and “internal customers
are real.”

1.2. Management by Fact


Knowing the current performance levels of our products or services in our
customers’ hands and of all our employees is the first stage in being able to
improve. If we knowwhere we arestartingfrom, we canmeasure our
improvement.
Having the facts necessary to manage the business at all levels is the
second principle of total quality. Giving that information to people so that
decisions are based upon fact rather than “gut feel” is essential for contin-
uous improvement.
Quality Improvement and Statistical Reasoning 37

Side I Side 2

Rl F@l

Side 3 Side 4

Figure 1 The pyramid principles of TQM.(From Kanji and Asher, 1993.)

The core concepts that relate to management


by fact are “all workis a
process” and “measurement.”

1.3. People-basedManagement
Knowing what to do and how to do it and getting feedback on performance
form one part of encouraging people to take responsibility for the qualityof
their own work. Involvement and commitment to customer satisfaction are
ways togeneratethis.Thethirdprinciple of totalqualitymanagement
recognizes that systems, standards, and technology in themselves do not
mean quality. The role of people is vital.
The core concepts that relate to people-based management are “team-
work” and “people make quality.”
38 Kanji

1.4. ContinuousImprovement
Total quality cannot be a quick fix or a short-term goal thatwill be reached
when a target has been met. Total quality is not a program or a project.It is
a management process that recognizes that however much we may improve,
our competitors will continue to improve and our customers will expect
morefrom us. The link between customerandsupplier withprocess
improvement can be seen in Kanji (1990).
Here,
continuous improvement-incremental change,
not
major
breakthroughs-must be the aim of all who wish to move toward-total
quality.
The core concepts that relate to the company’s continuous improve-
ment are “the continuous improvement cycle” and “prevention.”
Each concept is now discussed, together with an example of how that
concept was used by a company to bring about improvement.

2. CORECONCEPTS OFTQM
2.1. InternalCustomersAreReal
Thedefinition of quality [see Kanji (1990)], “satisfyingagreedcustomer
requirements,”relatesequallytointernalandexternalcustomers.Many
writers refer to the customer-supplier chain and the need to get the internal
relationships working in order to satisfy the external customer.
Whether you are supplying information, products, or a service, the
people you supply internally depend on their internal suppliers for quality
work. Their requirements are as real as those of external customers; they
may be speed, accuracy, or measurement.
Internal customers constitute one of the “big ideas” of total quality
management. Making the mostof this idea can be very time-consuming, and
manystructuredapproachestakealong time andcan be complicated.
However,onesuccessfulapproach is totakethe“cost of quality”and
obtain information about the organization’s performance and analyze it.
Dahlgaard et al. (1993) used statistical methods to discuss the relationship
between the total quality cost and the number of employees in an organiza-
tion.

2.2. AllWorkIs a Process


The previous section looked at internal customers and how to use the idea
that they are real as a focus for improvement.
QualityImprovementandStatisticalReasoning 39

Another possible focus is that of business processes. By "process" we


mean any relationship such a s billing customers or issuing credit notes-
anything that has an input. steps to follow, and an output. A process is a
combination of methods,materials,manpower,machinery,etc., which
taken together produce a product or service.
All processes contain inherent variability, and one approach to quality
improvement is progressively to reduce variation, first by removing varia-
tiondue to special causes and second by drivingdowncommon cause
variation.thus bringingtheprocess intocontrolandthenimproving its
capability.
Various statistical methods, e.&., histograms, Pareto analysis, control
charts,andscatterdiagrams,are widely used by qualitymanagersand
others for process improvement.

2.3. Measurement
The third core conceptof total quality management is measurement. Having
a measure of how we are doing is the first stage in being able to improve.
Measurescan focusinternally,i.e.,oninternalcustomersatisfaction
(Kristensen et al., 1993), or externally, i.e., on meeting external customer
requirements.
Examples of internal quality measurements are
Production
Breach of promise
Reject level
Accidents
Process in control
Yield/scrap (and plus value)
Kristensen et al. (1993), when discussing a measurement of customer
satisfaction, used the usual guidelines for questionnaire design and surveys
and statistical analysis to obtain the customer satisfaction index.

2.4. Prevention
The core conceptof prevention is central to total quality management andis
one way to move toward continuous improvement.
Prevention means not letting problems happen. The continual process
of driving possible failure outof the system can, over time, breed a culture of
continuous improvement.
There are two distinct ways to approach this. The first is to concen-
trateonthe designofthe product itself (whether a hardproductor a
40 Kanji

service); thesecond is to work ontheproductionprocess.However,the


most important aspect of prevention is quality by design using statistical
reasoning.
There are several frequently used tools, and failure mode and effect
analysis (FMEA) is one of the better known ones. It is associated with both
design (design FMEA) and process (process FMEA).
Other frequently used methods are failure prevention analysis, which
waspioneered by KepnerTregoe,andfoolproofing(orPokaoki). The
advantage of all of these methods is that they provide a structure or thought
process for carrying the work through.

2.5. CustomerSatisfaction
Many companies, when they begin quality improvement processes, become
very introspective and concentrate on their own internal problems almost at
the expense of their external customers.
Other companies, particularly in the service sector, have deliberately
gone out to their customers, first to survey what is important to the custo-
mer and then to measure their own performance against customer targets
(Kristensen et al., 1993). The idea of asking one’s customers to set customer
satisfaction goals is a clear sign of an outward-looking company.
One example is Federal Express, who surveyed their customer base to
identify the top I O causes of aggravation. The pointswere weighted accord-
ing to customer views of how important they were. A complete check was
made of all occurrences, and a weekly satisfaction index was compiled. This
allowed the company to keep a weekly monitor of customer satisfaction as
measured by the customer. An understanding of survey and statistical meth-
ods is therefore needed for the measurement of customer satisfaction.

2.6. Teamwork
Teamwork can provide an opportunity for people to work together in their
pursuit of total quality in ways in whichtheyhave not worked together
before.
People who work on their own or in small, discrete work groups often
have a picture of their organization and the work that it does that is very
compartmentalized. They are often unaware of the work that is done even
by people who work very close to them. Under these circumstances they are
usually unaware of theconsequencesofpoorquality in theworkthey
themselves do.
By bringingpeopletogether in terms with a common goal, quality
improvement becomes easier to communicate over departmental or func-
QualityImprovementandStatisticalReasoning 41

tional walls. In this way the slow breaking down of barriers acts as a plat-
form for change.
We defined cultureas“the waywe d o thingshere,”andcultural
changeas“changingthe way we do thingshere.” Thischange implies
significant personal change in the way people react and in their attitudes.
A benchmarkingapproachcanalsohelptochangethe waythey do
things.
Teamwork can be improved by benchmarking, a method that is simi-
lar to the statistical understanding of outliers.

2.7. PeopleMakeQuality
Deming has stated that the majority of quality-related problems within an
organization are not within the control of the individual employee.As many
as 80% of these problems are caused by the way the company is organized
and managed.
Examples where the system gets in the way of people trying to d o a
good job are easy to find, and in all cases simply telling employees to do
better will not solve the problem.
It is important that the organization develop its quality management
system, and it should customize the system to suit its own requirements.
Each element will likely encompass several programs. As a matter of fact,
this is where the role of statistics is most evident.

2.8.TheContinuousImprovementCycle
Thecontinuous cycle of establishingcustomerrequirements,meeting
those requirements, measuring success, and continuing to improve can be
used bothexternallyandinternallyto fuel the engine of continuous
improvement.
By continually checking with customer requirements, a company can
keepfindingareas in whichimprovementscan be made.Thiscontinual
supply of opportunity can be used to keep quality improvement plans up-
to-date and to reinforcetheidea thatthetotalqualityjourney is never-
ending.
In order to practice a continuous improvementcycle it is necessary to
obtain continuous information about customer requirements, i.e., d o mar-
ket research. However, we know that market research requires a deep sta-
tistical understanding for the proper analysis of the market situation.
42 Kanji

3. STATISTICALUNDERSTANDING

The role of statistical concepts in the development of total quality manage-


ment it nothing new. For example, Florence Nightingale, the 19th century
statisticianandfamousnurse, wasknownas the mother of continuous
health care quality improvement. In 1854 she demonstrated that a statistical
approach by graphical methods could be persuasive i n reducing the cost of
poor quality care by 90% within a short periodoftime.Later, in 1930,
WalterShewhart.anotherprominentstatistician, also suggested thatthe
same kind of result could be achieved by using statistical quality control
methods.
The fundamental aspect of statistical understanding is the variation
that exists in every process, and the decisions are made on that basis. If the
variation in a process is not known, then the required outputof that process
will be difficult to manage.
I t is also very important to understandthat every processhas an
inherentcapabilityandthattheprocess will be doing well if it operates
within that capability. However, sometimes one can observe that resources
are being wasted in solvingaproblem, and simplynot realize thatthe
process is working at its maximum capability.
In order to understand variability and the control of variation, it is
necessary to understand basic statistical concepts. These concepts are simple
to understand and learn and provide powerful management tools for higher
productivity and excellent service.
Inthiscomplexbusinessworld,managersnormallyoperate in an
uncertainenvironment,andthereforetheirmajoremphasis is on the
immediate problems. In their everyday life they deal with problems where
the application of statistics occurs in pursuit of organizational objectives.
However, as we know, the business world is changing, and managers
along with other workers are adopting this change and also learninghow to
manage it. For many people, the best way of adopting this changeis to focus
on statistical understanding because it permeates all aspects of total quality
management.
We have already learned that “all work is a process” and therefore
identification and reduction of a variation of processes provides opportunity
forimprovement.Here,theimprovementprocess, which recognizes that
variation is everywhere, gets help from the statistical world for this quality
journey.
In general,managers cantakemanyactionsto reducevariation to
improve quality. Snee (1990) pointed out that managers can reduce varia-
tion by maintaining the constant purpose of their employees to pursue a
common quality goal.
QualityImprovementandStatisticalReasoning 43

4. CONCLUSIONS

In recent years, particularly inJapan and the United States, there has been a
strongmovementforgreateremphasis ontotalqualitymanagement in
whichstatisticalunderstandinghas been seen to be a major contributor
for management development.
It is clear that statistical understanding plays a major role in product
and service quality, care of customers through statistical process control,
customer surveys, process capability, cost of quality, etc. The value of sta-
tistical design of experiments, which distinguishes between special causeand
commoncausevariation, is also well established in thearea of quality
improvement.
If we alsoacceptthat “allwork is process,”that allprocesses are
variable, and that there is a relationship between management action and
quality, then statistical understanding is an essential aspect of the quality
improvement process.
Further, in the areas of leadership, quality culture, teamwork, etc.,
development can be seen in various ways by the use of statistical under-
standing.
In conclusion, I believe that total quality management and statistical
understanding go hand in hand. People embarking on the quality journey
must therefore venture onto the road of total statistical understanding and
follow the lead of total quality statisticians.

REFERENCES

Dahlgaard JJ, Kristcnsen K, Kanji GK. Quality cost and total quality management.
Total Quality Management 3 ( 3 ) : 21 1-222. 1993.
Kanji GK. Totalqualitymanagement:Thesecondindustrial revolution. Total
QualityManagement l(1): 3-12, 1990.
Kanji GK. AsherM. TotalQualityManagement:ASystemicApproach.Carfax
Publishing Company, Oxfordshire, U.K., 1993.
Kristensen K, Kanji GK, Dahlgaard JJ. On measurement of customer satisfaction.
Total Quality Management 3(2): 123-128, 1993.
Snee RD. Statistical thinking and its contribution to total quality. Am Stat 44(2):
I 1 6 1 2 1 , 1990.
This Page Intentionally Left Blank
Leadership Profiles and the
Implementation of Total Quality
Management for Business Excellence
Jens J. Dahlgaard, Su Mi Park Dahlgaard, and Anders Narrgaard
The Aarhus School of Business, Aarhus, Denmark

1. INTRODUCTION

Total quality management (TQM) is defined by Kanji and Asher, 1993 as

A company culture which is characterized by everybody’s participation


in continuous improvements of customer satisfaction.

To build the TQM culture it is important that everystaffmember-top


managers, middle managers, and other employees-understand and apply
the five basic principles of TQM. Thesecan be visualized in terms of
theTQMpyramid(DahlgaardandKristensen, 1992,1994) presented in
Figure I .
As can be seen from Figure 1, the foundation of the TQM pyramid is
leadership. All staff members need leaders who can explain the importance
of TQM principles and who can show how those principles can be continu-
ously practiced so that the organization gradually achieves business excel-
lence.
Each staff member and each group must continuously focus on the
customer (external as well as internal customers). They must continuously
try to understandthecustomers’ needs,expectations,andexperiences so
that they candelightthecustomer. To be able to delightthecustomer,
continuous improvement is necessary. World class companies are continu-
ously trying to improve existing products or develop new ones. They are

45
46 Dahlgaard et al.

The 5 Principles of TQM


A

Figure 1 TheTQM pyramid.

continuously trying to work smarter, not harder, by improving their pro-


cesses, and they understand that the most important asset to improve is
their people. To support everybody in continuous improvements, measure-
mentsare of vitalimportance. To improveproducts we needfeedback
from customers (measurements of customer satisfaction and other custo-
mer facts). To improve processes we need feedback from the various pro-
cesses (process measurements of defects, wastage, quality costs, etc.). To
improvepeople we needfeedbackfromemployees(measurementsof
employee satisfaction and other facts related to improvement of people).
Statisticalmethodscanbeusedinmany ofthesemeasurements.The
application of statisticalmethods is oftenthebest way toensurehigh
reliabilityof themeasurements,andforcomplexmeasurementssuchas
measurements of people’s mind-sets it may be the only way to generate
reliable facts (see Section 2).
Continuously applying thefive principles ofTQM will gradually result
in business excellence. But what is business excellence? Business excellence
has many definitions. One example is (Raisbeck, 1997)
The overall way of working that results in balanced stakeholder (custo-
mers,employees, society, shareholders)satisfaction so increasing the
probability of long term success as a Business.
Leadership Profiles and Implementation of TQM 47

In 1992 the European Foundation for Quality Management (EFQM)


launched the European Quality Award and a model to be used for assess-
ment of the applicants for the award. The model, which is seen i n Figure 2,
has gradually been accepted as an efficient self-assessment tool that compa-
nies can use to improve the strategic planning process i n order to achieve
business excellence. Since 1996 themodel has been called the Europccm
t n o t l d ,fbr TQM u t l r l business escelltvw.
It is not the aim of this chapter to explain the detailedlogic behind the
model in Figure 2; themodel closely resembles theMalcolmBaldridge
Quality Award model that was launched in 1988. The model signals very
clearly to its user that if you want good business results you have to under-
standtheirrelationshipstootherresults--peoplesatisfaction,customer
satisfaction,impactonsociety-and,ofcourse, to theenablers.The
model gives a good overview of How (enablers) you m y get desired results
( = W h a t ) . How to use the model in a strategic planning process, monitored
by Shewhart’s and Derning’s PDCA cycle, is explained i n Section 3.
Comparing this model with the TQM pyramid of Figure I , we recog-
nize that both models have leadership as an important element. There are
good reasons for that. Good leadership and strong management commit-
ment have long been recognized as the most essential preconditions for any
organization aspiring to be world class. As a resu!t, much effort has been
devoted to the pursuit of a “business excellence” approach to leading and
managing an organization in order to achieve world class performance.
Combiningthe principles ofthe TQM pyramidwiththe principles
(values) behind the European model for TQM business and excellence, we

(100 points)
r
hManagement
(90 points)

(80 points)
(140 points)
(90points)

; Satisfiction
(200 points)
- Business Results
( I 50 points)

Impact on
Resources
Society
(90 points)
(60 points)

4 -b
Eanblcn 50% Results 50%

Figure 2 The European model for TQM and business excellence


48 Dahlgaard et al.

propose in this chapter that the fundamental principles of business excel-


lence be taken to be the basic principles of total quality management sup-
plemented by the principles of the learning organization and the creative
organization.* The results are thefollowing six principles for business excel-
lence:
1. A focus on customers and their needs
2. Continuousimprovement
3. The empowerment and participation of all staff members
4. A focusonfacts
5. A commitmenttocreativity
6 A focusoncontinuouslearning
A lot has been written about leadership and management’s responsibilities
fortheimplementation of theseprinciples andtherelatedconcepts,but
there has not been much concern about identifying the different leadership
profiles in today’s business world and their relations to the above principles
and the success criteria for business excellence. If a manager‘s leadership
profile does not correlate positively with the six principles listed above, then
the manager may be a barrier to the implementation of TQM. In this case
you obviously have only three options:
1. Firethemanager.
2. ForgetTQM.
3. Educatethemanager.
It is our belief that education of the manager is a feasible solution in most
cases. Forthatpurpose we havedeveloped an integratedapproachfor
managementdevelopmentthat is based onqualityfunctiondeployment
(QFD; see Section 2). By applyingthe Q F D techniquetothisareait is
possible to gain information about the effect of different leadership profiles
on the success criteria for business excellence. Without a profound under-
standing of this relationship, we cannot achieve business excellence.
The aims of this chapter are
1. To show an exampleofhowstatisticalmethodscan beused to
controlanddevelopthesofterparts of totalqualitymanage-
ment-the leadership styles (Section 2).

*Success criteria taken from the EQA business excellence model hove been supplemented with
succcss criterla from the creative and learning organizations because although creativity and
learning are implicitly Included in total quality management, theory on total quality manage-
ment has to a certain degree neglected these two important disciplines. The aspect that unites
all of the chosen success criteria is that they all demand a strong commitment from the senlor
management of an organization.
Leadership Profiles and Implementation of TQM 49

2. To provide an overviewofthe role and application of statistical


methods in monitoring the implementation of total quality man-
agement to achieve business excellence (Section 3).

2. THE EUROPEAN EMPIRICAL STUDY ON LEADERSHIP


STYLES

To achieve our first aim an empirical study was carried out that involved
more than 200 leaders and managersof European companies and some1200
of their employees. The format of the study was as follows.
1. Fourhundred chiefexecutive officers fromFrance,Germany,
Holland, Belgium, the United Kingdom, and Denmark were ran-
domly selected fromvariousEuropeandatabases.The selection
criteria were that they had to be from private companies (100%
state-owned companies were excluded) with morethan 50 employ-
ees.
2. The selected leaders were asked to complete an 86-point question-
naire* composed of two sections:
a. 49 questions asking leaders to rate the importanceof a number
of aspects of modern business managementt
b. 37 questions asking leaders to rate the importanceof a number
of statements or success criteria on business excellence
3 . By analyzingthematerialsupplied by theleaders in response to
the first 49 questions, it was possible to plot the “leadership pro-
file” of each individual respondent. These leadership profiles are
expressed in eight different leadership “styles”.
4. The success criteria, which form the focus of the second section (37
questions), indicate thekey leadership qualities required to achieve
business excellence. The higher the leaders scored on these ques-
tions, the more they could be said to possess these qualities.

*The complete Leadership Prolile questionnaire in fact consisted of 106 questions. The addi-
tional 20 questions covered culturalissues that do not form partof this chapter. The questions
were developed by Gecrt Hofstede in 1994.
?The aspects of management were identified by a Danish focus group participating in a pilot
vcrsion of this survey in 1995. developed by Anders Nsrgaard and Heme Zahll Larscn. The
focus group conslsted of nine directors representing various areas of business, who were asked
to identifythe key attributes of a good business leader. The attributes so identified were
classified on the basis of an affinity analysis, and as a result 49 variables were established.
These variables could then be used to plot any individual leadership profile.
50 Dahlgaard et al.

5. For each leader, 10 employees were also selected to participate in


the survey. These employees were asked to rate the importanceof
the 49 management aspects, in order togive a picture of what the
employees considered desirable for ideal leaders.

2.1. Descrlption of the Leadership Model


The leadership model that was developed as the basis for this analysis is
designed to shed light on the relationship between the business leadership
styles of today's leaders and the requirements to achieve business excellence.
By plotting the leadership profile of any individual leader, the model pro-
vides a tool to assess the extent to which he or she is working toward the
successful achievement of business excellence.

Success Criteria
As described in Section 1, the success criteria for businessexcellence used in
thisresearchcomprisethreemainelements-totalqualitymanagement,
creativity, and learning. However, since the interaction between an organi-
zation's leadership and its employees has a major impact on whether these
criteria are achieved or not, this interaction becomes, in effect, a fourth
success criterion.
As Figure 3 shows, the achievementof these success factors is affected
by the leadership profiles of those in charge of the organization. Although

Busmess
Leadershlp
Excellence Profile

Figure 3 Theleadershipmodel.
Leadership Profiles and Implementation of TQM 51

not included within the scope of this chapter, it is reasonable to assume that
these leadership profiles are in turn influenced by a number of “basic vari-
ables” such as leader’s age, education, and experience and the size of the
company or the sector in which it operates.

The First Success Criterion: Totd Qurrlity Mnncgement. Total qual-


ity management is regarded as the main criterion for business excellence.
Focusing on achieving continuous improvementsin an effort to enhance the
company’s strengths and eliminate its weaknesses, TQM covers all areas of
the business, including its policies and strategies, its management of people,
and its work processes. The core values of the total quality-oriented orga-
nization are afocus on the customer, the empowermentof its people,a focus
on fact-based management, and a commitment to continuous improvement.
Since the European Quality Award (EQA) model is the most author-
itativeandmostwidelyusedmethodofassessing TQM in Europe, core
aspects of this model have been used to determine the performance of the
surveyedleaderswithregard to the first success criterion. The higherthe
score the leaders achieved in this part of the questionnaire, the more posi-
tively they can be said to be working with total quality management.

The Second Success Criterion: Crentivity. To achieve business excel-


lence, organizationsmustalsofocusstronglyondevelopingcreativity.
Urban* (1995, p. 56) hasstated, “If all companiesarehigh-qualityand
low-cost, creativity will be the differentiating factor.”
Creativity is an important criterion forbusiness excellence becauseit is
a vital stimulusforimprovementandinnovation. It is aprerequisitefor
business excellence that an organization and its leaders be both committed
to, and capable of, puttingin place an organizational structure that fosters a
creative environment. At the same time, they must be able to control and
make use of that creativity. Since creative ideas do not just surface sponta-
neously, it is essential to implement a creative planning process. The creative
organization aims to establish an effective basis for innovation and contin-
uous improvement by adopting a systematic approach to the various aspects
of creativity, such as the evaluation of ideas and procedures for commu-
nication.
For the purposes of this study, European leaders’ performance with
regard to this success criterion-the extent to which they are proactively
working to generate and retain creativity-wasassessed according to the
theory of managing ideas set out by Simon Majaro.

*Glen L. Urban,Dean o f the Sloan School of Management.MassachusettsInstitute of


Technology.
52 Dahlgaard et al.

The Third Success Criterion: Learning. To quote Peter Senge (Senge,


1990, p. 4): “The organizations that excel will be those that discover how to
tap theirpeople’s commitmentandcapacitytolearnat all levels in an
organisation.”
The successful organization of the future will be a learning organiza-
tion-one that has the ability to take on new ideas and adapt faster than its
competitors. The model of thelearningorganizationusedforthisstudy
followsthe five learning disciplines set out by Senge.These disciplines
havethereforeservedasthe basis forevaluatingtheEuropean leaders’
performance with respect to this third success criterion.
The Fourth Success Criterion: Lenclrr-Et?lpioyee Internction. The
three success criteria above all depend critically on the interaction between
theleaders and theiremployees. For successfulworkwith totalquality
management,learning,andcreativity, it is importantforleadersto get
their subordinates “on board” and to harness their energies in the pursuit
ofthesesuccess criteria. A comparison of the views of the employees
(through the profile they provided of their “ideal leader”) with the actual
performance of the leaders themselves was therefore used as a measurement
of this interaction.

2.2. Leadership Styles


As described earlier, the answers the leaders provided to the questionnaire
formed the basis of an assessment of them in terms of eight different leader-
ship “styles.” The eight leadershipstyles were identified by a factor analysis.
The 49 questionsregardingleadershipcapabilitieswerereducedto eight
latent factors. It is essential to bear in mind that a leader is not defined
simply as belonging to one or another of these styles but in terms of an
overall profile that contains varying degrees of all eight. In other words,it is
the relative predominance of some styles over others that determines the
overall leadership profile of any given individual. The eight leadership styles
are described in the following paragraphs.

The Captain
Key attributes: Commands respect and trust; leads from the front; is pro-
fessionally competent, communicative, reliable, and fair.
The Captain is in many ways a “natural” leader. He commands the
respect and trust of his employees and leads from the front. He has a con-
fidence based on his own professional competence, and when a decision is
made it is always carried out. He has an open relationship with his employ-
Leadership Profiles and Implementation of TQM 53

ees. He treats themall equally, is usually prepared to listen to their opinions,


and usually ensures that information they require is communicated to them.

The Creative Leader


Key attributes: Is innovative, visionary, courageous, inspiring; has a strong
sense of ego.
The Creative leader is full of ideas and is an active problem solver and
a tireless seeker after continuous improvement. He has aclear image of the
direction the company should pursue in the future. He is courageous and is
willing to initiate new projects despite the risk of failure. He is a source of
inspiration to his employees. He has a tendency to act on inspiration rather
than on rational analysis and is driven by a strong sense of ego.

The Involved Leader


Key attributes: Shows empathy, practices a “hands-on” approach, does not
delegate, focuses on procedures.
The Involved leader possesses good peopleskills, is well attuned to the
mood of his staff, and takes time to listen to their problems and ideas. His
close involvement withhis employees gives him a good overview of the tasks
they are working on. This level of involvement, however, makes it hard for
him to delegate tasks rather than participate personally. He is focused on
procedures and routines in teamwork and is consequently less well suited to
take an overall leadership role.

The Task Leader


Key attributes: Is analytical, “bottom line”-driven, result-oriented, imper-
sonal, persevering, intolerant of mistakes.
The Task leader believes success is measured by bottom-line financial
results. Day-to-day business in the organization is carried out on the basis
of impersonal,rationalanalysis.TheTaskleader is result-orientedand
tends to be extremely persevering and determined once a course of action
has beendecided. The reliance onarationalattitudetowardworkand
procedures means that this leader has difficulty accepting mistakes made
by employees, with employee morale and performance consequently tend-
ing tosuffer whenthey fail to meet the leader’s expectations.TheTask
leader lacks personal skills when it comes to dealing with the problems or
opinions of employees.
54 Dahlgaard et ai.

The Strategic Leader


Key attributes: Focuses on strategic goals, takesa holistic view of the orga-
nization, is a good planner, avoids day-to-day details, is process-oriented.
trustworthy.
The Strategic leader has an overall view of the organization, focusing
on longer term goals rather than day-to-day issues. This leader is process-
oriented, believing that consistent work processes are essential for positive
results.He is very efficient, setting clear objectives forwhatneedstobe
achieved. His comprehensive overview of the organization and his personal
efficiency make him a highly trustworthy leader of his employees.

The Impulsive Leader


Key attributes: Obsessedwith new ideas,unfocused,curious,energetic,
participative.
The Impulsive leader’s most salient characteristic is an obsession with
new ideas combined with an unfocused energy. He is constantly “on fire”
and lets nothing get in the way of his enthusiasm. As a result, he tends to
take an interest in a wide range of issues and opportunities without neces-
sarily having the capability to pursue the possibilities this process generates.
In his fanaticism to push through his latest ideas, he tends to appear auto-
cratic and domineering to his employees.

The Specialist Leader


Key attributes: Is expert, solitary. lacks inspirational ability, is resistant to
change, calm.
The Specialist leader is an expert in his field who prefers to work alone.
His leadership is expressed through the quality of his expertise rather than
through any “people” skills. He is not good a t teamwork, lacking the ability
to inspire others and having a tendency be to pedantic and uncompromising.
He appears calm, assured, and in control.

The Team Builder Leader


Key attributes: Is tolerant, gives feedback,actsasacoach,motivates,
inspires, is supportive.
The Team Builder leader perceives himself primarilyas a coach aiming
to maximize the advantages of teamwork. He gives constructive feedback
concerning his employees’ work and behavior. He is also very tolerant and
understands the need to support and inspire employeesin critical situations.
Leadership Profiles and Implementation of TQM 55

2.3. The Relationship Between Success Criteria and


Leadership Styles
The three success criteria and the eight leadership styles are estimated in
this study to determine the precise demands that European leaders face
when they seek business excellence. By estimating the relationships among
the three success criteria and the eight leadership stylesitispossible to
isolate the leadership styles resulting in the greatest impact on the success
criteria.
With the data of 202 European leaders we have been able to empiri-
cally prove that the Team Builder, the Captain, the Strategic, the Creative,
and the Impulsive leadership styles allhave a positive impacton one, two, or
all three success criteria. The leadershipstyles are ranked according to their
degree of influence on the success criteria. The more success criteria the
leadership styles influence, the more important they are to achieving busi-
ness excellence, i.e., the Team Builderis the most important one (impacts on
three success criteria; see Fig. 4), whereas the Impulsive leader is the least
important (impacts on one success criterion, Quality). The remaining leader-
ship styles-the Involved, the Task, and the Specialist leaders-have no
influence on achieving business excellence.
However,itis notenoughtohaveknowledge of thecorrelation
between the success criteriaandtheleadership styles. Europeanleaders

Figure 4 Thecorrelationbetween success criteriaandleadership styles.The


numbers indicate the strength of the relationships.
56 Dahlgaard et ai.

must also take into consideration the Ideal Leadership profile outlined by
the employees. By using quality function deployment (see Section 2.4) it is
possible for managers to work with the demands of the employees.

2.4. ModelforMeasuringExcellentLeadership
An Excellent Leadership model should integrate the demands that the suc-
cessful leader must consider when trying to achieve business excellence. The
model should clarify what the leader should do to improve his performance
as a leader in relation to the success criteria for achieving business excel-
lence.
A product itnprowment technique called quality function deployment
(QFD) is used asatoolformeasurement ofExcellentLeadership. The
essence of this technique consists of combining aset of subjective variables,
normally set out by the demands of customers, with a set of objective vari-
ables provided by the manufacturers’ product developers.As a result of this
interactiveprocessanumberoffocusareasfordevelopinghighquality
products become apparent, enabling manufacturers to adapt their products
more precisely to customer demands.
Treating the leaders as “products” and theemployees as “customers,”
Q F D is used as a technique for determining Excellent Leadership. This is the
reason for making the parallel between leaders and products. In QFD, the
voice of the customer is used to develop the product. A leader has many
“customers”suchas employees andstakeholders.In this project,the
employeesare selected asour link tothecustomerpart in QFD.This
means that the voice of the employees will serve as an important guideline
for leaders today in developing the right leadership qualities.
The information required for the Q F D construction consists of
Employee demands of an ideal leader. The employees’IdealLeader
,ofile represents the customers’ demands of the “product” in QFD.
The relationship between success criteria for achieving business excel-
lence.
The relationshipbetweensuccesscriteriaanddifferentleadership
styles.
Theindividual leader’s score onthe successcriteria and leadership
styles.
Information about the “best in class” leaders within the areas of per-
formance (quality, learning, and creativity).
The QFD technique provides the possibility to work with the follow-
ing aspects:
Leadership Profiles and Implementation of TQM 57

Zorreietlon
Attributes matrix

msessment 1

Figure 5 The Excellent Leadership model.

1. Assessment of the leader’s performance on the success criteria.


2. Benchmarking-a comparison with “best in class”-leaders that
have the highest score on the success criteria.
3. Estimation of an Excellent Leadership profile (ELP). The ELP is
used to evaluate whether or not any leader matches the require-
ments for achieving business excellence.
The integrated QFD modelis described below and is referredto here-
after as theExcellent Leadership model (Fig. 5). The description provides an
explanation of the model but does not explain its full potential. Only the
relevant parts of the model’s matrix are explained in order to clarify how
Q F D can be used in this specific managerial perspective.
The QFD technique consists of a number of different matrices(collec-
tions of large numbers of quantifiable data), which makes the technique
systematic and rational. Using each matrix as a foundation for analyzing
the empirical data on European leaders makes it possible to work with the
data in aneasy and understandable way. Each of the matrices in Figure5 is
discussed in the following subsections.

Attributes-Leadership Styles
The attributes matrix (far left in Fig. 5 )

-
includes the different attributesof leader-

06011
ship.Eightleadershipstyleshavebeen
identifiedinrelation tothisstudy. As
explainedearlier,
the
eight
leadership
styles
were created
the
on
basis of rating
n the importance of 49 aspects of modern
58 Dahlgaard et al.

business management. However, in keepinga general view it is evident that


the focus is on the eight latent leadership factors. Furthermore, the devel-
opment of future leadership is based on different leadership styles, so it is
not essential to have a high degreeof detail until a later stage.

Weights-A Rating of the Eight Leadership Styles


The 1150 employees who participated in

ulena F l
the survey also evaluated the importance
of the 49 aspects of modern business man-
agement under consideration to their con-
cept of an ideal leader. This employees’
Ideal Leader profile provides a rating
a weightof importance for each of the
or

eight leadershipstyles. With this information the leader can identify possible
areas of improvement in meeting employee demands for an ideal leader.

Correlation Matrix

The correlation matrix is the heart of the


Excellent Leadership model. In this part
of the model the correlation between the
individual leader’s profile and the employ-
ees’ IdealLeaderprofile is estimated.
Correlating the three success criteria with
P I the eight leadership styles yields a picture
illustrating theeffects that each of the individual leadershipstyles has on the
success criteria for achieving business excellence.

Substitution
The roof of the QFD house, (Fig. 5 ) con-
sists ofa correlation matrix that illustrates
the correlation between the three success
criteria. This part of the model is relevant
in determining potential
substitution
opportunities between thecriteria.Only
three criteria are included in this project,
which gives only limited information on
substitution. Using the 37 elements of the success criteria might make it
possible to come up with a more differentiated view of substitution between
the elements.
Leadership Profiles and Implementation of TQM 59

Assessment
The leader’s performance is measured on
the basis of the three success criteria. This

Ufi0
OO . ,
assessment is carried out by means of a
self-evaluation,duringwhichtheleader
answers 37 questions.Theanswersto
these questions indicate the leader’s and/
. or organization’s level ofactivity on the
success criteria(quality,learning,andcreativity),forachievingbusiness
excellence, illustrated by an individual score. This assessment provides the
leaders with a score of their current performance and critical areas in which
further allocation of resources is required for the development of business
excellence. It is important to have knowledgeof one’s current level if one is
to set relevant objectives for the future. The three successive critieria should
beindividuallyevaluated. A globalapproach is required,astheyare
strongly correlated

Benchmarking
Theright-handside of themodelillus-

ndnn
E
tratestheprofilesfor“bestinclass”
withinthethreesuccesscriteria.These
profilescan be used asabenchmark
against “best in industry,” which can gen-
erate new ideas for improvement. These
profilesserveasafoundationforthe
ExcellentLeadership profile,which takesintoaccountthethree success
criteria and employees’ demands of an ideal leader.

Areas of Improvement
The bottom matrix in Figure 5 illustrates
the ‘‘result’’ of theprocess.Multiplying

IO theweightsoftheemployeeswiththe
relationships between the leadership styles
and the three success criteria creates this
endproduct.Takingthe
employees,theareas
view of the
of improve-ment
for the leader can be identified. Inotherwords,theleader isprovided
with concrete ideas of ways in which the respective areas of improvement
are weighted according to employee demands.
60 Dahlgaard et al.

Excellent Leadership Profile


The Excellent Leadershipprofile(also
known as the Success Profile),serves as

UO€lD’l E l
abenchmarkfortheleaders.It
ideal
leadershipprofile if the
wantsto succeed in managingquality,
learning,andcreativity.Inthisproject
is the
leader

theoverallobjectivewas tocreateone
profile of an excellent leader working actively with the management disci-
plines included in the success criteria. From this perspective this matrix at
the far right in Figure5, is considered the most important onein our use of
QFD.
The QFD technique has served asthebasisforourresearchand
resulted in the identification of the Excellent Leadership profile. The five
crucial drivers (leadershipstyles)forachievingexcellentleadership were
identified by a factor analysis. By correlating leadership styles with success
criteria for business excellence it was possible to identify the styles most
positively correlated to businessexcellence. Expanding the theoretical foun-
dation, as seen in this chapter to treat the empirical data on European
leaders withQFD andthereby take into consideration“employees’ ideality”
has resultedin a more accurate picture of the true drivers in the achievement
of business excellence.
The Excellent Leadership profile shown in the rightmost matrix in the
QFD-model can be benchmarked against any segment or group of leaders,
i.e., leaders from different countries or sectors,of different ages, and so on.
Two segments have been selected for further analysis:

1. European leaders’
leadership
profile versus the Excellent
Leadership profile.
2. Country-by-country comparison of European leaders’ leadership
profile.

2.5. TheExcellentLeadershipProfile
In order to evaluate whether or not a leaderis equipped to lead an organi-
zationtobusiness excellence, abenchmark Excellent Leadershipprofile
(ELP)mustbedeveloped.Thisillustratestheleadershipprofilethat is
best orientedtowardtheachievement of all three of themainbusiness
excellence success criteria.
Leadership Profiles and Implementation of TQM 61

The leadership profile benchmark is based on three groups of leaders,


the 20 leaders who scored highest on creativity, quality, and learning. It is
then used to develop the Excellent Leadership profile.

A Note on the Leadership Profile Graphs Used in this Study

1. The eight leadership styles that make up the leadership profiles are
measured on a scale of 0 to 100 (vertical axis of Fig. 6).
2. Scores above or below 50 points representdeviationsfromthe
average of each leadership style.
3. The closer a leader gets to 100, the more strongly his or her leader-
ship profile is characterized by theelementsidentified in the
description of that particular leadership style.
4. Conversely, the further a score falls below 50, the less applicable
those elements are as a description of the leader's profile.
As Figure 6 illustrates,twoleadershipstyleshavethepredominant
influencewithinthe Excellence Leadership profile-the Strategicand the
Task.
TheStrategic is clearlythemost importantleadership style when
it comes to identifyingthecharacteristicsrequiredofaleaderseeking

... ..
70 ~~~~ ~

.. .. . __-...-...
.-.
40 ..

30 ~ . . ~-~~

20
The TheThe The The Task
The The TheTeam
Captan Creative Involved Stratege lrnpulsrve Speclakt Bulkier

Figure 6 The Excellent Leadership profile. Dotted lines represent theband of


deviation from the excellent leadership profile.
62 Dahlgaard et al.

business excellence. ThecompetenciesoftheStrategicleaderarethere-


foreonesthatanyleaderhopingto achievebusinessexcellencemust
continuouslydevelop.Thismeansthatanoverall view ofthebusiness
is essential,withpracticaldetailsof daily worknotbeingallowedto
prevent a focusonstrategicgoalsor get in the wayofsettingclear
organizational objectives.
The strongpresence of the Task leadershipstyle within the ELP under-
lines the fact that a highly developed analytical capability and an extremely
result-oriented approach are both necessary for the achievement of business
excellence. The Captain, the Creative, and the TeamBuilder styles also play
animportantpart in achievingbusiness excellence. TheELP confidence
interval is above 50, and these styles arethereforeimportanttothe
Excellent Leadership profile.
Compared to the results in Figure 4 it may seem surprising that the
Task leadership style has such a strong weight in the Excellent Leadership
profile. The explanation for that is that our benchmarks consisted of the 20
leaders who had the highest scores on quality, creativity, and learning. A
characteristic of those leaders was that they also showed a relatively high
score on the questions that correlated positively with the Task leadership
style.
The remainingthree styles-the Involved,theImpulsive,andthe
Specialist-are not regarded as important in the context of the Excellent
Leadership profile. As canbe seen fromFigure 6, they are all broadly
“neutral,” reaching a score around average. This does not mean that they
can be safely disregarded, however, since a score below average (i.e., below
50) would certainly represent a deviation from the ELP. In other words,
whileleadersneedtostrive actively to achieve theStrategic andTask
leadershipcompetenciesandalsothesofterleadershipattributesofthe
Creative,theTeamBuilder, and theCaptain, theyshouldnotignore
theotherleadership styles or seek toeliminatethemfromtheirprofile
altogether.

2.6. European Leaders Versus the Excellent Leadership


Profile
In Figure 7 two profiles areillustrated:theExcellentLeadershipprofile
interval (dotted lines) and the Europeanleaders’ profile (bold line), the latter
being the average profile of all 202 European leaders participating in the
study.Thegraphsshowthatthereare large deviations betweenthe
European leaders’ profile andtheELPon twoleadership styles: the
Captain and the Strategic.
Leadership Profiles and Implementation ofTQM 63

i
me
1 me
30
201
me The Task The me me Team The
glclnvoivedCreatrve Captan Builder

+European leaders .... ... Excellent LeadershlpRoflle

Figure 7 European lcadcrs versus Excellent Leadership profile.

The Strategic:
1. Ascoreofalmost 60 indicates thatEuropean leaders do place
importanceonthe skills oftheStrategicleaderandputthem
into practice, by taking a long-term view of the company and its
direction, setting clear objectives, and being focused on maintain-
ing consistent work processes.
2. They need to develop these competencies even further, however, if
they wish to match the ELP.
3. The significant deviation between the leaders’ actual performince
and the requirements of the ELP is of considerable importance,
given that the Strategic leadership style is the most crucial element
of the ELP.
The Captain:
1. TheEuropean leaders’ low score on the Captain style category
indicates that they are not “natural” leaders. At best. they learn
leadership skills a s they grow into their assignment.
2. The below 50 scoreindicates that these leaders arenotstrongly
characterized by thecompetencies of thisparticularleadership
style-providing leadershipfromthefront,encouragingopen
communication, andcommanding the respect andtrust of
employees.
64 Dahlgaardetal.

3. Although the Captain is not as crucial to the overall ELP as, for
example, the Strategic leadershipstyle, the deviation here is still an
important one in termsofprovidingthebalanceofleadership
styles that is needed to achieve business excellence.

2.7. EuropeanLeadersVersusEmployees’Ideality
The employees’ Ideal Leadership profile embodies the preferences expressed
by the 1150 employees who participated in the survey. Direct subordinates
to chief executives and managing directors were asked to use their answers
to the first 49 questions of the survey to describe their “ideal” leader-
someone for whom they would be willing to make an extra effort in their
work. Comparing the leaders’ profile with the employees’ Ideal Leadership
profile shows whether the employees are in harmony withtheleader for
achieving business excellence and where they are in conflict.
Figure 8 highlightsfourmainareasofleadershipwhereEuropean
employees’expectations differ significantly fromtheactualperformances
of the leaders: the Captain, the Creative leader, the Involved leader, and
the Specialist leader. (A difference of 10 points or more is significant). The
two styles positively correlated to achieving business excellence are included
in the analysis.

mm

20 L

Figure 8 European leaders versus employees’ideality.


Leadership Profiles and Implementation ofTQM 65

The Captain
Figure 8 shows a difference of approximately 18 points between employees’
expectations and actual performance in the Captain category.
The European leaders’ low score in the Captain style category indi-
cates that they are not “natural” leaders. Atbest, they learn leadership skills
as they grow into their assignment.
The below 50 score indicates that the leaders are not strongly char-
acterized by the competencies of this particular leadership style-providing
leadershipfromthefront,encouragingopencommunication,andcom-
manding the respect and trust of employees.
Employees place a much greater value on the leadership characteristics
of the Captain than their leaders do.
The employees’ score of 60 indicates that they react positively to a
strong “natural” leader who can guide them and to whom they can look
with respect, and that they appreciate the continuous flow of information
provided by the Captain.

The Creative
Figure 8 indicates a difference of approximately 22 points between actual
leadership performance and employee expectations in the creative style cate-
gory.
The Creative style is the leadership style showing the most significant
difference, with employees rating the Creative attributes very highly, at a
score of 70, while their leaders score below 50.
The high score (70) indicates that, in contrast with the Strategic and
Task styles, European employeesplaceahighvalue onleaderswhoare
characterized by the Creative leadership competencies.
The employees show a strong preference for a creative, inspiring, and
courageous leader, scoring higher on this leadership style than on the other
seven. This translates into a strong demand among Europeanemployees for
a leader of vision and innovation whois prepared to deal with the increasing
complexity of the business environment and who sees creativity and con-
tinuousimprovementasthekeysto success. European employees seek a
leader who acts as a source of inspiration, motivating the workforce and
taking courageous business decisions. These expectations, however, are sig-
nificantly abovetherequirementstheirleadersneedtomeet in order to
achieve business excellence.

Comments on the Specialist


There wasadifference ofapproximately 15 points between the leaders’
profile and the employees’ ideality profile.
66 Dahlgaard et al.

The employees’ low score on the Specialist leadership style (below 35)
can be seen as the mirror image of the high value they place on the Captain
and Creative styles. The solitary nature of the Specialist leader, his lack of
“people” skills and ability to inspire. arethedirectantithesis of the
Captain’s and the Creative leader’s attributes. The Specialist style of leader-
ship is clearly not appreciated or regarded by employees as being of great
value.
European leaders, whose Specialist score was significantly above the
employee rating for that style, place a greater value on this leadership style
than their enlployees do.

2.8. Conclusions
In seeking to achieve business excellence, European leaders may encounter
resistance among their employees.Of crucial significance in this regard is the
fact that European employees place a markedly lower value on the Team
Builder and Strategic competencies than is required for business excellence.
By contrast, their “ideal” leader is heavily characterized as being creative,
inspiring. and an active problem solver.
The clear findings from this research study were that the five crucial
drivers of businessexcellence aretheTeam Builder,the Captain,the
Strategic,theCreative,andtheImpulsiveleadership styles (Fig. 4).
Leaderstryingtoachievebusinessexcellencemusttherefore view the
high-level attainment of these sets of leadership competencies a s their para-
mount objective.
It is important to remember, however. that this must not be done at
the cost of neglecting otherleadershipcompetencies. As theExcellent
Leadership profile demonstrates. the other leadership styles may be of less
importance to achieving business excellence than the five leadership styles
mentionedabove,but this doesnotmeanthat theyshould be neglected
altogether. The overall balance of the ELP requires the other leadership
styles to be maintainedat levels within theELPinterval.Maintaining a
certainfocus on thesecompetencies is therefore still an important aspect
of excellent leadership.

3. MONITORINGTHEIMPLEMENTATION OF THE
SUCCESS CRITERIA FOR BUSINESS EXCELLENCE

Section 2 showed how it is possible to measure and hence to understand the


softer parts of TQM (the intangibles). Remember that Dr. Deming talked
about “the most important numbersbeing unknown and unknowable,” i.e.,
Leadership Profiles and Implementation of TQM 67

measures of the qualitative world. Section 2 shows an example of how it is


possible to measure themind-set of people by using statistical methods. This
section gives an overview on how to monitor and improve tangibles (things,
processes, etc.) as well as intangibles. Business excellence can be achieved
only if continuous improvements are focused on both areas. Sucha focus is
an important element of the leadership part of Figures I and 2.

3.1. ThePlan-Do-Check-ActionCycle for Business


Excellence
The problem with leadership is that most managers are confused abouthow
to practice leadership. They need one or more simple models from which
they can learn what their main leadership tasks are and how to integrate
those tasks in the strategic planning process, a process that generates the
yearly business plan and also longer term plan for the company (3-5 year
plan) eachyear. TheEuropeanmodelforbusiness excellencemayhelp
managersto solve thatproblem.Boththe yearlybusinessplan andthe
long-term strategic plan can be designed by using the nine criteria of the
model; i.e., the plan should comprise the result criteria of the model ( w / ~ r / f
you want to achieve) and the enabler criteria as well (how you decide to
work, i.e., h o you plan to use intangibles). Figure 9 gives an overview of
1 1 3

this Plan-Do-Check-Action (PDCA) approach.


It is seen from Figure 9 that action consists of a yearly self-assessment
of I t h t you have achieved and you achieved the results. Such a yearly
h o n l

self-assessment is invaluable as input to the next year's strategic planning


process.
During the year the plan is implemented with the help of people in the
company's processes, and the results on people satisfaction, customer satis-
faction, impact on society, and business results come in. This implementa-
tion may be visualized as a deployment of the plan to the Do and Check
levels as shown in Figure 10.
Figures 9 and 10 give the guidelines ortheoverallframeworkfor
finding a way to business excellence. The guidelines are monitored by the
PDCA cycle in which Study and Learn (Check) constitute the crucial pre-
condition for continuous improvement of the strategic planning process.
The frameworkhasbeenlinkedtotheEuropeanmodelforTQMand
business excellence.
Aswaspointed out i n Section 2, theEuropeanmodelfor business
excellence is not explicit enough on creativity and learning. For that reason,
and also because companies outside Europe may wish to apply othermodels
(e.g., the Malcolm Baldridge model), a more general model is proposed. We
call themodelthePDCA-leadership cycle forbusiness excellence. This
68 Dahlgaard et al.

PDCA and Strategic Planning

--
I
'I/ ', \
\ '\

Pian:
How? 1. Leadership
2. People Management
3. Policy and Strategy
4. Resources
5. Processes
What? 6. PeopleSatisfaction
7. Customer Satisfaction
8. impact on Society
9. Business Results
Action: O
I .Self-Assessment

Figure 9 The elements of Plan in relation to the yearly strategic planning process
(items 1-10),

model, which contains the key leadership elements for businessexcellence, is


shown in Figure 11.
It is seen from Figure 11 that the Plan component contains the vision,
mission, and goals of the company together with the business plan, which
contains goals for both tangibles and intangibles. In the Do phase the plan
hasto be deployedthrough policy deployment.Twootherelementsare
crucial for an effective implementation of the business plan: (1) the leader-
ship style of all managers and (2) education and training. The Check phase
of the PDCA-leadership cycle comprises two elements: ( I ) Gaps between
goals and results have to be identified, and (2) the gaps have to be studied
for learning purposes. Once we understand why the gaps came up we are
ready for Action. This phase should result in new ideas for improvement of
people, processes, and products and new ideas for motivation of the people.
Leadership Profiles and Implementation ofTQM 69

PDCA and Implementation


,

Plan: 1. Leadership
2. People Management
3. Policy and Strategy
4. Resources
Do: 5. Processes
Check: 6. People Satisfaction
7 Customer Satisfaction
8. Impact on Society
9 Business Results
Action: O
I .Self-Assessment

Figure 10 Deployment of the plan to the Do and Checklevels.

With this raw material the company has strong input for the next PDCA-
leadership cycle for business excellence.
Letus look more specifically ateducationandtraining in the Do
phase.

3.2. Education and Training for Business Excellence


The overall purpose of education and trainingis to build quality into people
so that it is possible to practice real empowerment for business excellence.
This can be achieved only if education and training are part of an overall
leadership process where improvements in both tangibles and intangibles
support each other as natural elements of the strategic planning process.
Tangibleworldclassresultsareevidence of business excellence, butthe
precondition for the tangible results are the intangible results such a s recog-
nition, achievement, and self-realization. The intangible results are a pre-
70 Dahlgaard et al.
t
Leadership Profiles and Implementation ofTQM 71

condition for buildingvalues into the processes, i.e., value building of intan-
gible processes,which again will improvethetangibleresults.Figure 12
shows how this process is guided by theprinciplesof the TQM pyramid
supplemented by education and training.
If we look at Education and Training (Fig.12), we see that it forms the
foundation of a temple and that its aim, quality of people, is the roof of the
temple. The pillars of the temple are the main elements of Education and
Training: ( I ) learning, (2) creativity, and (3) team building. Trainingin team
building is a necessary element to support and complement creativity and
learning. The importance of team building was also clearly demonstrated in
Section 2 of this chapter (see Figs. 4 and 7).
The main elementsof the three pillars are shownin Figures 13-15. It is
seen that the elements of each pillar are subdivided into a logic part and a
nonlogic part. The logic part of each pillar contains the tools to be used for
improvementoftangibles(things,processes,etc.),andthenonlogicpart
contains the models, principles, and disciplines that are needed to improve
intangibles such as the mind-set of people (mental models, etc.). Learning
and applyingthetoolsfromthe logic part of thethreepillarsmayalso
gradually have an indirect positive effect on intangibles.
Most of the methods presented in this volume are related to the logic
part of the three pillars.To build quality into peopleand to achieve business
excellence, logic is not enough. Education and training should also comprise
the nonlogic part of the pillars, which is a precondition for effective utiliza-
tion of the well-known logical tools for continuous improvement. It is a
common learning point of worldclass companies that managers are the
most important teachers and coaches of their employees. That is the main
reason why education and training are integrated in the PDCA-leadership
cycle for business excellence.

4. CONCLUSION

I n this chapter the role of statistical methods in monitoring the implementa-


tion of TQM and business excellence has been discussed. It has been argued
that in order to achieve business excellence it is necessary to continuously
improve tangibles (things, processes)as well as intangibles (e.g., the mind-set
of people). Improving the mind-set ofpeople is the same a s building quality
into people. Improvement of tangibles requires education and training on
the
well-knownstatistical tools
such a s statistical
process
control.
Improvement of intangibles requires education and training on nonlogical
models, principles, and disciplines. Both types of education and training are
needed to achieve business excellence.
-I
Business Excellence h)

U
5
tDo,
/ k
Intangible Results 5
Tangible Results
I %!
c

ngible Processes
ue building) Recognition
Achievement

/ Education and Training \

Figure 12 The continuous improvement process for business excellence.


Leadership Profiles and Implementation ofTQM 73

Mental
7 new
models
pDCA/
Wish to
PDSA learn
Joy In
learnlng

e * Personal

Education and Training

Figure 13 The logic and nonlogic parts of Learning in Education and Training.

Non-logic
5a
P
1
5.
a
w

iw - La

I /

Education and Training

Figure 14 The logic and nonlogic parts of Creativity in Education and Training.
74 Dahlgaard et al.

Non-loglc
Touls: *Shared

nicatlon * Joy in
teamwork

Education and Training

Figure15 The logic and nonlogic parts of Team Building in Educationand


Training.

Section 2 showedhow it is possible to use statisticalmethodsto


understand and improve the soft part of TQM implementation: le-1‘ d er-
ship styles. Withoutunderstandingthe effects of leadership styles it is
impossible to practice effective leadership. As leadership is both the foun-
dation of the TQM pyramid in Figure 1 and the first enabler criterion of
theEuropean modelforbusiness excellence (see Fig. 2). it is obvious
thatthe firststep onthejourney to business excellence shouldbeto
try to assess and improve the different leadership styles of the company’s
managers.
Section 3 showed how leadership can be practiced and monitored by a
simple PDCA leadership cycle. It was shown that in this cycle the imple-
mentation of the company’s business plan is accomplished by people work-
ing i n the different processes that are running dayby day. These peopleneed
education and training in the well-known statistical tools for improvement
oftangibles(things,processes,etc.) as well as education and training in
models,principles, and disciplines forimprovementofintangibles(the
mind-set). It was argued that the company’s business plan should contain
improvement goals for tangibles as well as intangibles. Only in this way can
business excellence be achieved.
Leadership Profiles and Implementation of TQM 75

RECOMMENDED READINGS

DahlgaardJJ,KristensenK. Vejen til Kvalitet (in Danish).Centrum.Aarhus.


Denmark. 1992.
Dahlgaard JJ. Kristensen K, Kanji GK. The Quality Journey--A Journey without
an End. Carfax, Abingdon, UK, 1994, and Productivity Press. India. 1996.
Dahlgaard JJ, Kristensen K, Kanji GK. Total Quality Leadership and Employee
Involvement (The TQM pyramid), Proceedingsof ICQ'96. Yokohama, JUSE.
1996.
DahlgaardJJ.KristensenK. Milliken Denmark,Acasestudy included in the
EuropeanWayto Excellence CaseStudyProject,EU DGIll and EFQM.
1997.
DahlgaardJJ,Kristensen K, Kanji GK.TheFundamentals of TotalQuality
Management. Chapman & Hall, London, UK, 1997.
Park Dahlgaard SM, Kondo Y. Quality Motivation (in Danish). Centrum, Aarhus,
Denmark. 1994.
Fritz R. Corporate Tides: Redesigning the Organtzation. Butterworth-Heinemann,
Oxford, UK, 1994.
lmai M. Kaizen-The key to Japan's competitive success. Kaizen Institute, Tokyo.
Japan, 1986.
Kanji GK, Asher M. Total Quality Management Process-A Systematic Approach.
Carfax, Abingdon. UK, 1993.
Kanji G K . Quality and statistical concepts. In: Kanji GK. ed. Proceedings of the
First World Congress, Sheffield. UK. Chapman & Hall. Cornwall, UK, 1995.
Marjaro S. Managing Ideas for Profit: The Creative Gap. McGraw-Hill, London,
UK. 1992.
Morgan A. Imaginization: The Art of Creative Management. Sage Publ., London,
UK, 1986.
Senge PM. The FifthDiscipline -The Arts & Practice of the Learning Organization.
Doubleday Currency, New York, 1990.
Urban GL. A second industrial revolution. Across the Board 32(2). 1995.
This Page Intentionally Left Blank
5
A Methodological Approach to the
Integration of SPC and EPC in Discrete
Manufacturing Processes
Enrique Del Castillo
Pennsylvania State University, University Park, Pennsylvania

Rainer Gob and Elart von Collani


University of Wuerzburg, Wuerzburg, Germany

1. INTRODUCTION

The control of industrial manufacturing processes has long been considered


from two different points of view. Stotistic~rlp r ~ o ~ c~orltrol
~ ~ s . (SPC),
~ which
tracesbacktothework of Walter Shewhart in the 1920s, was originally
developed for discrete manufacturing industries (industries concerned with
the production of discrete items). On the other hand, continuous process
industries, chemical industries for instance, used various forms of adjust-
ment strategies administered by automatic controllers. This type of process
control became known as cnginccjritlg prowss corltrd (EPC) or ~ r l r t o t w t i c
process c ~ ~ ~(APC).
t r d Separately, both approaches havereceived cnornmus
interest in the academic literature.
Interest in SPCandEPCintegrationoriginated i n the 1950s i n the
chemical industries. Part of this interest is due to the inertial elementsin this
type of production process (e.g.. raw materials with drifting propcrties) that
result i n autocorrelated qualitycharacteristics of
the
endproduct.
Traditional SPC methods assume instead i.i.d. quality characteristics, and
problems of II high number of false alarms and the difficulty in detecting
process shifts occur under (positive) autocorrelation at low lags. If the auto-
correlation structure can be modeled and a compensatory variable can be

77
78 DelCastilloetal.

found to modify the quality characteristic, then an EPC scheme is put into
place tocompensateforsuchdriftingbehavior.However,abrupt,large
shifts in the quality characteristic indicate major failures or errors in the
process that cannot generally be compensated for by the EPC controller.
For this reason, many authors have suggested that an SPC chartbe added at
the output of an EPC-controlled production process to detect large shifts.
There is noclearmethodology,however,thatmodelssuchintegration
efforts in a formal and general way.
In contrast, interest in SPC-EPC integration in discrete part manufac-
turing is morerecent.Inthistypeofproductionprocess,elementsthat
induce autocorrelation are not common. However, drifting behavior of a
process that “ages” is common. A typical example of this is a metal machin-
ing process in which the performance of the cutting tool deteriorates (in
many cases, almost linearly) with time. Many years ago, when market com-
petition was not so intense, specifications were wide enough for a produc-
tion
process
drift
towithout
producing
large
a proportion of
nonconforming product. With increasing competition, qualityspecifications
have become more rigorous, and drifting behavior, rather than being toler-
ated, is actively compensated for by simple EPC schemes.
Academic interest in the area of SPC-EPC integration has occurred as
anaturalreactiontotherequirements of industrialpractices.However,
most of the approaches suggested during the discussion on this problem
argued from the point ofviewof practical necessities alone. Proponents
of either side admit that many control problems in modern manufacturing
processes cannot be solved by either SPC or EPC alone. As a consequence.
methods from each field are recommended as auxiliary tools in a scheme
originally developed either for SPC or for EPC applications alone. None of
these approaches have been really successful from a methodological point of
view. The models used were originally designed for either proper SPC or
EPC applications but not for an integrationof the two. The practical neces-
sity of an integrating approach to industrial control problems is obvious,
but a rigorous mathematical model to reflect this need is still missing. As a
reaction to this methodological gap, the present chapter establishes a simple
model that integrates the positions of SPC and EPC.

2. MODELS PROPOSED IN THE LITERATURE FOR SPC-


EPC INTEGRATION

Although diverse authors have discussed the different aims and strategies of
SPCandEPC (e.g., Barnard, 1963; MacGregor, 1988, 1990; Box and
Kramer, 1992; Montgomeryetal., 1994), few specific modelshavebeen
integration ofSPCandEPC in Manufacturing 79

proposed for the integration of these fields. Among these models we find
algorithmic statistical process control (ASPC) and run-to-run control pro-
cedures.

2.1. ASPC
Vander Weil et al. (1992) (see also Tucker et al., 1993) model the observed
quality characteristic 6, of a batch polymerization process at time t as

where the first term on the right represents a shift of magnitude p that occurs
at time t o , x, is the compensatory variable, and thenoise term is a stationary
ARMA( 1 , l ) stochastic process. In what the authors refer to as algorithmic
statisticalprocesscontrol(ASPC), processshifts aremonitored by a
CUSUM chart, whereas the ARMA noise is actively compensated for by
an EPC scheme. Using a similar approach, Montgomery et al. (1994) pre-
sentedsomesimulationresults.Clearly,ASPC is focused on continuous
production processes.
A basic weakness of the APSC approach is that there is no explicit
stochastic model for the time f o of shift occurrence.

2.2. Run-to-Run Process Control


Sachs et al. (1995) (see also Ingolfsson and Sachs, 1993) assume instead a
simple linear regression model with no dynamic effects for controlling cer-
tain semiconductor manufacturing processes. The model is

By using a control chart on the residuals of model (2), called “generalized


SPC” by theauthors,theirmethodappliestwodifferenttypes of EPC
schemes: an Exponentially Weighted Moving Average (EWMA)-based con-
troller if theobserveddeviationfromtarget is “small” (called “gradual
control” by theauthors)anda Bayesian controllerthatdeterminesthe
momentandmagnitude oflargerdeviations in casethey occur.Other
authors (Butler and Stefani, 1994; Del Castillo and Hurwitz, 1997) extended
model (2) to the case where deterministic trends and ARMA( 1,l) noise exist.
A basic weakness of the run-to-run models is that the classical ratio-
nale of SPC applications is a shift, stochasticin time of occurrenceand/or in
magnitude. Again, this is not reflected by the run-to-run models.
80 Del Castillo et ai.

3. BASIC FEATURES OF MODELS IN SPC AND EPC


3.1. Process Cfianges in SPC and EPC
Any approach to process control needs a model of yr.oce.v.sc/wngp.y, i.e., a
modelforthechanges in theprocess parameters(outputmean,output
variance, output proportion nonconformance) that occur throughout pro-
duction time. On this topic, the traditional approaches to SPC and EPC
differ significantly, corresponding to their originsin different types of indus-
tries.
The overwhelming majority of SPC modelsidentify process changes as
brupt shifts of the process parameters due to assignable causes, disturbances
or shocks that affect the manufacturing facilities. These shift models gen-
erally share four basic assumptions:
1. The magnitude of shifts is large relative to the process noise var-
iance.
2. Shifts are rare events; the period up to ,the occurrence of the first
shift or between two successive shifts is large.
3. Shifts can result from a variety of assignable causes. Detection of a
specific assignable cause requires expert engineering knowledge of
the production process, and it is time-consuming and expensive.
4. Control actions to remove assignable causes of variation are time-
consuming and expensive, requiring skilled staff, machinery, and
material.Theseactions result in rearrangingthe process para-
meters to the “in-control” or target values, e.g., recentering the
process mean. That is, these actions are corrective in nature.
Under these assumption, a constant automatic adjustment strategy (i.e., an
EPC) obviously is not the appropriate remedy for process variation.
Engineering process control models that originate in continuous pro-
cess industries consider process changes i n the form of cot?tir?lro~rs
11tYft.s. In
contrast to the shift modelsofSPC,thebasicassumptions of the EPC
approach are that
I. The drift is slow.Measuredover a short timeinterval,thedrift
effect is small relative to the noise variance.
2. The dpift permanently continues throughout the production time.
3. The drift is an inherent property of the production process. Expert
engineering knowledge of the production process provides knowl-
edge of the structure of the drift process. To a certain degree, the
drift effect can be estimated and ‘predicted.
4. The control actions taken to counteract the drift effect are minor
in terms of time and expense. They follow a repetitive procedure
Integration of SPC and EPC in Manufacturing 81

or algorithm that can be left to automatic controllers. These con-


trol actions have no effect on the process parameters but rather
compensate for observed deviations of the quality characteristic.
That is, the causes of the drifting behavior are not “corrected” but
only conlpcrrsrrterl for.
Under these assumptions, constant automatic adjustments are a reasonable
control strategy.

3.2. Open-Loop and Closed-Loop Behavior of a Process


The mathematical modelsused by SPC and EPCreflect different ideas about
process changes and process control. We shall explain the differences and
similarities for the simple situation of a process that, at successive discrete
time points 0, 1,2, . . . , produces output with a single quality characteristic
kO, E,*, . . . . An essential aspect is thedistinctionbetweenthe open-loop
behavior (behavior without control actions) and closed-loop behavior (beha-
vior in the presence of control actions) of such a process.
In SPC, detection of an assignable cause and subsequent corrective
action occurs only rarely. If it occurs, it amounts to a complete rcweblwl of
the manufacturing process. Hence it is useful to split the entire production
runintotheperiods (reneuwl qde,y) betweentwosuccessivecorrective
actions(renewals)andtoconsidereachrenewal cycle alongaseparate
timeaxis 0, 1,2, . . . with correspondingoutputqualitycharacteristics
to,kl,k2, . . . . The effect of control actions is not reflected in the output
model.
In standardEPC,controlactionsaretakenregularlyateach time
point. Without these permanent compensatory actions the process would
exhibit a completely different behavior. A model of the process behavior
without control is indispensable for the design and evaluation of control
rules. Thus we have the
open-loop output
quality
characteristics
E,;, {;, ti,.. . of the process without control (left alone) and the closed-
loop output quality characteristics to,kl,{?, . . . of the process subject to
control actions.

3.3. ProcessChangesinSPCModels
Statistical process controlis designed for manufactuqing processes thatexhi-
bit discrete parameter shifts that occur at random time points. Thus in SPC
models the most general form of the output process (E,;)N,, is the sum of a
mrrrked point process anda white mise component.Thisapproach is
expressed by the model
82 Del Castilloetal.

kf! = P, + E ,
In this formula (P,)~, is a marked point process,

with a target p*, with marks ?i2, . . . representing the sizes of shifts 1 , 2 , . . .
andacountingprocessthat gives thenumber ofshifts in timeinterval
[O; t). in Eq. (3) is awhitenoiseprocessindependentof ( P , ) ~ , ,The
.
white noise property is expressed by

E[&,]= 0, V [ E , ]= 0 2 , E [ E , E , +=
~ ]0 for all t E No;k E N
(5)

A simple and popular instance of a marked point process is one with


shifts occurring according to a Poisson process (Nf)N,.Most investigations
oncontrolcharts use furthersimplifications. For instance,deterministic
absolute values = A (A > 0) of theshiftsarefrequentlyassumed.
Many approaches assume a single shift of a given absolute value A that
occurs after a random (often assumed to be exponentially distributed) time
v. In this case we have

where the random variable y is the sign of the deviation from target with

In the case of one-sided shifts, we have p = 0 or p = 1; in the case of two-


sided shifts, it is usually assumed that p = 0.5.

3.4. ProcessChangesinEPCModels
Engineering process control is designed for manufacturing processes that
exhibit continuous parameter drifts. Some typical instances of open-loop
output sequences in EPC models are as follows.
Integration of SPC and EPC in Manufacturing 83

ARMA Models
An important family of models usedto characterize drifting behavior occur-
ring due to autocorrelated data is the family of ARMA(p, q) models (Box
and Jenkins, 1976):

where (E,)N,is a white noise sequence [see Eq. ( 5 ) ] . By introducing the back-
operator Bk3’if(
shift Eq. (7) can be writtenas

or as

where h,(B) and $,(B) are stable polynomials in B. Sometimes, nonstation-


ary ARIMA(p, d , q) models of the form

have been used instead to model drifting behaviorin continuous production


processes.

Deterministic Drift
If the drifting behavior is caused by aging of a tool (see, e.g., Quesenberry,
1988), a simple regression model of the form

is sufficient to model most discrete manufacturing processes. Here, T is a


target value and dt is a deterministic time trend.

Unit Root Trend


Alternatively, a “unit root” process can be used to model linear drifts by
using
84 Del Castillo et al.

For example, if h,(f?)= I , then (9) is a random walk with drift tt that has
behavior similar to that givenby (8) but with variance that increaseslinearly
with time.

3.5. Common Structure of SPC and EPC Models


Analyzing the SPC models of Section 3.3 and the EPC models of Section
3.4, we can point out a common structure that is helpful i n developing an
approach for an integrating model.
We shall decompose the output into two components. One of these
components is a function of the white noise variables E , alone
~ and represents
completelyuncontrollablerandomvariation.Theothercomponent, e,,
represents the effective state of the process, which is subject to an inherent
drift or to a shift due to an assignable cause. Formally, this means to con-
sider the output equations (3), ( 7 ) , (8), and (9) as special cases of the model

Inmanycases 8, is a deterministicfunctionof t thatcoincideswiththe


outputmean, i.e., = 8,. Theargument ( E , ~ ) , is
~ ~ required
, to allow for
possible cumulative effects of the white noise variables on the process out-
put; see ( 1 3) or (14) below.
Let us rewrite the models (3), (7), (8), and (9) in terms of (10).
As to the shift model (3), let

For the deterministic drift model (S), let

For the random walk with drift model [see (9) with h,(B)= 11, let

Finally, for the ARMA@, (1) model (7), we can identify


Integration of SPC and EPC in Manufacturing 85

where h, = -hit?' and where we also identify

r=O

4. MODELING THE INTEGRATION OF SPC AND EPC

Simultaneous application of SPC and EPC procedures to the same manu-


facturing processmakessenseonly if the processexhibitsbothkinds of
changesconsidered in Section 3: discreteandabruptvariation by shift,
which represents the position of SPC, and continuous variation by drift,
which represents the position of EPC. Consequently, an integrative model
forSPCandEPCshouldcontaincomponentscorrespondingtothe two
types of process variation models given in Section 3: a marked point process
component to justify theuse of SPC (see Section 3.3) and a drift component
to justify the use of EPC (see Section 3.4).
In view of the common structure of SPC and EPC models formulated
by (1 O), a unifying model for SPC and EPC canbe expressed by the follow-
ing model for the uncontrolled (open-loop) process output \;:

t; = 3/((P:.1)),s5,9. . . 3
(1)
(P.Y( E ; ) ),ss,7(9,s ).T5,?. ' . 7 (v)),s5,, (16)

where . . . , (p6K')N,,are K differentmarkedpointprocessesrepresent-


ing the effect ofshiftsto be treated by SPC [see Eq. (4)], ...,
(6!."))),,,are M different drift processes that represent the effect of contin-
uous drifts tobe treated by EPC [see Eqs. (1 2). (1 3), and (1 5)], and ( E , ~ ) ~is ,, a
white noise sequence [see Eq. (5)]. For some applications it is necessary to
choose all past values (P:)),~<,, (i?$)),s5,, ( E , ~ ) , as
~ ~ ,the arguments of the func-
tions 3,to allow for possibie cumulative effects of #), & , . . . , a(i) , , 80)
. . .,E,, E,;
. . . on (see Section 4.2, random walk drift model).
Equation (16) gives a generic frameworkfora processmodel that
integrates the positions of SPC and EPC. Let us now consider three impor-
tant examples with one drift component, i.e., with M = 1.
86 DelCastilloet al.

4.1. AdditiveDisturbance
In many cases an abrupt shift can be modeled as a translation of the output
value 6;. To express this situation in the terms of model (16), we choose

where ( P , ~ ) ~is, ,a shift process of the type introduced by Eq. (4), (qs)Nois a
process that represents the effect of continuous drifts [see Eqs. (12), (13),
and (15)], and ( E , ~ ) ~is" a white noise sequence[see Eq. (5)]. In many cases, we
simply have G , ( ( E , ~ ) ,=~ ~E ,, )[see Eqs. (1 1) and (12)]. For examples of func-
tions G , ( ( E , ~ ) ,that
~ ~ , )express a cumulative effect of the white noise variables,
see Eq. (14).

4.2. ShiftinDriftParameters
Usually the models for drift processes ( that are used in EPC depend on
parameters. These parameters can be subject to shifts during production.
Engineering controllers, however, are designed for fixed and known para-
meter values and cannot handle such sudden parameter shifts. Even adc1ptive
EPC schemeshave thefundamentalassumptionthatthechanges in the
parameters are slow compared to the rate at which observations are taken
(AstromandWittenmark, 1989). ThussupplementarySPC schemes are
required to detect these abrupt changes (Basseville and Nikiforov, 1993).
Let us consider two simple models thatwill be investigated in some detail in
Section 5.

Shift in Trend Parameter-Deterministic Drift Model


In the original parametric model, see (12), let the drift component (B,)N,, be
described by a deterministic trend,

14, = T + dt

i.e., by the recursion

with a parameter d and a target value T . However, the drift parameter d


may be subject to abrupt shifts, as may occur, e.g., when a cutting tool starts
to fail. Thus the parameter value attime t should be considered as a random
variable p/,where ( P ~ ) ~is, a, marked point process as given by (4) with target
p* = d. Replacing d by pLIin (17) we obtain the output equation
Integration of SPC and EPC in Manufacturing 87

I n the scheme of Eq. (16) we have

Shift in Trend Parameter-Random Walk with Drift Model


In the original parametric model the drift component( 81)N,,is the same as in
the deterministic drift model [compare Eqs. (12) and (1 3 ) ] . Again, the para-
meter d may be subject to abrupt shifts. Thus the value of the parameter d at
time t should be considered as arandomvariable p t , where (pt)Nois a
marked point process as given by (4) with target p* = d. To calculate the
effect on the output we have to insert pLIfor d in the difference equation (9)
with h,(B) = 1. We obtain the output equation

Equation (19) constitutes a special case of Eq. (16) with


I I

4.3. Additive Disturbance and Shift in Drift Parameters


As a generalization we can consider a combination of the models of Sections
4.1 and 4.2: an additive disturbance component (p!‘)) and a shift compo-
NO.
nent (p:2))Noin thedriftparameter. Letussketchthls approachforthe
deterministic trend and the random walk with drift models.

Additive Disturbance and Shift in Trend Parameter-


Deterministic Trend Model
Consider the deterministic trend model under the assumption that there are
possible shifts of the drift parameter d represented by a marked point pro-
cess (p:’))N,l with target p: = d and that there is an additive disturbance
88 DelCastilloet al.

represented by a marked point process ( ~ . ~ j l ) ) ~with


, target 117 = 0. Then we
obtain the output equation

Equation ( 2 ) constitutes a special case of Eq. (16) with

K=2, M=l,

Additive Disturbance and Shift in Trend Parameter-Random


Walk with Drift Model
Consider the random walkwith drift model under the assumption that there
are possible shifts of the drift parameter d represented by a marked point
process (~d’))~(,withtarget p; = cl andthere is an additivedisturbance
represented by amarkedpoint process ( p : l ) ) N , j withtarget py = 0. Then
we obtain the output equation
I I I I

r=l I= I I= I

In the scheme of Eq. ( 1 6) we have

K = 2M, = 1,

5. ENGINEERINGPROCESSCONTROLLERS

If a compensatory variable x, can be determined in a production system,


then a control rule of the form
Integration of SPC and EPC in Manufacturing 89

can be devised. Usually, a controller such as Eq. (22)is found by optimizing


some performance (or cost) index J . A common index is

where T denotes the process target and N is the total number of observa-
tions the process is going to be run. Minimization of JI results in a m i n b n z m ~
I I I ~ N M s c p r e c~rror(MMSE) controller (Box and Jenkins, 1976), which is also
controller by Astrom (1970) if E,, denotes devia-
called a n 1 i r 7 i r n u m ~~rrirrnce
tion from target, in which case T = 0 in (23). From the principle of optim-
alityofdynamicprogramming, it can be shownthatthe minimizing
criterion (23) is equivalentto minimizingeach E[(k,- T)’] separately
(Soderstrom, 1994, p. 313).
Other cost indices have been proposed for quality control applications.
The following cost index was proposed by Box and Jenkins (1963) for their
“machine tool” problem:

where 6(.u) = 0 if .u = 0 and F ( s ) = 1 if s # 0. This is a function with quad-


ratic off-target cost and a fixed adjustment cost independent of the magni-
tude of theadjustment x, - s - ~ . With this coststructure,theauthors
showed that it is optimal to wait until the process is sufficiently far from
target in order to perform an adjustment, a policy that resembles an SPC
control chart. However, the widthof the “adjustment limits” is a function of
the relative adjustment cost c/o and is not based on statistical considerations
(Box and Jenkins. 1963; Crowder, 1992).
Fixed adjustment costs may be common in certain production pro-
cesses. However, if x, represents a setting of some machine (i.e., a setpoint
for an automatic controller included with the equipment), then the adjust-
ment cost c is practically zero and J2 reduces to an MMSE controller.
We now investigate two simple EPC controllers for the drift processes
ofEqs. (8) (deterministictrend) and (9) (random walkwith drift) i n the
general framework of Section 4.3. The simpler situations of Sections 4.1
and 4.2 are obtained as special cases of the general scheme. Control rules
will be designed according to the J , criterion (MMSE). We will assume that
the effect of the sequence ( . Y ~ ) ~ ,of
, compensatory variables on the output
process is expressed by
90 Del Castillo et al.

which implies that the full effect of the compensatory variable is felt imme-
diately on the quality characteristic. Furthermore,we assume as before that
the noise variables ( E , ) ~ , form a white noise sequence. These two assump-
tions guarantee that the closed-loop variablesto, tl,. . . are all independent.
This makes it easier to see how the MMSE criterion (23) is equivalent to
requiringthateachsquaredeviationbe minimizedseparatelywithout
recourse to dynamic programming techniques.

5.1. Control of DeterministicTrend


We consider the deterministic trend model of Section 4.3 with a possible
shift in the trend parameter d and an additive disturbance. By (20) and (25),
the equation of the output of the controlled process is

It is clear that the controlrule has tobe designed for the case where the shift
components pj‘) are on their targets p;, i.e., for the case

By (22), . Y , ~is~ independent of E,; hence

Obviously, equality is obtained for

and at the “current” time t we implement the control action,

s , = -d(t + 1)
Hence the MMSE controller as defined by (28) corresponds to a pure
“feedforward”controller(i.e.,theobservation is not “fed back”into
thecontrolequation,butrathertheanticipateddisturbance is used).
Controller (28) is equivalent to rule dl in Quesenberry (1988) if the sample
size k of that paper equals 1.
integration of SPC and EPC in Manufacturing 91

Under the effect of the shift components (P:‘))~”,the effect of control


rule (28) on the output can, by (26), obviously be expressed as

r=l

5.2. Control of Random Walk with Drift


An alternative model for linear driftis the random walk with drift stochastic
process. As in the second subsection of Section 4.3, we admit possible addi-
tive shifts represented by a process (p(l))Noand possible shifts in the drift
parameter represented by a process (p;(4 By (21) and (25) the equation of
the output of the controlled process is

As in Section 5.1, the control rule has to be designed for the case in
which the shift components p:‘) are on their targets pT, i.e., for the case

By ( 2 2 ) , cr-l - T + (i + - is independentof E,; hence

Obviously, equality is obtained for

which defines the MMSE control rule.


It is interesting to contrast controlrules (28) and (31). Equation (31) is
a “feedback” rule, since the observed value for the quality characteristic(5,)
is sent back to the controller to determine the next value of the input vari-
able (x,). For -yo = 0 we obviously obtain from (31) that

XI = -nt - C(5;- T )
I= I
92 DelCastillo et al.

The second term on the right-hand side of (32) justifies the name “discrete
integral controller” used for this type of control rules.
Finally, let us evaluate the effect of control rule (31) on the output
quality characteristic under the effect of the shift components (p!‘))N,l. From
(30) and (32) we obtain

Inserting (33) into (32) we obtain

6. DISCUSSION OF SPC IN THE PRESENCE OF EPC

Considertheoutput of amanufacturing process underthesimpledrift


controllers of the previous section. Theoutput witl~outthe presence ‘of para-
meter shifts is a special case of (29) or (33) with constants pjl) = 0 and
pj2) = c/. In both cases we obtain

St = T + E, for all t E N

In this case, the process output(kt)N is a sequence of i.i.d. random variables


withmean E[5,]= T ontargetandtheminimum possiblevariance
V[6,]= 2 .
For a successful SPC-EPC integration, it is necessary to analyze the
output of processes under EPC control with shift components. For a sub-
stantial discussion we need simple instances of shift components.

6.1. Effect of Simple Shifts on EPC-Controlled Processes


For many applicationsit is appropriate to assume simple structured shifts
of
thetypegiven by Eq. (6). In themodels ofSections 5.1 and 5.2 let us
consider the output processes under this type of shift. We assume that
Integration ofSPCandEPCinManufacturing 93

where 1.17 = 0, p; = d are the target values, A I > 0 is the absolute shift size,
vI is the random time until occurrenceof the shift, and y I is the random sign
of the shift.
Under these assumptions the output equation (29) of the deterministic
trend model becomes

where

is the indicator function of a set B c R.


Applying the same assumptions to the output(33) of the random walk
with drift model, we obtain for t 2 2

For the control variable of the random walk with drift model we obtain, by
inserting (35) into (34),

Theequationsforthe simplermodelswithonly one possibleshift


(either additive or in the drift parameter) are obtained from (36) and (38)
either by letting v2 = +oo (only additive shifts) or by letting v I = +oo (only
shifts in drift parameters).
In the following two sections we discuss (36) and (38) in two practi-
cally relevant situations.

6.2. ShiftsOccurringDuringProductionTime
In the deterministic trend case, the controller defined by (28) has no feed-
back from the output and is thus not able to compensate for random shifts.
As is obvious from (36), an additive shift takes the process mean away from
its target T to T + 7, A , , but the output at least remains stable i n its mean.
A shift in the drift parameter is even more harmful. After such a shift, the
output mean has a trend component ( t - Lv2j)y2A2. It is obvious that in the
94 DeiCastiiioetai.

presence of possible shifts such a process should be monitored by a supple-


mentary SPC scheme.
The feedback controller of (31) or (32) for the randomwalk with drift
is able to react both on additive shifts and on shifts in the drift parameter.
As is obvious from (38), due to the delay of one time period in the con-
troller's action, an additive shiftleads to only asingle outlier of theoutput 6,
at t = t u , + 11 but remains without effect at further time points. A shift in
the drift parameter canbe more harmful. After such a shift, the output mean
is constantly off target T at T + y2A2. However, this shift in the mean has
serious consequences onlyif Iy2A2I is large or if the cost of being off target is
large. In such casesit is reasonable to monitor theprocess by supplementary
SPC procedures.

6.3. Effect of a Biased Drift Parameter Estimate


In the approach of the model presented in Section 4.2, a biased estimate of
the drift parameter d can be interpreted as a shiftin the drift parameter that
coincides with the setup of the process. This is quite useful as it allows us to
study the effect of mistakenly using a biased trend parameter estimate in an
EPC scheme. Let d be the biased estimate that is used instead of d in the
control equations (28) and (31) or (32). Then we can describe the situation
by (36) and (38) by letting

and

The effect of this type of parameter shift in the trend and random walk
models is exactly the same as in Section 6.2.

6.4. Effect of Constraints in the Compensatory Variable


An important aspect in practice, usually not addressed in the literature on
SPC-EPC integration, is that the compensatory variable must usually be
constrained to lie within a certain region of operation, i.e.,
Integration of SPC and EPC in Manufacturing 95

for all instants t . Inparticular,integralcontrollerssuchasEq.(31)can


compensate for shifts of any size provided that the controllable factor is
unconstrained.
It is useful to consider what would happen if the EPC schemes given
by Eqs. (28) and (31) were applied to a constrained input process. Since the
drift is linear, the control variable x, moves in the opposite direction than
the drift to keep E,/ on target. However, at some point the controller hits a
boundary (either A or E ) and remains there afterward. In the control engi-
neering literature this is referred to as “saturation” of the EPC scheme.

Effect of Constraints Under the Deterministic Trend Model


Let us discuss the case of a constrained control variable in the trend model
of Section 5.1. For simplicity’s sake we discuss only the case of cl > 0 with a
lower bound A < 0. The case of d < 0 with a corresponding upper bound
E > 0 is completely analogous.
The relationship between the control variable x, of Eq. (28) and the
constrained control variable .VI is

s I = -d(r + 1) if t I -A/d - 1
if t > - A / ( ] - 1

Hence the output cI of the process under constrained control is

Under the simple shift components of type (35) the output [ I satisfies the
right-hand side of (36) for t 5 - A / t l . For t > - A / d we obtain

Obviously,thearguments in favor of supplementary application of


SPC schemes in the trend model that are put forward in Section 6.2 also
hold in the case of constrained controllers.
96 DelCastilloetal.

Effect of Constraints Under the Random Walk Model


In the random walk model we also restrict attention to the case cl > 0 with a
lower bound A < 0.
Unlike the situation for the deterministic trend model, the time; until
hitting or falling below the lower bound A is stochastic and is defined by

From (43) the distribution of K can be found by first determining the con-
ditional distribution under vi and y I and then integrating with respect to the
corresponding densities. We shall not investigate this problem here.
The relationship between the control variable s , of (39) and the con-
strained control variable ,:. is

x, = x,A if t < K
i f f l ~ (44)

Hence the output kl of the process under constrained control is

Under the simple shift components of type (35) the output 5,


satisfies the
+
right-hand side of (38) for 2 5 t < K + 1. For t L 2, t 3 K 1 we obtain
integration of SPC and EPC in Manufacturing 97

In the unconstrained case, supplementary application of SPC in the


random walkmodelmakes sense in onlyspecial cases (see Section6.2).
FromEq. (46), it is evident that in theconstrainedcasesupplementary
application of SPC is much more interesting and perhaps indispensable.

6.5. Effect of Using a Wrong Model


We now study what would happen if a wrong drift model is used.
Under a deterministic trend, the relation between the closed-loop out-
put E,, and the control variable. Y , - ~ is given by (26). If the integral controller
defined by (31) is used in this model, the explicit expression for E,, is

Under the simple shifts of type (35) we obtain

Whether there are parameter shifts or not, the outputexhibits twice as great
a variance as in the case of using the correct model. This case occurs in
Quesenberry’s (1988) d, and [I3 rules. If p:’) = 0 and p:” = tl for all t , then
E,, = T + ( I - a)&,, which is an MA(1) process, an always-stationary time
series model (Box and Jenkins, 1976).
If parameter shifts occur, we have the following result. Except for the
+
single outlier for v I < t 5 v I I , 6, is permanently off target for t > v, with
absolute deviation A*. This, and the uncertainty about the correctness of the
assumptions of the model, make it advisable to use SPC methods i n addition
to the simple EPC schemes.
If, on the contrary, the deterministic trend controller (28) is used in a
random walk with drift process, the closed-loop equation is, by (30).
I I

In the long run, if we let t grow without bound and we use the inverse
of the difference operator, namely,

then the closed-loop equation is


98 Del Castilio et ai.

If p;’)= 0 and p:?) = d for all t , then the previous equation reduces to

which is anonstationary AR(1) process (Box andJenkins, 1976) with


var(kr) + 00 as t + c o .
For bounded values of t and under the simple shifts of type ( 3 9 , we
obtain

In this case, whether there are parameter shifts or not, the output exhibits
variance that increases linearly with time compared with the case of using
the correct model. Thus it is evident that using an EPC controller designed
for a random walk with drift model is “safer” than using an EPC controller
designed for a deterministic trend process in case we selected (by mistake)
the wrong drift model.
Taking the shifts into account we have the following result. There is a
shift in the mean for v l < t and a shift that results in a trend for v2 < t .
Again, given the uncertainty about the correctness of model assumptions it
is obviously advisable to use additional SPC methods.

7. SHEWHART CHARTS FOR DETECTION OF SHIFTS IN


THE DETERMINISTIC TREND MODEL

In this section we investigatethedesignofasimpletwo-sidedShewhart


chart with fixed samplingintervalfordetectionofshifts(i.e.,abrupt
changes) in the trend parameter of the model in Section 5.1. The chart is
defined by the triple (n, c, / I ) of sample size tI, control limit width multiple c,
sampling distance /I (i.e., the number of discrete periods between samples),
where t? E N, c E (0; +m),/I E N, / I 2 tI. The control procedure is as follows.
(Sl) At time points / I , 211, 3h, . . . , k h , . . . , output samples (5c,,, . . . ,
E,ch+,l-l) of size tI are taken from the process.

(S2) The absolute value I& - TI of the difference of the arithmetic


mean
integration of SPC and EPC in Manufacturing 99

of the sample variables from the process target T is compared


with the control limit co/Ji.
(S3) If I&. - TI 5 co/Ji, the manufacturing process continues with-
out intervention. If I&. - TI > co/Ji, the manufacturing pro-
cess is stopped (giving an out-qf’control signul or alurm) and
inspected for the presence of an additive shift or a shift in the
trend parameter. If no shift is detected, the manufacturing pro-
cess continues without further intervention.If a shift is detected,
the manufacturing process is renewed, i.e., the conditions of the
start of the process at time point 0 are restored, e.g.,by a repair
or by complete overhaul of the production facilities. After the
renewal theprocess is restarted at time point 0 of the next
renewal cycle.
From the point of view of the optimality principles of mathematical
statistics, there may be better tests for the detection of a shift in the trend
model than the test defined by rules (Sl), (S2), (S3). Nevertheless, the two
following arguments support an investigation of Shewhart charts under our
shift model:
I . The simple structure of Shewhartcharts simplifies the designof
optimum charts in a statistical or economic scheme of optimality.
2. Shewhart charts are widely used in industrial practice. Most often,
the charts applied are not designed under a precise statistical and
economic model but from a heuristic point of view (sample sizes
n = 3 , 5 , 7 ; 3 0 limits as control limits). It is interesting to investi-
gate the behavior of such charts under the trend modelof Section
5.1.
Here we investigate Shewhart charts from a statistical point of view.
This decision is not supported by principal arguments; it merely reflects an
option for simplicity. An economic design is based on variables such as the
number of false alarms, length of a renewal cycle, and profit incurred from
items during a cycle. It is obvious that for a model that admits both an
additive shift and a shift in the drift parameter, the formulas for the dis-
tributions and expected values of such variablesare rather involved. Thus an
investigation into the economic design would lead to mathematical details
that far exceed the scope of the present chapter, which is primarily interested
in the structure of a fundamental modelof SPC-EPC integration and a
simple application thereof.
100 Del Castillo et al.

7.1. TheAverage Run Length


An essential quantity in the statistical design of a Shewhart chart is the
rcverrrger11n kng1h (ARL), i.e., the expected number of samples until occur-
rence of an alarm, or, equivalently, the averugc time to signcll (ATS), i.e., the
expected time until occurrence of an alarm. Run length and time to signal
are usually calculated under the simplifying assumption that the process is
either stable without a parameter shift or stable at a given parameter shift.
Inthisapproachtheproblem ofthetimeuntiloccurrence of a shift is
ignored. Interest concentrates on the question, “How long does it take to
obtain an out-of-control signal if the process has entered certain invariant
conditions at an arbitrarily fixed time point O?”
To define the ARL in terms of the model of Section 5.1 we consider as
in Section 6.1 fixed absolute shift sizes A, 1 0 with given signs z / E (-1, 1).
A, = 0 is admitted to express the case that no shift of type i has occurred.
Ignoring the times until occurrence of shifts and assuming that the condi-
tions of the process remain fixed from an arbitrarily chosen time point 0 on,
we obtain in analogy to (36) the output equation

Under the control rules (Sl), (S2), (S3), the 1’1111 length, i.e., the number q of
samples until occurrence of an alarm, is defined by

We assume production speed I ; i.e., one item is produced in one time unit.
Then the total time until occurrence of an alarm (time to signrcc) is

llh + I1 - 1
Define 6, = A , / o . We use this standardization to avoid the nuisance para-
meter 0.
The ARL cannow be defined as the expected value of q , considered a s
a function A(zl. t i l , z 2 ,6?) of the shift amounts 6, and the signs zI of shifts:

The corresponding expected totaltime until occurrence ofan alarm (ATS)is


Integration of SPC and EPC in Manufacturing 101

For explicit calculation of the ARL the noise components ( E , ) ~ are


assumed as i.i.d., each E , with normal distribution N(0;0’). Hence under
Ck
the output equation(50),the test statistics (&)N are independent. where is
normally distributed, with parameters

Hence the alarm probabilities are given by

In thecaseof A2 = 0 (no shift in thedriftparameter),thealarm


probabilities are constant in the number k of the sample. Hence we have
the classicalcase: The distribution of the run length q is geometric with
parameter

Thus, in particular,

In the case of A2 > 0 (shift in the drift parameter), the alarm prob-
abilities vary in the number li of the sample. The distribution of*the run
length q is determined by the probabilities
102 Del Castillo et al.

In this case we obtain no simple expression for the ARL. We have

From a computational point of view, it is better to write Eq. (57) in


recursive form:

with p( 1) = P l ( z l , S I , z 2 , 62), and thus

rn= I

7.2. Example
Chemical mechanical planarization (CMP) is an important process in
themanufacturing of semiconductors. A key qualitycharacteristic in a
CMP process is theremovalrateof silicon oxide fromthesurfaceof
eachwafer.Sincethepolishingpadswearoutwith use, anegativetend
is experienced in this response, in addition to random shocks or shifts. The
removalratehasatarget of 1800 and is controlled via adeterministic
trend EPC scheme. Theerrorsare norm$ly distributed withmeanzero
and CJ = 60, and an estimate of the drift d is used for control purposes. It
is desired notto let theprocessrunformorethan an averageof 10
samples if abias in thedriftestimate of magnitude 0.010 = 0.6 exists.
In the absence of shifts in the mean or trend, an ARL of 370is desired.
In addition, positive shifts of size A I = l o should be detected, on average,
after a maximum of 12 samples if the aforementioned biased trend esti-
mate is (incorrectly) used by the EPC.
Table 1 shows numerical computations for this problem using Eqs.
(57)-(60) and varying 17 from 1 to IO. Clearly, the desired ARL of 370is
obtained with c = 3; thusthetableshowsresultsforthis valueof c.
Fromthe
table, A ( 0 , 0, - 1 , O . O l ) = 10.86 for I I = 4, and
A(1, I , - 1 , O . O l ) = 13.51 for 11 = 5. Therefore, the chart designwith the
smallest sample size that meets the design specifications calls for using
n = 5 and c = 3. The h design parameter ( h 2 H ) should be decided based
Integration of SPC and EPC in Manufacturing 103

Table 1 Average Run Lengths for the Example Problem, c = 3, / I = I O

4 1 , 1, l.O.01) a - 1 , I , 1,O.Ol) A(O,O, I , 0.01) A ( 1 , I , O , 0)


n = A(”1, I , - 1 , O . O l ) = A(1, I , -1,O.Ol) = A(O.0, - l , O . O l ) z A(-I, I , 0,O)
1 9.55 26.62 18.42 43.89
2 5.68 21.37 14.21 17.73
3 4.00 18.15 12.16 9.76
4 3.05 15.64 10.86 6.30
5 2.46 13.51 9.93 4.49
6 2.06 1 1.67 9.21 3.43
7 1.78 10.06 8.63 2.76
8 1.57 8.66 8.14 2.31
9 1.43 7.46 7.72 2.00
IO 1.31 6.43 7.36 1.77

uponeconomicconsiderationsnotdiscussed in thischapterand was


therefore set to 10.
Interestingly, the third and fourth columns in Table 1 indicate that for
n < 9, a negative drift “masks” positive shifts and vice versa, making it
harder to detect a shift [i.e., this occurs when sign(zl) # sign(,-,)]. Also, we
have the relationship

Figure 1 showsarealization of thecontrolledsamplemeansfor


this process (&) with noshifts in themeanoccurring in thesimulated
time. The designed chart limits areshownsuperimposed.Intheabsence
ofabruptshifts in theprocess,theSPCchart will detectthebiased d
stimateafteranaverage of9.93samples (cf. Table I , fourthcolumn),
although in the figure it was notdetecteduntilsample 11. Inpractice,
production willbe stoppedatthealarm time andcorrectiveaction will
be taken (e.g., replacingthepolishingpad),which will recenter the
process.

ACKNOWLEDGMENTS

Dr. Castillo was funded by NSF grants INT 9513444 and DM1 9623669.
Drs. Gob and von Collani were funded by DAAD grant 315/PPP/fo-ab.
104 Del Castilloet al.

0 ?. b 6 B s s ~ b ~ ~ ~ ~ ~ ~ b ~ ~ 0 ~
Sample, k

Figure 1 Arealizationof samplemeans tk forthe case of a deterministictrend


EPC uses a biased trend estimate. The computed Shewhart limits for detecting such
bios are also shown. ,

REFERENCES

Astrom KJ. 1970. introduction to Stochastic Control Theory. Academic Press. San
Diego, CA.
Astrom KJ, WittenmarkB. 1989. Adaptive Control. Addison-Wesley. Reading, MA.
Barnard GA. 1959. Control charts and stochastic processes. J Roy Stat Soc Ser B
XXI(2): 239-27 I .
Busseville M. Nikiforov IV. 1993. Detection of AbruptChanges--Theoryand
Application. Prentice-Hall. Englewood Cliffs. NJ.
Box GEP.Jenkins G. 1963. Furthercontributions to adaptivequalitycontrol:
Silnultaneous estimation ofdynamics: Nonzero costs. Bull Int Stat lnst 34:
943-974.
Box GEP, Jenk'ins G. 1976. Time Series Analysis. Forecstihg. and Control. Rev. ed.
Holden Day, Oakland, CA.
Integration of SPC and EPC inManufacturing 105

Box GEP. Kramer T. 1992. Statistical process monitoring and feedback adjustment
-A discussion. Technometrics 34(3): 251-267.
Butler SW. Stefani JA. 1994. A supervisory run-to-run control of a polysilicon gate
etch using in situ ellipsometry. IEEE Trans Semicond Manuf 7(2): 193-201.
Crowder SV. 1992. An SPC model for short production runs: Minimizing expected
cost. Technotnetrics 34: 6 4 7 3 .
Del Castillo E, Hurwitz A. 1997. Run to run process control: A review and some
extensions. J Qual Technol 29(2): 184196.
Del Castillo E. 1996. Some aspects of process control in semiconductor manufactur-
ing. Proceedings of the 4th W~rzburg-Umea Conferencein Statistics. pp. 37-
52.
Ingolfsson A. Sachs E. 1993. Stability andsensitivity of an EWMA controller.J Q u ~ I
Technol 25(4): 271-287.
MacGregor JF. 1988. On-line statistical process control. Chetn Eng Prog October,
pp. 21-31.
MacGregor JF. 1990. A different view of the funnel experiment. J Q L ITechnol
~ 22:
255-259.
Montgomery DC, Keats JB. Runger GC, Messing WS. 1994. Integrating statistical
process control and engineering process control. J Qual Technol 26(2): 79-87.
Quesenberry CP. 1988. An SPC approach to compensating a tool-wear process. J
Qual Technol 20(4): 220-229.
Sach E, Hu A. Ingolfsson A. 1995. Run by run process control: Combining SPC and
feedback control. IEEE Trans Semicond Mnnuf 8( I ) : 2643.
SoderstromT. 1994. Discrete TimeStochastic Systems. EstimationandControl.
Prentice-Hall. Englewood Cliffs. NJ.
Tucker
WT, Faltin
FW,
Vander Wiel SA. 1993. ASPC: An
elaboration.
Technometrics 35(4): 363-375.
Vander Weil SA. Tucker WT, Faltin FW, Doganaksoy N.1992. Algorithmic statis-
tical process control: Concepts and an application. Technometrics 34(3): 286-
297.
This Page Intentionally Left Blank
Reliability Analysis of Customer Claims
Pasquale Erto
University of Naples Federico 11, Naples, Italy

1. INTRODUCTION

Reliability theory is substantially the “science of failures,” in the same way


in which medicine is the “science of diseases,” directed toward curing or
preventing them. However, because each failure virtually implies the exis-
tence of customer dissatisfaction and complaints, reliability is in some ways
the science of complaints also and, during the warranty period, the science
of claims. Besides, to fully exploit it in the context of quality management,
we must always remember that its operative meaningis “the probability that
a system possesses and keeps its quality throughout timc.”
In general, reliability is a characteristic of systems that possess and
keep during their life the working qualities for which they were designed and
realized. In this sense reliability is a time-oriented quulity charucteristic [ l ]
that can also be referred to technical, productive, commercial, and service
activities that perform their tasks timely and effectively. Technically, relia-
bility is quantified as the probability of no failures (i.e., of performing the
required function) under given environmental and operational conditions
and for a stated period of time.
I n order to be concrete, let us develop this point of view specifically for
thecarindustry, which constitutesawell-known,crucial,and effective
application field.
Today, a new car model must meet a specified reliability level from its
initial launching on the market, on pain of obscuring the company image,
which willbe restoredonlywith difficulty by subsequentimprovements.
Generally, the reliability targets can be achieved by means of good design,
manypreliminary life testsoncomponentsorsubsystems,andaquality

107
108 Erto

control policy. Nevertheless, in the case of mass production, it is essential to


verify constantly that these targets are really fulfilled in service. In fact, the
in-service reliability level may turn out tobe different from the expected one,
mainlyowing to Faults in theproduction process and/orto unforeseen
stresses induced by the real operating environment. Then the manufacturer
has a pressing need to collect and analyze field data to detect the causesof a
possible discrepancy between the in-service and expected reliability, to be
able to immediatel$ adbpt' the necessary corrective actions.
Obviously, since carsareproducts with a widerangeofoperating
environmentsand users, oneshouldhaveagreatmanymanufactured
units under monitoring, over their entire life, to be confident in the measure
of their reliability. But the impracticability of such a policy is quite evident.
Thustheapproach generally undertakenconsists in monitoring only the
units belonging to homogeneous samples of limited size (e.g., a taxi fleet)
and/or controlling the repair operations of manufactured units during the
warranty period. Other sourcesof information, such as the number of spare
parts sold, are sometimes used, but they are less informative and are not
consideredhere. Themonitoring of a vehicle fleet allowsoneto collect
information about the entire life of the product, taking into account both
early and wear-out failures. Moreover, i n many cases, these fleets are sub-
jected to more intensive use thannormal,andthismakesit possible to
obtain measures 'of reliability in a relatively short time. Nevertheless, these
measures are generally valid only for the operating environment and use of
the particular sample chosen, andit is often difficult to extend them to other
situations. Besides, they cannot take into account the impact of subsequent
improvements.
The use of warranty data makes timely information available at low
costfor reliability evaluations.Obviously, these dataaretruncated(i.e.,
limited tothefirstperiod of life), so they takeintoaccount mainlythe
impact of early failures, but theiruse has the advantage of quickly allowing
upboth to choose corrective actions and to check the effectiveness of those
just , I

However, to better understand all the information nested in the war-


ranty data, we must consider that usually these data report only the com-
ponentandfailurecodesandthe mileageinterval in whichthe failure
occurred.Forinstance, with specific reference totheautomobileworld,
no information is directly provided about the number of cars that cover
the various mileage intervals without failures, and hence no direct knowl-
edge is av:lilable about the population to which the observed failures must
be I.eferred. In this situation, only approximate estimation procedures are
usually used. that is, procedures that are generally based on some a priori
(and subjective) evaluationofcardistributionversus mileageintervals.
Reliability Analysis of Customer Claims 109

Instead, using the reliability analysis approach, we can rigorously estimate


boththefailureandcardistributions versusmileage. Themethodhas
already been successfully used in real-life cases that are partially reported
i n an illustrative example included in this chapter.

2. ANALYSES FROM WARRANTY DATA OF CARS

I n a modern way of thinking, quality means “customer satisfaction,” and it


is feasible to realize quality, in this sense, only with the total quality manage-
ment approach to the management of the whole company. In such a con-
text,the reliability engineers’involvementconformsto this management
policy of the car industry too, aiming to involve everyone’s commitment
to obtain total quality.
However, in order to plan, realize, or control a certain quality level of
the cars produced, the availability of an efficacious practical measure of “in-
service” quality is first needed. In fact, one of the fundamental rules for the
management of total quality consistsof turning away from making decisions
based exclusively on personal opinions or impressions. Instead, one needs to
refer to data that are really representative of the quality as perceived by the
customers, such as the warranty data. These data, however, contain only the
following information:

1. Vehicletypecode
2. Assembly date
3. Componentand defect code
4. Mileage tofailure
In formal statistical language, the warranty data are failure observa-
tionsfromasamplethat is bothtruncated(attheend of thewarranty
period) and has items suspended at the number of kilometers effectively
covered by the respective customers. Thus, to carry out areliability analysis.
both the numberof failures and the numberof suspensions for each mileage
interval are required. Obviously, the warranty data give no mileage infor-
mation about those vehicles that are sold and reach the end of each mileage
interval without any claim being made. Thus, no direct information is avail-
able about the population towhich the reported number of failed items must
be referred. Therefore, the usual procedures used by the automotive indus-
try [ 2 4 ] require the a priori estimation (often arbitrary) of the vehicle dis-
tribution versus mileage in order to partition the total number of vehicles
under warranty into mileage intervals. Note that this distribution may also
be verydifferentfromcaseto case, since it mayconcern vehicles under
110 Erto

special maintenance contract including the warranty period or only a frac-


tionofthevehiclesunderwarranty (e.g., all vehicles fromaparticular
production unit), as well as all vehicles produced in a given period, etc. In
Ref. 5 it is stressed that the number of claims at a specific age depends on
mileage accumulated, so supplementary information on the mileage accu-
mulated for the population of cars in service is needed.
Thischaptershowshowonecanestimatesimultaneouslyboththe
mileage and the failure distribution functions without needing any a priori
estimation. In the nextsection,a special case that occurred in a real-life
situation, in which the proposed method of analysis was successfully used, is
discussed.

3. AREAL-LIFERELIABILITYANALYSIS
3.1. TheAvailableDataSet
Failure data normally refer to about 40 different components (or parts) of
some car model. The kilometers to failure are typically grouped into equal-
width lifetimes, each of 10,000km, and all vehicles under consideration are
sold during the same year in which repairs are made.
In our case from real life [SI, 498 cars were sold in the year, and the
totalnumber of warrantyclaims referred tothemanufacturer was 70.
Furthermore, irrespective ofthepartsinvolved,thisnumber ofclaims
were distributed over the lifetimes as shown in Table 1.
The characteristic that makes this case peculiar is that no age (from
selling date) distribution and no distributionof covered kilometers are given
for the fleet under consideration. Hence, it is not possible to allocate the
unfailed units in each lifeime. To overcome this difficulty we can use the
reliability analysisapproach,introducinganestimationprocedurethat
involves at the same time both the failure and kilometer distributions.

Table 1 Number of Warranty Claims in Each Lifetime

Number
of claims
Reliability Analysis of Customer Claims 111

3.2. Method of Analysis


Let T , be the random variable (r.v.) representing the “kilometers to failure”
and G(r) = Pr(T, I t ) its unknown distribution function. Moreover, let F(r)
be the probability that a car sold during the year of observation does not
exceed the kilometers t until the end of this year. The experimental context
underanalysis is equivalenttoasampling(truncatedatthe end of the
warranty period) in which some of the items under life testing have their
test randomly suspended before failure. Thus, an r.v. representing the “kilo-
meterstosuspension,”say T s , is definedwiththe unknowndistribution
function F ( t ) = Pr(Ts I t ) . Hence, it follows that the probability that an
item fails before f km is

and the probability that an item is suspended before t km is

Pr{(T, < t ) n (T, > T S )=


} [l - G ( s ) ] d F ( s )
0

T, and T s being independent random variables.


Assuming that G(t) and F(r) are exponential functions, that is, G(t) =
1 - exp(“ar) and F ( t ) = 1 - exp(-bt), the probability that an item fails
before t becomes

a+b
ll
{ 1 - exp[-(a + b)t])
and, similarly, the probability that an item is suspended before t becomes

b
~ {l-
a+b

Some comments on the assumptionof the exponential model for G(t)


are required. Warranty data are essentially data on early failures; hence,
from a theoretical viewpoint, kilometers to failure should have a decreasing
failure rate. Then, as an example, a Weibull model with shape parameters
less than 1 should be moresound.However,experimentalresults have
shown that in situations similar to the present one (see Refs. 3 and 4), the
shape parameter of the Weibull distribution is very close to 1. Hence, it
appearsthatthere is no practicaladvantagetousingamorecomplex
model than the exponential one for G(t).
112 Erto

To explain the choice made for F ( t ) , some information given in Ref. 2


about the kilometer distribution versus age (from selling date) for a fleet of
European cars can be used. I n Ref. 2, information is available on the two
5% tails of the distributions at 3,6,9, and 12 months of car age. For each of
these distributions a Weibull model that had the same two 5 % tails can be
assumed. In Figure I the kilometer distributions (at 3, 6, 9, and 12 months
of age)arereported using a Weibull probabilitydistribution.Thenthe
“compound” kilometer distribution, which corresponds to car ages (from
selling dates) uniformly distributed over the range of 12 months, is drawn.
As can be seen from the estimates reported in Table 2 (calculated with the
maximum likelihood method), this distribution turns out to be very close to
the exponential, being close to one its shape parameter. Even if this result
cannot be considered decisive proof, the exponential model appears to be at
the very least the preferential candidate for F ( t ) in the present situation.

3.3. Estimation Procedure


In order to estimate the two unknown parameters, CI and h, we use a very
powerfulstatisticalestimationmethod,the mnsinum likelihood method.
The logic of this method is very simple, and even those who are not statis-
ticians can take advantageof it. It is founded on theidea that the probability
law we are looking for is most likely the one-of the hypothesized family-
thatshowsthemaximumjointprobabilitydensityofthecollected data

99.9
99
90
70
50
30
20
io
5

1
0.5

0.1
0.1 0.01 1 10 100
t , covered kilometers (km/lOOO)
Figure 1 Weibull kilometer distributions at 3, 6. 9, and 12 months and the corre-
sponding compound distribution (heavy line).
Reliability Analysis of CustomerClaims 113

Table 2 Parameter Estimates of the Kilometer Distribution Drawn in Figure 1

3 1170 6 mo 9 mo Compound
12 mo
Weibull shape 1.13 1.17 1.13 1.14 0.97
parameter
Weibull scale 5.85 1 1.04 17.56 23.25 13.56
parameter

sample (called the likelihood function, L ) . So those values of the unknown


parameters that maximize the L function are the nzasbnum likelihootl esti-
m(I1es.
Thedefinition of theprobabilitiesofbothfailureandsuspension
allows us to construct the L function for a sample arising from the experi-
mental situation under study [8]. I n fact, letting
N = total number of cars under observation
11; = number of failures in the ith lifetime (TI, TI+,), =T,0
= number of lifetimes
111
n = Cy:, 11; = total number of failures
the likelihood function, L, is found to be proportional to

The values of ( I and h that maximizethis function (given a sample with


known N and ? I ; ) arethe needed maximumlikelihoodestimates of the
unknown parameters N and h.
Reparameterizing for convenience the likelihood functionin terms ofn
+
and c = n h and equating to zero the partial derivatives
of ln(L) yields the
equations

(6a)

and
114 Erto

To solve this last nonlinear equation in c, an iterative method is required


(e.g., the Newton-Raphson method). An initial tentative value, e*, can be
easily found by taking into account that the probability that an item fails in
the ith interval has the nonparametric estimate n i / N . Thus, as an example,
for the first lifetime ( T I = 0, T2),(see equation 3),

Pr[(O < T, IT2)n (r,> T,)] = -[1 - exp(-c7'2)]


(1

C
nl
=-
N

and since from the first likelihood equation, u / c = n / N , an initial tentative


value follows:

* -ln(l - n l / n )
c =
T2

Then the estimates a and b = c - a can be obtained without any further


difficulty.

3.4. PracticalExampleandComments
The maximum likelihood method was applied to the sample of warranty
claimsfromreal life given in Table 1. The followingestimatesofthe
unknown parameters a and b were found:

a = 2.1 1 x IOp5 km" and b = 12.93 x km"

The model chosen appears to fit the experimental data with an extremely
high degree of accuracy. In Table 3 the observed and the estimated numbers
of failures for each lifetime are reported. The estimated average number of
kilometers for the fleet under test is 1/b = 7734 km, which is a very plausible
value for a fleet of cars whose ages are distributed over 12 months.

Table 3 Comparison of Collected and Estimated Numbers of Claims

Lifetimes (km/1000) 0-10 10-20 20-30 30-40 > 40


No. of claims, observed 55 11 3 1 0
No. of claims, estimated 54 12 3 1 0
Reliability Analysis of Customer Claims 115

4. SOME CONCLUDING REMARKS

Sometimes, the evaluation of the ability of an item to satisfy customers can


be carried out effectively only via reliability analysis. In fact, some stated or
implicit customers’ needs must be satisfied over time, because they are time-
oriented characteristics. So reliability methods conceived to assess the prob-
ability of performing required functions for a stated period of time can be
the most suitable ones.
The analysis of customer claims presented in this work is only one
example of the possible applications of the reliability approach. Another
example, not restricted to the warranty period, is given in Ref. 9.
Moreover, it must be pointed out that reliability analysis can be used
not only to constantly verify that the quality targets are really fulfilled in
service, but also to help improve any time-oriented quality characteristic of
a product by estimating its current level in the field. Therefore, collecting
field failure data over the entirelife of a product and performing areliability
analysis can be an effective policy for achieving continuous improvement of
products [6].
In this context, reliability analysis allows one not only to control the
failure or degradation of product performance, but also to reduce the var-
iation in the performance over time among copies of the same item [7]. In
order to do that, many practical methods for estimating failure distributions
are available in the reliability literature, primarily those that integrate the
competence of both engineers and statisticians [lo, 111, since, as has already
been said, only by involving everyone’s commitment can total quality be
achieved.

REFERENCES

1. Brunelle RD, Kapur


KC.
Customer-centered reliability methodology.
Proceedings
of Annual Reliability and Maintainability
Symposium,
Philadelphia, 1997, pp. 286-292.
2. Toth-FayR.Evaluation of field reliability on the basis oftheinformation
supplied by thewarranty.Proceedings of EOQC I1 EuropeanSeminar on
Life Testing and Reliability, Torino, Italy, 1971, pp. 71-78.
3. TurpinMP.Application of computermethodsto reliability predictionand
assessment in a commercial company. Reliab Eng 3: 295-314, 1982.
4. Vikman S, Johansson B. Some experiences with a programmed Weibull routine
fortheevaluation of field test results. ProceedingsofEOQC I1 European
Seminar on Life Testing and Reliability, Torino, 1971, p. 70 (ext vers pp. 1-20).
5. Kalbfleish JD, Lawless JF, Robinson JA. Methods for the analysis andpredic-
tion of warranty claims. Technometrics 33(3): 273-285, 1991.
116 Erto

6. Jones JA, Hayes JA. Use of a field failure database for improvementof product
reliability. Reliab Eng Syst Safety 55: 131-134, 1997.
7. Waync KY. Kapur KC. Customer driven reliability: Integration of QFD and
robust design. Proceedings
of Annual Reliability and Maintainability
Symposium, Philadelphia. 1997. pp. 339-345.
8. Erto P, GuidaM.Somemaximum likelihoodreliability estimatesfromwar-
ranty data of cars in users' operation.Proceedings of European Reliability
Confercnce, Copenhagcn, 1986, pp. 55-58.
9. Erto, P.Reliabilityassessments by repair shops via maintenance data. J Appl
Stilt 16(3): 303-313. 1989.
I O . Erto P, Guida M. Estimation ofWeibullreliability from few life tests. J Qual
Reliab Eng Int l(3): 161-164, 1985.
I I . Erto P, Giorgio M. Modifiedpractical Bayes estimators. IEEE Trans Reliab
45(1): 132-137, 1996.
Some Recent Developments in Control
Charts for Monitoring a Proportion
Marion R. Reynolds, Jr.
Virginia Polytechnic Institute and State University, Blacksburg, Virginia

Zachary G. Stoumbos
Rutgers University, Newark, New Jersey

1. INTRODUCTION

Control charts are used to monitor a production process to detect changes


that may occur in the process. In many applications, information about the
process may be in the form of a classification of items from the process into
one of two categories, such as defective or nondefective, or nonconforming
and conforming. The process characteristicof interest is the proportion p of
itemsthat fall in the first category.Forconvenience in describingthe
problembeingconsidered,the labels “defective” and“nondefective” will
be used in this paper for the two categories. It is usually assumed that the
items from the process are independent with probabilityp of being defective.
This would then imply that the total number of defective items in a sample
of /7 items,say T , has a binomialdistribution. Inmostqualitycontrol
applicationstheprimaryobjective in using acontrolchart would be to
detect an increase in p , because an increase in p corresponds to a decrease
in quality. However, a decrease i n p would be of interest if it is important to
document an improvement in process quality. Woodall (1997) gives a gen-
eral review of control charts that can be applied to the problem of monitor-
ing p.
The traditional approach to applying a control chart to monitorpis to
take samples of size IZ at regular intervals and plot the values of the sample

117
118 Reynolds and Stoumbos

proportion defective, T l n , on a Shewhart p-chart. The p-chart usually has


control limitsset at f 3 standard deviations from the in-control value po
(three-sigma limits). Although the p-chart is relatively easy to set up and
interpret, it has a number of disadvantages. These disadvantages are parti-
cularly critical whenpo is close to zero. Monitoring process
a with po close to
0 is becoming more and more commonwith the increasing emphasis on high
quality production. Thus,it is important tobe aware of the disadvantagesof
the p-chart and to consider better alternatives.
The distribution of T is discrete, and when po is close to zero the
distribution of T is also highly skewed (unless n is very large). This results
in the p-chart with 30 limitshavingproperties very differentfromwhat
would be expected from a normal distribution with 30 limits. For example,
for many values of n and p o that might occur in applications, the calculated
lower control limit is negative, so there is, in effect, no lower control limit.
Thismeansthatthechart will not be able to detectdecreasesin p . In
addition, when p = po, the discreteness and skewness of T can result in a
probability above the upper control limit that is far from the value 0.00135
expectedfromthenormal distribution. If theprobability is farabove
0.00135,thenthefalse alarmrate will be muchhigherthan expected,
whereas if the probability is far below 0.00135, then the false alarm rate
will be much lower than expected. A lower than expected false alarm rate is
undesirablebecause it means that processchanges will be detected more
slowly than necessary.
The p-chart is a Shewhart control chart that plots, at each sampling
point, the proportion defective for that sample alone. Information from past
samples is not used, and this results in a chart that is not very efficient for
detecting small changes inp . In particular, if po is close to 0, then the p-chart
requires a very largevalueof 11 to detectasmallincrease in p withina
reasonable length of time. The traditional approach to improving the effi-
ciency of a Shewhart chart for detectingsmall process changes is to use runs
rules. For the p-chart, the use of runs rules might also enable decreases in p
to be detected when there is no lower control limit. The disadvantages of
usinglargenumbers of runs rules are that the chart is more difficult to
interpretandtheevaluation ofthestatisticalpropertiesofthechart is
much more complicated. Most evaluations of the statistical properties of
runs rules are based on the symmetrical normal distribution, with regions
within the control limits specified in terms of the standard deviation of the
statistic being plotted. Applying these runs rules to the p-chart resultsin the
same problem aswith the control limits; the discretenessand skewness of the
distribution of T can result in runs rules with properties much different than
expected.
Control Charts for Monitoring a Proportion 119

An additional disadvantage to using a p-chart with runs rules is that


using runs rules is not the most efficientway to detect small changesi n p. A
better approach to obtaining a control chart that will detect small process
changes is to use a chart, such as a CUSUM chart, that directly and effi-
ciently uses the past sample data at each sample point. Although the dis-
creteness of the distribution of T will be an issue with a CUSUM chart, the
fact that the CUSUM chart is based on a sum will make the discreteness
much less of aproblemthan in thecase of thep-chart. Inthepast,a
hindrance to the application of CUSUM charts to monitor 17 is that it is
difficult for the practitioner to determine the CUSUM chart parameters that
willgive specified properties. Some tables or figures havebeenpublished
[see, for example, Can (1993)], but these tables and figures do not include all
values of I I and po that would be of interest i n applications.
Another approach to obtaining more efficient control charts is to use a
control chart that varies the sampling rate as a functionof the process data.
Although a large number of papers have been published in recent years on
variablesamplingratecontrolcharts [see, e.g., Reynolds(1996a)and
Stoumbos and Reynolds (1997b)], only a few papers have been published
on the specific problemofmonitoring p [see, e.g., Rendtel (1990) and
Vaughan (1993)]. The application of variable sampling rate control charts
to monitoring p has been hindered by the difficulty of determining the chart
parameters that will give specified properties.
The objective of this chapter is to consider three highly efficient con-
trol charts for monitoring p that can be used in three different situations.
The first control chart is a CUSUM chart, called the Bernoulli CUSUM
chart, that can be used in situations in which all items from the process are
inspected. The use of 100% inspection is becoming more common as auto-
matic inspection systems are implemented. Also. in the highly competitive
global markets of today there is an increasing emphasis on maintaining a
verylow proportion of productthat is defective orthatdoesnot meet
specifications. The sampling rates that arenecessary to discriminate between
very low valuesof p will frequentlycorrespondto 100% inspection.
CUSUM charts for this problem havebeenconsidered before [see, e.g.,
Bourke (1991)l. Adisadvantage ofthese CUSUMchartshas been that
designing a CUSUM chart for a particular application has been difficult
unless the values of n and p o in the application happen to correspond to
values in published tables. A contribution of the current chapter is to show
howto design aCUSUMchartforthecase of 100% inspectionusing
relatively simple and highly accurate approximations.
The second control chart tobe considered is a CUSUM chart that can
be applied in situations in whichsamplesof IZ items are taken from the
process at regular intervals. CUSUM charts for this problem havebeen
120 Reynolds and Stoumbos

studied before [see, e.g., Gan (1993)l. As in the case of 100% inspection, a
disadvantage of the CUSUM chart in this situation has been that designing
one for a particular application has been difficult unless the values of I? and
po in the application happen to correspond to publishedresults. A contribu-
tion of the current chapter is to show how todesign a CUSUM chart for the
binomial distribution using relatively simple and highly accurate approxi-
mations.
The third control chart to be considered here is a chart that can be
applied when it is not feasible to use 100% inspection but it is feasible to
varythe sample size used at each sampling point depending on the data
obtained at that sampling point. The sample size is varied by applying a
sequentid prohnhility ratio test (SPRT) at each sampling point. This SPRT
chart for monitoring p is a variable sampling rate control chart, and it is
muchmore efficient thanchartsthattakea fixed-size sample.Methods
based on relatively simpleand highly accurateapproximationsare given
for designing the SPRT chart.
The remainder of this chapter is organized as follows. Sections 2-5
pertain to the Bernoulli CUSUM chart, Sections6-8 pertain to the binomial
CUSUM chart, and Sections 9-12 pertain the the SPRT chart. For each
chart, a description is given, the evaluation of statistical properties is dis-
cussed, a design method is explained, and a design example is given. Some
general conclusions are given in Section 13.

2. THE BERNOULLI CUSUM CHART WHEN USING 100%


INSPECTION

When a l l items from the process are inspected, the results of the inspection
of the ith item can be represented as a Bernoulli observation X ; , which is 1 if
the ith item is defective and 0 otherwise. Then p corresponds to P ( X , = 1).
The control chart to be consideredfor this problem is a CUSUM chart
based directly on the individual observations X , , X,, . . . without any group-
ing into segments or samples. This Bernoulli CUSUM chart is defined here
for the problem of detecting an increase in p . The problem of detecting a
decrcase in p . as well as additional details about the Bernoulli CUSUM
chart, are given in Reynolds and Stoumbos (1999).
For detecting an increase in p , the Bernoulli CUSUM control statistic
is
Control Charts for Monitoring a Proportion 121

where y > 0 is thereferencevalue.Aftertheinspection of item i this


CUSUM statistic adds the increment X , - y to the previous value as long
as the previous value is nonnegative, but resets the cumulative sum to 0 if
the previous value drops below 0. The starting value Bo is frequently t,1' k en
to be 0 but can be taken to be a positive value if a head start is desired [see
Lucas and Crosier(1982) for a discussion of using a headstart in a CUSUM
chart]. This chart will signal that there has been an increase in p if Bi3 h H ,
where h g is the control limit. The reference value y can be chosen by using
the representation of a CUSUM chart as a sequence of SPRTs. To deter-
mine the value of y it is necessary to specify a value p 1 > p O that represents
an out-of-control value ofp thatshould be detected quickly. For a given in-
control value p o and a given out-of-control value p I , define the constants 1'1
and r2 as

Then, from the basic definition of an SPRT (see Section 9), it can be shown
that the appropriate choice for y is

I t will usually be convenient if y = 1 /nl, where 111 is an integer. For example,


if po = 0.005 and p I is chosen to be p 1 = 0.010, thenthis willgive
r1 = 0.00504, r2 = 0.6982, and r 1/ r 2 = 0.00722 = lj138.6. In this case, if
p 1 is adjustedslightlyfrom 0.010 to 0.009947,then r l j r 2 will decrease
slightly to 1/139. This means that the possible values of Bi will be integer
multiplesof 1/139, and this will beconvenient for plotting the chart. In
general, if p0 and p I are small, then a slight change in p l will be sufficient
to make y = l / m , where nz is an integer. In most cases the precise specifica-
tion o f p , will not be critical, so this slight change in pI will be of no practical
consequence.

3. PROPERTIES OF THE BERNOULLI CUSUM CHART

The performance of a control chart is usually evaluated by looking at the


averagerunlength (ARL), which is theexpected number of samples
required to signal. In thecurrentcontext of 100% inspection,theremay
be no natural division of observations into samples orsegments, and thus, to
avoid confusion, wewilluse the (werage nu~nher.o f ohsetvations to .vigwI
(ANOS) instead of the ARL to measure the performance of control charts.
122 Reynolds and Stoumbos

Assumingthattheproductionrate is constant, the ANOS can be easily


converted to time units, and for purposes of exposition we will frequently
refer to the ANOS as a measure of detection time. When the process is in
control (p = po), it is desirable to have a large ANOS so that the rate of false
alarms is low. On the other hand, when there has been a significant change
in p , it is desirable to have a small ANOS so that this change in p is detected
quickly. I n the previous sectionpI was defined as a value of p that should be
detected quickly, and thus theANOS should be small at p = p i . However, i n
practice, it is usually desirable to consider a range of values of p around pI
and to have a chart with good performance for all of these values of p .
The ANOS of the Bernoulli CUSUM can be evaluated by formulating
the CUSUM as a Markov chain [see Reynolds and Stoumbos (1999)]. This
approach gives the exact ANOS when r l / r 2 is a rational number, but the
disadvantage is that a computer program is usually required. The approach
to be given here is from Stoumbos and Reynolds (1996) and Reynolds and
Stoumbos ( 1 999) and is based on using approximations developed by Wald
(1947) and diffusion theory corrections tothese approximations obtained by
Reynolds and Stoumbos (1999) by extending the work of Siegmund (1985).
The approximation for the ANOS that is obtained using this approach will
be called the corrected diifir.sio/l (CD) approximation. The C D approxima-
tion will form the basis of a highly accurate and relatively simple design
methodthatrequires only a pocketcalculator to design the Bernoulli
CUSUM for practical applications.

4. A METHOD FOR DESIGNING THE BERNOULLI CUSUM


CHART

To design a Bernoulli CUSUM chart for a particular application it will be


necessary to specify po, the in-control value of p, and p I , the value of p that
the chart is designed to detect. The values of p o and p 1 will then determine
the reference value y through Eq. (3). As discussed above, when p o and pl
are small it will usually be convenient to adjust pI slightly so that y = I//??,
where m is an integer. The design method is presented here for the case in
which 0 < p o < 0.5. The case in which po L 0.5 is discussed i n Reynolds and
Stoumbos (1999).
In designing the chartit is also necessary to determine thevalue for the
control limit h B . The value ofh, will determine the false alarm rate and the
speed with which the chart detects increases in p . A reasonable approach to
determining k, is to specify a desired value of the ANOS when p = po and
then choose /le to achieve approximately this value of the ANOS. It will
usually not be possible to achieveexactly a desiredvalue ofthe ANOS
Control Charts for Monitoring a Proportion 123

because the Bernoulli distribution is discrete. Once hg is chosen to achieve


approximately the specified ANOS at p = po, it will be desirable to look at
the ANOS atp = p1 and at othervalues of p to determine whether detection
of shifts i n p will be fast enough. Inpractice, it may be necessary to adjust h g
to achieve a reasonable balance between the desire to have a low false alarm
rate (achieved by choosingalarge hB) and fastdetectionofshifts in p
(achieved by choosing a small h B ) .
The C D approximation to the ANOS of the Bernoulli CUSUM chart
uses an adjusted value of kg, which will be denoted by hi, in a relatively
simple formula. This adjusted value of hg is

where E@) can be approximated by

0.410 - 0.0842 log@) - 0.039 1[log@)13


if 0.01 5 p < 0.5
-o.oo376[log(p)]4 - o.ooooo8[log(p)]7
dP) x
if 0 < p < 0.01

(5)

When p = po, the C D approximation to the ANOS is

For given values of rl and r2 and a desired value for the in-control ANOS,
Eq. (6) can be used to find the required value of hi, and then (4) and ( 5 ) can
be used to find the required value of h g . Finding /I: using (6) can be accom-
plished by simple trial and error.
In most applications it will be desirable to determine how fast a shift
from po to p l will be detected. The C D approximation to the ANOS when
P =PI 1s

Note that / I ; uses po even though the ANOS is being approximated at p l .


Approximations to the ANOS for other values of p and a discussion of the
124 Reynoldsand Stoumbos

accuracy of the C D approximation are given in Reynolds and Stoumbos


(1 999).

5. AN EXAMPLE OF DESIGNING A BERNOULLI CUSUM


CHART

Consider a production process for whichit has been possible to maintain the
proportion defective at a low level, po = 0.005, except for occasional periods
in which the value of p has increased above this level. All items from this
production process are automatically inspected, and a Shewhart p-chart is
currently being used to monitor this process. Items are grouped into seg-
ments of I I = 200 items for purposes of applying the p-chart. If 3 0 limits are
used with the p-chart, then the upper control limit is 0.01996, and this is
equivalent to signaling if TI z 4, where q is the number of defectives in the
j t h segment. When p = p o = 0.005, this results in P ( q 2 4) = 0.01868, and
it was decided that this probability of a false alarm was too high. Thus, the
upper control limit of the p-chart was adjusted so that a signal is given if
T/ 5 , and this gives a probability of 0.00355 for a false alarm. There is no
lower control limit because giving a signal for T, = 0, the lowest possible
value of T/, would result in P(T, = 0) = 0.3670 when p = p o , and thus the
false alarm rate would be unacceptably high. When p = p o , the expected
number of segmentsuntila signal is 1/0.00355 = 282.05.Eachsegment
consistsof 200 items, so thiscorrespondstoanin-controlANOS of
56,410 items.
To design a Bernoulli CUSUM chart for this problem, suppose that
process engineers decide that it would be desirable to quicklydetect any
special cause that increases p from 0.005 to 0.010 and that the in-control
ANOS should be roughly 56,410 (the value corresponding to the p-chart in
currentuse).Fromaprevious discussionofthecaseof 170 = 0.005 and
y l = 0.010, it was shown that adjusting p I slightly from 0.010 to 0.009947
would give r 1 / r 2= 1/139, and thus I H = 139. Using trial and error to solve
(6) to give ANOS(po) X 56,410 resultsin a value of /I> of 6.515 [this valueof
/I> will give an in-control ANOS of 56,408 according to the approximation
of Eq. (6)]. Then, using(4) and ( 5 ) to convert to / I , gives ~ ( p= ) 4.646,
&(po)m = 0.328, and /lg = 6.187. As a point of interest, the exact in-
control ANOS using h B = 6. I87 can be calculated to be56,541by using
the methods given in Reynolds and Stoumbos (1999). Thus, in this case the
C D approximation gives results that are extremely close to the exact value
and certainly good enough for practical applications.
After / I , has been determined, Eq. (7) can be used to determine how
fast a shift from p o to p , willbe detected. Using /I> = 6.515 in (7) gives
Control Charts for Monitoring a Proportion 125

ANOS(pI) 1848. Interestingly,theexact ANOScan be calculated to be


1856, so the C D approximation is also very good at p = p l . At p = P I , the
ANOS of the p-chart is 3936 items. Thus, the p-chart would require on
average more than twice as long as the Bernoulli CUSUM chart to detect
a shift from yo to p I .

6. THEBINOMIALCUSUMCHART

Inmanyapplications 100% inspectionoftheprocess output will not be


feasible, and thus samples from the output will have to be used for mon-
itoring. In this section, the problem of monitoring p when the data from the
process consist of samples of fixed size tz that are taken at fixed sampling
intervalsoflength d is investigated. If T k is used to representthetotal
numberof defectives observed in the kthsample,then thestatistics
T I ,T,, . . . are independent binomial random variables. The control chart
to be considered here is a CUSUM chart based on these statistics.
The binomial CUSUM chart uses the control statistic

and signals at sample h- if Yk 2 h y , where Yo is the starting value and y is


given by (3). The reference value of this CUSUM chart is t7y = t / r l / r ? , and
this reference value is appropriate for detecting a shift to p l .
In the current situation in which samples are taken from the process,
the performance of a control chart can be measured by the ccverrrge tirne t o
sigr/nl (ATS). As in the case of using the ANOS in previous sections, when
p = po the ATS should be large, and when p shifts from p o the ATS should
be small. In evaluating the ATS of the binomial CUSUM chart, it wll be
assumed for simplicity that the time required to take and plot a sample of 11
observations is negligible relative to the time d between samples. I n this case,
the ATS can be expressed as the product of d and the avc~crgernrndwr of’
scrr?lplc.s t o sigrlcrl (ANSS). Can (1993) discusses Markov chain methods for
evaluatingtheANSSofthebinomial CUSUM chart. Here we use C D
approximations to design the binomial CUSUM chart.
When the ATS is used as a measure of the time required to detect a
shift in p, the ATS is usually computed assuming that the shift in p occurs
when process monitoring starts. However, in many cases the process may
run for a while at the in-control value po and then shift away from po at
some random time in the future. In this case the detection time of interest is
the time from the shift to the signal by the control chart. For control charts
such as the CUSUM chart, the computation of this expected time is com-
126 Reynoldsand Stoumbos

plicated by the fact that the CUSUM statistic may not be at its starting
value when the shift in p occurs. If it is assumed that the CUSUM statistic
has reached its stationary or steady-state distribution by the time the shift
occurs.thentheexpectedtimefromthe shift tothe signal is called the
steady-state ATS(SSATS).Whenpeformingcomprehensivecomparisons
of different control charts, it is appropriate to consider the SSATS as a
measure of detectiontime.However,forthelimitedcomparisonsto be
given in the design examples in this paper, the ATS will be used.

7. A METHOD FOR DESIGNING THE BINOMIAL CUSUM


CHART

A method for designing the binomial CUSUM can be developed by using


C D approximations to the ANSS and the ATS. This method is presented
here for the special case in which p o < 0.5 and I/ny is a positive integer.
Extensions of this method to more general cases are currently under devel-
opment.
The C D aproximation to the ANSS of the binomial CUSUM uses an
adjusted value of I t y , which willbe denoted as I&, in a relatively simple
formula. The adjusted value of llY is

When p = p o , the C D approximation to the ANSS is

When p = p I , the C D approximation to the ANSS is

An approximation to the ATS is obtained by multiplying the ANSS by the


sampling interval d.
To design a binomial CUSUM chart, the values of po and p I can be
specified, and then these values will determine the reference value y through
Eq. (3). To use the C D approximations given above it will be necessary to
choose values of I I and pI such that 1/ny is a positive integer. In this case, the
possible values of the binomial CUSUM statistic Y , will be integer multiples
Control Charts for Monitoring a Proportion 127

of ny. Thus, in considering values forh y , it is sufficient to look atvalues that


are integer multiples of ny. These values of /lY will correspond to certain
values of /I*, using (9). Using the approximation (lo), the value of /I;. can be
selected that willgive approximatelythe desiredvalue forthein-control
ANSS. Then the / t y to be used can be obtained from / I ; , by using (9).

8. AN EXAMPLE OF DESIGNING A BINOMIAL CUSUM


CHART

Consider a situation similar to the example in Section 5 in which a Shewhart


p-chart is beingused tomonitor a production process for whichthe in-
control value of p is p o = 0.005. Instead of using 100% inspection for this
process, suppose that it is necessary to take samples from the process out-
put. The value of po is relatively small,andthus it is necessary to take
relatively large samples for the p-chart to be able to detect small increases
in p above po. Suppose that samplesof size I I = 200 items are used (the same
as the size of the segments in the example in Section 5). To keep the total
sampling effort to a reasonable level, the samples are taken from the process
every d = 4 hr. As i n the previous example, the upper control limit of the p-
chart wasadjusted so that a signal is given if Tk 2 5 , which gives
P(Tk 2 5 ) = 0.00355when p = po. Thiscorrespondstoanin-control
ANSS 1/0.00355
of = 282.05 in-control
an
and ATS of
4(282.05) = 1128.2 hr.
Consider now the design of a binomial CUSUM chart assuming that
samples will be taken every d = 4 h r as described above. Suppose that pro-
cess engineersdecidethat it is importanttodetecta shift in p from
po = 0.005 to p I = 0.010 and that it would be reasonable to have an in-
control ANSS of approximately 282 (thesameasthevalueforthe p-
chart). As in the example in Section 5 , adjusting p I slightly from 0.010 to
0.009947 will give y = rl/rZ = 1/139. If IZ is taken to be 139, then the refer-
encevalueofthebinomial CUSUMchart becomes r7y = 139/139 = 1.
Manypractitionersmight prefer tohave I? = 140, ratherthan 139, and
this can be achieved by an additional slight adjustment i n p I . I f p l is adjusted
to 0.009820,then this willgive y = r , / r 2 = 1/140. Then, taking j 7 = 140
gives a reference value of rty = 140/140 = 1.
The reference value for the binomial CUSUM chart is 1, so it follows
that it is sufficient to look at values for h y that are integer multiples of I . If
several values of /ly are tried, it is found that using h y = 5 in Eq. (9) gives
17; = 5.33, and using this h*y in Eq. (IO) gives ANSS(Jo) % 228.7. As a point
of interest, the exact ANSS for this value of / I is 228.6. Using h , = 6 i n Eq.
(9) gives h*y = 6.33, and using this /I;, in (IO) gives ANSS(Jo) M 471.8 (the
128 Reynolds and Stoumbos

exact value is 47 1.3). Neither of these ANSS values is extremely close to the
desired value of 282, but suppose that itis decided that 228.7 corresponding
to h Y = 5 is close enough. Using h Y = 5 will give an in-control ATS of
approximately4 x 228.7 = 914.8. Using h Y = 5 and 7,/; = 5.33 in (11)
gives ANSS(pI) x 11.6 (theexactvalue is 11.9). Thiscorrespondstoan
ATS at p = p 1 = 0.0098 of approximately 4 x 11.6 = 46.4hr. At p = p I ,
the ATS of the p-chart is 80.3 hr. Thus, the binomial CUSUM chart will
detect a shift to p I fasterthan the p-chart will. Note that the p-chart is
sampling at a higher rate than the CUSUM chart (200 every 4 hr versus
140 every 4 hr), but the CUSUM chart has aslightly higher false alarm rate.
When the p-chart is being used to detect small increases in p above a
small value of p o , it is necessary to use a large sample size to detect this
increase in a reasonable amount of time. This may require that the sampling
interval cl be relatively long in order to keep the sampling cost to a reason-
able level. However, for the binomial CUSUM chart it is not necessary to
have large; it is actually better to take smaller samples at shorter intervals.
Thus, as an alternative to taking a sample of = 140 every r l = 4 hr, con-
sider the option of taking a sample of/? = 70 every cl = 2 hr. If the binomial
CUSUM chart uses 11 = 70 and p 1 = 0.009820, then the reference value will
be 11 = 70/140 = 0.5, and the possible values for Y, will be integer multiples
of 0.5. Thus, it is sufficient to look atvalues for h Y that are integer multiples
of 0.5. If the p-chart has an in-control ATS of 1128 and it is desirable to
have approximately the same value for the binomial CUSUM with d = 2,
then the in-control ANSS should be I128/2 = 564. Using h y = 5.5 in (9)
gives /I;. = 5.83, and using this i n (IO) gives ANSS(po) X 558.5 (the exact
value is 557.9). This corresponds to an in-control ATS of approximately
2 x 558.5 = 1 1 17.0. Using (1 1) gives ANSSGI~)x 24.6 (the exact value is
25. I). This corresponds to an ATS at p = pI = 0.0098 of approximately
2 x 24.6 = 49.2 hr. Compared to the p-chart. this binomial CUSUM chart
has almost the same false alarm rate and a lower sampling rate, yet it will
detect a shift to p 1 much faster.
As another alternative to taking a sample of = 140 every d = 4 hr,
consider the option of taking a sample of = 35 every hour. If the binomial
CUSUM chart uses I? = 35 and pI = 0.009820. then the reference value will
be n y = 35/140 = 0.25, and the possible values for Y, will be integer multi-
ples of 0.25. Thus, it is sufficient to look at values for h y that are integer
multiples of 0.25. If the p-chart has an in-control ATS of 1128 and it is
desirable to have approximately the same value for the binomial CUSUM
with d = I , then the in-control ANSS should be 1 128. Using h Y = 5.75 in (9)
gives h;. = 8.08, and using this I?;, in (10) gives ANSS@{)) 1228.2 (the
exactvalue is 1226.6). Because cl = 1, this correspondstoanin-control
ControlChartsforMonitoring a Proportion 129

ATS of 1226.2hr.Using ( 1 1) gives ANSSQl) zz 50.7 (theexact value is


51.3). This corresponds to an ATS at p = p1 = 0.0098 of 50.7 hr.
The three binomial CUSUM charts that havebeenconsideredhere
have the same sampling rate of 35 observations per hour. However, their
false alarm rates are not exactly the same, so it is difficult to do precise
comparisons of the charts. But based on the results given for these charts, it
seems clear that taking small samples at frequent intervals would give fast
detection of process shifts. If I I is reduced to the smallest possible value, 1,
then the binomial CUSUM chart reduces to the Bernoulli CUSUM chart
discussed previously. UsingI I = 1 might be the best way to apply a CUSUM
chart from a statistical point of view, but taking samples of size / I = 1 might
be inconvenient in some applications.

9. THE SPRT CHART

CUSUM charts, such as the binomial CUSUM chart described previously.


can be thought of as sequences of SPRTs carried out over successive sam-
pling points. The SPRT chart to be considered in this section is based on
using SPRTs in a different way. In particular, the SPRT chart is based on
applying a sequential test (an SPRT) to the individual itemsinspected at
each sampling point. A description of the SPRT chart for the caseof mon-
itoring a general parameter is given by Stoumbos and Reynolds (1996) and,
for the case of monitoring the mean of a normal distribution, i n Stoumbos
and Reynolds (1997b). More details about the current problem of applying
the SPRT chart to monitor p are given in Reynolds and Stoumbos (1998).
In the context of hypothesis testing, the SPRT is a general sequential test
that can be applied to test a simple null hypothesis against a simple alter-
native hypothesis. For the caseof a test involving the proportion defective p ,
the SPRT can beused to test the null hypothesis H,):p = / l o againstthe
alternativehypothesis H l : p = p l . Inthecontextofmonitoring p, po
would be the in-control value of p , and pI would be a value that should
be detected quickly, as defined in previous sections.
Suppose that a sampling intervalof length d is used for sampling from
the process. At each sampling point items from the process are inspected one
by one and an SPRT is applied, with the sample size used at each sampling
point beingdetermined by the SPRT. If items can be inspectedquickly
enough,thentheinspectioncan be doneonconsecutive items a s they
come from the process. For example, if an item is produced every IOsec
and the inspection and recordingof the result take no more than I O sec, then
inspection can be done as the items are produced. On the other hand, if the
inspection rate is slower than the production rate, then inspection could be
130 Reynolds and Stoumbos

done after production on items that have been accumulated. Alternatively,


inspection could be done on items as they come from production. with some
items skipped. For example, if an item is produced every I O sec but inspec-
tion requires between 30 and 40 sec, then every third or fourth item couldbe
inspected during inspection periods.
If the SPRT applied at sampling point k accepts H o : p = y o , then the
decision is that the process is in control. The process is thenallowed to
continue to the next sampling point, k + I , at which time another SPRT
is applied. But if the SPRT applied at sampling point k rejects H o , then this
is taken as a signal that there has been a change in p . Action should then be
taken to find and eliminate the cause of this change in p . Thus, the SPRT
chart involves applying an SPRT ateach sampling point and giving a signal
whenever one of these SPRTs rejects Ho.
To define theSPRTthat is appliedatsamplingpoint k , let the
Bernoulli random variable X,,be defined by X,,= 1 if the ith item at sam-
pling point k is defective and by X,, = 0 otherwise. The statistic used by the
SPRT is definedintermsofaloglikelihood ratiousingthedensity
/ ( s ;p ) = p y ( 1 -p)'-." of X,,. After the j t h item is inspected at sampling
point k , this log likelihood ratio statistic is

where the constants r l and r2 are defined by ( 2 ) , and

,=I

is the total number of defective items in the first ,j items inspected at sam-
pling point k.
The SPRT chart requires the specification of two constants CI and h,
h < ( I , and uses the following rules for sampling and making decisions.
1. At sampling point k , if b < Sk/< a, then continue sampling.
2 . At sampling point k , if S,, 2 a, then stop sampling and signal that
p has changed.
3. At sampling point k , if S,, 5 h, then stop sampling at sampling
point k and waituntilsamplingpoint +
k 1 to beginapplying
another SPRT.
The inequality
Control Charts for Monitoring a Proportion 131

determineswhenthe SPRT continues sampling and is usually called the


critical inequality of the SPRT. In some applications it may be more con-
venient to carry out the SPRTby dividing Sk, by r2 to obtain an equivalent
critical inequality. If p , > Po, then this equivalent critical inequality is

where g = h/rZ, 11 = a/r2,and y is given by (3). Thus, after inspecting thelth


item at sampling point k , the SPRT is carried out by determining Tk/, sub-
tracting y j , and comparing the result to g and h . If (1 5) holds, then inspec-
tion is continued at this point; if T k, - y j 2 h, then sampling is stopped and
a signal is given; and if Tk/ - y j 5 g , then sampling is stopped until the time
for sample k + 1 is reached.
As in the cases of the Bernoulli CUSUM and the binomial CUSUM,it
will usually be convenient to have y = l / m , where m is a positive integer, so
that the SPRT statistic Tk,- y j in (15) will take on values that are integer
multiplesof y. It will usually be possible to make y = l / m by a slight
adjustment of p l . When y = l / m , m a positive integer,theacceptance
limit g in (15) can be chosen to be an integer multiple of llm, and this
will ensure that the SPRT statistic Tk, - y j will exactly hit g when the test
accepts H,. Inthedevelopmentofthe SPRT and the SPRTchartthat
follows it is assumed that y = I/m and that g is an integer multiple of y.
If Tk, - y j is an integer multiple ofy, then it follows that the rejection limit h
can also be taken to be an integer multiple of y, although Tk, - y j may still
overshoot h when the test rejects Ho.

10. THE PROPERTIES OF THE SPRT CHART

When evaluating any hypothesis test, a critical property of the test is deter-
mined by either the probability that the test accepts the null hypothesis or
the probability that the test rejects the null hypothesis, expressed as func-
tions of the valueof theparameterunderconsideration.Followingthe
convention in sequential analysis, we work with the operating cllaracteristic
(OC) function, which is the probability of accepting Ho as a function of p.
For most hypothesis tests the sample size is fixed before the data are taken,
but for a sequential test the sample size, say N , depends on the data and is
thus a random variable. Therefore, for a sequential test the distribution of N
must be considered. Usually, E ( N ) ,called the a w a g e sample tlumber (ASN),
is used to characterize the distribution of N .
132 Reynolds and Stoumbos

Each SPRT eitheraccepts or rejects H o , andthusthenumberof


SPRTs untilsignal
a has geometric
a distribution
with
parameter
1 - OCg7). Because each SPRT corresponds to a sample from the process,
the expected number of SPRTs until a signal is the ANSS. For the SPRT
chart, the ANSS for a given 17, say ANSSb), is thus the mean of the geo-
metric distribution, which is

1
ANSSQ) =
1 - OCQ)

When there is a fixed time interval el between samples and the time
required to take a sampleis negligible, then the ATS is the product of d and
the ANSS. Thus, the ATS at p , say ATSO,), is

When p = pO, then 1 - OC(p,) = a, where cx is the probability of a type I


error for the test. The ATS is then

When p = p i , then 0C(pi)= p, where p is the probability of a type I1 error


for the test. The ATS is then

Exact expressions for the OC and ASN functions of the SPRT for p
can be obtained by modeling the SPRT as a Markov chain [see Reynolds
and Stoumbos (1998)l. These expressions, however, are relatively compli-
cated,andthus it would be convenient to havesimplerexpressionsthat
couldbe usedin praticalapplications.Theremainderof thissection is
concernedwithpresentingsomesimpleapproximationstothe OC and
ASN functions. These approximations to the OC and ASN functions are
presented here for the casein which 0 < po < 0.5. The case in which po 2 0.5
is discussed in Reynolds and Stoumbos (1998).
When the SPRTis used for hypothesis testing,it is usually desirable to
choose the constantsg and / I such that the test has specified probabilities for
type I and type I1 errors. The CD approximations to the OC and ASN
Control Charts for Monitoring a Proportion 133

functions use an adjustedvalueof / I , which will be denoted by /I*, in a


relatively simple formula. The adjusted value of /I* is

It is shown in Reynolds and Stoumbos( 1 998) that choices for g and /I* based
on the CD approximations are

/I* 23 I log(--)
- I-P
1’2

and

If nomial values are specified for CY and p, then g and /I* can be determined
by using Eqs. (21) and (22), and then the value of / I can be obtained from /I*
by using Eq. (20).
The CD approximation to the ASN at po and pl can beexpressed
simply in terms of c1 and P [see Reynolds andStoumbos (1998)l. For
p = p o , this expression is

and for p = p l the expression is

Thus, for given c1 and [3,evaluating the ASN at po and p 1 is relatively easy.

11. A METHOD FOR DESIGNING THE SPRT CHART

To design the SPRT chart for practical applications it is necessary to deter-


mine the constants g and / I used in each SPRT. In many applications it is
desirable to specify the in-control average sampling rate and the false alarm
rateand design thechartto achievethesespecifications.Spe-ifying the
sampling interval r l and ASN(p,J will determine the in-control average sam-
134 Reynolds and Stoumbos

pling rate, and specifying ATS(po) will determine the false alarm rate. Once
these quantities are specified, the design proceeds as follows.
The value of c1 is determined by using Eq. ( I 8) and the specified values
of d and ATS(po). Then, using (23), the value of p can be determined from
the specified value of ASN(p,) andthe valueof c1 justdetermined.
Expression(23) cannot be solvedexplicitlyfor p in terms of 01 and
ASN(p,), so thesolutionfor p will have to be determinednumerically.
Once c1 andaredetermined,Eqs. (21), (22), and (20) can be used to
determine g and h.

12. ANEXAMPLE OF DESIGNING AN SPRT CHART

To illustratethedesignandapplicationofthe SPRT chart, consider an


example similar to the examples of Sections 5 and 8 in which the objective
is to monitor a productionprocess with po = 0.005. Suppose that the current
procedure for monitoring this process is to take samples of n = 200 every
d = 4 hr anduse a p-chart that signals if five or more defectives are found in
a sample. Suppose that items are produced at a rapid rate and an item can
be inspected in a relatively short time. In this case, process engineers are
willing to use a sequential inspection plan in which items are inspected one
by one and the sample size at each sampling point depends on the data at
that point. In this example the time required to obtain a sample is short
relative to the time between samples, so neglecting this time incomputations
of quantities such as the ATS seems to be reasonable.
As in the example in Section 5, suppose that p1 is specified to be 0.010
and then adjusted slightly to 0.009947, so that y = 1/139. For the first phase
of the example, suppose that it is decided that the SPRT chart should be
designed to have the same sampling interval, the same in-control average
sampling rate, and the same false alarm rate as the p-chart. Then d can be
taken to be 4, the target for ASN(po) can be taken to be 200, and the target
for ATS(p,) can be taken to be 1 128 hr.
First consider the problem of findingg and k in critical inequality (1 5 )
of the SPRT. Using the specifications decided upon for the chart, Eq. (18)
implies that c1 shouldbe0.003545.Then,solving(23)numericallyfor p
gives p = 0.7231. Then, using Eqs.
(21) and (20) gives h* =
log(0.2769/0.003545)/0.6928 = 6.2906 and h = 5.9606, and using Eq. ( 2 2 )
gives g = log(0.7231/0.996455)/0.6928 = -0.4628. Rounding g and 12 to the
nearestmultipleof ljl39 gives g = -64/139 = -0.4604 and h =
828/139 = 5.9568. Thus, the SPRT chart can be applied in thiscase by
using the critical inequality
Control Charts for Monitoring a Proportion 135

-0.4604 < Tk, - (j139) < 5.9568 (25)

The in-control ASN of this chart should be approximately 200 (the exact
value is 198.97), and the in-control ATS should be approximately 1128 hr
(theexactvalue is 1128.48 hr). Using Eq. (24),thischart’s ATSat
p = pI = 0.009947 should be approximately d/(l - p) = 4/ (1 - 0.7231) =
14.45hr (the exactvalue is 14.51 hr).Thus,comparedtothe valueof
78.73 hr for the p-chart, the SPRT chart will provide a dramatic reduction
in the time required to detect the shift from po to p1.
The value chosen forp I is really just a convenient design device for the
SPRT chart, so this value of p would usually not be the only value that
should be detected quickly. Thus, when designing an SPRT chart in prac-
tice, it is desirable touse the CD approximation (or the exact methods)given
in Reynolds and Stoumbos (1998) to find the ATS for a rangeof values of p
around p l . For the evaluation to be given here, exact ATS values for the
SPRT chart were computed and are given in column 3 of Table 1. ATS
values for the p-chart are given in column 2 of Table 1 to serve as a basis of
comparison. Comparing columns 2 and 3 shows that, except for large shifts
in p , the SPRT chart is much more efficient than the p-chart. When con-
sidering the binomial CUSUM in Section 8, it was argued that it is better to
take small samples at more frequent intervals than to take large samples at
long intervals. To determine whether this is also true for the SPRT chart, an
SPRT chart was designed to have an approximate in-control ASNof 50 and
a sampling interval of d = 1 hr. This would give the same sampling rate of
50 observations per hour as in columns 2 and 3. The ATS values of this
second SPRT chart are given in column 4 of Table 1. Comparing columns 3
and 4 shows thatusing a sampling interval of d = 1 with ASN = 50 is better
than using a sampling interval of d = 4 with ASN = 200, especially for
detecting large shifts.
In some applications, the motivation for using a variable sampling rate
control chart is to reducethesamplingcostrequired to produce a given
detection ability [see Baxley (1996), Reynolds (1996b), and Reynolds and
Stoumbos (1998)l. Because the SPRT chart is so much more efficient than
the p-chart, itfollows thattheSPRTchartcould achievethedetection
ability of the p-chart with a much smaller average sampling rate. To illus-
trate this point, the design method given in Section 11 was used to design
some SPRT charts with lower average sampling rates. Columns 5 and 6 of
Table 1 contain ATS values of two SPRT charts that have an in-control
averagesamplingrateofapproximatelyhalfthevalueforthep-chart
(approximately 25 observationsperhour).The SPRT chart in column 5
uses d = 2.0 and has ASN(p0) % 50, and the SPRT chart in column 6 uses
d = 1.0 and has ASN(po) % 25. Although these two SPRT charts are sam-
136 Reynolds and Stoumbos
Control Charts for Monitoring a Proportion 137

pling at half the rate of the p-chart, they are still faster at detecting shifts in
p. Columns 7 and 8 of Table 1 contain ATS values of two SPRT charts that
have an in-control average sampling rate of approximately one-fourth the
value for the p-chart (approximately12.5 observations per hour). The SPRT
chart in column 7 uses ti = 2.0 and h a s ASN(po) x 50, and the SPRT chart
in column 6 uses d = 1 .0 and has ASN(po)x 25. Comparing columns 5 and
6 to column 2 shows that the SPRT charts with half the sampling rate of the
p-chart offer faster detection than the p-chart. Columns 7 and 8 show that
an SPRT chart with about one-fourth the sampling rate of the p-chart will
offer roughly the same detection capability as the p-chart.

13. CONCLUSIONS

It has been shown that the Bernoulli CUSUM chart, the binomial CUSUM
chart, and the SPRT chart are highly efficient control charts that can be
applied in different sampling situations. Each of these charts is much more
efficient than the traditional Shewhart p-chart. The design methods based on
the highly accurate C D approximations provide a relatively simple way for
practitioners to design these charts for practical applications. Although the
design possibilities for these charts are limited slightly by the discreteness of
the distribution of the inspection data, this discreteness is much lessof a
problem than for the p-chart.
The SPRT chartis a variable sampling rate control chart that is much
more efficient than standard fixed sampling rate charts such as the p-chart.
The increased efficiency of the SPRT chart can be used to reduce the time
required to detectprocess changes or to reduce the sampling cost required to
achieve a given detection capability.

REFERENCES

Baxley RV Jr. (1995). An application of variable sampling interval control charts. J


Qual Technol 27: 275-282.
Bourke PD. (1991). Detecting a shift in fraction nonconforming usingrun-length
control charts with 100% inspection. J Qual Technol 23: 225-238.
Gan FF.(1993). An optimaldesign of CUSUM control charts for binomial counts. J
Appl Stat 20: 445460.
LUCZIS JM, Crosier RB. (1982). Fast initial response for CUSUM qltality control
schemes: Give your CUSUM a head start. Technotnetrics 24: 199-205.
Rendtel U. (1990). CUSUM schemes with variable sampling intervals and sample
sizes. Stat Papers 3 1: 103-1 18.
138 Reynolds and Stournbos

Reynolds MR Jr. (19964.Variable sampling interval control chartswith sampling at


fixed times. HE Trans 28: 497-510.
Reynolds MR Jr. (1996b). Shewhart and EWMA variable sampling interval control
charts with sampling at fixed times. J Qual Technol 28: 199-212.
Reynolds MR Jr, Stoumbos ZG.(1999). A CUSUM chart for monitoring a propor-
tion when inspecting continuously. J. Qual Technol 31: 87-108.
Reynolds MR Jr. Stoumbos ZG. (1998). The SPRT chart for monitoring a propor-
tion. IIE Trans 30: 545-561.
Siegmund D. (1985). Sequential Analysis. Springer-Verlag, New York.
Stoumbos ZG, ReynoldsMR Jr.(1996). Control charts applying ageneral sequential
test at each sampling point. Sequent Anal 15: 159-183.
Stoumbos ZG, Reynolds MR Jr. (1997a). Corrected diffusion theory approximations
in evaluatingproperties of SPRTchartsformonitoringa process mean.
Nonlinear Analysis 30: 3987-3996.
Stoumbos ZG, Reynolds MRJr. (1997b). Control charts applying a sequential test at
fixed sampling intervals. J Qual Technol 29: 2 1 4 0 .
Wald A. (1947). Sequential Analysis. Dover, New York.
Woodall WH. (1997). Controlcharts based onattributedata:Bibliographyand
review. J Qual Technol 29: 172-183.
Vaughan TS. (1993). Variable sampling interval t ~ process
p control chart. Commun
Stat: Theory Methods 22: 147-167.
Process Monitoring with Autocorrelated
Data
Douglas C. Montgomery
Arizona State University, Tempe, Arizona

Christina M. Mastrangelo
University of Virginia, Charlottesville, Virginia

1. INTRODUCTION

The standard assunlptions when control charts are used to monitor a pro-
cess are that the data generated by theprocesswhen it isin control are
normally and independently distributed with meanp and standard deviation
0.Both p and 0 areconsidered fixed andunknown.Anout-of-control
condition is created by an assignablecausethatproducesachangeor
shift in p or 0 (or both) to some different value. Therefore, we could say
that when the process is in control the quality characteristic at time 1, x,,is
represented by the model

x, = p+&,, t = 1,2, . . . (1)

where E, is normallyandindependentlydistributedwithmeanzeroand
standard deviation 0. This is often called the Shewhart modelof the process.
When these assumptions are satisfied, one may apply either Shewhart,
CUSUM, or EWMA control charts and draw reliable conclusions about the
state of statistical control of the process. Furthermore, the statistical proper-
ties of the control chart, such as the false alarm rate with 3 0 control limits,
or the average run length, can be easily determined and used to provide
guidance for chart interpretation. Even in situations where the normality

139
140 MontgomeryandMastrangelo

assumption is violated to a slight or moderate degree, these control charts


will still work reasonably well.
The most important of these assumptions is that the observations are
independent (or uncorrelated), because conventional control charts do not
perform well if the quality characteristics exhibit even low levels of correla-
tion over time. Specifically, these control charts will give misleading results
in the form of too many false alarms if the data are autocorrelated. This
pointhas been made by numerousauthors, includingBerthouex etal.
(1978). Alwan andRoberts (1988),MontgomeryandFriedman (1989),
Alwan (1992), Harrisand Ross (1991), MontgomeryandMastrangelo
(I99 I), Yaschin (1993). and Wardell et al. (1994).
Unfortunately, the assumption of uncorrelated or independent obser-
vations is not even approximately satisfied in somemanufacturingpro-
cesses. Examples include
chemical
processes in which consecutive
measurements on process or product characteristics are often highly corre-
lated and automated test and measurement procedures in which every qual-
ity characteristic is measured on every unit in time order of production. The
increasing use of on-line data acquisition systems is shrinking the interval
between process observations. As a result, the volume of process data col-
lected per unit time is increasing dramatically [see the discussion in Hahn
(1989)) All manufacturing processes are driven by inertialelements, and
when the interval between samples becomes small relative to these forces,
the observations on the process will be correlated over time.
It is easy to given an analytical demonstration of this phenomenon.
Figure 1 shows a simple system consisting of a tank of volume V , with an
input and output material stream having flow ratef'. Let 119, be the concen-
tration ofacertainmaterial in the input stream at time t and x, be the
correspondingconcentration in the outputstreamat time t . Assuming
homogeneity within the tank, the relationship between s , and II,, is

X
-I

Figure 1 A tank with volume V and input and output material streams.
ProcessMonitoringwithAutocorrelatedData 141

where T = V / j is often called the tirue c~onstcrntof the system.


If the input stream experiencesa step change of 1 1 ' ~at time t = 0 (say),
then the output concentration at time t is

s , = Wo( I - e"'T)

Now,inpractice, we d o notobserve x, continuouslybutonlyat small,


equally spaced intervals of time, At. In this case,

where CI = I - e - A ' / T .
The properties of the output stream concentrationx, depend on those
of the input stream concentration 11-, and the sampling interval. Figure 2
illustrates the effect of the mean of w, on s,. If we assume that the w, are
uncorrelatedrandomvariables,thenthecorrelation between successive
values of s , (orautocorrelation between s , and isgiven by

Note that if At is much greater than T . then p E 0. That is, if the interval
between samples At in the output stream is long, much longer than the time
constant T , then the observations on output concentration will be uncorre-
lated. However, if At 5 T , this will not be the case. For example, if

Figure 2 The effect of the process on s,.


142 Montgomery and Mastrangelo

AtlT = I , p = 0.37
A t / T = 0.5, p = 0.61
A t / T = 0.25, p = 0.78
A t / T = 0.10, p = 0.90

Clearly, if we sample at least once per time constant, therewill be significant


autocorrelationpresent in theobservations.Forinstance,samplingfour
times per time constant ( A t l T = 0.25) results in autocorrelation between
x, and of p = 0.78. Autocorrelation betweensuccessive observationsas
small as 0.25 can cause a substantial increase in the false alarm rate of a
control chart, so clearly this is an important issue to consider in control
chart implementation.
Figure 3 illustrates the foregoing discussion. Thisis a control chart for
individualmeasurementsappliedtoconcentrationmeasurementsfroma
chemical process taken every hour. The data are shown in Table 1. Note
that many points are outside the control limits (horizontal lines) on this
chart. Because ofthenature of theproductionprocessandthe visual
appearance of the concentration measurements in Figure 3, which appear
to “drift” or “wander” slowly over time, we would probably suspect that
concentration is autocorrelated.
Figure 4 is ascatterplot of concentrationat time t (x,)versus
concentrationmeasuredoneperiodearlier (.Y,-,). Notethatthepoints
onthisgraphtendtoclusteralongastraight line witha positive slope.
That is, a relatively low observation of concentration at time t - I tends
to be followed by another lowvalue at time t , while a relatively large
observation at time t - 1 tends to be followed by another large value at
time t . Thistype of behavior is indicative of positive autocorrelation in
theobservations.
It is also possible to measure the level of autocorrelation analytically.
The autocorrelation over a series of time-oriented observations is measured
by the autocorrelation function

where Cov(.x,, .x,-k) is the covariance of observations that are k time periods
apart, and we have assumed that the observations(called a time series) have
constant variance given by V ( s , ) .We usually estimate the values of p k with
the sample autocorrelation function:
ProcessMonitoringwithAutocorrelatedData 143
78000 .
76 000 '

74 000 '

72.000

68.000 I V Y
A /1
v ,
66 000
64.000 i
I
62.000 '
- = - Q - Q - Q - c - C - Q - Q - i - Q -
- " ~ , N - . ~ f f n ~ C O ~ 0 P x X ~ ~

Time. 1

Figure 3 Control chart for individuals.

Table 1 ConcentrationData
Time, t X Time, t X Time, t X Time, t X
1 70.204 26 69.270 51 70.263 76 71.371
2 69.982 27 69.738 52 71.257 77 71.387
3 70.558 28 69.794 53 73.019 78 71.819
4 68.993 29 79.400 54 71.871 79 71.162
5 70.064 30 70.935 55 72.793 80 70.647
6 70.29 1 31 72.224 56 73.090 81 70.566
7 71.401 32 71.930 57 74.323 82 70.311
8 70.048 33 70.534 58 74.539 83 69.762
9 69.028 34 69.836 59 74.444 84 69.552
10 69.892 35 68.808 60 74.247 85 70.884
11 70.1 52 36 70.559 61 72.979 86 71.593
12 7 1.006 37 69.288 62 71.824 87 70.242
13 70.196 38 68.740 63 74.6 12 88 70.863
14 70.477 39 68.322 64 74.368 89 69.895
15 69.510 40 68.713 65 75.109 90 70.244
16 67.744 41 68.973 66 76.569 91 69.7 16
17 67.607 42 69.580 67 75.959 92 68.914
18 68.168 43 68.808 68 76.005 93 69.216
19 69.979 44 69.931 69 73.206 94 68.431
20 68.227 45 69.763 70 72.692 95 67.516
21 68.497 46 69.541 71 72.251 96 67.542
22 67.113 47 69.889 72 70.386 97 69.136
23 67.993 48 7 1.243 73 70.519 98 69.905
24 68.1 13 49 69.701 74 71.005 99 70.515
25 69.142 50 71.135 75 71.542 100 70.234
144 MontgomeryandMastrangelo
77 ""

e
761 e
75 j e
e f i e
74 I

66 hX 70 72 7.1 76 70
x,.,

Figure 4 Scatter plot of concentration at time t(.Yr) versus concentration measured


one period earlicr (s-,).

As a general rule, we need to compute values of rk for a few values of IC.


k I n/4. Many software programs for statistical data analysis can perform
these calculations.
Thesampleautocorrelationfunctionfor theconcentrationdata is
shown in Figure 5. The dashed line on the graph is the upper two-standard
deviation limit on the autocorrelation parameter pk at lag k . The lower limit
(not shown here) would be symmetrical. These limits are useful in detecting
nonzero autocorrelations; in effect, if a sample autocorrelation exceeds its
two-standard deviation limit, the corresponding autocorrelation parameter
px. is likely nonzero. Note that there is a strong positive correlation at lag 1;
ProcessMonitoringwithAutocorrelatedData 145

that is, concentration observations that are one period apart are positively
correlated with r l = 0.88. This level of autocorrelation is sufficiently high to
distort greatly the performance of a Shewhart control chart. In particular,
because we know that positive correlation greatlyincreases the frequency of
false alarms. we should be very suspicious about the out-of-control signals
on the control chart in Figure 3.
Several approaches have been proposed for monitoring processes with
autocorrelated data. Just asin traditional applications of SPC techniques to
uncorrelated data, our objective is to detect assignable causes so that if the
causes are removed, process variability can be reduced. The first is to sample
from the process less frequently so that the autocorrelation is diminished.
For example, note from Figure5 that if we only took every 20th observation
on concentration, there would be very little autocorrelation in the resulting
data. However, since the original observations were taken every hour, the
new sampling frequency would be one observation every 20 hr. Obviously,
the drawback of this approach is that many hours may elapse between the
occurrence of an assignable cause and its detection.
Thesecondgeneralapproachmay be thought of as a n~ocidhcrseci
npprocrch. One way that this approach is implementedinvolvesbuilding
an appropriate model for the process and control, charting the residuals.
The basis of this approach is that any disturbances from assignable causes
that affect theoriginalobservations willbe transferredtotheresiduals.
Model-basedapproachesarepresented in thefollowingsubsection. The
nw0el:fiee n p p r o c d ~doesnotusea specific model fortheprocess; this
approach is discussed in Section 3 .

2. MODEL-BASEDAPPROACHES
2.1. ARlMAModels
An approach to process monitoring with autocorrelated data that has been
applied widely i n the chemical and process industries is to directly model the
correlative structure with an appropriate time series model, use that model
to remove the autocorrelation from the data, and apply control charts to the
residuals. For example, suppose we could model the quality characteristic s,
as

+
where 5 and 4 (-1 < < 1) are unknown constants and E, is normally and
independently distributed with mean zero and standard deviation CT.Note
howintuitive this model is fortheconcentrationdatafromexamining
146 MontgomeryandMastrangelo

Figure 4. Equation (3) is called a first-order autoregressive model; the obser-


vations .x, from such a model have mean \/(1 - 4) and standard deviation
o/(l - 4’)”’, and the observations that are k periods apart (.x, and
have correlation coefficient @. That is, the autocorrelation function should
decay exponentially justas the autocorrelation functionof the concentration
data did in Figure 5. Suppose that 4 is an estimate of 4 obtained from
analysis of sample data from the process and .? is the fitted valueof .x,.
Then the residuals

e, = .x, - x,

are approximately normally and independently distributed with mean zero


and constant variance. Conventional control charts couldnow be applied to
the sequence of residuals. Points out of control or unusual patterns on such
charts would indicate that the parameter 4 had changed, implying that the
original variable I, was out of control. For details of identifying and fitting
time series models such as this one,see Montgomery et al.(1990) and Box et
al. (1994).
The parameters in the autoregressive model. Eq. (3), may be estimated
by the method of least squares, that is, by choosing the values of 5 and 4
that minimize thesum of squarederrors E,. Manystatisticalsoftware
packageshaveroutinesforfittingthesetime series models. The fitted
value of this model for the concentration data is

x, = 8.38 + O . 8 8 ~ , - ~
Wemaythinkofthisas analternativetotheShewhartmodelforthis
process.
Figure 6 is an individuals control chart of the residuals from the fitted
first-order autoregressive model. Note that now no points are outside the
control limits. In contrast to the control chart on the individual measure-
ments in Figure 3, we would conclude that this process is in a reasonable
state of statistical control.

Other Time Series Models


The first-order autoregressive model used in the concentration example [Eq.
(3)] is not the only possiblemodel for time-oriented data that exhibits a
correlative structure. An obvious extension to Eq. (3) is
ProcessMonitoringwithAutocorrelatedData 147

4.000

3.000

2.000
5 1,000
z
t 0.000

-1.000

-2.000

-3.000
Time, t

Figure 6 Controlchartforindividualsappliedto the residualsfromthe AR(I)


model.

which is a secondorder autoregressive model. In general, in autoregressive-


type models, the variable s is directly dependent on previous observations
f

x f P lx,-2,
, and so forth. Another possibility is tomodelthedependence
through the random component E , . A simple way to do this is

This is called a first-order moving average model. In this model, the


+
correlation between x, and .x,-I is pI = 4 / ( 1 e*) and is zero at all other
lags. Thus, the correlative structure in .xf extends backward for only one
time period.
Sometimes combinations of autoregressive and moving average terms
are useful. A,first-order mixed model is

This model often occurs in the chemical and process industries. The reason
is that if the underlying process variable.x, is first-order autoregressive and a
random error componentis added to x,, the result is the mixed model in Eq.
(6). In the chemical and process industries, first-order autoregressive process
behavior is fairly common. Furthermore, the quality characteristic is often
measured in a laboratory (or by an on-line instrument) that has measure-
ment error, which we can usually think of as random or uncorrelated. The
reported or observed measurement then consists of an autoregressive com-
ponent plus random variation, so the mixed model in Eq. (6) is required as
the process model.
148 MontgomeryandMastrangelo

in someapplications.Whereasthepreviousmodelsare used to describe


stationary behavior (that is, x, wandersarounda “fixed” mean).the
model in Eq. (7) describes nonstationary behavior (the variable x, “drifts”
as if there were n o fixed value of the process mean). This model often arises
in chemical and process plants when x, is an “uncontrolled” process output,
that is, when no control actions are takento keep the variable close to target
value.
The models we have been discussing in Eqs. (3)-(7) are members of a
class of time series models called alrtoregllvssive i11tegrccted r11011irlg rrvercgp
(ARIMA) models. Montgomery et al. (1990) and Box et al. (1994) discuss
these models in detail. While these models appear very different from the
Shewhart model [Eq. (l)],they are actually relatively similar and include the
Shewhart model as a special case. Note that if we let 4 = 0 in Eq. (3), the
Shewhart model results. Similarly, if we let 8 = 0 in Eq. ( 5 ) , the Shewhart
model results.

Average Run Length Performance for Residuals Control Charts


Several authors have pointed out that residuals control charts are not sen-
sitive to small process shifts [e.g., see Wardell et al. (1994)l. The average run
length for the residuals chart from an AR(I) model is

l-P,+P
ARLRES =

where P , is the probability that the run has length 1, that is, the probability
that the first residual exceeds f 3 ,

p , = Pr(run length = 1)
= 1 - @(3- 6)+ @(-3 - 6)

@(.) is the cumulative distribution function of the standard normal distribu-


tion.Theprobabilitythatanysubsequentobservation will generatean
alarm is the probability that o, exceeds f 3 ,

P = 1 - @(3 - 6(1 - 4)) + @(-3 - 6(1 - +)) (10)

See Willemain and Runger (1996) for the complete derivation.


ProcessMonitoring with AutocorrelatedData 149

Table 2 ARLs for ResidualsChart


~~ ~~ ~

Correlation 6/c
0 0 0.5 4 1 2
0.00 370.38 152.22 43.89 6.30 1.19
0.25 370.38 212.32 80.37 13.59 1.32
0.50 370.38 280.33 152.69 37.93 2.00
0.90 370.38 364.51 345.87 260.48 32.74
0.99 370.38 368.95 362.76 3 12.00 59.30
Note: ARLs measured in observations.

Table 2 shows ARLREs for representative values of the autocorrela-


tion coefficient $I and shift 6. Note the poor performance of the residuals
chart when the correlation is high ($I = 0.90 or 4 = 0.99). This problem
arises because the AR(I) model responds to the change in the mean level
and partially incorporates the shift in the mean into its forecasts, as seen in
(13).

Using an Exponentially Weighted Moving Average (EWMA) with


Autocorrelated Data
The time series modeling approach illustrated in the concentration example
can be time-consuming and difficult to apply in practice. Typically, we apply
control charts to several process variables. and developing an explicit time
series model foreachvariableofinterest is potentiallytime-consuming.
Some authors have developed automatic time series model building to par-
tially alleviate this difficulty [see Yourstone and Montgomery (1 989) and the
referencestherein].However,unless the time series model is ofintrinsic
value in explainingprocessdynamics(as it sometimes is), this approach
will frequently require more effort than may be justified in practice.
Montgomery and Mastrangelo (1991) suggested an approximate pro-
cedure based on the EWMA. They use the fact that the EWMA canbe used
in certain situations where the data are autocorrelated. Suppose that the
process can be modeled by the integrated moving average model of Eq. (7).
I t can be easily shown that the EWMA with h = 1 - 8 is the optimal one-
step-aheadforecastfor this process. That is, if is theforecastforthe
observation in period t + 1 made at the end of period t , then

.?,+I ( 1 ) = "/

+
where ,: = As, (1 - h ) ~ , is
- ~the EWMA. The sequenceof one-step-ahead
prediction errors,
150 MontgomeryandMastrangelo

is independently and identically distributed with mean zero. Therefore, con-


trol charts could be applied to these one-step-ahead prediction errors. The
parameter h (or equivalently, e) would be found by minimizing the sum of
squares of the errors el.
Now suppose that the process is not modeled exactly by Eq. (7). In
general, if the observations from the process are positively autocorrelated
and the process mean does not drift too quickly, the EWMAwith an appro-
priate value for h will provide an excellent one-step-ahead predictor. The
forecasting and time series analysis field has used this result for many years;
for examples, see Montgomery et al. (1990). Consequently, we would expect
many processes that obeyfirst-orderdynamics(that is, followaslow
“drift”) to be well represented by the EWMA.
Consequently, under the conditions just described, we mayusethe
EWMA as the basis of a statistical process monitoring procedure that is
an approximation of the exact time series model approach. The procedure
wouldconsistofplottingone-step-aheadEWMApredictionerrors(or
model residuals) on a control chart. This chart could be accompanied by
a run chart of the original observations on which the EWMA forecast is
superimposed. Our experience indicates that both charts are usually neces-
sary, as operational personnel feel that the control chart of residuals some-
times does not provide a direct frame of reference to the process. The run
chart of original observations allows process dynamics to be visualized.
Figure 7 presents a control chart for individuals applied to the EWMA
prediction errors for the concentration data. For this chart. h = 0.85. This is
the value of h that minimizes the sum of squares of the EWMA prediction

0.500
0.400
0.300
0.200
*
;0.100
a
p, 0.000
VY
2 -0,100
-0.200
-0.300
-0.400
-0.500
Time, t

Figure 7 EWMA prediction errors with h = 0.85 and Shewhart limits.


ProcessMonitoringwithAutocorrelatedData 151

errors. This chart is slightly different from the control chart of the exact
autoregressive model residuals shown in Figure 6, but not significantly so.
Both indicate a process that is reasonably stable, with a period around t =
62 where an assignable cause may be present.
Montgomery and Mastrangelo (1991) point out that it is possible to
combineinformationaboutthestate of statisticalcontroland process
dynamics on a single control chart. If the EWMA is a suitable one-step-
ahead predictor, then one could use z , as the centerline on a control chart
for period r + 1 with upper and lower control limits at

UCL,,, = z , +3 0
and

LCL/+, = z / - 3 0 (12)

and the observation s + 1 would be compared to these limits to test for


f

statistical control. We can think of this as a tnoving centerline E W M A con-


trol chcrrt. As mentioned above,in many cases this would be preferable from
an interpretation standpoint to a control chart of residuals and a separate
chart of the EWMA, as it combines information about process dynamics
and statistical control in one chart.
Figure 8 is the moving centerline EWMA control chart for the data,
with h = 0.85. It conveys the same information about statistical control as

80.000

78.000 h
76.000

74.000

72.000

70.000

68.000

66 000

Figure 8 Movingcenterline EWMA controlchartapplied to theconcentration


data.
152 MontgomeryandMastrangelo

the residual or EWMA prediction error control chart in Figure7, but oper-
ating personnel often feel more comfortable with this display.
A R L Pcrformmce. Because theEWMA-basedprocedurespresented
above are very similar to the residuals control chart, they will have some of
thesameproblemsindetectingprocessshifts. Also. TsengandAdams
(1994) note that because the EWMA is not an optimal forecasting scheme
formostprocesses[except theIMA( 1 , l ) model], it will notcompletely
account for the autocorrelation, and this can affect the statistical perfor-
mance of control charts basedon EWMA residuals or prediction errors.
MontgomeryandMastrangelo (1991)suggesttheuseof supplementary
procedures called trackingsignalscombined with thecontrolchartsfor
residuals. There is evidence that these supplementary procedures consider-
ablyenhancetheperformance of residuals controlcharts.Furthermore,
MastrangeloandMontgomery (1995)show that if anappropriately
designed tracking signal scheme is combined with the EWMA-based proce-
dure we have described, good in-control performance and adequate shift
detection can be achieved.
Estirmrting crnd Monitoring 0 . Thestandarddeviation of theone-
step-ahead errors or model residuals o can be estimated in several ways.
If h is chosen as suggested above over a record of n observations, then
dividing the sum of the squared prediction errors for the optimal h by I I
will produce an estimate of 02.This is the method used in many time series
analysis computer programs.
Another approach is to compute the estimateof CJ as is typically done
in forecasting systems. The mean absolute deviation (MAD) could be used
in this regard. The MAD is computed by applying an EWMA to the abso-
lute value of the prediction error,

Since the MAD of a normal distributionis related to the standard deviation


by 0 E 1.25A, [see Montgomery et al. (1990)], we could estimate the stan-
dard deviation of the prediction errors at time t by

Another approach is to directly calculate a smoothed variance,


ProcessMonitoringwithAutocorrelatedData 153

MacGregor and Ross (1993) discuss the use of exponentially weighted mov-
ing variance estimates in monitoring the variability of a process. They show
how to find control limits for these quantities for both correlated and uncor-
related data.

The Weighted Batch Means Control Chart


While control charting the residuals from a time series model is one way to
cope with autocorrelation, there is another way to exploit these models. This
is the weighted batch means chart, introduced by Runger and Willemain
(1995).
Bischak et al. (1993) derived away to eliminate autocorrelation among
theaverages of successive data values in discrete-eventsimulation.Their
findingshavevalue for statisticalprocess control, since a way to cancel
autocorrelation i n subgroupsmaps theproblemofautocorrelateddata
into the familiar problem of using independent subgroups to monitor pro-
cess means.
Starting with a stationary ARIMA or ARMA model, Bischak et a l .
(1993) derivedtheweights needed toeliminateautocorrelation between
batchmeansasafunctionofthebatch size and themodelparameters.
Designating the batch size by h and forming thejth batch from consecutive
data values X ( / - l ) h + rthejth
, weighted batch mean is

r=l

The batch size h can be selected to tune performance against a specified shift
6.
The weights w i must sum to unity for YJ to be an unbiased estimate of
the process mean p. For A R b ) processes, the optimal weights are identical
in the middle of the batch but differ in sign and magnitude for the first and
last values in the batch. For the AR(I) model, the weights are
154 MontgomeryandMastrangelo

For example, with h = 64 and 4 = 0.99, the middle weights are all 0.016,
and the first and last weights are -1.57 and 1.59, respectively.
Given normal data and any bath size h > I , the optimal weights pro-
duce batch means that are i.i.d. normal with mean

and variance

1
Var( Y,) =
(1 - $)’(h - 1)

Given (17) and (18). the standardized value of a shift from 11 to p + 6 is

To adjust the on-target ARL to equal ARLO,,,, one computes the control
limit by solving for A. in

where h i n the numerator accounts for thefact that each batch is h observa-
tionslong.Thentheaverage run length forthe weighted batchmeans
(WBM) chart (measured in individual observations) can be computed as

Table 3 compares ARLWnM against ARLREs for a range of values of


batch size h, shift 6, and autocorrelation 4. The proper choice of batch size h
results in superior performance for the WMB chart. In general. the WBM
chart is more sensitive than the residuals chart for shift 6 5 3 and autocor-
relation 0 I 4 5 0.99.
TheWBMchart achieves its superiority by, in effect, using larger
subgroups of residuals. It iswell known that for independent data, larger
subgroups provide greater sensitivity to small shifts. Runger and Willemain
(1995) show that a form of this conclusion applies to autocorrelated data as
well.
ProcessMonitoringwithAutocorrelatedData 155

Table 3 ARLs of Residuals and Weighted Batch Means Charts


Correlation Shift d/s
f b 0 0.5 4 1 2
0 RES 370.38 155.22 43.89 6.30 1.19
2 370.38 170.14 53.42 9.21 2.25
4 370.38 86.03 19.33 4.88 4.00
8 370.38 48.47 12.57 8.01 8 .OO
16 370.38 34.33 16.53 16.00 16.00
32 370.38 37.32 32.00 32.00 32.00
64 370.38 64.30 64.00 64.00 64.00
128 370.38 128.00 128.00 128.00 128.00
256 370.38 256.00 256.00 256.00 256.00

0.25 RES 370.38 212.32 80.37 13.59 1.32


2 370.38 226.37 94.01 20.02 3.41
4 370.38 135.90 37.84 7.70 4.02
8 370.38 82.97 21.21 8.40 8 .oo
16 370.38 56.18 19.72 16.00 16.00
32 370.38 49.57 32.22 32.00 32.00
64 370.38 67.61 64.00 64.00 64.00
128 370.38 128.07 128.00 128.00 128.00
256 370.38 256.04 256.00 256.00 256.00

0.5 RES 370.38 280.33 152.69 37.93 2.00


2 370.38 290.61 170.I4 53.42 9.21
4 370.38 21 5.06 86.03 19.33 4.88
8 370.38 152.45 48.47 12.57 8.0 1
16 370.38 108.52 34.33 16.53 16.00
32 370.38 85.47 37.32 32.00 32.00
64 370.38 87.30 64.30 64.00 64.00
128 370.38 132.01 128.00 128.00 128.00
256 370.38 256.04 256.00 256.00 256.00

0.9 RES 370.38 364.51 345.87 260.48 32.74


2 370.38 366.45 355.10 3 15.45 2 14.22
4 370.38 360.48 333.49 254.77 123.84
8 370.38 35 1.66 304.84 195.75 74.03
16 370.38 339.49 270.84 147.13 50.25
32 370.38 324.61 237.03 115.86 45.99
64 370.38 310.51 213.34 108.42 66.32
128 370.38 306.20 216.17 141.32 128.02
256 370.38 331.71 28 1.96 256.62 256.00
156 MontgomeryandMastrangelo

Table 3 (continued)

0.99 RES 370.38 368.95 362.76 3 12.00 59.30


2 370.38 370.34 370.22 369.75 367.86
4 370.38 370.28 369.97 368.76 363.98
8 370.38 370.18 369.60 367.26 358.19
16 370.38 370.04 369.04 365.08 350.02
32 370.38 369.86 368.30 362.20 339.72
64 370.38 369.66 367.50 359.16 329.50
128 370.38 369.56 367.13 357.81 325.79
256 370.38 369.88 368.40 362.73 343.39
Note: ARLs mcnsured in observations.
Sowc,c,: Runger and Willenxun 1995.

3. A MODEL-FREE APPROACH: THE BATCH MEANS


CONTROL CHART

Runger and Willemain (1996) proposed an unweighted batch means (UBM)


control chart as an alternative to the weighted batch means (WBM) chart
for monitoring autocorrelated process data.
The UBM chart differs from the WBM chart by giving equal weights
to every point in the batch. let the,jth unweighted batch mean be

This expression differs from ( 1 5) only in that

1
I,'. - - i = I, h
"h'

The important implication of (23) is that although one has to deter-


mine an appropriate batch size h, one does not need to construct an ARMA
nlodel of the data. This model-free approach is quite standard in simulation
output analysis, which also focuses on inference for long time series with
high autocorrelation.
A model-free process-monitoring procedure was the objective of the
manyschemesconsidered by Rungerand Willemain (1996). That work
showed that the batch means can be plotted and approximately analyzed
ProcessMonitoringwithAutocorrelatedData 157

on a standard individuals control chart. Distinct from residuals plots, UBM


charts retain the basic simplicity of averaging observations to form a point
in a control chart. With UBM, the control chart averaging is used to dilute
the autocorrelation of the data.
Procedures for determining an appropriate batch size were developed
by Law and Carson (1979) and Fishman (1978a, 1978b). These procedures
are empirical and do not depend on identifying and estimating a time series
model. Of course, a time series model can guide the process of selecting the
batch size and also provide analytical insights.
Runger and Willemain (1996) provided a detailed analysis of batch
sizes for AR( 1) models. They recommend that the batch size be selected so
a s to reduce the lag 1 autocorrelation of the batch means to approximately
0. IO. They suggest using Fishman’s (1978a)procedure, which starts with h =
1 and doubles h until the lag 1 autocorrelation of the batch means is suffi-
ciently small. This parallels the logic of the Shewhart chart in that larger
batchesaremore effective fordetectingsmallershifts;smallerbatches
respond more quickly to larger shifts.
Though a timeseries model is not necessary to construct a UBM chart,
Table 4 shows the batch size requirements for the AR( 1) model for various
values of 4 (Kang and Schmeiser, 1987). The lower values of oURM imply
greater sensitivity.

Table 4 MinimumBatch Size Required for UBM Chart


for AR( 1) Data
4 I1 a(U B M ) / a a( U B M ) / a
0.00 1 1.oooo nja
0. I O 2 0.7454 1.1111
0.20 3 0.6701 0.8839
0.30 4 0.6533 0.8248
0.40 6 0.6243 0.7454
0.50 8 0.6457 0.7559
0.60 12 0.6630 0.7538
0.70 17 0.7405 0.8333
0.80 27 0.8797 0.9806
0.90 58 1.2013 1.3245
0.95 118 1.6827 1.8490
0.99 596 3.7396 4.0996
Note: Butch size chosen to make lag-l autocorrelation of batch means
0. IO.
158 MontgomeryandMastrangelo

Table 5 Performances of the Unweighted Batch Means Control Chart


Shift, 610
0 h Method 0 1 2
0.9 60 98 199
Approximation70 3
Monte Carlo 371 f 2 206 f 3 100 f 1
0.95 120 20
Approximation 296 370 1
Monte Carlo 371 k 6 303 -+ 4 206 k 2
Note: ARLs measured in batches. Monte Carlo results based on 5 sets of 5,000 alarms.
Uncertainties are 9571 confidence intervals.

Table 6 Comparison of Shewhart Charts ARLs for AR(1) Data


ShiftJ/cT
0 Method
-
h 0.00 0.5 41 2
0 RES 1 10000 2823 520 34 2
0.25 RES 1 10000 4360 1183 116 3
WBM 4 10000 2066 320 23 4
UBM 4 10000 1279 149 11 4
WBM 23 10000 233 34 23 23
UBM 23 10000 210 32 23 23
0.50 RES 1 10000 652 1 2818 506 17
WBM 8 10000 2230 378 33 8
UBM 8 10000 1607 225 20 8
WBM 43 10000 397 66 43 43
UBM 43 10000 367 63 43 43
0.90 RES 1 10000 9801 9234 7279 1828
WBM 58 10000 61 19 2548 548 96
UBM 58 10000 5619 2133 423 81
WBM 472 10000 2547 823 476 472
UBM 472 10000 2504 809 476 472
0.99 RES 1 10000 9995 9974 6977 4508
WBM 596 IO000 969 I 8868 6605 3238
UBM 596 10000 9631 8670 6178 2847
WBM 2750 10000 9440 8129 6605 3238
UBM 2750 10000 9420 8074 5434 3225
Note: ARLs measured in observations.
ProcessMonitoringwithAutocorrelatedData 159

Rungerand Willemain (1996) usethefollowing approximationto


estimate the performance of the UBM chart:
h
ARL = (24)
+
1 - @(A0 - ~ / ~ u B M ) @(-A0 - ~ / ~ u B M )

This approximation, which assumes that the batch means are i.i.d. normal
withmean p andstandarddeviation oUBM as given in Table 4, was
confirmed by Monte Carlo analysis (Table 5 ) .
Sinceestimating ARLs with (27) is simplerthan extensive Monte
Carloanalysis,theapproximation is used in Table 6. Table 6 compares
this ARL with the ARLs of the other two charts for selected values of the
autocorrelation parameter 4. The batch sizes h were chosen by using Table 3
to provide a WBM chart sensitive to a shift 6 = 1. The comparison was
made with the in-control ARL,, = 10,000. Table 6 shows that both batch
means charts outperform the residuals chartin almost all cases shown, with
the UBM chart performing best of all.

REFERENCES

AlwanLC. (1992).Effectsof autocorrelationoncontrolchartperformance.


Commun Stat: Theory Methods 21: 1025-1049.
Alwan LC, Roberts HV. (1988). Time series modeling for statistical process control.
J Bus Econ Stat 6(1): 87-95.
BerthouexTM,Hunter,WG, Pallesen L. (1978). Monitoring sewage treatment
plants: Some quality control perspectives. J Qual Technol IO: 139-148.
Bischak DP, Kelton WD, Pollock SM. (1993). Weighted batch means for confidence
intervals in steady-state simulations. Manag Sci 39: 1002-1019.
Box GEP, Jenkins GM. Reinsel GC. (1994). Time Series Analysis. Forecasting and
Control. 3rd ed. Prentice-Hall, Englewood Cliffs, NJ.
Fishman GS. (19784. Grouping observations in digital simulation. Manage Sci 24:
510-521.
Fishman GS. (1978b). Principles of Discrete Event Simulation. Wiley, New York.
Hahn GJ. (1989). Statistics-aided manufacturing: A look into the future. Am Stat43:
74-79.
Harris TJ, Ross WH. (1991). Statistical process control procedures for correlated
observations. Can J Chem Eng 69: 48-57.
Kang K. Schmeiser B. (1987). Properties of batch means from stationary ARMA
time series, Operations Research Letters 6: 19-24.
Law A. Carson JS. (1979). A sequential procedure for determining the length of
steady-state. Op Res 29: 101 1.
MacGregor JF, Ross TJ. (1993). The exponentially weighted movingvariance. J
Qual Technol 25: 106-1 18.
160 MontgomeryandMastrangelo

Mastrangelo CM, Montgomery DC.(1995). Characterization of a moving centerline


exponentially weighted moving average. Qual Reliab Eng Int 1 l(2): 79-89.
Montgomery DC. Friedman DJ. (1989). Statistical process control in a computer-
integratedmanufacturingenvironment. I n : Statistical Process Control in
AutomatedManufacturing.KeatsJB,HubeleNF,eds.MarcelDekker,
New York.
Montgomery DC. Johnson LA, Gardiner JS. (1990). Forecasting and Time Series
Analysis. 2nd ed. McGraw-Hill, New York.
Montgomery DC, Mastrangelo CM. (1991). Some statistical process control meth-
ods for autocorrelated data. J Qual Technol 23(3): 179-193.
Runger GC. Willemain TR. (1995) Model-based and model-free control of autocor-
related processes. J Qual Technol 27: 283-292.
Runger GC, Willcnlain TR. (1996). Batch-means control charts for autocorrelated
data. 1IE Transactions 28: 483487.
Tseng,AdamsBM. (1994). Robustness of forecast-basedmonitoring schemes.
TechnicalReport,
Department of Management Science and Statistics,
University of Alabama.
Wardell DG, Moskowitz H , Plantc RD. (1994). Run-length distributions of special-
cause control charts for correlated processes and discussion. Technometrics
36( I): 3-27.
Willemain TR, Runger, GC. (1996).Designing control charts using an empirical
reference distribution. J Qual Technol 28( I): 3 1.
Yaschin E. (1993). Performance of CUSUM control schemes for serially correlated
observation. Technometrics 35: 37-52.
Yourstone S, Montgomery DC. (1989). Development of a real-time statistical pro-
cess control algorithm. Qual Reliab Eng Int 5: 309-317.
An Introduction to the New Multivariate
Diagnosis Theory with Two Kinds of
Quality and Its Applications
Gongxu Zhang
Beijing University of Science and Technology, Beijing, People’s
Republic of China

1. MULTIOPERATIONANDMULTI-INDEXSYSTEM

I n factories multioperation and multi-index systems are very common. A


multioperation system is a system in which its product is processed by a
production line consisting of two or more operations. A multi-index system
is one i n which at least one operation has two or more indices, such as a
technical index and/or a quality index. For example, a printed circuit pro-
duction line consists of 17 operations, with at least two indices and at most
27. Again, for analgin (a kind of drug) a production line consistsof six
operations, with at least two and at most six indices. Such examples exist
indeed everywhere.

2. PROBLEMSENCOUNTEREDINIMPLEMENTING
QUALITY CONTROL AND DIAGNOSIS IN A
MULTIOPERATION, MULTI-INDEX SYSTEM

In a multioperation, multi-index system, ifwe want to implement quality


control and diagnosis, there are three major problems:
1. I n a multioperation production line, the processing of the preced-
ing operating will i n general influence the current operation. Since the pre-
ceding influence is synthesized with the processing of the current operation,

161
162 Zhang

how do we differentiate one from the other?If we cannot differentiate them,


we cannot distinguish their quality responsibility, and then we cannot imple-
ment scientific quality control. Evidently, in a multioperation production
line, we need to diagnose the preceding influence.
2. In a multi-index production line, there is the problem of correla-
tions among indices. For example, in the operation ofetching a printed
circuit,thequalityindexofetchinghascorrelationswiththetechnical
indices: NaOH, C1-,Cu’+. When the etching index is abnormal, we need
to diagnose which technical index or indices induced this abnormality.
3. In a multioperation,multi-indexproduction line, thereareboth
theprecedinginfluenceandthecorrelationsamong indices, makingthe
problem more complex.

3. HOW TO DIAGNOSE THE PRECEDING INFLUENCE IN A


MULTIOPERATION PRODUCTION LINE

In a multioperation production line, we need to use the diagnosis theory


with two kinds of quality proposed by Zhang (1982a) to diagnose the pre-
ceding influence. The basis of this theory is the concept of two kinds of
quality.

3.1. TwoKinds ofQuality


According to the differentranges involved in different definitions of quality,
there are two kinds of product quality:
I. Turd qucdiry is theproductqualitycontributed by thecurrent
operation and all the preceding operations. It is simply product
quality in the usual sense and is felt directly by the customer.
2 . Pcwtid q u d i t ~is~the quality specifically resulting from the current
operationanddoesnotincludethe influence of thepreceding
operations. Obviously, it reflects the work quality of the current
operation.
These two kinds of quality exist in any operation. Total quality consists of
two parts: the partial quality and the preceding
influence on it; hence, partial
quality is only part of total quality.

3.2. Importance of the Concept of Two Kinds of Quality


The concept of two kinds of quality is very important, as can be seen from
the following facts:
MultivariateDiagnosis Theory 163

1. The two kinds of quality exist at each operation.


2. The concept of two kinds of quality is very general and exists in all
processes of production, service and management as well as many
other processes.

3.3. Fundamental Thinking of the Diagnosis Theory with


Two Kinds of Quality
The so-called diagnosis is always obtained through a comparison of a mea-
sured value with the standard value. For example, in order to diagnose the
preceding influence, we can take the partial quality (which has no relation-
ship with the preceding influence) of the current operation as the standard
value, and the corresponding total quality (which consists of both the partial
quality and the precedinginfluence) as the measured value. Comparing these
two kinds of quality, we can diagnose the preceding influence of the current
operation. The greater thedifference between these two kinds of quality, the
more serious the preceding influence.
Here, the key problem is how to measure these two kinds of quality.If
we use a control chart to measure them, we can use the Shewhart control
chart to measure the total quality and the cause-selecting Shewhart control
chart proposed by Zhang (1980) to measure the partial quality. We refer to
this as diagnosis with two kinds of control charts. If we use the process
capability index to measure the two kinds of quality, we can use the total
process capability index (which is just the process capability index in the
usual sense), denoted by Cp,, to measure the total quality and the partial
process capability index, denoted by Cl,p,which is a new kind of process
capability index proposed by Zhang (19824, to measure the partial quality.
We refer to this as diagnosis with two kinds of process capability indices.
The former is a realtime diagnosis, and the latter is a diagnosis over time.
See Zhang (1 989, 1990).

3.4. Steps in Diagnosis with Two Kinds of Control Charts


The steps in diagnosis with two kinds of control charts are as follows.
Step 1. Construct the diagnosis systembetween adjacent operations
with technical relations as shown in Figure 1. In Figure I , the connection
between operations 1 and 2 is the total quality of operationI , and there exist
two kinds of quality at operation 2, i.e., total quality and partial quality.
Suppose the total qualities of operations 1 and 2 are measured withtwo
Shewhart charts, and the partial qualityis measured with the cause-selecting
164 Zhang

Operation 2
r""""""""""""""""""""

IShewhart
Total
I Quality Chart
Total
Quality
-,Shewhart
Chart
I
I

I Diagnosis
I
I I System
I
I
I
Partial -,Cause-Selecting I
I
I Quality Chart I
L""""""""""""""""""""1

Figure 1 Diagnosis system betweenadjacent operations.

Shewhart chart; then the diagnosissystem can also be referred to as a three-


chart diagnosis system.
Step 2. Diagnose the diagnosis system according to the typical case
diagnosis table. Table I . Since each control chart has two states, i.e., the
normal state and the abnormal state, the three-chart diagnosis system has
eight typical diagnosis cases (see Table 1). Comparing the three chartsof the
diagnosis system with the three-chart cases of Table I , we can diagnose the
diagnosis system.
From Table 1 we can see that if we d o not have the diagnosis theory
with two kinds of quality and use only Shewhart charts at each operation,
we may get a false alarm or alarm missing for cases 11, 111, VI, and VII. This
is already verified by experience in factories. It is not a fault of the Shewhart
chart itself; in fact, it is due to our misunderstanding of the Shewhart chart.
The Shewhart chart canbe used to reflect total quality only; thus it includes
the preceding influence. Using the Shewhart chart as if it reflects the partial
quality only and has no relation with the preceding influence is wrong; see
Zhang (1992b, p. 173).

3.5. Characteristics of Diagnosis with Two Kinds of


Control Charts
In Table 1, the diagnosis of each typical case is derived only from ordinary
logical deduction; we did not use probability and statistics. Thus there are
no two kinds of errors. Table 1 also considers the connection between pre-
ceding and succeeding operations.
The Shewhart chartused in Table 1 can be replaced by some other all-
control chart, for example, the CUSUM chart or the T' chart. But, at the
same time, the cause-selectingchart should be replaced by the corresponding
cause-selecting CUSUM chart, the cause-selecting T' chart, etc.
MultivariateDiagnosis Theory 165

Table 1 Typical CaseDiagnosisTable


Cause-
Shewhart
Shewhart selecting
Typical
chart
for
chart
for
chart
for
cases
operation 1 operation 2 operation 2 Diagnosis
I + + + Thepartialquality is abnormal.
The preceding influence is also
abnormal.
I1 + + ~
The partial quality is normal. The
preceding influence is abnormal.
Ill + - + Thepartialquality is abnormal.
The preceding influence is also
abnormal. But the one offsets
the effect of the other.
IV + The preceding influcncc is ‘I bnor-
mal. but the partial quality off-
sets its effect and makes the
total quality of operation 2 to
be normal.
V - + + Thepartialquality is dmormal.
The preceding influence is normal.
VI - + - The partial quality and thepreced-
ing influence arc both normal,
but their total effect is to make
the total quality of operation 2
become abnormal.
VI1 ~ - t Thepartialquality is abnormal.
But the preceding influence offsets
it to make the total quality of
operation 2 become abnormal.
The partial quality, the preceding
inlluence, and the total quality
are all normal

4. HOWTO DIAGNOSE THE CORRELATION AMONG


INDICES FOR A MULTI-INDEX OPERATION

Fora multi-indexoperation. we need to usethe multivariatediagnosis


theorywithtwokinds of qualityproposed by Zhang (1996b, 1997). The
fundamental thinking of this theory is silnilar to thatof the diagnosis theory
with two kinds of control charts. But since this is the multivariate case, we
usethemultivariate T’ controlchartandthe cause-selecting T’ control
166 Zhang

chart instead of the corresponding Shewhart chart and the cause-selecting


Shewhart chart in the three-chart diagnosis system.Since the statisticsof the
T' control chart include the covariance matrix of each variable (assuming
that the number of variables is p ) ,

[.y2;311 SI2 .;:." .yT]


SI/'

.YpI sp2 ...

where S,, i #.j, is the covariance, the T 2 control chart can consider com-
pletely the correlation among variables.
The multivariate T' control chart was proposed by Hotelling in 1947
and was well used i n Western countries for multivariatecases. Its merits are
that ( I ) it considers the correlations among variables and (2) it can give us
exactly the probability of thefirst kind of error, a. But its greatest drawback
is that it cannot diagnose which variable induced the abnormality when the
process is abnormal. On the other hand, the bestmerit of thediagnosis
theorywithtwokinds of quality is that it canbe used to diagnosethe
cause of abnormality in the process. Hence Zhang proposed a new multi-
variate diagnosis theory with two kinds of quality to combine the above-
stated theories together so that we can concentrate their merits and at the
same time avoid their drawbacks.

5. HOW TOSIMULTANEOUSLYDIAGNOSETHE
PRECEDING INFLUENCE AND THE CORRELATION
AMONG INDICES IN A MULTIOPERATION, MULTI-INDEX
SYSTEM

From the preceding discussions it is evident that we need to use the diag-
nosis theory with two kinds of quality in order to diagnose the preceding
influence, and we also need to use the multivariate diagnosis theorywith two
kinds of quality in order to diagnose the correlated indices. In such a com-
plex system, it is not enough to depend on the technology only; we must
consider statistical process control and diagnosis (SPCD) too. Besides the
diagnosis theories of Western countries always diagnose all variables simul-
taneously. Suppose the number of variables is p and the probability of the
first kind of error in diagnosing a variable is ct, then the probability of no
first kind of error in diagnosing p variables is
MultivariateDiagnosisTheory 167

P()=(I-cc)”zl-ppa
Thus, the probability of the first kind of error in diagnosing p variables is
P , = 1 -Po z p a
i.e.it is proportional to the number ofvariables.Inthe case of agreat
numberofvariables,thevalueof P I maybecomeintolerable. To solve
this problem, Zhang and his Ph.D. candidate Dr. Huiyin Zheng (Zheng,
1995) proposed the multivariate stepwise diagnosis theory in 1994.

5.1. Fundamentals of the Multivariate Stepwise Diagnosis


Theory
If we tested that the population of all variables concerned with the problem
is abnormal, we want to identify the abnormal variable. Instead of diagnos-
ing each variable contained in this population, we need only diagnose the
mostprobable assignablevariableeachtime, for by so doing we can
decrease the number of steps of diagnosis needed. The steps of the multi-
variate stepwise diagnosis theory are as follows:
Step 1. Test the abnormality of the population of all variables. If it is
normal, the diagnosis stops; otherwise proceed to step 2.
Step 2. Select the most probable assignable variable and test whether
it is abnormal or not.
Step 3. Test the remaining population of variables. If it is normal,
then the diagnosis stops, otherwise return to step 2.
Repeatsteps 1-3 until we canascertaineachvariable to be normalor
abnormal.
In practice, in general, it takes only one to three steps to complete the
multivariate diagnosis process.

5.2. CompilingtheWindowsSoftware DlTQ2000


We have compiled the Windows software DTTQ2000 (DTTQ=diagnosis
theory with two kinds of quality),which combines the diagnosis theory with
two kinds of quality, the multivariate diagnosis theory with two kinds of
quality,andthemultivariate stepwisediagnosistheory. So far we have
diagnosed the multioperation, multi-index production lines of eleven fac-
tories more than 40 times using DTTQ2000. All results of these diagnoses
have been in accordance with practical production.
168 Zhang

5.3. Necessity of Application of the Multivariate Diagnosis


Theory with Two Kinds of Quality and Its DlTQ2000
Software
Today’s society has developed into an era of high quality and high relia-
bility. The percent defectiveofsomeelectronic products is as low as the
parts per million or even parts per billion level, so production technology at
the worksite must be combined with statistical process control and diagnosis
(SPCD) to guarantee product quality. In fact, the requirements of SPCD
with respect to product quality are more severe than those of technology.
For example, the control limits of control charts are, in general, situated
within the specification limits. In addition, we consider significant variations
i n product quality and nonrandom arrangements of points plotted between
control limits on the control chart to be abnormal and take action to elim-
inate such abnormalities. But, on the other hand, technology does not pay
attention to such facts.
At theworksite,technicians in generaltakeone of thefollowing
actions whenever there is a need of multivariate control: (1) Put a l l para-
meters of the current operation to be within the specification limits or (2)
adopt the Shewhart control chart to control each parameter of the current
operation. In fact, these two actions are virtually the same; both oversim-
plify themultivariateproblemand resolve it into several univariate
problems.Here,unless all thevariablesareindependent,otherwise we
must consider the correlations among variables. For example, in the printed
circuit production line there are altogether 27 indices at the operation of the
factory Desmear/PTH. If we supervise this processwith 27 s-R, control
charts supervising each of the 27 indices individually, then we can supervise
27 averages and 27 standard deviations, i.e.,

p,.o,. i = 1 , 2 ,..., 27

But we cannot supervise the correlations among variables, i.e., the covar-
iances among indices, with such a univariate S-R,~control chart. There are
altogether 351 [= 27(27 - 1)/2] covarianceparametersor coefficients of
correlation to be supervised. Only by using multivariate diagnosis theory
+
with two kinds of quality can we supervise all 405 (= 27 27 + 351) process
parameters and implement the SPCD. Using the DTTQ2000 software we
have diagnosed eleven factories in China, and all the diagnostic results have
been in fairly good agreement with the actual production results.
Using the DTTQ2000 software with a microcomputer, it takes only
about 1 min to perform one diagnosis; thus, it saves much time on the spot.
Not only is the diagnosis correct, but it also avoids the subjectivity of the
working personnel.
MultivariateDiagnosisTheory 169

In a factory, it always takes a long time to train an experienced engi-


neer in quality control and diagnosis.If we use the DTTQ2000 software, the
training time is much reduced.

5.4. How to Establish SPCD in a Multioperation, Multi-


index System
In a multioperation, multi-index system, in order to establish the SPCD we
must consider three principles:

Principle 1. A multioperation production line must consider the pre-


ceding influence.
I. If there is no preceding influence, the partial quality will be equal
to the total quality at the current operation, and we can use the
Shewhart control chart (which is only a kind of all-control chart)
to control it.
2. If there is apreceding influence, there exist twokindsof qual-
ity,totalqualityandpartialquality at thecurrentoperation.
Totalqualitycan be controlled by theall-controlchart,and
partial quality can be controlled with the cause-selecting control
chart.
3 . Exceptforthe first operation or the above-stated case 1, we can
constructathree-chartdiagnosis system asshown in Figure 1.
Then we can diagnose this diagnosis system according to the typi-
cal case diagnosis table, Table 1.

Principl~2. A multi-index production line must consider the correla-


tion among indices.
1. If the indices are not related, we can use a univariate all-control
chart to control each index individually.
2. If the indices are related, we need to use a multivariate all-control
chart to control the whole index system.

Pritlciph 3 . Inamultioperation,multi-indexsystem, we need to


consider both thepreceding influence and the correlation among indices,
which makes the problem of implementing the SPCD more complex. The
multivariatediagnosis system withtwokinds of quality is a method for
solving this complex problem, and its implementations show that the the-
ory is in good accordance with actual practice.
170 Zhang

6. APPLICATIONS OF THEMULTIVARIATEDIAGNOSIS
THEORY WITH MI0 KINDS OF QUALITY

Here, we show some practical examples of the multivariate diagnosis theory


with two kinds of quality as follows.

Example 1
Operations4and 5 of aproduction line forthedruganalgin have five
indices, three of which belong to the preceding operation; the other two
belong to the succeeding operation. Their data are a s follows (see group
51 data in Table 2):

Preceding operation: .xl = 8.80, .x2 = 97.71, s 3 = 89.11


Succeeding operation: .x4 = 95.67, .xs = 4.37

Using the DTTQ2000 software, we know that the T’ value is 18.693, greater
than the upper control limit (UCL) of 13.555 of the T 2 control chart (Fig.
2), whichmeans that the process is abnormal. Then, by diagnosing with
DTTQ2000, we know that index x 5 is abnormal.

Example 2
Using the DTTQ2000 Windows software to diagnose the same desmear/
PTHoperations ofthreeprintedcircuitfactories, A, B, C, we obtained
Figure 3. Compare and criticize these three factories.

Table 2 Data for Operations 4 and 5 of Analgin Production Line


Group
No. s , .Y? x 3 Sq .Y( TZ Diagnosis
27 11.70 96.08 84.84 93.88 1.35 11.988 Normal
28 9.70 95.85 86.55 93.51 2. I8 11.311 Normal
29 9.70 95.85 86.55 95.24 1.32 3.765 Normal
30 7.66 98.61 9 1.06 95.34 1.39 4.050 Normal

47 9.00 98.42 89.57 95.89 1.17 2.942 Normal


48 8.00 97.24 89.34 95.67 2.98 5.544 Normal
49 8.00 97.24 89.34 95.14 1.77 1.148 Normal
50 8.80 97.71 89.1 1 95.90 1.25 1.369 Normal
51 8.80 97.71 89.1 1 95.67 4.37 19.214 Abnormal
MultivariateDiagnosisTheory 171
7

13.555

L"""""""""""""""""""""""""""O.(#)O
16 18 20 22 242628303234363840 42 44 46 48 50 52 54 56 58
l ~ I , I l I I I , I I " I , I ' I ' I , I ' I ' I ' I ' I ~ I ' I ' I I I ' ' ~

Figure 2 T' control chart

From Figure 3 we see that the desmear/PTH operation of factory A


(Fig.3a) is understatisticalcontrol;butthedesmear/PTHoperation of
factory B (Fig. 3b) has a record ofan average of 1 .O point per month plotted
outside the UCL of the T2 control chart; and the same operation in factory
C (Fig. 3c) has an average of 1.3 points per month plotted outside the UCL
of the T2 control chart. Hence, factories A,B, and C are in descending order
according to the work quality of the desmear/PTH operation. Thus, the
multivariatediagnosistheory withtwokinds of qualitycan be used to
give us an objective evaluation of the quality of each factory. This method
can also be used to point out their direction of quality improvement.

7. CONCLUSION

I. Accordingtowhathas been statedabove, we can see thatthe


multivariate diagnosis theory withtwokindsof quality and its
DTTQ2000Windowssoftwarehaveprospects of beingapplied
to the field of multioperation,multi-index systems. Itsgreatest
merit is thatitconsidersthemultivariatecharacteristicsofthe
multioperation,multi-index system andcancontrol all objects
that should be controlled by the system.
2. Theimplementation ofthistheoryatelevenfactories in China
showsthatproductionpracticesare in fairagreementwiththe
theory.
'0.000
1 3 5 7 9 11 13 15 17 19 21 23
25
27 29 31 33 35 37 39'41

L"""""""""""""""""""""""""". 0.000
1 3 5 7 9 11 1315 17 21
19 23 25 27 29 31 33 3

21.004

_""""""""""""""""""""""""----- 0.000
1 3 5 7 3 11 13 15 17 19 21 23 25 27 29 3l
, . : . > : , I I . ' . I . , , ! ,

ic)
MultivariateDiagnosisTheory 173

REFERENCES

Chen Z.Q., Zhang G . (199621). Fuzzy control charts. China Q L ~October , 1996.
Zhang G. (1980). A new type of qualitycontrolchart allowing the presence of
assignable cause-the cause-selecting control charts. ActaElectron Sin 2: 1-10.
Zhang G. (198221). Control chart dcsign and a diagnosis theory with two kinds of
quality. Second AnnualMeeting of CQCA.February 1982, Guilin. P. R.
China.
Zhang G. (198%). Multiple cause-selecting control charts. Acta Elcctronic Sin 3:31-
36. May 1982.
Zhang G.(1983). A universal method of cause-selecting-The standard transforma-
tion method. Acta Electron Sin 5: 1983.
Zhang G. (1984a).Cause-Selecting ControlChart:Theoryand Practice. The
Publishing House of Post and Telecommunication, Beijing, 1984.
Zhang, G. (1984b). A new type of control charts-cause-selecting control charts and
athcory of diagnosis with controlcharts. Proceedings of WorldQuality
Congrcss '84, pp 175-185.
Zhang G . (1985). Cumulative control charts and cumulative cause-selecting control
charts. J China Inst of Commun 6:31-38.
Zhang G. (1989). A diagnosis theory with two kinds of quality. Proceedings of the
43rd AmericanQualityCongressTransactions. pp 594-599. Reprinted in
TQM, UK. No. 2, 1990.
Zhang. G . (1992a). Cause-Selecting ControlChartandDiagnosis.TheAarhus
School of Business, Aarhus, Denmark, 1992.
Zhang G. (1992b).Textbook of QualityManagement.Thc Publisher of High
Education. 1992.
Zhang G. Dahlgaard JJ. Kristenscn K. (1996b). Diagnosingquality:theoryand
practice.Rcsearch Report. MAPP, Denmark. 1996.
Zhang G. (1997). An introductiontothemultivariatediagnosistheory wit11 two
kinds of quality. China Q u ~ February
. 1997, pp 36-39.
Zheng H. (1995). Multivariate thcory of quality control and diagnosis. Doctoral
dissertation, Beijing Univcrsity of Aeronautics and Astronautics, 1995.
This Page Intentionally Left Blank
10
Applications of Markov Chains in
Quality-Related Matters
Min-Te Chao
Academia Sinica, Taipei, Taiwan, Republic Of China

1. INTRODUCTION

To evaluate the performance of a control chart, one of the key elements is


the average run length (ARL), whichis difficult to calculate. However,if the
underlying observations can be embedded into a finite Markov chain, then
an exact ARL can be found if the observations are discrete, and approx-
imations are available if they are continuous. I n this chapter I provide a
systematic reviewof many quality-related topics in situations in which a
finite Markov chain can be employed.
The most fundamental statistical system consists of a set of indepen-
dent random variables. Although this structure contains all the essential
features for statistical analysis, and in fact most ideas for statistical analysis
may have their origin traced back to this simple case, it nevertheless lacks
the versatility to describe the more complex systems that are often encoun-
tered i n real-life applications. In this chapter I describethenextsimplest
case,
the
Markov chainmodel,
under which
variousquality-related
problems can be vividly described and analyzed.
The most striking advantageof a Markov chain is its versatility. It can
be used to describe,e.g., intricate deterioration processes and complex main-
tenance procedures (Neuman and Bonhomme, 1975) with relative ease.
Many complicated quality-related processes, when properly arranged,
can be embedded into a Markov chainof reasonable size. Since the theory of
Markovchains,particularly finite andergodicMarkovchains, iswell

175
176 Chao

established, for various applications the essential problem is how to find a


reasonably sized Markov chain to describe the underlying quality-related
process. Once this is done, the rest of the analysis is standard.
In this chapter I consider various well-known procedures and in each
caseindicatewhysuchaMarkovchaincan be constructed.Effortsare
placed on exploration rather than on original research. We first list some
basic facts about a Markov chain (Section 2 ) , and in Section 3 examples are
given of where some exact results can be obtained. The exact ARL formula
and its samplingdistributionare given in Section 4. I thenintroduce in
Section 5 the Brook and Evans (1972) approximation technique and show
how it can be applied to various CUSUMs.
We have concentratedourefforts mostly oncontrolcharts.The
Markovchainmethod,however,canalso be appliedtootherquality
systems. A list of these procedures is presented i n Section 6.

2. BASICFACTSABOUTMARKOVCHAINS

In this section I briefly describe the necessary backgroundof a finite Markov


chain that will be needed for the rest of this chapter.
Given a sequence of random variables X , , X,, . . ., the simplest non-
trivial structure we can impose is to assume that theX’s are independent and
identically distributed (i.i.d.). This assumption is often used to describe a
sequence of observations of certain quality characteristics for essentially all
kinds of control charts. If the X’s are correlated, then the probability struc-
ture of these X’s can be very complicated. One of the simplest nontrivial
dependent cases is that of the X’s following a Markov process.
Roughlyspeaking, a sequence {X,,, 17 2 0 ) of randomvariables is
Markovian if it has the ability to forget the past when the present status
is given. When we say “present,” “past,” or “future,” we implicitly assume
that there is a time element. We shall in this respect consider the subscript I I
of X,, as a timeindex. Mathematically,theMarkovpropertycan be
expressed by

P[X,,+, E AIX,, X, (XOXI, ..., X,,)E B] = P[X,,+, E AIX,, = X] (I)

for a l l Bore1 sets A c R , B c R“”. If, in addition to Eq. (I), the Y s take
values only i n a finite set S, which without loss of generality we may assume
to be S = { I , 2 , ...,s } , then we say that the X’s follow a finite Markov chain.
For a finite Markov chain, the information contained i n Eq. ( I ) can be
summarized into, say.
Markov Chains in Quality-Related Matters 177

If, in addition, P , , : ~of (2) is independent of 11, then the Markov chain is said
to have stationary transition probabilities. In this case, let

and let II = (x,. x,...,x,s),x, = PIX. = i]. It can be shown that for a Markov
chain with stationary transition probabilities the knowledge of n and the
matrix P is sufficient to determinethejointprobabilitydistribution of
( X o , X,, X,, ...). We call II theinitialdistribution and P the (stationary)
transition probability matrix.
In what follows, we shall always assume that the Markov chains under
consideration are finite with a certain initial distribution and a stationary
transition matrix.
A good technicalreason to use a matrix P is that we can employ
matrixalgebra to simplify variouscalculations. For example,thekth-
order transition probability

is simply the (i,j)th element of 9,the kth power of the transition matrix P,
1.e..

The entries of P are probabilities, so the row sums of P are unity and
the entries themselves are all nonnegative. I t may happen that some of the
entries of P are 0. But if we look at the sequence P, P ‘ , P’, ..., it may happen
that atsome k > 0 all entries of 9 are strictly positive. If this is the case, this
means that if one starts from any statei, in k steps it is possible to reach state
,j, and this holds true for all 1 5 i,.j 5 s. If$ > O for some I< > O and for all
1 5 i,.j 5 s, then we say that the Markov chain is irreducible.
Let,f;”’)be the probability that in a Markov chain starting from statej,
the first time it goes back to thejth state is at time 11, i.e.,
f“) = P[X, #.j, #,j, .... X,,-, = j l X o =.jl
# . j , X,,
Let

,I= I
178 Chao

The quantity p, is the average time at which a Markov chain starting from
state j returns to statej , and it is called the mean recurrence time for statej .
If p, < 00, then stateJ is said to be ergodic. If p, <00 for all j E S, then we
say that the Markov chain is ergodic.
If a Markov chain is irreducible and ergodic, then the limits

exist and are independent of the initial state i. Furthermore, uj >O, x;=l
u, = I , and

The vector u = (uI, 142, ..., u,J is called the absolute stationary probability. If
a = u, then it can be shown that

P[X,, = J ]= P[X, =j ] (8)

for a l l j E S and for all 2 0, i.e., the Markov chain is stationary (instead
of just having a stationary transition probability).
Aninterestingfeature of Eq. (6) is thatitsrate of convergence is
geometric.Let U be an s x s matrixconsistingofidenticalrows,where
each row is u. Then by (7), PU = U P = U , so by induction we have

p" - u =(P - U)" (9)

The fact that ( P - U)" -+ 0 exponentially fast follows from the Perron-
Frobeniustheorem,and since it is a little bit tootechnical we shallnot
pursue it further. This basically explains that for a well-behaved Markov
chain, the series in (5) usually converges because it is basically a geometric
series. Also, the long-term behavior of an ergodic Markov chain is indepen-
dent of its initial distribution.
Let A be a subset of S, and let T = inf{n 2 1, X, E A ] . Then T i s the
first entrance time to the set A . For a control chart modeled by a Markov
chain,the set A mayconsistoftheregionwhere analarmshould be
triggeredwhen X,,E A occursforthe first time.Thus T is thetime
when the first out-of-control signal is obtained, and E ( T ) is closely related
totheconcept of averagerunlength(ARL).Whenthecontrolcharts
become moreinvolved,theexactorapproximatecalculationsof ARLs
become involved or impossible with elementary methods. However, most
MarkovChainsinQuality-RelatedMatters 179

(not all) control charts can be properly modeled by a Markov chain, and
essentially all methods developed to calculate the ARLs are more or less
based onthe possibility thatonecanembedthecontrolschemeintoa
Markov chain.

3. DISCRETECASE:EXACTRESULT

In this section we discuss cases for which an exact finite Markov chain can
be found to describe the underlying control chart. I first describe a general
scenario where such a representation canbe arranged and explainwhy it can
be done.
Assume that the basic observations are X,, X,,..., which are i.i.d. and
take values in a finite set A of size k . The key point is that the X ' s are
discrete and the set A is finite. This may be the case when either the X's
themselves are discrete or the X ' s can be discretized.
Most control charts are of"finite memory"; i.e., at time n the decision
whether
of to flag an
out-of-control
signal
depends on ..., X,,
only. In other words, we may trace back to consult the recent behavior of
the observations to decide whether the chart is out of control, but we do it
for at most r steps back, r < 00. The case for which we have to trace back to
the infinite past is excluded.
Let Y,, = (X,,-,.+,, ..., X,,). Therandomvector Y,, cantakeas
many as s = k r < m possible values. It is easy to see that the Y's follow a
Markov chain with an s x s transition matrix. Since at time n, Y,, is used
to decide whether the process is out of control, we see that, conceptually
atleast,forthescenariodescribedabove,there exists a finite Markov
chainforwhichthebehaviorofthecontrolchartcan be completely
determined.
However, s = k' can be a very large number, so the s x s matrix can be
toolargeto havepracticalvalue.Fortunately, this matrix is necessarily
sparse (i.e., most entries are 0), and if we take a good look at the rules of
the control chart, then the chances are we may find some means to drasti-
cally reduce the number of states of the Markov chain. Hence, to implement
our general observation, we need case-by-case technical works for various
control charts.

Example 1. The Standard X Chart


If groups of size tz is used against f 3 0 limits, define, for each J?,,,
180 Chao

Note that the X's are thecodedvaluesofthe X ' s . As long as our only
concern is whethertheprocess is undercontrol,the behaviorofthe X
chart can be completely determined by the coded X's. The coded X ' s are
still i.i.d., and this is a special kind of Markov chain. Its transition matrix,
when the process is under control. consists of three identical rows:

wherep, = p 3 = @(-3) andp2 = @(3)- @(-3), where @ denotes the cumu-


lative distribution function of a standard normal distribution.
We can do similar things for the standard R chart.

Example 2. Shewhart Control Chart with Supplementary Runs


We often include additional run rules on a standard control chartto increase
its sensitivity in detecting a mean shift. For example,
Rzrkc T45. I f four of the last five observations are between (-3 and -1)
or between 1 and 3, then a signal is suggested.
The well-implemented Western Electric Company (1965) rules also fall into
this category.
I f we want to implement rule T45 ( i n addition to the standard f 3 o
rules), we first need to divide the real line into five disjoint intervals:
I, = ((ti- I , r/;]

with o 5 = -[lo = 00, 11; = -5 + 2i, 1 5 i 5 4. Hence an s = 5- = 3125-state


5

Markovchain is sufficient to describethis situation. But a 3145 x 3145


matrix is too large even for today's computers, so well devised tricks are
needed to drastically reduce the value of s . It turns out that it is possible to
use a 30-state Markov chain, which is of moderate size.
Rule T45 can be replaced by other run rules or some combinations of
them. The idea is that in many cases we may drastically cut the size s, and it
is possible to find a constructive method to implement such a simplification.
Hence this type of problem (evaluate the exact ARLs and run length dis-
tributionsforShewhartcontrolcharts with varioussupplementaryrun
MarkovChains in Quality-RelatedMatters 181

rules) is mathematically tractable. For technical details, the full method is


documented in Champ and Woodall (1987) and programs are available in
Champ and Woodall (1990).

Example 3. Discrete CUSUM


Assume that Y,,, thei.i.d.observationsforaqualitycontrolscheme,are
integer-valued and that a one-sided CUSUM (Van Dobben de Bruny, 1968)
is under consideration. Define So = 0 and

S,, = max(0, Y,, + S,,-l], 11 = I , 2, ...

Then the one-sided CUSUM signals an out-of-control message at stage t~


when S,, 2 t . This is a situation in which, at first sight, the decision to signal
may depend on all data points up to time n, so it does not fall into the
scenario described earlier for a finite Markov chain representation.
We may look at the construction somewhat differently. Since the pro-
cess stops when Sf, 2 t , obviously the important values for S,, are 0,1, ...,
t - 1, t . When S,, 2 t , the process stops; hence we may use a ( t + I)-state
Markov chain to describe S,,, with the last state behaving like an “absorbing
state.” We write the transition probability matrix as follows:

where R is t x t , 0 is a 1 x t vector of O’s, and 1 is a t x 1 vector of 1’s. A


typical entry of R is
r,i = P[S,, =,jlS,,+l= i], 0 5 i,,j 5 t - I
Expression (IO) is typical for control charts represented by a finite
Markovchain.HeretheARL is theaverage time for theprocess S,, to
enter the absorbing state t . In symbols,

ARL = E ( N )
N = inf(n 2 1 : Sf, 2 t ]

Example 4. Two-sided CUSUM


The CUSUM i n Example 3 is one-sided, since it detects the upward shift of
the process mean only. For a two-sided (discrete) CUSUM, suppose that
182 Chao

integer-valued random variables Y,, and Z,, are observed. Define S,(O) =
S,(O) = 0, where

+
SH(n)= max(0, Y,, S H ( n- 1))
SI>(??) +
= min(0, Z,, SL(n - I ) )
and
N = inf(n >_ 1 : SH(n)2 t l or SL(rz) 5 - t z )
Normally, we would have Y, = X , - k l , Z , = X , + k2 for some known inte-
gers k l , k2. The X‘s arethebasicsequence of thequalitycharacteristic
measured for control. The bivariate process (SH(n),SL(n))takes values in
{O,l, ..., t l } x {O,l, ..., fz}, and it is possible to write a finite Markov chain
+
with s = ( t l l ) ( t 2+ 1 ) states (see Lucas and Crosier, 1982). For a two-
sided CUSUM, the number of states of the underlying Markov chain can
be reduced to aboutt , t 2 / 2 by careful arrangement of states (Woodall, 1984).
However, it is not known whether we can always reduce the Markov chain
of the two-sided CUSUM to a linear function of t l + r2.

4. GENERALRESULTS

I have demonstrated with examples that it is often possible to represent a


control chart with a Markov chain with a transition probability matrix of
the form (10); i.e., states 0, I , ..., t - 1 are transient states, and one state,
state t , is absorbing. Let N j be the number of stages, starting from state
i E (0, 1, ..., t - l ) , to reach the absorbing state for the first time. Then it
followsfromthe standardMarkovchaintheorythatthe l t h factorial
moment of N , , i.e.,

+
pf‘)= E ( N ; ( N ,- 1) ... ( N , - l (1 1)) 1)

can be found via the matrix equation

where 1 is a t x 1 vector of 1’s. Furthermore, the run length distributionsof


N o , N l , ..., N,-l are given by

(PINO= r ] , P I N I = r], ..., P[N,-I = 1.1)’ = Rr-’(1 - R ) l (13)

r = 1 , 2 , ... .
MarkovChains in Quality-RelatedMatters 183

What we have described can be roughly summarized as follows. If we


can find a finite Markov chain to describe the behavior of a control scheme
in theform of Eq. (IO), thenallproblemsconcerningthe ARLs ofthe
control chart are solved. The only technical concern is that the size of the
transition matrix should be manageable.

5. APPROXIMATIONS:THECONTINUOUSCASE

When the underlying quality characteristic is continuous, a situation may


rise for which we cannot embed the control scheme into a finite Markov
chain.

Example 5. One-sided CUSUM with Continuous Observations


Let us considerasetup identical to that ofExample 3 butwiththe Y's
replaced by i.i.d. N ( k , I). where k is a known positive constant. The run
length is N , defined by
N = inf(n 2 1 : S,, 2 t ] , t>O

We proceed to find the distribution of N . I t is easy to see that N takes values


1,2, ... only, and it is sufficient to find
P [ N > 1.1 = P[SI < t , S? < t , ..., S, < t ]
Since f > 0, it is easy to find P [ N > O ] = 1 and P [ N > 13 =
P[YI < t ] = @(t - k ) , where @ is the cumulative distribution function of
the standard normal distribution.
The case for P [ N > 2 ] is more complicated. By definition,

P [ N > 2 ] = P[SI < t , & < t ]

= I' P[S?< rlS, = s]dF,(.u)

where F I is the cumulative distribution of SI,i.e.,

The complication in (15) results from the fact that


P[SI = 01 = P [ Y , 5 01 = @(-k)>O
184 Chao

hence the random variable SI is neither continuous nor discrete. It has a


jump at SI = 0 and is continuous in (0, 03). Substituting (15) into (l4),
omitting the algebra, we have
P[N>2] = @ ( f - k)@(-k) +
L1' @(f - s - k)&s - k ) (I.\-
where 4 is the probability density function for the standard normal.
But since the last integral has no simple closed form, this is about as
far a s we can go analytically. (We can find P [ T > 3 ] in more complicated
forms, but the situation quickly runs out of our control when we try to find
P[N>r.] for I' = 3,4, ... .) This basically shows that there is no easy way to
calculate the exactARL for the one-sided CUSUM chart if the observations
are i.i.d. normal. Also. the above example demonstrated that it is necessary
to use approximate methods to find an approximation for the ARL for the
standard one-sided CUSUM chart.
The basic idea of how to find approximate ARLs is due to Brook and
Evans (1972). Since we can find the exact ARL of CUSUM when the Y ' s are
discrete, then when these observations are continuous it is natural to dis-
cretizethe Y's first. The exact ARL for thediscrete version of CUSUM
serves as an approximation of the exact ARL for the continuous case.
Specifically, for the situation described i n Example 5, define

P[X,, = . j ] = P[O.- 1/2)< Y,, - k <w(j + 1/2)1v]


+
= @((j 1/2)w) - @((j= 1 /2)119
& ll~ql(j11~)

if 113, the threshold size for our "roundoff' procedure, is small. Since [ X , ,-
Y,,l 5 II' for all 1 2 , we would intuitively expect Y,, =X,,. and ARLs based on
the X ' s , which we may find exactly via the Markov chain method, can be
used to find a reasonableapproximation of the ARLsfortheoriginal
CUSUM based on continuous distributions.
How small should 11' be in order to induce a reasonable approxima-
tion'? Very little is known mathematically although we believe it is workable.
However, it is reported (Brook and Evans, 1972) that it is possible to obtain
agreement to within 5% of the limiting value when f = 5 and to within 1 %
when f = 10.
The basic idea ofBrook andEvanscan be applied to various
CUSUMs. Since thebasicconcept is thesame, we shallonly list these
MarkovChains in Quality-Related Matters 185

cases. Successful attempts have been reported for the two-sided CUSUM
(Woodall, 1984) and multivariate CUSUM (Woodall and Ncube, 1985). In
these cases, however, the sizes of the transition probability matrices increase
exponentially with the dimension of the problem, andso far no efficient way
to drastically reduce the matrix size is known. The Brook and Evans tech-
nique also applies to weighted CUSUMs (Yashchin, 1989), CUSUMs with
variablesamplingintervals (Reynolds et al., 1990), and theexponentially
weighted moving average schemes (Saccucci and Lucas, 1990). I n all these
examples, the control scheme can be described in the form
Si,, = g j ( X , , ,S j , f , - l ) , ? I ; i = I , 2, ..., 111
where g, are fixed functions and the Y s are i.i.d. continuous ordiscrete. For
example, for the two-sided CUSUM, we have H I = 2 and

If the Si’s are discretized to t different values, then the control scheme can be
approximately described by an s-state Markov chain, s = t”’.

Example 6. Another Two-sided CUSUM


For the standard two-sided CUSUM, a careful arrangement can reduce the
need of t 2 states, where we assume, for simplicity, that t l = t2 = t . If the
situation is extremely lucky, it can be reduced to 2t - 1 states; but in general,
(r’+ r ) / 2 is about the best we can do (Woodall, 1984). Hence even for the
two-sided CUSUM, theBrook and Evanstechnique has itslimitations.
Another way to lookattheproblem is to consider a slightly different
two-sided CUSUM. The version below is suggested by Crosier (1986).
Let So= 0 and define C,,, Sf, recursively by

This is clearly Markovian. Since there is essentially one equation to describe


the control scheme, there is no difficulty in using a t-state Markov chain.
186 Chao

6. OTHERAPPLICATIONS

So far we havelimited our discussion tocontrolcharts.However,the


Markovchaintechnique is so versatile that it canbeappliedtomany
quality-related topics.
Amainarea of applicationconcernsvariouscontinuoussampling
planswithattribute-typeobservations. All theseplans arebasedona
sequenceof i.i.d. discreteobservations,andthe decisionrelated to these
plans is normally based on at most a finite number of observations counted
backward. This fits into our general scenario of Markov chains, and the
only technical problem left is to find a Markov chain of reasonable size.
Most continuous sampling plans (three versions of CSP-I and CSP-k,
IC = 2 , 3 , 4 , 5) can be embedded into a proper Markov chain (see Blackwell,
1977). The ANSIiASQC 21.4 plan falls into this category (Grinde et al.,
1987; Brugger, 1989). Other examples include the two-stage chain sampling
plan (Stephens and Dodge, 1976), the skip-lot procedure (Brugger, 1975),
process control based on within-group ranking (Bakir and Reynolds, 1979),
startup demonstration test (Hahn and Gage, 1983), and precontrol sampling
plans (Salvia, 1987).
A moreimportantapplication of a Markovchain is tostudythe
behavior of the quality scheme, be it discrete or continous, when the basic
observations are correlated. Very little is known in this respectwhenthe
observations are continuous. But if they are discrete, we may model the
dependence by assuming that
the
basic
observations
follow a finite
Markov chain also. In the expression shared by many quality systems,
s,,= $Ax,,s,,-I )
1

we see that S,, follows a Markov chain if X,, follows a Markov chain. Hence
the general idea described in Section 4 still applies. However, studies in this
respect, although workable, are rare in the literature. The only related work
seems to be Chao (1989).
The Markov chain method also finds its application in various linearly
connected reliability systems. A general treatment can be foundin Chao and
Fu (1991). Readers are referred to the review article by Chao et a l . (1995).

7. CONCLUSION

In this chapter I have demonstrated, with examples and general scenario


descriptions. that it is often possible to define a Markov chain such that the
underlyingqualitycontrolschemecan be completelydescribed by this
Markov chain.
MarkovChainsinQuality-RelatedMatters 187

To evaluate the system performance ofa control chart, or other qual-


ity-related schemes, perhaps the most difficult quantity to calculate is the
ARL and its associated sampling distributions. The Markov chain techni-
que provides a general means for accomplishing this task.

ACKNOWLEDGEMENT

Research was partially supported by g r a n t NSC-86-2115-"015 from the


National Science Council o f ROC.

REFERENCES

Bakir ST, Reynolds MR Jr. (1979). A nonparametric procedure for process control
based on within-group ranking. Technometrics 21: 175-183.
Blackwell MTR. (1977). The effect of short production runs
on
CSP-I.
Technometrics 19:259-263.
BrookD,EvansDA. (1972). Anapproachtotheprobabilitydistribution of
CUSUM run length. Cumulative sum charts; Markov chain. Biometrika 59:
539-549.
Brugger RM. (1975). Asimplification of skip-lot procedureformulation. J Qual
Technol. 7:165-167.
Brugger R M (1989). A simplified Markov chain analysis of ANSIiASQC 21.4 used
without limit numbers. J Qual Technol. 21:97-102.
Champ CW. Woodall WH. (1987). Exact results for Shewhart control charts with
supplementary runs rules. Quality control; Markov chain; average run length.
Technometrics 29:393-399.
Champ CW. Woodall WH.(1990). A program to evaluate the runlength distribution
of a Shewhart control chart with supplementary runs rules. J Qual Technol 22:
68-73.
Chao MT. (1989). The finitetime behavior of CSP whendefects are dependent.
Proceedings of the National Science Council, ROC, Part A 13:18-22.
ChaoMT.FuJC. (1991). The reliabilityoflargeseriessystems underMarkov
structure. Adv Appl Prob 23:894908.
Chao MT, Fu JC, Koutras MV.(1995). Survey of reliability studies of consecutive-k
out-of-n:F and related systems. IEEE Trans Reliab 44:120-127.
CrosierRB. (1986). A new two-sided cumulativesumqualitycontrolscheme.
Technometrics 28:187-194.
Grinde R, McDowell ED, Randhawa SU. (1987). ANSIiASQC 21.4 performance
without limit numbers. J Qual Technol 19:204-215.
Hahn GJ, Gage JB. (1983). Evaluation ofa start-updemonstration test. J Qual
Technol 15:103-106.
188 Chao

Lucas JM, Crosier RB. (1982). Fast initial response for CUSUM control schemes.
Technometrics 24: 199-205.
Neuman CP, BonhommeNM. (1975). Evaluation of maintenance policies using
Markov chains and fault tree analysis. IEEE Trans Rehab 24:3745.
Reynolds M R Jr, Amin RW,Arnold JC. (1990). CUSUMcharts with variable
sampling intervals. Technometrics 32:371-384.
Saccucci MS. LucasJM. (1990).Average run lengths forexponentially weighted
moving average control schemes using the Markov cham approach. J Qual
Technol 22:154-162.
Salvia AA. (1987). Performance of pre-control sampling plans. J Qual Technol 19:
85-89.
Stephens KS, Dodge HF. (1976). Two-stage chain sampling inspection plans with
different sample sizes in the two stages. J Qual Technol 8:209%224.
Van Dobben de Bruny CS. (1968). CumulativeSum Tests: Theory and Practice.
Statistical Monographs and Courses No. 24, Griffin.
Western Electric Company. (1965). Statistical Quality Control Handbook. Western
Electric Co.. Indianapolis.
Woodall WH. (1984). On the Markov chain approach to the two-sided CUSUM
procedure run length distribution. Technometrics 26:4146.
WoodallWH,NcubeMM.(1985).MultivariateCUSUMquality-controlproce-
dures.Cumulativesum;Markovchain; Hotelling’s T’. Technometrics 27:
285-292.
Yashchin E. (1989). Weighted cumulative sum technique. Technometrics31 :321-338.
11
Joint Monitoring of Process Mean and
Variance Based on the Exponentially
Weighted Moving Averages
Fah Fatt Gan
National University of Singapore, Singapore, Republic of Singapore

1. INTRODUCTION

The Shewhart chart based on the sample mean 2 wasfirstdeveloped to


monitor a process mean. The chart was then modified to plot the sample
range R to monitor a process variance. Each chart was developed assuming
that the other process characteristic is in control. The more advanced chart-
ing procedures such as the cumulative sum (CUSUM) and exponentially
weighted moving average (EWMA) chartswere later developed basedon the
same basic assumption. This has led to the design and evaluation of perfor-
mance of the mean and variance charts separately. This kind of analysis
might mislead quality control engineers into making inferences concerning
themean orthevariancechartwithoutmaking reference totheother.
Experience with real manufacturing processes has shown that the process
variance tends to increase with the process mean.A decrease in the variance
whenthe mean isin control ishighly desirable,but if adecrease in the
variance is accompanied by a decrease in the mean, then it is highly undesir-
able. Gan (1995) gave an example of a process with a decrease in the var-
iance coupled with a change in the mean and showed that this process state
is difficult to detect. The mean chartbecomes insensitive to the change in the
mean because the variance of the sample mean has become smaller. Any
detection of a decrease in the variance with the mean appearing to bein
control could lead to the false conclusion that the process has improved. In

189
190 Gan

short, the problem of monitoring the mean and variance is a bivariate one,
and both the mean and variance chartsneed to be looked at jointly in order
to make meaningful inferences.
The use of combined schemes involving simultaneous mean and var-
iance chartsbased on the EWMAs of sample mean and varianceis discussed
in Section 2. The averagerunlength (ARL) performance ofthevarious
schemes is assessed in Section 3. A simple design procedure of a combined
EWMA scheme with an elliptical “acceptance” region is given in Section 4.
A real data set from the semiconductor industry is used to illustrate the
design and implementation in Section 5.

2. JOINT MONITORING OF PROCESS MEAN AND


VARIANCE

Consider the simulated data set given in Gan (1995). The data set comprises
80 samples, each of sample size 17 = 5. The first 40 samples were generated
from the normal distribution N ( p O ,o;),where = 1 and 0; = 1, and the rest
+
were from N ( p o 0.4oO/,h, (0.90,)~). Thus, the process was simulatedto be
in control for thefirst 40 samples, and between the 40th and 41st samples the
mean shifted upwardto po + o.4a0/,hand thevariancedecreasedto
( 0 . 9 0 ~ )A
~ . EWMAchart for monitoring the mean is obtained by plotting Qo
= po and Q, = ( I - A,,,,)Q,-, + hnrX, against the sample number t , where 2,
is thesamplemean at samplenumber t . A signal is issued if Q, > or
Q , < - /I,,,. Similarly,a EWMA chartformonitoringthevariance is
obtained by plotting yo = E[log(Sf)] = -0.270 (when o2 = 06)7 and y, =
(1 - A,,)q,-, + A v log($), where Sf is the sample variance at sample number
t . A signal is issued if y, > H v or 4 , < - hc.. More details on the EWMA
charts can be found in Crowder (1 987, 1989), Crowder and Hamilton(1 992),
Lucasand Saccucci (l990),andChang (1993). The EWMA meanand
variance charts based on the parameters given in Gan (1995, Table 2, p.
448, scheme EE) are constructed for the data and displayed in Figure 1.
A quality control engineer has to constantly combine the information
in thetwocharts (whichmight not be easily done in practice)tomake
meaningful inferences. To ensure that the charts are interpreted correctly,
the two charts couldbe combined into one, and this canbe done by plotting
x
the EWMA of log(S2) against the EWMA of as shown in Figure 2. The
chart limits of the two charts form the four sides of a rectangular “accep-
tance” region. Any point that falls within the region is considered an in-
control point (for example, pointsA and B ) , and any point thatfalls outside
the region is considered an out-of-control point (for example, pointC). The
thick bar on the plotis not an out-of-control region but represents the most
MonitoringofProcessMeanandVariance 191

0.40
0.30
14 0.20
"0 0.10
< 0.00
2 -0.10
$ -0.20
-0.30
-0.40
0 5 10 15 20 25
30
35
40
45
50
55 60 65 70
75 80

0.30
0.10
-0.10
"
A
2 -0.30

:
4-

-050
-0.70
$ -0.90
-1.10
0 5 10 15 20 25 3035 40 45 50 55 60 65 70 75 80
Sample Number

Figure 1 EWMA charts based on J? and log(S') for a simulated data set where the
first 40 samples were generated from the normal distribution N ( p u , oi), where 11" = 1
and c ir= I , and the rest were from N (I+, + 0.40u/./h, (0.90~)').

desirable state, where the mean is on target and the variance has decreased
substantially.
The advantage of this charting procedure is immediate: Any inference
made can be based on both the EWMAs jointly. The interpretation of an
out-of-control signal is easier because theposition of the point gives an
indicationofboththemagnitudeanddirectionoftheprocessshift.
However, the order of the points is lost if they are plotted on the same

0.3

6: -0.3
3 -0.5
0
-0.7
0.9
C.
-0.4 0.0 0.4
EWMA of X
Figure 2 A combined EWMA schemc with a rectangular acceptance region.
192 Gan

plot. To get around this problem, each point canbe plotted on a new plot in
a sequence a s shown later in Figure 13. The disadvantage is that it is not as
compact a s the traditional procedure illustrated in Figure 1.
The traditional way of plotting the mean and variance charts sepa-
rately [see, forexample, Can (1995)] amounts to plotting the EWMA of
log(S’) against the EWMA of 2 and using a rectangular acceptance region
formaking decisions. Themainproblem with arectangularacceptance
region is that both points A and B (see Fig. 2 ) are considered in control,
although it is fairly obvious that point B represents a far more undesirable
state than that of point A . An acceptance region that is more reasonable
would be an elliptical region as shown in Figure 3. Takahashi (1989) inves-
tigated an elliptical type of acceptanceregionforacombinedShewhart
scheme based on 2 and S or R. An economic statistical design for the 2
and R chartswas given by Saniga (1989). Apoint is consideredout of
control if it is outside the elliptical acceptance region. For example, point
B is an out-of-control point, but point A is an in-control point, for the
elliptical regiongiven in Figure 3. This chart is called a bull’s-eye chart,
as any hit on the bull’s-eyewill provide evidence of the process being on
target.
For the same smoothing constants hnr and A,,,in order for a EWMA
schemewith an elliptical region to havethe same ARL as theEWMA
scheme with a rectangular region, the chart limits of the mean and variance
chartshaveto be slightly larger,asshown in Figure 4. The ideaof an
elliptical regioncomesfromtheHotelling’sstatistic to be discussed later.
Point A is an in-control point for the rectangular region, butit is an out-of-
control point for the elliptical region. Similarly, point B is an out-of-control
point for the rectangular region but an in-control point for the elliptical
region. Thus, an elliptical region would be expected to be more sensitive in
detecting large changes in both the mean and variance and less sensitive in

C 0.3
5 0.1
gJ -0.1
z -0.3
-0.5
93 -0.9
O
-0.7
w -1.1
-0.4 0.0 0.4
EWMA of x
Figure 3 A combined EWMA scheme with an elliptical acceptance region.
Monitoring of Process Mean and Variance 193
-2
N 0.3
0.1
2 -0.1
-0.3
0 -0.5
gw -0.7
-0.9
-1.1
-0.4 0.0 0.4
EWMA of X
Figure 4 A combined EWMA scheme with both rectangular and elliptical accep-
tance regions.

detecting a large shift in one process characteristic when there is little or no


change in the other characteristic.
A Shewhart bull’s-eye chart and a EWMA bull’s-eye chart are dis-
played in Figure 5. The Shewhart bull’s-eye chart displays 10,000 random
points (x,
log(S’)) when the process isin control. The EWMA bull’s-eye
chart displays the EWMA of the points (x,
log (S’)). Both the charts show
that the points are roughly distributed withinellipticalregions;hence an
ellipticalregion is a natural and more appropriate decisionregionfor a
Shewhart or EWMA bull’s-eye chart.
An equivalent decision procedure for the EWMA bull’s-eye chart is to
check the distance of the point (Q,, q r ) from the center (kto, E[log(S’)]) and
declare the point out of control if

2.0 ”. a-
II

1.0 h
PI 0.2
0.0 s. 0.0
h

c4 -1.0 -2 -0.2
“0 -0.4
2
- -3.0
bo -2.0
< -0.6
2 -0.8
-4.0 -1.0
-5.0 -1.2
-1.0 -2 .0 0.0 -0.5
1.0 2.0 0.0 0.5
x EWMA of X

Figure 5 Shewhart bull’s-eye chart and a EWMA bull’s-eye chart based on 10,000
random points (x,
log(S?)) from an in-control normal distribution.
194 Gan

for a point ( Q t 3q r ) located above the horizontal line q r = E[log(S’)] or if

for a point (er, 4 , ) located below the horizontal line. For a point above the
horizontal line,

which is a Hotellingtypeofstatistic.Thisstatistic is similar to theone


proposed by Lowry et al. (1992). Thus, another way of implementing the
bulls’s-eye chart is to plot the Hotelling type statistic T’ against the sample
number / asshown in Figure 6, which I shallrefer to asamultivariate
EWMA T 2 chart.
The main problem with this charting procedure is that when a signal is
issued, the chart does not indicate which process characteristic gives rise to
the signal. The omnibus EWMA chart proposed by Dornangue and Patch

2.5 0.3
2.0 0.1
-0.1
1.5 -0.3
T2 -0.5
1.0 -0.7
-0.9
0.5 -1.1
-0.4 0.0 0.4
0.0
0 10 20 30 40 50 60 70 80 EWMA of 2
Sample Number

Figure 6 A multivariate EWMA T’ chart for a simulated data set where the first
40 samples were generated from the normal distribution N (po, o;),where po = 1 and
O; = 1, and the rest were from N ( p o + O.4oo/J7i, (0.90~))’.
Monitoring of ProcessMeanandVariance 195

(1991) has a similar problem of interpretation.The T 2 statistic indicates only


the magnitude and not the direction of the shift. In process monitoring, the
direction of a shift is at least as important as the magnitudeof the shift. An
improvement to theT 2chart is to include abull’s-eye chart of the points with
the most recent point plotted as solid
a black dot. Although all the points are
shown in this bull’s-eye chart, in order to avoid overcrowdingof points only
the most recent 40 points, for example, are plotted each time. Thebull’s-eye
chart will provide the information on both the magnitude and direction of
any process shift. The interpretationof the T 2 statistic is simple and easy to
understand with the bull’s-eye chart. Mason et al.(1995) proposed a certain
decomposition of the Hotelling statistic for interpretation of the state of a
process. Their methodis mathematically more complicated and hence harder
for a quality control engineer to understand and appreciate.

3. COMPARISON OF SCHEMES BASED ON ARL

For a comparison of schemes based on the ARL, the in-control mean and
variance are assumed to be po = 0 and oi = 1, respectively. Each sample
comprises IZ = 5 normally distributed observations. The mean and variance
investigated are given by p = po + Aoo/&i and o = 6oo,where A = 0.0,
0.2, 0.4, 1.0, and 3.0 and 6 =0.50, 0.75,0.95, 1.00, 1.05, 1.25, and 3.00.
Combined schemeswith rectangular and elliptical acceptance regions are
compared in this sectiotn. All the schemes have an approximate ARL of
250. The ARL values of the schemes EE, and SS, (combined EWMA and
Shewhart schemeswith rectangularacceptanceregions) were computed
exactlyusing theintegralequationapproach given in Gan (1995). The
restwere simulated.Alternatively,the ARL of theEWMA bull’s-eye
chartEE, couldbecomputed by using theMarkovchainapproach of
Brook and Evans (1972) or the integral equation approach. Waldmann’s
method (Waldmann, 1986a, 1986b) could be used here for approximating
the run length distributionof a bull’s-eye chart. Let the starting values of the
EWMA mean and variance charts be u and v, respectively; then the ARL
function L,.(u,v) of thecombinedschemewith an elliptical acceptance
region B can be derived as

L&, v) = 1
196 Gan

where,/:? and.f;og(S2) are the probability density functions of 2 and log (s'),
respectively.
The schemes CC (combined CUSUM schemewith a rectangular
acceptanceregion)and EE,. arethesameasthose givenin C a n (1995).
Thecombined scheme CC consists of atwo-sided CUSUM mean chart
and a two-sided CUSUM variance chart. This scheme is obtained by plot-
ting So= To = 0.0, s, = max[0, s,-,+.VI - and TI = min [0, T,+l + s ,
+I<,,,] againstthesamplenumber t forthemeanchartand by plotting
+
SO= TO= 0.0, S, = max[0, SI_, log(.$) - kr,[;], and T, = min [0,
+
TI-, log(.#) + against t forthevariancechart.Thechartparameters
of the various schemes are given in Table I . More details on the CUSUM
charts can be found in C a n ( 1 991) and Chang (1993). The ARL compar-
isons are summarized in Table 2.
The ARL values of combined schemes CC, EE,, and SS, were simu-
lated such that an ARL that is less than I O has a standard error of 0.01; an
ARL that is at least I O but less than 50 has a standard error of0.1; an ARL
that is at least 50 but less than 100 has a standard errorof about 0.2; and an
ARL that is at least 100 has a standard error of about 1.0.
EE,. I V ~ S Z L Y CC. The performances of these two schemes aresimilar
except that when there is a small shift in the mean and a small decrease in
the variance, the EE, scheme is much more sensitive. When there is a large
increase in the variance, the EE, scheme is marginally less sensitive.
EE,. w r . s ~ ~EE,.
v The performances of these two schemes are similar.
The EEL,scheme is generally more sensitive than the EE,. scheme in detect-
ing increases i n the variance and less sensitive in detecting decreases in the
variance for the various means investigated.

Table 1 Control Chart Parameters of CombinedSchemes


Scheme
Acceptancc region Control chart parameters
Monitoring of Process Mean and Variance 197

Table 2 Average Run Lengths of Combined Schemcs with Respcct to the


Process Mean (p,,+ A x q ) / f i ) and Standard Deviation (60~))
A 6 cc EEr EE,, ss,. ss,.
0.00 0.50 5.9 5.8 6.4 68.9 153.0
0.00 0.75 24.8 21 .Y 24.7 322.4 6 12.9
0.00 0.95 284.4 236.7 254.8 364.2 426.0
0.00 1 .00 253.6 252.3 252.7 252.2 252.7
0.00 1.05 138.3 137.1 129.3 161.8 145.4
0.00 1.25 19.2 18.9 18.1 31.1 24.8
0.00 3.00 2.5 2.6 2.5 1.2 1.2
0.20 0.50 5.9 5.8 6.3 68.9 150.1
0.20 0.75 24.7 21.8 22.8 3 19.8 592.5
0.20 0.95 167.8 135.3 136.6 328.8 377.6
0.20 I .OO 145.8 129.7 127.4 228.0 227.9
0.20 1.05 96.2 88.7 82.3 148.2 133.3
0.20 1.25 18.4 18.0 17.0 30.0 23.9
0.20 3.00 2.5 2.6 2.5 1.2 1.2
0.40 0.50 5.9 5.8 6.1 68.9 142.0
0.40 0.75 23.2 20.5 18.5 309.5 530.6
0.40 0.95 62.3 51.8 52.5 248.4 276.6
0.40 I .00 56.1 48.8 49.0 173.6 170.8
0.40 I .05 46.9 41.7 39.8 1 16.9 105.7
0.40 1.25 16.2 15.8 14.5 27.1 21.5
0.40 3.OO 2.5 2.6 2.5 1.2 1.2
1.00 0.50 5.8 5.7 5.0 68.8 97.6
1 .OO 0.75 10.0 9.7 8.6 175.7 217.9
1 .00 0.95 10.5 10.2 10.6 64.9 71.7
1 .00 1 .00 10.4 10.2 10.5 49.6 50.6
1 .00 1.05 10.3 10.1 10.2 38.3 36.3
1 .OO 1.25 8.9 8.8 8.0 15.2 12.2
1 .00 3.00 2.4 2.5 2.4 1.2 I .2
3.00 0.50 2.5 2.6 2.5 2.3 2.3
3.00 0.75 2.6 2.6 2.1 2.2 2.6
3.00 0.95 2.6 2.6 2.8 2.2 2.4
3 .00 1 .00 2.6 2.6 2.8 2.2 2.3
3.00 1.05 2.8 2.6 2.8 2.1 2.3
3.00 1.25 2.7 2.7 2.7 2.1 2.0
3.00 3.00 2.1 2.2 2.0 1.1 1.1
198 Gan

SS, sersu.s SS,. The difference in performance is moresubstantial.


The SS, scheme is more sensitive thanthe SS, scheme in detecting
increases in thevariancebutsubstantially less sensitive in detecting
decreases in the variance, especially for a small change or no change in
the mean. For larger changes in the mean, the difference is smaller.
E W M A Schemes versus Shewhart Schemes. The EWMA schemes are
substantially more sensitive than the Shewhartschemes except for the case
when there is a big change in the variance.
In order to have a better understanding of the performance of these
schemes, 10,000 random samples were simulated for four different sets of
process characteristics: A = 0.0 and 6 = 0.75, A = 0.4 and 6 = 1.00, A =
0.4 and 6 = 0.75, and A = 0.4 and 6 = 1.05. These are plotted as Figures 7a,
7b, 7c, and7d, respectively, forthecombinedShewhartschemes. The
EWMA of the points (x, log (S’)) are plotted as Figures 7e, 7f, 7g, and
7h, respectively. For A = 0.0 and 0.4 and 6 = 0.75, the SS, scheme is more
sensitive than the SS, scheme, and this is indicated by Figures 7a and 7c,
which show that there are more points outside the rectangular acceptance
region than there are outside the elliptical region.
For A = 0.4 and 6 = 1.05, SS, is slightly more sensitive, as indicated
by Figure 7d, which shows that there are more points outside the elliptical
region thanoutsidetherectangularregion.For A = 0.0, 6 = 0.75and
A = 0.4, 6 = 0.75,forexample,Figures7a and 7c showthatthereare
very few pointsoutsidetheacceptanceregions.Insharpcontrast,there
areasubstantialnumber of pointsoutsidetheacceptance region in
Figures 7e and 7g. This explains the substantial difference in the ARL’s
of the EWMA and Shewhart schemes. Plots 7a and 7e correspond to the
case when a processimprovement has takenplace, and this is reflected much
more clearly in a EWMA scheme than in a Shewhart scheme. This means
that the EWMA scheme would be a more effective tool for quality improve-
ment. These plots also suggest that if sufficient points are collected for a
process and the points are plotted on a bull’s-eye chart, then the plot will
providevaluableinformationregardingtheoverallstateoftheprocess
characteristics. The central location and spread of thepointscouldalso
be used to estimate graphically the process characteristics.

4. DESIGN OF A EWMA BULL’S-EYE CHART AND


MULTIVARIATE EWMA T2 CHART

A simpledesign procedure is providedhereforthedesign of a EWMA


bull’s-eye chart. Table 3 contains the chart parameters of EWMA bull’s-
eye charts with an in-control ARL of 300 based on a sample size n = 5.
Monitoring of Process Mean and Variance 199

(7;r) a = 0.0, d = 0.75 (7c) a = 0.0,6 = 0.75


2.0 x

1.0 0.2
0;
0.0
0.0 0 -0.2
h
I - 1.0 % -0.4
2M -2.0 Q -0.6
.- -3.0 E -0.8
-4.0 3 -1.0
-5.0 LU -1.2

0.0 -2.0 2.0

(7b) A 0.4,6 = 1.00 (7f) A = 0.4,6 = 1.00


2.0 0.4
1.0 0; 0.2
0.0 50 0.0
5 -1.0 -2 -0.2
% -0.4
1

-g -2.0
-3.0
Q -0.6
I: -0.8
-4.0 3 -1.0
-5.0 LU -1.2
-2.0 0.0 2.0 -0.5 0.0 0.5

(7c) A = 0.4,6 = 0.75 (7g) A = 0.4,6 = 0.75

(7tl) A = 0.4,6 = 1.05 (711) A = 0.4,6 = 1.05


2.0 h 0.4
1.0 Y
-1 0.2
0.0 * 0.0
5 -1.0
2 -0.2
“0 -0.4
g
.”.
-2.0
Q -0.6
-3.0 -0.8
-4.0 $ -1.0
-5.0 -1.2
-2.0 0.0 2.0 -0.5 0.0 0.5
x EWMA of x
Figure 7 Combined Shewhart schemes and combined EWMA schemesbased on
10,000 random points from out-of-control normal distributions.
200 Gan

Table 3 Control Chart Parameters of Combined EWMA Schemes with Ellipticd


Acceptmce Region, In-Control Average Run Length of 300, and Sample Sizc 5

A,.

~ ~~ ~

0.620 o.021 0.621 0.622 0.622 0.623 0.623 0.623 0.623 0.624 0.624 0.624 0.625 0.625
0.182 0.241 0.268 (1.294 0.319 (1.343 0.366 0.389 0.410 0.431 0.451 0.471 0.509 0.546
0.820 0.910 0.953 0.995 1.036 1.077 I . l l h 1.155 1.194 1.232 1.270 1.308 1.383 1.456

0.710 0.710 0.711 0.711 0.712 0.712 0.782 0.713 0.713 0.713 0.713 0.713 0.714 0.715
0 . 1 ~ 0.242
3 0.269 0.295 0.320 0.344 0.367 0.389 0.41 I 0.432 0.452 0.472 0.510 0.547
(1.822 0,')Il 0.Y55 0.996 1.037 1.078 1.117 1.157 1.196 1.234 1.271 1.309 1.384 1.458

0.752 0.752 0.751 0.753 0.753 0.754 0.754 0.755 0.7.55 0.755 0.756 0.756 0.756 0.757
0.183 0.242 0.269 (1.295 0.320 0.344 0.367 0.390 0.411 0.432 0.453 0.472 0.510 0.547
0.~23 0.917 0.955 0 . ~ 9 7 1 . 0 3 ~1 . 0 7 ~I . I I X 1.157 1.196 1.234 1.273 1.310 1.384 1.458

0.792 0.793 0.793 0.79.3 0.7Y4 0.794 0.794 0.795 0.795 0.796 0.796 0.796 0.797 0.797
0.183 0.243 0.270 0.296 0.320 0.344 0.367 0.390 0.41 I 0.432 0.453 0.472 0.510 0.547
0.823 0.913 0.955 0.997 1.038 1.079 I . I I X 1.158 1.196 1.235 1.273 1.310 1.385 1.458

0331 0.832 0.832 0.833 0.833 0.833 0.814 OM4 0.834 0.835 0.835 0.835 0.835 0.836
o . 1 ~ 0.243
4 (1.270 0.296 o.321 0.345 0.36~ 0.390 0 . 4 1 1 0.432 0.453 0.472 0.510 0.547
0.823 0.913 0.955 0.998 1.039 1.079 1.119 I.I5X 1.196 1.235 1.273 1.311 1.385 1.459

0 . 8 6 ~0.870 0.870 0.870 0 . ~ 7 10.871 0 . ~ 7 10.872 0.872 0 . ~ 7 30.873 0 3 7 3 0.874 0.874


0.183 0.243 0.270 0.296 0.321 0.345 0.368 0.390 0.412 0.433 0.453 0.473 0.51 I 0.547
0.~240.913 0 . 9 ~ 60.998 1.040 1.080 1 . 1 1 9 1 . 1 5 ~1 . 1 9 7 1.236 1.274 1.311 1.3116 1.460

0.907 0 . ~ 0 70 . ~ 0 ~ (I.YOX 0.908 0.909 0.909 o.910 0.909 O.YOY 0.91I 0.91 I 0.91 I
0.908
o . 1 ~ 50.243 0.271 0.296 0.321 0.345 n . m 0.390 0.412 0.433 0.451 0.473 o.511 0.548
OX25 0.914 0.957 0.999 1.040 I.080 1.120 1.159 l . l Y X 1.236 1.274 1.313 1.387 1.460

0.943 0.Y43 0.943 0.944 0.944 0.944 0.945 0.945 0.945 0.945 0.946 (1.946 0.946 0.947
O.IX5 0.244 0.271 0 . 3 7 0.322 0.345 0.368 0.390 0.412 0.433 0.453 0.473 (1.511 0.548
0.825 0.914 0.Y57 0.999 1.040 1.080 1.120 1.159 l.lY8 1.236 1.274 1.313 l.3X6 1.460

0.977 (1.97~0.979 0.979 0.979 0 . ~ 70 .~~ 8 0(I.YXO o . ~ x o0.980 O.YRI ( ~ 9 x 10 . ~ 8 1~ 9 x 1


O.IX5 0.244 0.271 0.297 0 . 3 2 2 0.346 0.369 0.391 0.412 0.433 0.454 0.473 0.511 (1.548
0.825 0.914 O.Y58 0.991) 1.040 1.081 1.121 1.160 1.198 1.236 1.275 1.312 l.3X7 1.460

1.012 1.013 1.013 1.013 i . 0 1 3 1 . 0 1 4 1.014 1.015 1 . 0 1 5 1.015 1.016 1.016 1.016 1.016
(1.185 0.244 0.271 (1.297 0.322 0.346 0.369 0.391 0.413 0.433 0.454 0.474 0.51 I 0.548
0.825 0.915 0 . 9 X 1.000 1.041 L O X ? 1.121 1.161 1.199 1.237 1.276 1.313 1.387 1.461

1.045 1.046 1.047 1.047 1.0471.047 i . 0 4 ~1.048 1 . 0 4 ~ 1.049 1.049 i.n.50 1.049 1.m
0.185 0.244 0.272 0.297 0.322 0.346 0.369 0.391 0.413 0.434 0.454 0.474 0.512 0.548
0.826 0,915 0.959 1.000 1,041 1.081 1.121 1.160 1.200 1.238 1.270 1.314 1.3X7 1.462
1.079 1.07') 1.079 1,079 l.0XO I.0XI 1.0XI I.0XI 1.082 1.082 1.082 1.082 1.083 1.083
0 . 1 8 5 0.244 0.272 0.297 0.322 0.346 ( 1 . 3 6 ~0.391 0 . 4 1 3 0.434 0.454 0.474 0.512 0.548
0.826 0.916 0.958 i.000 1.042 i.ox2 1 . 1 2 2 1.161 im 1 . 2 3 ~1.276 1.314 1.389 1.462
MonitoringofProcessMeanandVariance 201

Similar tables covering other in-control ARLs and sample sizes are available
fromtheauthor.Theseareobtained by usingsimulationsuchthatthe
simulated in-control ARL has an error of 1.0. The starting value of the
mean chart is given by the in-control mean pO, and the starting value of
the variance chart is given by

Suppose a combined scheme with knr = 0.14 and k I I = 0.16 is desired. Then
the chart parametersof the combined scheme can be obtained from Table 3
easily as follows:
Mean chart:

Variance chart:

The elliptical acceptance region can then be constructed using

for the elliptical curve above the horizontal line q, = E[log(S')]and using
202 Gan

for the elliptical curve below the horizontal line.

5. A REALEXAMPLE

Quality control engineers wouldlike to monitor the meanball shear strength


of aconnectiononamicrochip.Frompastprocessdata,thein-control
mean is estimated to be around 72 g, and the standard deviationis estimated
to be around 10 g. A sample of size 5 is taken at regular intervals, and the
ball shear strength of each chip is measured. The chart limits of the schemes
discussed here are chosen such that a combined mean and variance scheme
hasanin-controlARL of about 300. Thesmoothingconstants of the
EWMA charts are chosen to be k M = 0.14 and kv = 0.16.
The individualShewhartchartsof x
and log(S’) are displayed in
Figure 8. The individual EWMA charts of J? and log(S2) are displayed in
Figure 9. Both variance charts suggest evidence of a decrease in the process
variance. Thetwomeanchartsshowthattheprocessmean is rather
unstable even though the variance has somewhat stabilized at later samples.
This is an example where the process mean is unstable while the process

110 g I
100
90
X 80
70
60
50

7.0

-
%
6.0
5.0
4.0
-
v
g 3.0
2.0
1.0
0.0
0 5 15 10 20 25 30 35 40 45
Sample Number

Figure 8 Shewhart charts based on J? and log (S’) for the ball shear strength data.
Monitoring of Process Mean and Variance 203

Figure 9 EWMA charts based on J? and log (S') for the ball shear strength data.

variance is somewhat stable. This could be due to the production of chips


with different mean ball shear strengths for different batches but with the
variance within a batch being more stable from batch to batch. This points
to the need to search for ways to ensure a more stabilized mean.
A multivariate Shewhart T 2 chart and a EWMA T 2 chart for the ball
shear strength data are displayed in Figures 10 and 1 I , respectively. Both
charts show bigger bursts of activity after the 25th sample. However, the
A bull's-eye
reasons for these bursts ofactivity are not clear from the charts.
chart would help a quality control engineer to have a better understanding
of the process characteristic when an out-of control signal is issued.

4.0
3.5 t 7.0 3 I I
3.0 6.0
2.5 5.0 0

T2 2.0 4.0 0
3.0
1.5
2.0
1.0 1.o
0.5 0.0
0.0 72 50 94
0 5 10 2015 25 3530 40 45
x
Sample Number

Figure 10 A multivariate Shewhart T 2 chart for the ball shear strength data.
204 Gan
3.5 j I
3.0
2.5
2.0
T 2 1.5
1.0
0.5
0.0
0 5 10 2015 25 30 35 40 45
EWMA of X
Sample Number

Figure 11 A multivariate EWMA T’ chart for the ball shear strength data.

The Shewhart and EWMA bull’s-eye charts are displayed in Figures


12 and 13, respectively. These types of charts should ideally be constructed
using computer programs. The charts continuously provide valuable infor-
mation regarding the process characeristics i n a manner that is easily under-
stood by quality control engineers. The EWMA bull’s-eye chart shows that
the out-of-control points for samples 6-8 are probably due to decreases in
the mean and variance. Figure 13 also shows that the out-of-control points
at samples 26-28 are probably due to an increase in the mean alone. Similar
conclusions can be drawn from the Shewhart bull’s-eye chart. If the process
is in control, then the points on a Shewhartbull’s-eye chart will be randomly
scattered. If a sequence of plotted points are all in a particular quadrant,
then the quality control engineer should be on the alert and take samples
morefrequentlythanusual (see StoumbosandReynolds, 1996, 1997).
Alternatively,supplementary run rulescould be applied to aShewhart
bull’s-eye chart.

6. CONCLUSIONS

Three ways of charting 2 and log (S’) for the purposeof joint monitoring of
both mean and variance were discussed with respect to ease of implementa-
tion and ease of interpretation. The traditionalway of plotting the mean and
variance charts separately amountsto plotting log (S’) against J? based on a
rectangular “acceptance” region. Using the justification of a Hotelling-type
statistic, it was shown that an elliptical acceptance region is more natural
and appropriate. This led to the EWMA bull’s-eye chart and themulti-
variateEWMAchart based ona Hotelling-type T‘ statistic.AEWMA
bull’s-eye chart provides valuable information on both the magnitude and
direction of a shift in the process characteristics. The multivariate EWMA
206 Gan
Monitoring of Process Mean and Variance 207

chart provides only the magnitude and not the direction of a shift. It is
recommended that a EWMA bull's-eye chart be plotted beside a multivari-
ate T 2 chart to help quality control engineers gaina better understanding of
the process characteristics. Average run length comparisons show that the
performances of schemesCC and EE, are similarexcept that when there is a
small shift in the mean and a small decrease in the variance, the EE, scheme
is much more sensitive. When there is a large increase in the variance, the
EE, scheme is marginally less sensitive. The performances of the EE, and
EE, schemes are also found to be similar. The EE, scheme is generally more
sensitive than the EE, scheme in detecting increases in the variance and less
sensitive in detecting decreases in the variance. The difference between SS,
and SS, is more substantial. The SS, scheme is more sensitive than the SS,
scheme in detecting increases in the variance but substantially less sensitive
in detecting decreases in the variance, especially when there is little or no
change in the mean. The EWMA schemes were found to be substantially
more sensitive than the Shewhartschemes except for the case when there is a
big change in thevariance.Finally,asimpledesignprocedurefor an
EWMA bull's-eye chart was provided.

REFERENCES

BrookD,EvansDA. (1972). Anapproachtotheprobabilitydistribution of


CUSUM run length. Biometrika 59:539-549.
Chang TC. (1993). CUSUM and EWMA charts for monitoring a process variance.
Unpublished M S c . thesis, Department of Mathematics, National University
of Singapore, Singapore.
Crowder SV. (1987). A simple method for studying run-length distributionsof expo-
nentially weighted moving average charts. Technometrics 29:401-407.
Crowder SV. (1989). Design of exponentially weighted movingaverage schemes.
J Qual Technol, 21:155-162.
Crowder SV, Hamilton MD. (1992). An EWMA for monitoring a process standard
deviation. J Qual Technol 24: 12-2 1.
DomangueR,PatchSC. (1991). Someomnibusexponentially weighted moving
average statistical process monitoring schemes. Technometrics 33:299-3 14.
Gan FF. (1991). Anoptimal design of CUSUMqualitycontrolcharts. J Qual
Technol 23:279-286.
Gan FF. (1995). Joint monitoring of process mean and varianceusing exponentially
weighted moving average control charts. Technometrics 37:446-453.
Lowry CA, Woodall WH, Champ CW, Rigdon SE.(1992). A multivariate exponen-
tially weighted moving average control chart. Technometrics 34:46-53.
208 Gan

Lucas JM, Saccucci MS. (1990). Exponentially weighted movingaveragecontrol


schemes: Properties and enhancements (with discussion). Technometrics 32:
1-29.
Mason RL, Tracy ND, Young JC. (1995). Decomposition of T’ for multivariate
control chart interpretation. J Qual Technol, 27:99-108.
Saniga EM. (1989). Economic statistical control chart designs with an application to
x and R charts. Technometrics 3 1 3 1 3-320.
Stoutnbos ZG, Reynolds MR Jr. (1996). Variable sampling rate control charts with
sampling at fixed intervals.Proceedings of the(International)Industrial
EngineeringResearch Conference,Minneapolis,MN,May 18-20, 1996
(invited paper).
Stoumbos ZG, Reynolds MR Jr. (1997). Variable sampling rate control charts for
one and two-sided process changes with sampling at fixed times. Proceedings
of the Second World Congress of Nonlinear Analysts. Athens, Greece, July
10-17, 1996 (invited paper).
Takahashi T. (1989). Simultaneouscontrolcharts based on (S, x) and (S, R) for
multiple decision on process change. Rep Stat Appl Res. Union Jpn Sci Eng
36: 1-20.
Waldmann K-H. (1986a). Bounds to the distribution of the run length in general
quality-control schemes. Stat Hefte 27:37-56.
Waldmann K-H. (1986b). Bounds for the distribution of the run length of geometric
moving average charts. Appl Stat 35: 151-1 58.
12
Multivariate Quality Control Procedures
A. J. Hayter
Georgia Institute of Technology, Atlanta, Georgia

1. INTRODUCTION

In many quality control settings the product under examination may have
two or more related quality characteristics, and the objective of the super-
vision is to investigate whether all of these characteristics aresimultaneously
behaving appropriately. In particular, a standard multivariate quality con-
trol problem is to consider whether an observed vector of measurements
.x = (.xI, ...,.xk)’ from a particular sample exhibits any evidence ofa location
shift
froma set of “satisfactory” or“standard” meanvalues p0 =
(p:, ..., p:)’. The individual measurements will usually be correlated due to
the nature of the problem, so that their covariance matrix C will not be
diagonal. In practice, the mean vector p0 and covariance matrix C may be
estimated from an initial large pool of observations.
.x I , ..., .\J’

and theproblem is then tomonitorfurtherobservations .x in orderto


identify any location shifts in any of the mean values.
If the assumption is made that the data are normally distributed, then
the distribution of an observation .x is N k ( p , C), and the problem is to assess
the evidence that p # 1:). In the univariate setting (k = 1) this problem can
be handled with a Shewhart control chart with control limits set to guaran-
tee a specified error rate E. One might consider handling the multivariate
problem by constructing individual a-level control charts for each of the k
variables under consideration. However, it has long been realized that such
an approach is unsatisfactory since it ignores the correlation between the
variables and allows the overall error rate to be much larger than a. On the

209
210 Hayter

other hand, if individual error rates of a / k are used, then the Bonferroni
inequality ensures that the overall error rate is less than the nominal level a.
However, this procedure is not sensitive enough since theactualoverall
errorratetendsto be muchsmallerthan a becauseof thecorrelation
between the variables.
Amultivariatequalitycontrolprocedurethatcan be successfully
implemented in manufacturing processes should meet the goals of
1. Controlling the error rate of false alarms
2. Providingastraightforwardidentification oftheaberrantvari-
ables
3. Indicating the amount of deviation of the aberrant variables from
their required values
In addition, for certain problems it is desirable that the multivariate quality
control procedure
4. Be valid without requiring any distributional assumptions.
An overview of the multivariate quality control problem canbe found
in Alt (1985). In this chapter some more recentwork ontheproblem is
discussed. Specifically, Section 2 considers the situation where the normality
assumption is made,andtheHayterandTsui (1994) paper is discussed
together with work by Kuriki (1997). Section 3 considers the work on non-
parametric multivariate quality control procedures by Liu (1995) and Bush
(1 996).

2. PROCEDURES BASED ON A NORMALITY ASSUMPTION

It is clear that a basic property of a good procedure for this multivariate


problem is that an overallerrorrate of the specified level a should be
maintained exactly, so that the probability of incorrectly deciding that the
process is out of control (when it is, in fact, still in control) should be equal
to the specified value a. Hotelling (1947) provided the first solution to this
problem by suggesting the use of the statistic
T 2 = ( X - pO)'k"(~ - p 0 )
where 2 is an estimate of the population covariance matrix C . However,
another prolem is that of deciding what conclusions can be drawn once the
experimenter has evidence via theT 2 statistic that the processis no longer in
control. Specifically, how is it determined which location parameters have
moved away from their control values pp?
Multivariate Quality Control Procedures 211

2.1. ConfidenceIntervalsProcedure
Hayter and Tsui (1994) proposed a procedure that provides a solution to
this identificationproblemandtotherelatedproblemofestimatingthe
magnitudes of any differences in the location parameters from their stan-
dard values pp. The procedure operatesby calculating a set of simultaneous
confidence intervals for the variable means p i with an exact simultaneous
coverage probability of 1 - a. The process is deemed to be out of control
whenever any of these confidence intervals does not contain its respective
control value ,LL.?,and the identification of the errant variable or variables is
immediate.Furthermore,thisprocedurecontinuallyprovides confidence
intervals for the “current” mean values p i regardless of whether the process
is in control or not or whether a particular variable is in control or not.
-
Let X Nk(O, R ) , where R is a general correlation matrix with diag-
onal elements equal to 1 and off-diagonal elements given by pij, say, and
define the critical point CR,*by

p(Ixi1 5 ~ R , E 1; 5 i 5 k)
In the more general case -
when X N k ( p , C) for any general covariance
matrix C, let the diagonal elements of C be given by 0:. 1 5 i 5 k , and
the off-diagonal elements by oji. Then if R is the correlation matrix gener-
ated from C,so that pi, = oU/oio,,it follows that

POX, - pil/o, 5 C R . ~1 ;5 i 5 k )
However, this equation can be inverted to produce thefollowing exact 1 - M
confidence level simultaneous confidence intervals for the p i , 1 5 i 5 k :

P(pj E [ X , - oiCR,x, x,+ o i c R . r ] ;1 5 i 5 k )


Notice that the correlation structure among the random variables X affects
the simultaneous confidence intervals through the critical point CR,,.
The multivariate quality control procedure operates as follows. For a
known covariance structure C and a chosen error rate a, the experimenter
first evaluatesthecriticalpoint CR.?. Then, following anyobservation
x = (x,, ...sk)‘,the experimenter constructs confidence intervals.
p, E + oiCR..r]
[-Y,- o i C R , a , -Y,

for each of the k variables. Theprocess is considered to be in control as long


as each of these confidence intervals contains the respective standard value
p!. However, when an observation .Y is obtained for which one or more of
the confidence intervals does not contain its respective standard value py,
then the process is stated to be out of control, and the variable or variables
212 Hayter

whose confidence intervals do not contain 11: are identified as those respon-
sible for the aberrant behavior.
This simple procedure clearly meets the goals set in the introduction
for a good solution to the multivariate quality control problem. An overall
error rate of a is achieved, since when p = I?) there is an overall probability
of 1 - a that each of the confidence intervals contains the respective value
p:. Also, the identification of the errant variables is immediate and simple,
and furthermore, the confidence intervals allow the experimenter to assess
the new meanvalues of theout-of-controlvariables.This is particularly
useful when theexperimenter canjudge theprocess to be still “good
enough” and hence allow it to continue.

2.2. Example
Consider first the basic multivariate quality control problem with IC = 2 so
that there arejusttwo variablesunderconsideration.Inthiscase,the
required critical point CK.?depends only on the error size a and the one
correlation term pi? = p, say. I n tables B. 1-B.4 of Bechhofer and Dunnett
( 1988), values of the critical pointare given for a = 0.20, 0. IO, 0.05, and 0.0 I
and for p = 0(0.1)0.9 (the required values for CK,? correspond to the entries
for 1’ = 2 and v = 00). More complete tables are given by Odeh (1982), who
tabulates the required critical points for additional values of CY and p (the
values C,<.%at k = 2 correspond to theentriesat N = 2). Interpolation
within these tables can be used to providecritical values for other cases
not given. An alternative method is to use a computer program to evaluate
the bivariate normal cumulative distribution function.
As an example of the implementation of the procedure with k = 2,
consider the problem outlined i n Alt (1985) of a lumber n~anufacturingplallt
that obtains measurements on both the st(!jrrle.ss and the berldirlg strength of
a particular grade of lumber. Samples of size 10 are averaged to produce an
observation s = ( s i , s,)’,and standard values for these averaged observa-
tions are taken to be p7) = (265,470)’ with a covariance matrix of

111this case the correlation is p = 0.6, so that with an error rate of c1 = 0.05,
the tables referenced above give the critical point as CR,?= 2.199.
Following an observation s = (si, .x-?)’, the simultaneous confidence
intervals for the current mean values p = (p,, p 2 ) ’ are given by
Multivariate Quality Control Procedures 213

E [SI - 2 1 9 9 J l 0 , SI + 2 . 1 9 9 m = [SI - 6.95, SI + 6.951


pz E [.X? - 2.199J121, S? + 2 . 1 9 9 m = [s? - 7.65, .X' + 7.651
These confidence intervals have a,joint confidence level of 0.95. The process
is considered to be in control as long as both of these confidence intervals
contain their respective control values po = (265,470)', that is, a s long a s
258.05 5 .X, 5 271.95 and 462.35 Is 2 5 477.65. However,following an
observation .X = (255,465)', say, the process would be declared to be out
of control, and the first variable stiffness would be identified as the culprit.
Furthermore, the confidence interval for the mean stiffness level would be
p I E (248.05,261.95) so that theexperimenterwouldhaveanimmediate
quantification of theamount ofchange in themeanstiffness level. An
additional example with h- = 4 variables is given in Hayter and Tsui (1994).

2.3. IndependenceAssumption
A general assumption of the multivariate quality control procedures is that
observations obtained from the process under consideration can be taken to
be independentofeachother. Specifically, if a controlchart based on
Hotelling's T' statistic is employed, then it is assumed that the two statistics
Tf = (SI - pO)'f; - I (.\.I - 11')
and
' - p0 )'C"(S2
T'2 - (s- - - klO)

obtained from two observations ,X' and .X? of the process are independent of
each other. Individually, these two statistics each have a scaled F-distribu-
tion, but any lack of independence between them may seriously affect the
interpretation of the control chart.
Kuriki (1997) shows how the effect of a dependence between the vari-
ables can be investigated. In general, the joint cumulative distribution func-
tion of the statistics Tf and T: is
I -I
P(Tf 5 2 1 , T: 5 . 2 ) = P(J?;S-'J?II:I,.v~S 5 :2)

where S = k hasaWishartdistributionand (yl, has a 2k-dimensional


normaldistribution.Therandom variables and J~ maynot be indepen-
dent of each other due perhaps to a correlation between subsequent obser-
vations taken from the process or through 1.1.'
which may be an average of
observations i n an initial pool. This general bivariate F-distribution can be
used to assess the effects of a lack of independence between observations
214 Hayter

froma process if Hotelling'scontrolchart is employed, andKuriki


(1997) shows how it can be easily evaluated as a two-dimensional integral
expression.

3. NONPARAMETRICPROCEDURES

The flow diagram in Figure 1 illustrates how distribution-free multivariate


quality control procedures can be developed. The left side of the diagram
corresponds to a traditional procedure. An initial pool of "in-control'' data
observations is often used to determine the control values p0 = 2 , and the
assumptionthatthedatahaveamultivariatenormaldistribution is
required. The dotted lines correspond to distribution-free procedures that
can be employed.
The middle procedure is based on the consideration ofa nonpara-
metric test of the hypothesis
Ho:p=p0

. . . . . . . . . . . .. .. . . . . . . .. . . .

I yes

Traditional
Distribution-tree
testing testing
technlques
Distribution-treetesting
procedures
I Distribution-free testlng
procedures

Figure 1 Flow diagram of multivariate quality control testing procedures.


MultivariateQualityControlProcedures 215

with p” = S. This procedure could be implemented even if there is no initial


pool of data observationsand fl is simplysome specified targetvalue.
However, in general it seems more sensible to make full use of the initial
pool of observations and to develop a procedure indicated on the far right of
the flow diagram in which the current data are compared with the initial
pool of observations. In this case, the question of interest is whether it is
plausible that thetwo data sets,theinitial pool of observations and the
current data observations, are actually observations from a common distri-
bution. A discussion of such procedures that are developed in Liu (1995)
and Bush (1996) is provided in this section.

3.1. NonparametricMultivariateControlCharts
Liu (1995) provides some nonparametric multivariate quality control pro-
ceduresthat follow theright-handdotted line ofFigure 1 in that they
compare current observations with an initial pool of “in-control” observa-
tions. The main idea is to reduce the current multivariate observation to a
univariateindexthatcan be plottedonacontrolchart.Three types of
controlchartsare suggested that are truly nonparametric i n natureand
can be used to detectsimultaneouslyanylocationchange or variability
change in theprocess. Liu’s procedures are motivated by the “depth” of
current measurements within the initial pool of observations and are con-
ceptually equivalent to the procedures described in Bush (1996) employing
functional algorithms to calculate the scores that are described in detail in
the following sections.

3.2. Overview of NonparametricProcedures


Assume that the initial pool consists of the observations
s I , ..., .+’
whereeach s’ is ak-dimensionalvector that is an observationfroman
unknown distribution with mean 1-1’= (p:, ...,p:) and covariance matrix
C. Note that the observations .Y‘ may in fact be defined to be averages of
several measurements. The purpose of the quality control procedure is to
determine whether or not a new observation can be considered to be an
observation from this same distribution. The nonparametric procedure tests
the following hypotheses:

Ho: The new observation and the initial pool can be considered
to be p + I observations from the same unknown distribution.
216 Hayter

H,.,: The new observation cannot be considered to be from the


same distribution as the initial pool.

If the null hypothesis is rejected, then the process is declared to be out of


control.
As in other quality control procedures, the initial pool of supposedly
identically distributedobservations is employed to define thestandards
against which the new observations are measured. Traditionally, the initial
pool is used tocalculatecontrol limits, butthenonparametricmethods
described below require a different and more direct use of the initial pool.
Consider the two-dimensional case. Suppose a plot of s i versus s 2 reveals an
elliptical shape. A new observation, s o , is taken, and the point (s)),x!) is
added to the graph. There is no need for concern if so plots well within the
borders of the ellipse. However, a point outside the ellipse or on the fringes
may signal that the process is out of control. Thus the nonparametric qual-
ity control procedure operates by considering the location of the new obser-
vationwithrespecttotheinitialpool. A useful procedure will indicate
whether a new observation is nearthecenteroftheinitial pool, onthe
fringes, or outside.

3.3. VariableTransformation
I t is convenient to define testing procedures in terms of a set of transformed
observations. If the initial pool and the new observation are combined to
+
form a set of p I observations, then let the sample average vector be S =
( S I ,..., X k ) and the sample covariance matrix be S,. The quality control
methodsrequirecalculatingadistancemeasurebetweenvariouspoints,
and a sensible way to do this is with the Mahalanobis distance, where the
distance from to s' is defined to be
s f

D;j = (s'- .Y')'S.,' (s'- s')


I t can be shown that the Mahalanobis distance is equivalent to the squared
Euclideandistancebetween"standardized"observationswhere
1" = ( s f - .V)A
and A A ' = S,: I . Thus,
D..
I/"
- (1" - l j ) ' ( , ? - J J )
. -
The matrix A is easily calculated from the eigenvalues and eigenvectors of
S,;', but in practice the matrix A need never be calculated, since the testing
procedures can be implemented i n terms of the original observations sIn f .

other words, while it is convenient to define quality control procedures in


MultivariateQualityControlProcedures 217

terms of the transformed observationsy‘, the actual implementation may be


performed in terms of the original observations .I”.

3.4. Calculation of a p-Value


The nonparametric procedure produces a set of scores So, S I ,.... S,, asso-
ciated with each observation in the initial pool (Si, 1 p i p p ) and the new
observation (So).The score Sireflects the “position” of observation yl with
respect to all p + I observations. In general, the lower the score, the closer
anobservation is tothe“center” of the set of observations. Let R,, 0
-
< i 5 p , be the rank of Siamong So, ..., S,,, where average ranks can be
used if there are ties among the Siin the usual manner.
The value of Ro corresponding to the new observation is of particular
interest. If the new observation and a l l p observations in the initial pool are
observations from the same distribution (so that the process is still in con-
+
trol), the Ro is equally likely to take any value from 1 to p 1 (supposing
that there are no ties in the scores Si).Moreover, large values of Ro indicate
that the new observation is on the fringe of the initial pool of data points, an
eventthathasan increased probability if theprocess has moved out of
control, and so a p-value for thenull hypothesis that theprocess is in control
can sensibly be calculated as

p+2- Ro
p-value =
P+ 1
Thisp-value reflects theproportion of the p + 1 observationsthat have
scores Sino smaller than So.

3.5. DecisionRules
The decision rules under which a process is declared to be out of control can
be chosen by the engineers implementing the procedure. Notice that the p-
value is limited by the number of observations i n the pool. For example, if
there are p + 1 = 100 observations and Ro = 100, then the p-value for the
procedure is 0.01, and the process can be declared to be out of control if the
specified probability of a false alarm, (x, is greater than or equal to 0.01.
Traditionally, the specified error rate for a quality control procedure is often
taken to be smaller than (x = 0.01, which implies that for this nonparametric
procedure a larger initial pool would be needed.
In addition to the consideration of individual p-values, “runs rules”
may also be employed. In univariate control charts, several successive points
on the sameside of the centerline are often allowedto trigger a stopping rule
218 Hayter

suggesting that there has beena change in the mean of the distribution.
Similarruns rules may be adoptedfor these nonparametricprocedures.
For example, suppose that the p-values for a series of successive observa-
tions are each less than 0.20 but that none of the individual p-values is less
than the specified a level. One might declare the process tobe out of control
on the basis that these new observations are a l l near the fringes of the initial
pool of observations.
Runs rules can be designed to locate changes in either the mean or the
variance of the distribution. Any appearance that a set of new observations
are not "well mixed" within the initial pool suggests that the distribution
may have changed. Changes in the mean imply changes in the location of
thedistributionandmay be identified by a locationalshift in the new
observations.Changes in thecovariancestructure C should be indicated
by changes i n the shape of the distribution. Specifically, increases in the
varianceof a variableshould be characterized by frequentobservations
outside or on the fringes of the distribution.
I n conclusion,theconsideration of theindividualp-values ofnew
observations together with an awareness of the location of thenew observa-
tions relative to the initial pool of observations should allow an effective
determination of out-of-control signals.

3.6. Calculation of theScores


There are two basic types of algorithms that can be used to construct the
scores So, S , , ..., S,,. These are functional algorithms and linkage algorithms.

Functional Algorithms
With functional algorithms the scores are calculated from a series of com-
parisons of the observations .I"with each other. Specifically, the score Si is a
function of .I,' = (J:,, ...,.vi)and every other point in the pool and can be
written as
S; =f'(y';
yo, ..., JJ')

The function is defined so that observations that are far from the center of
the set of observations receive high scores while observations near the center
receive lowscores.Threepossiblechoicesforthefunctionaredescribed
below.
1. The easiest method to consider is
MultivariateQualityControlProcedures 219

r=l J=O

where is theindicatorfunction.Inthiscasethescorefunction
can be thought of as simply being calculated from a count of how
many points are on either side of a particular observation and as
being similar to a multivariate sign test. The score Siwill be close
to zero for points in the center of the distribution, because at the
center there are roughly an equal number of observations in every
direction. At the perimeter other observations tend to be to one
side, and thus the score will be large. For these scores the magni-
tude of the difference between two pointsy' and JJ is ignored, and
only the direction of the differenceis important. Note that thereis
a large potential for ties in the scores to occur with this method.
2 . Asecondprocedure is similartothe first except that the actual
distancesbetweenpointsareused to calculatethescores.The
score Siis calculated as the sum of the Euclidean distances from
y' to every other point y', 0 5 ,j 5 p , so that

Thus Siis the sum of the p distances from y' to all points in the
combined pool. It is clear that the scoresof the observations at the
center of thegroup will tendtobe lower thanthescoresfor
perimeter observations.
3. The scoresobtainedfromthethirdmethodarecalculated by
comparing an observation y' with a statistic based on the com-
bined pool. This statistic,
M = ( M I , ..., M k )
is chosen to be a "middle value" of the combined pool of observa-
tions such as the mean vector or the median vector. Typically the
scores can be calculated as the distances of the observations from
this middle value so that
s;= (JI;- M)'CV' - M )
Again, note that observations near the center of the pool will have
small scores while observations on the perimeter will have larger
scores. Note also that this method requires far fewer calculations
220 Hayter

than method (2), although this difference should not be important


with present computing facilities.

Linkage Algorithms
Linkage algorithms resemble a linking clustering algorithm in that the p + 1
observations are linked together one point at a time. The cluster begins at
the center of the distribution and branches to all of the observations i n the
combined pool. Points are added to the cluster in succession until al l p + 1
points are part of the cluster. The criterion for choosing the next point tobe
added to the cluster is that it should be the “closest” observation to the
cluster. The distanceto the cluster canbe measured in several different ways,
which are discussed below. The score Siis defined to be equal toj when y i is
theJth point added to the cluster (note that i n this case R, = Si).The first
point to be added to the cluster can generally be taken to be the pointclosest
to .F. Observations closest to the center will tend to be added first, and those
on the perimeter willbe added last. Also, observations in heavily concen-
tratedareas will tend to be addedtothecluster before observationsin
sparsely concentrated areas, since in dense regions observations are closer
together,andthereforeobservations will tendtobe linked in succession
once the first observation in that region has been added to the cluster.
When these linkage algorithms are appliedit can be useful to construct
a “center value” M , which is considered to be the first point in the cluster
(although it may be removed from the cluster later). Three possible ways to
decide the order inwhich observations are added to the cluster are described
below.
I. If observation 11’is not already in the cluster, then it is added to the
cluster if it is the closest (amongall observations not alreadyin the
cluster) observation to any observation already in the cluster. In
other words, for each observationy‘ not already in the cluster, the
minimum distance
D..
!I
= (I! - I,./)’(\,‘ - J*/)

is calculated over all points J J already in the cluster. The point y‘


with the smallest minimum distance is then added to the cluster.
2 . Method 1 can be generalized by calculatingthesum of the it
slnallest distances from an observation not i n the cluster to obser-
vations already in the cluster, for a fixed value of CI. While method
(1) has N = I , it may be sensible to take N = 2, say, whereby the
sum of the two smallest distances from an observation not in the
cluster to observations alreadyi n the cluster are used to determine
which observation should be added to the cluster next.
MultivariateQualityControlProcedures 221

3. An additional extension would be to calculate the sum of all of the


distances from an observation not in thecluster to eachofthe
observations already in the cluster. This method is different from
method ( 2 ) ,since in this case the value of( I changes and is equal to
the number of observations currently in the cluster.

4. SUMMARY

A single product can be described by several correlated variables that are to


be monitored by qualitycontrolprocedures.Thecorrelationstructure
between the variables should be taken into account when designing a quality
control scheme for the product. A good multivariate quality control proce-
dure is one that, at a specified error rate a,triggers the out-of-control alarm
only with probability c( when the process is still in control and triggers the
alarm as quickly as possible when the process is out of control. I n addition,
it should provide a simple and easily implementable mechanism for deciding
which of the variables are responsible when the process is determined to be
out of control. Finally, it should allow easy quantification of the amount by
which theout-of-controlvariableshavechanged in meanvalue.Recent
advances in this area provide more tools for the practitioner to meet these
goals.

REFERENCES

Alt FB. (1985). Multivariate quality


control.
In Kotz and Johnson, cds.
Encyclopcdia of Statistical Sciences, Vol. 6. New York: Wiley.
Bechhofer RE, Dunnett CW. (1988). Percentagepoints of multivariateStudent t
distributions.
In: Selected Tables in Mathematical Statistics. Vol. 1 1.
American Mathematical Society.
Bush HM. (1996). Nonparametric multivariate quality control. PhD Thesis, Georgia
Institute of Technology, School of Industrial and Systems Engineering.
Hnyter AJ. Tsui K . (1994). Identification and quantification in multivariate quality
control problems. J Qual Techno1 26(3): 197-208.
HotellingH. (1947) “MultivariateQualityControl” in Techniques of Statistical
Analysis. Eisenhart. Rastay and Wallis, eds. New York: McGraw-Hill.
Kuriki S. (1997). A note on a bivariate F-distribution. Personal communication.
Liu RY. (1995). Control charts for multivariateprocesses. J Am Stat Assoc 90(432):
1380-1 387.
Odeh (1982) “Tables of Percentage Points of the Distribution of The Maximum
Absolute Value Equally Correlated
Normal
Random Variables.”
Communications in Statistics-Simulation and Computation 11 :65-87.
This Page Intentionally Left Blank
13
Autocorrelation in Multivariate
Processes
Robert L. Mason
Southwest Research Institute, San Antonio, Texas

John C. Young
McNeese State University, Lake Charles, Louisiana

1. INTRODUCTION

A basic assumption i n mostmultivariatecontrolprocedures is thatthe


observation vectors are uncorrelated over time. When this assumption is
true, the graph of any process variable against time should show only ran-
dom fluctuations. When the assumption is false, the patterns in such time
plots are systematic and often indicate the existence of linear or quadratic
trends. In these latter situations, incorrect signals can occur in the corre-
spondingmultivariatecontrolchart,andthe effectiveness of theoverall
controlproceduremaybeweakened[e.g., see Alt et al. (1977) or
Montgomery and Mastrangelo (1991)l.
Numerous industrial processes produce data that change over time.
This may occur because of such factors as the continuous wear on equip-
ment, the degenerative effects of environmental and chemical contamina-
tion, and the depletion of the catalyst in a chemical process. Autocorrelated
observations resulting because a process continuously decays overtime may
be detectable if one samples the process on a regular time interval. However,
processdecay thatoccurs in stagesmayappearto be insignificant and
undetectable across short time intervals but highlysignificant and detectable
when the process is monitored over extended time intervals. Mason et al.

223
224 Mason and Young

(1996) present an excellent example of a situation where the autocorrelation


behaves as a step function.
If autocorrelation goes undetected or ignored, it can create serious
problems in multivariate control procedures. This often occurs when the
effects of the autocorrelated variable are confounded with the time effects.
An adjustment would be needed in such situations in order to obtain a true
reading on process performance at given a point in time. Control procedures
for autocorrelated data in a univariate setting make adjustments by model-
ing the time dependency and examining the residuals of the resultant auto-
regressive models. Under proper assumptions, these residuals, or adjusted
values (effect of the time dependency removed), can be shown to be inde-
pendent and normally distributed and are thus used as the charting statistic
in the control procedure [see, e.g., Montgomery (1991)l.
The problem with autocorrelated data from a multivariate process is
more complicated. We have to be concerned not only with how these vari-
ables relate to the other process variables but also with how some of the
process variables relate to time changes. Our procedure for analyzing such
autocorrelateddatacentersonthe use ofHotelling’s T 2 asthecontrol
statistic. Many of the desirable properties of this statistic for independent
observations are shown to apply to this situation.

2. DETECTIONOFAUTOCORRELATION IN MULTIVARIATE
PROCESSES

Why do certain types of processes have a tendency to generate observa-


tions with a time dependency‘? Autocorrelation may be due to a cause-and-
effect relationship between a process variable and time. If this occurs, the
observationontheprocessvariable is proportional to thevalue of the
variableatsomeprior time. In contrast, if thetimerelationship is only
an empirical correlation and not a cause-and-effect one, the current value
of the variable, although associated with the past value, is not determined
by it. I f this is the case, the association is usually due to an unobservable
“lurking” variable.
Consider two process variables that are highly negatively correlated
so that one variable increases as the other decreases. Suppose one of the
variables, the “lurking” one, cannot be observed but is known to increase
with time. Withoutknowledgeoftherelationshipbetweenthetwo vari-
ables, one would conclude that the second variable has a time dependency
in its observations, as its values would tend to decrease as time increases.
For example, if one considers the cyclical nature over time of the variable
depicted in Figure I , one might suspect that some form of time effect is
Autocorrelation in MultivariateProcesses 225

1
1000.00
800.00
-
n
p) 600.00
400.00
m
'E 200.00
: 0.00
; -200.00
2 -400.00
Q.
-600.00
-800.00
- 1000.00
Time

Figure 1 Process variable with cycle.

present. However, the noted trend is due to a "lurking" variable that has a
seasonalcomponent. Sincethe effects ofsuch"lurking"variables,when
they are known to exist, can be accounted for by making adjustments to
the associated observable variable, the detection of these situations can be
a greataid in thedevelopmentofapropercontrolprocedureforthe
process.
Detecting autocorrelation in univariate processes is accomplished by
plotting the process variable against time. Depending on the nature of the
autocorrelation, the plotted points will either move up or down or oscillate
back and forth over time. Subsequent data analysis canbe used to verify the
time trend, determine lag times, and fit appropriate autoregressive models.
The simple and straightforward method of graphing individual components
against time can be inefficient when there are a large number of variables,
and the interpretations can become confoundedwhen these components are
correlated. Despite these disadvantages, we have found that graphing each
individualvariableovertime is still useful i n multivariate processes. I n
addition to studying autocorrelation, it can lead to the discovery of other
influential variables.
To augment the above graphical method and reduce the number of
individual graphs that need to be produced, we additionally suggest that a
time-sequence variable be added to the data set.If any of the other variables
correlates with the time-sequence variable, it is highly probable that it cor-
relates with itself over time. Using this approach, one can locate potential
variables that are autocorrelated. Detailed analysis, including the graphing
of the individual variable over time,will either confirm or deny the assertion
for individual variables. Other techniques, such a s that given in Tracy et a l .
(1993). also should be explored.
226 MasonandYoung

3. VARIOUSFORMSOFAUTOCORRELATION

We examine two different forms of autocorrelation: uniform decay and stage


decay. Itis important torecognize each type,as bothplay an importantrole in
the development and implementationof a multivariate control procedure for
autocorrelated data. Uniform, or continuous, decay occurs when the observed
value of the process variableis dependent on some immediate past value. For
example, heat transfercoefficient data behave in this fashion. During thelife-
cycle of a production unit, the transfer of heat is inhibited owing to equipment
contamination or for other reasons that cannotbe observed or measured. A
new life cycle is created when the unit is shut down and cleaned. During the
cycle, the process is constantly monitored to ensure maximum efficiency.
Figure 2 contains the graph of a heat transfer coefficient over a number of
life cycles of a production unit. The uniform decay of the unit is evident from
the declining trend in the plotted curve prior to each new life cycle.
Stage decay occurs whenthe time change in a process variable is incon-
sistent on adaily basis but occursin a stepwise fashion over extended periods
of time. This form of autocorrelation is present in processes where change with
time occurs very slowly. The time relationship results when the process per-
formance in one stageis dependent on the process performance in the previous
stage or stages. The graph of a stage decay process variable is presented in
Figure 3. Notice that there is a distinctive shift in the process variable some-
where near the middle of the curve but that the fluctuations are around similar
levels below the shift and athigher but similar levels above the shift.

4. A CONTROL PROCEDURE FOR A UNIFORM DECAY


PROCESS

Our approach for obtaining a multivariate control procedure for uniform


decay processes is to use Hotelling's T' statistic and its associated orthogo-

Time
Figure 2 Life cycles over time.
Autocorrelation in MultivariateProcesses 227

14
12
10

'5
> 8
3
u 6
2 4

0-I
Time

Figure 3 Step change of process variable.

nal decomposition. Mason and Young(1999) show that correct modelingof


existing functionalrelationshipsbetweenprocessvariablesincreasesthe
sensitivity ofthe T 2 value in signal detection.Anoverviewofpertinent
points of their work and how it relates to a multivariate control procedure
forautocorrelated processes
with
uniformdecay is discussed below.
Mathematical details and data examples are provided in the original paper.
One example of an orthogonal decomposition of the T 2 value asso-
ciated with a p-dimensional data vector, X' = (x,, ..., x,,), is given as

T 2 = (X- X)'S"(X - X)
= T: + TZ.1 + . . . + T,,.12...~l-l)
2

where X and S are the usual estimates of the population mean vector and
covariance matrix obtained by using an in-control historical data set. In this
procedure [see Mason et al. (1995) foracompletedescription],the first
component of a particular decomposition, termed the unconditional term,
is used to determine whether the observation on thejth variable of a signal-
ing data vector is within the operational range of the process. The general
form of the jth unconditional T 2 is given by

where .xJ is the jth component of X , andand si2 arethecorresponding


mean and variance estimates as determined using the in-control data set.
The remaining components, termed conditional terms of the decomposition,
228 Mason and Young

are used i n detecting deviations in relationships among the variables that


produced the signal. The general form of a conditional T’ term is given by

This is the square of theith variable adjusted by the estimates of the mean
and variance of the conditional distribution of x,
given s l , s 2 , ...,
Theordering of the components in the data vectordeterminesthe
representationofeachtermofthedecomposition. As pointed out by
Mason et al.( 1 995). there are p ! different arrangements of the p components
of a data vector, and these lead to p ! decompositions, each consisting of p
terms. Mason and Young (1997)show that the unique terms of a l l such
decompositions will contain all possible regressionsof an individual variable
on all possible subgroups of the remaining p 1 variables. For example, the
first component, s , , of a three-dimensional data vector would be regressed
against all possible subgroups of the other two variables. These regressions
and the corresponding conditional T’ terms are presented in Table 1. Using
the tabulated results, a control procedure based on the T’ statistic can be
developed for a set of process variables that exhibit uniform time decay in
the observations and, at the same time, are correlated with other process
variables. As an example, consider a bivariate vector (.Y,.Y) where the vari-
able J, exhibits a first-order autoregressive relationship [i.e., AR(l)]. Note
that the observations are actually of the form (Xt, Y t , YtPl), where t repre-
sentsthetimesequenceofthe data. The AR( 1) relationshipforcan be
represented in model form as

where bo and h l are unknown constants. If J, were being monitored while its
relationshipwith s wasignored, a signal would be produced when the
observed value of J‘ was not where it should be a s predicted by the estimate
of the model in Eq. (3). However, if one chooses to examine the value of J’

Table 1 List of Possible Regressions for s , When p = 3


Conditional T’
AutocorrelationinhlultivariateProcesses 229

adjusted for the effect of s and the time dependency, a model of the form

would be more appropriate.


The modeling of time relationships existing among the process vnri-
ablesrequiresaddingadditionallagvariables to thehistorical data. For
example, a historical data set for a bivariate process is a data matrix con-
sisting of observations on the vector (.vr, J,,), where t = I , .... 17. Assuming
autocorrelation exists among the observations onJ’ and is of the AR( I ) form
given in (4), the data set will have to be reconstructed to have the form
( s t , y,, J ’ , + ~ ) t. = 2, ..., 11, in order to estimatethe mode!. The ordering of the

vector components is arbitrary but is important to the notation scheme for


the T’ terms. Interpretation of a signal for thissituation is achieved by
examining appropriate terms from all possible decompositions of the signal-
ing T’ value. Details are provided i n Table 2.
Higher order autoregressive relationships can be examined by adding
other lagvariables to thehistorical dataset.Forexample,suppose the
variable J’ has an AR(2) time dependency so that

The
reconstructed
data vector
would be of the
form (.\-,, The
use of suchtime-dependentmodelsrequiresprocessknowledgeand an
extensive investigation of the historical data.

Table 2 Interprctation of Useful T’ Components in AR(I) Model


nterprctationT’ component
Tf Checks if x component of data vector is in operational range of .I-.
T,’ Checks if J! component of datavector in is operationalrangc
of y,

Ti,3 Determines if current value of y is i n agreement with the value


predicted using previous y value, or examines the value of 1. with
the effect of J ’ , - ~removed.
T;.2 Checks if s and ,v are countercorrelated. Effect of time is not
removed.
T,’. I Checks if J! and -1-are countcrcorrelatcd.Not symtnctrical with T:,’.
Effect of time is not removed.
Ti.I 3 Determmes if prescnt value
of J’ is in agreement with the valltc
predictcd using s and previous value of J’.
230 Mason and Young

5. EXAMPLE OF A UNIFORM DECAY PROCESS

Consider a chemical process where observations are taken on a reactor used


to convert ethylene (C2H4) to ethylene dichloride (EDC), thebasic building
block for much of the vinyl products industry. Feedstock for the reactor is
hydrochloric acid
gas
(HCI)
alongwith
ethylene
and
oxygen ((I2).
Conversion of the feedstock to EDC takes place in a reactor under high
temperature, and the process is referred to as oxyhydrochlorination (OHC).
There are many different types of OHC reactors available to convert ethy-
lene and HCI to EDC. One type, afixed-life or fixed-bed reactor, must have
critical components replaced at the end of each runcycle, as the components
are slowly depleted during operation. Performance of the reactor follows the
depletion of the critical components;i.e., the best performance of the reactor
is at the beginning of the run cycle, and it gradually becomes less efficient
during the remainder of the cycle. This inherent uniform decay in the per-
formance of the reactor produces a time dependency in many of the result-
ing process and quality variables.
Consider a steady-state process where the reactor efficiency is at 98%.
The efficiency variable will contain very little variation (due to the steady-
stateconditions), and itsoperationrange will besmall.Any significant
deviation from this range should be detected by the process control proce-
dure. However, over the life cycle of a uniformly decaying reactor, the unit
efficiency might have a very large operational range. For instance, it might
range from 98% at the beginning of a cycle to 85% at the end of the cycle
and would thus contain more variation than a steady-state variable. If we
failed to consider thedecay in the process, any efficiency value between 85%
and 98% would be acceptable, even 85% at the beginning of a cycle.
A deviation beyond the operational range (established using in-control
historical data) for a process variable can be detected by using its uncondi-
tional T 2 term. In addition, incorrect movement of the variable within its
range (occurring because of improper relationships with other process vari-
ables)can be detected by using theconditional T 2 terms.However, this
approach does not account for the effects of movement due to time depen-
dencies. Adjusting for a timeeffect will provide additional monitoringof the
movement of an individual variable within its operational range when the
effect of its previous value(s) has been removed. Including time-lag variables
in the computation of the T’ statistic adds corresponding terms to the T’
decompositions that can be used tomonitormovement of thevariables
through time. This enhances thesignal detection performance of the overall
T’ statistic.
Although the above reactor process is controlled by many variables,
we will use only four of them in this example in order to demonstrate the
Autocorrelation in MultivariateProcesses 231

530 1
520 -
510 -
500 -
5 490 -
c
480 -

460
470 1
Time

Figure 4 Reactortemperature versus time.

proposedcontrolchartprocedure.Theseincludethree processvariables,
labeled TEMP, L3, and L1, and a measure of feed rate, labeled RPI. All,
with the exception of feed rate, show some type of time dependency.
Temperature measurements are available from many different loca-
tions on a reactor, and together these play an important role in the perfor-
mance and control of the reactor. To demonstrate the time decay in all of
the measured temperatures, we present in Figure 4 a graph of their average
over a good production run.The plot indicates that the average temperature
of the reactor gradually increases over the life cycle of the unit.
Figures 5 and 6 contain graphs of the other two process variables, L3
and L1, over time. The decay effect for L3 in Figure 5 has the appearance of
an AR(1) relationship, while that for L1 in Figure 6 has the appearance of
some type of quadratic (perhaps a second-order quadratic) or an exponen-
tial autoregressive relationship.
Feed flow (RPl) toa reactor consistsof three components: theflows of
0 2HC1, gas, and CzH4.However, since these components must be fed in at
a constant ratio, one graph is sufficient to illustrate the feed. During a run

60.000 T

10.000
0.000
1
Time

Figure 5 L3 versustime.
232 MasonandYoung

2.00 T

0.00 J
Time
Figure 6 L1 versus time.

cycle, the feed to the reactor is somewhat consistent and does not system-
atically vary with time. This is illustrated i n Figure 7.
The correlation matrix for the four variables RP1, L1, L3, and TEMP,
including the first-order lag variables for LI, L3, and temperature (LLI,
LL3, and LTEMP), is presented in Table 3. Note the very strong lag corre-
lation for the three process variables. For example, L1 has a correlation of
0.93 with its lag value, indicating that over 80% of the variation on this
variable can be explained by the relationship with its lag value. This strong
correlation implies that an AR(1) model is a good approximation to the true
time dependency. Also, note the strong relationship between L1 and the lag
of the temperature. The correlation of 0.80 implies that over 64% of the
variation in the present value of L1 can be explained by the temperature of
the unit during the last sampling period.
To see the effect of these time-lag variables on a T' control proce-
dure, wewill comparethe T' values obtained with andwithoutthe lag
variables. Forcomparisonpurposes, we denotethe T' based onthe
chosen fourvariables RPI, L1, L3, andTEMP by T i andthe T'
based on all sevenvariables,includingthethreelagvariables LLI,
LL3. and LTEMP, by T;. Assume that each observation vector is repre-

250000 T
200000 "

100000 - -

Time
Figure 7 RPI versus time.
AutocorrelationinMultivariateProcesses 233

Table 3 Correlation Matrix for Reactor Data

sentedas(RP1.L1, L3, TEMP, LLI, LL3, LTEMP). Sincethestatistic


T i is based on the first four components of this vector, it is contained i n
theoverallvector T;. Also, all of the t e r m in thepossibledecomposi-
tions of T i are contained i n the various decompositions of T;. Since T;
contains information on the time-lag variables, it will be more sensitive to
any change in theprocess.
The inclusion of lag variables in the historical data will produce new
conditional terms in the decomposition of the T; statistic. For example, the
unconditional term T:3, which is contained in both T i and T;, is used to
determine if L3 is in its operational range. However, including the lag vari-
able LL3 adds the new conditional term, Tj?3.LL3, to T; and allows one to
monitor the locationof L3 based on its previous value. For lag values of one
sampling period, this term contains the AR( 1) model
L3 = ho + h , LL3 + error.
To compare the performance of T; to T i , consider a sequence of 14
observations (in time order) on the above four reactor variables and the
corresponding three time-lag variables. The data are presented in Table 4.
Our interest lies in the process variables LI and TEMP. Thevalues of LI are
relatively high for the first two observations, drop dramatically for thenext
two observations, and then graduallyincrease in value to near the end of the
data set. In contrast, the TEMP values start relatively low, gradually rise
until the middle observations, and then stabilize near the end.
Table 5 contains the T; and T: values for the 14 sample points. The u
level forbothstatistics is 0.0001. Notethatasignal is detected by T;
atobservations 4 and 6, but no signal is detected by T: atany of the
observations.
Interpretation of T’ signals for autocorrelated datsis no different than
for data without time dependencies. When a signal is detected, the T’ sta-
tistic is decomposed to determine the variable orset of variables that caused
the signal. When T; for observation 4 is decomposed, using the procedure
234 MasonandYoung

Table 4 Reactor Data


Obs.
No. RPI LI L3 LTEMP
TEMPLL3 LLI
1 188.300 0.98 44.13 510 1.40 50.47 498
2 189,600 0.81 33.92 521 0.98 44.13 510
3 198.500 0.46 28.96 524 0.81 33.92 52 1
4 194.700 0.42 29.61 52 I 0.46 28.96 524
5 206.800 0.58 29.31 530 0.42 29.61 52 1
6 198.600 0.63 28.28 529 0.58 29.31 530
7 205,800 0.79 29.08 534 0.63 28.28 529
X 194,600 0.84 30.12 526 0.79 29.08 534
9 148,000 0.99 39.77 506 0.84 30.12 526
10 186,000 1.19 34.13 528 0.99 39.77 506
11 200.200 1.33 32.61 532 1.19 34.13 528
12 189,500 1.43 35.52 526 1.33 32.61 532
13 186,500 1.10 34.42 524 1.43 35.52 526
14 180.100 0.88 37.88 509 1.I0 34.42 524

Table 5 T’ Values for Reactor Data

Observation T; T:
No (Criticalvalue = 39.19) (Criticalvalue = 28.73)
4.75 I 16.98
2
24.27 3 37.28
.X84 41
27.82 5 39.10
6
25.27 7 37.18
13.51 8 31.74
9
10
19.49 11 20.76
12
13
14
Autocorrelation in MultivariateProcesses 235

described in Mason et al. (1997), several large conditional T’ components


are produced, and each includes some subset of the variables L1, lag LI,
TEMP, and lag TEMP. For example, Tt1,LTEMp has a value of 18.40. Such a
large conditional T’ term implies that something is wrong with the relation-
ship between LI and temperature. The predicted value of LI using LTEMP
as a predictoris not within the range of the error of the modelas determined
from the in-control historical data set. On closer examination, the data in
Table 4 for observation 4 suggest that the value of LI is too small for the
temperature value. Withtheremovalofthesetwocomponentsfromthe
signaling observation vector, the subvector containing the remaining five
variablesproduces no signal. TheT2 value forthesubvector is 15.31,
which is insignificant compared to the critical value of 32.21 (a= 0.0001).
Given the dependencyof L1 on time, as illustratedin Figure 6, it may be
surprising thatwe did notfind a problem with the relationship between LI and
its lag value. However, in examining the values in Table 4, it is clear that the
trend in LI from observation 3 to observation 4 is not unusual, as there is a
downward trend in L1 from observation 1 to observation 4. However, at
observation 4, the downward movement in L1 is not in agreement with the
upward movement in the temperature, particularly when one considers the
positive correlation between these two variables noted in Table 3 for thehis-
torical data set. Thus,a process problem is created, and the T’ statistic signals.
Analysis of the signaling observation 6 produces similar results. The
conditional terms involving subsets of LI, lag L1, TEMP, and lag TEMP
are generally large in value. For example, T;I,LTEMp has a value of 17.72,
2
T,,MP,LI has a value of 21.98, and T;~.TEMP,LTEMP ahas value of 21.53. All
these values indicate that there is a problem in the relationship between LI
and TEMP relative to that seen in the historical data.
Note that Ti, which did not include the effects of the time dependen-
cies between the processvariables, failed todetecttheabove two data
problems. However, this is not due to a failure of the T2 statistic, as its
performance is based solely on the provided process information. Clearly,
T; is more sensitive than Ti, since it has included information on the auto-
correlation that is present in three of the four variables. Thus, one would
expect its performance in signal detection to be superior.

6. A CONTROLPROCEDURE FOR STAGEDECAY


PROCESSES

Process decay that occurs in stages was illustrated in Figure 2. As a general


rule, this type of decay occurs over many months or years, and the time
dependency is between different stages in the process. For example, process
236 MasonandYoung

performance in the second stage might be dependent on performance i n the


first stage, and performance in the third stage might be dependent on per-
formance in thepreviousstages. A process-monitoringprocedure at any
givenstagemust adjustthe processforits performance in theprevious
stages.Thus.controlproceduresareinitiatedtodetectwhen significant
deviationoccursfromtheexpectedadjustedperformanceasdetermined
by the historical database. An overview of how this is done is briefly dis-
cussed in this section, and more extensive details and examples canbe found
i n Mason et al. (1996). Consider a situation where a three-stage life has been
determined for a production facility consisting of 11 units. Observations are
homogeneous within each stage but heterogeneous between stages. An in-
control historical data set, composedof observations on /I variables for each
unit during each stage of operation, is available. This is represented symbo-
lically in Table 6, where each Xjj is a p-dimensional vector that representsan
observation on 11 process variables; i.e.,
X ”I / -
-( . 3 .y;,~t ..., -y;jp)
where i = 1, ..., I ? . and ,j = 1 , 2, 3.
The proposed solution for the T’ control procedure for use with such
stage-decay process datais to use a 3p-dimensional observation vector given
by X ; = (X,,, X,,, X,,), h- = I , 2, ...,17. The vector X , represents all the
observations taken on the p variables from a given processing unit across
the three stages of its life. For a given production unit, the observations
acrossthethreestagesaretime-relatedandthusdependent.However,
within a givenstage,observationsareindependentbetweenproduction
units. Since X, has three components corresponding to the three life cycles
of the unit, it willbe possible to adjust the p process variables i n the T’
statistic for the corresponding stage dependencies.
Suppose X , can be described by a multivariate normal distribution
with ameanvectorrepresented a s p’ = ( p i , p2, p 3 ) , wherethe p,, i =
1 , 2, 3. are the p-dimensional mean vectors of the process variables at the
ith stage. The covariance structure for X , is given as

Table 6 Three-Stage Life History

Unit Stage 1 Stage 2 Stage 3


AutocorrelationinMultivariateProcesses 237

where X,;represents the covariance structure of the observations for the ith
stage,i = 1,2,3; and x,,i # j , denotes the covariance structure ofthe
observations between stages. Using a historical data set, standard estimates
(x. S ) , of the unknown population parameters (p,C) can be obtained, and a
control procedure based on an overall T’ can be developed. Note that the
estimates are partitioned in the same fashion a s the parameters.
As an example of theproposedcontrolprocedure,supposea new
observation, X. is taken on a given unit in its third stage. The overall T’
for this observation is given by

and will be used a s the charting statistic. Interpretation of a signaling vector


iskeyed tothepartitionedparts of X (i.e., thesubvectorsrepresenting
observations on the unit at the various stages). Significant components of
the T’ decomposition and howtheypertain to the observation vector X
taken in stage 3, assuming satisfactory performance in stages 1 and 2, are
presented i n Table 7.
When a signalling T’ component is identified, it can be decomposed to
locate the signaling variable or group of variables. Suppose a problem is
located in the conditional T.:,? term. This implies, from Table 7. that the
observation vector taken at stage3, adjusted for the process performance at
stage 2, is out of control. With this result, however, we will not know if the
process performance is better or worse than that indicated by the historical
situation unless we further examine the sourceof the problem in t e r m of the

Table 7 Interpretation of Components in Stage Decay, p = 3

Interpretation
Component of component

T; Checks if thc p components of the observationvector X , are within


tolerance.
e .I Checks process performanceonstage 3. i.e., X,, adjustingforperfor-
mance in stage 1 as given by X , .
G .2 Checksprocessperformance in stage 3, adjustingforpcrformancc in
stage 2.
T:. I 2 Checks process p c r f o r m m x in stage 3. adjustingforperformance in
stages 1 and 2.
238 Mason and Young

individual variables. To do this, we will need to perform a second decom-


position, but this onewill involve decomposing the signaling conditional T’
component.
For p = 3, one possible decomposition of T.:,’ is given by

Interpretation of these doubly decomposed terms is the same as for any T’


with variable components. For example, ( T:)3,2represents an unconditional
T’ term and can beused to check the tolerance of the first component of the
observation vector.
In general, incoming observations on a new unit are monitored in a
sequentialfashion.Whenaunit isin stage I , onlytheobservation XI is
available, and monitoring is based on use of the statistic

If a signal is observed, the T’ is decomposed and the signaling variables are


determined. For signaling observations in the remaining stages, the proce-
dure is the same as that outlined above for an observation in stage 3.

7. SUMMARY

Thecharting of autocorrelatedmultivariatedata in a controlprocedure


presents a number ofseriouschallenges. A usermustnotonlyexamine
therelationships existing between the processvariablestodetermine if
any are unusual but must also adjust the control procedure for the effects
of the time dependencies existing among these variables. This chapter pre-
sents one possible solution to problems associated with constructing multi-
variate control procedures for processes experiencing either uniform decay
or stage decay.
The proposed procedure is based on exploiting certain properties of
Hotelling’s T’ statistic. The first useful property is the inherent dependency
of this statistic on the relationships that exist between and among the pro-
cess variables. If time dependencies exist, they can be identified by including
time variables in the observation vector and then examining their relation-
ships with the process variables. A second important property of T’ is that
its signaling values can be decomposed into components that lead to clearer
interpretation of signals. The resulting decomposition terms can be used to
monitor relationships with the other variables and to determine if they are in
agreementwiththosefound in thehistorical dataset.Thisproperty is
particularly helpful in examining stage-decay processes, as the decay occurs
AutocorrelationinMultivariateProcesses 239

sequentially and thus lends itself to analysis by repeated decompositions of


t h e T’ statistic obtained at each stage.

REFERENCES

Alt FB, Deutch SJ, Walker JW. (1977). Control charts for multivariate, correlated
observations.ASQCTechnicalConferenceTransactions.Milwaukee, WI:
American Society for Quality Control, pp 360-369.
Mason,RL,Young,JC. (1999). improvingthe sensitivityof the T Z statistic in
multivariate process control. J Qual Technol 31. In press.
Mason RL, Tracy ND, Young JC. (1995). Decomposition of T’ for multivariate
control chart interpretation. J Qual Technol 27:99-108.
Mason RL. Tracy ND, Young JC. (1996). Monitoring a multivariate step process. J
Qual Technol 28:39-50.
Mason RL, Tracy ND, Young JC. (1997). Apracticalapproachforinterpreting
multivariate T Z control chart signals. J Qual Technol 29:396-406.
Montgomery DC. (1991). Introduction to Statistical Quality Control. New York:
Wiley.
Montgomery DC, Mastrangelo CM. (1991). Some statistical process control meth-
ods for autocorrelated data. J Qual Technol 23:179-193.
Tracy ND, Mason RL. Young JC. (1993). Use of the covariance matrix to explore
autocorrelation in processdata.In: Proceedingsof theASA Section on
Quality and Productivity. Boston, MA: American Statistical Association. pp
133-135.
This Page Intentionally Left Blank
Capability Indices for Multiresponse
Processes
Alan Veevers
Commonwealth Scientific and Industrial Research Organization,
Clayton, Victoria, Australia

1. INTRODUCTION

Production processes can be characterized by the simple fact that something


is produced as a result of a number of deliberate actions. The product may
be an item such as a glass bottle, a brake drum, a tennis ball, or a block of
cheese. Alternatively, it may be a polymer produced in abatchchemical
process or a shipment of a mineral ore blended from stockpiles that are
being continuouslyreplenished.Whateverthe case, there will usually be
several measurable quality characteristics of the product for which specifi-
cations exist. These are often a pairof limits between which the appropriate
measurement is required to lie. Sometimes a specification is a one-sided limit
such as an upper limit on the amount of an impurity in the product of a
chemical reaction.
The extent to which a process could or does produce product within
specifications for all its measured quality characteristics is an indication of
the ccrpcrhilit?, of the process. Capability can be measured both with and
without reference to targeting, and it is important to distinguish between
these two situations. The principal reasons why product may be produced
out-of-specification,i.e.,nonconforming,areeitherpoortargeting of the
process mean or excessive variation or a combination of both. In process
development or improvement campaigns, the two situations relate to the
following questions.

241
242 Veevers

1.Aretheranges of variation in my productcharacteristics small


enough to fit within the specification ranges?
2. How shall I choose the aim-point for my process mean in order to
. minimize the proportion of nonconformingproduct?
Cclpubilitllpotenticrl is concerned with thefirst question. It is a comparison of
a measure of process dispersion with the amount of dispersion allowed by
the specifications. Capability peTji)rmunceaddresses the second question and
is concerned with what actually happens during a period of stable produc-
tion.Theseconceptshavebeenformalizedfora single response by the
introduction of capability indices; see, for example, Kane [I], of which C,,
(forpotential)and CPk(forperformance)arethemostcommonly used.
These, and other, indices are discussed in the book by Kotz and Johnson
[2], which, together with the references therein and other chapters of the
presentvolume,provideagoodsummary of single-responsecapability
indices. For multiresponse processes, the question arises as to whether or
not suitable and useful multivariate capability indices exist. If so, they will
need to provide answers to the above two questions. Several indices have
been proposed for multiresponse processes, and some of them are discussed
later. However, it is first necessary to deal with some important issues of
clarification.

2. CAPABILITYSTUDIES,PROCESSMONITORINGAND
CONTROL

Since capability indices were brought to the attention of mathematical and


statisticalresearchers in the 1980s, therehasbeensomeself-perpetuating
confusion in the literature.A number of authors, for example Chan,et al. [3]
and Spiring [4],argue that C, is a poor capability measure because it fails to
take account of the target. What seems to be forgotten is that the p in C,,
stands for potential and there was never any intention that it should take
account of the target. Cp is meant as an aid to answering question1 posed in
Section I , and concerns variation but not location. On the other hand, C,,k
was devised to help answer question 2 and refers to the actual performance
of the process when targeting has taken place. There is no need to compare
C,, with Cpk(or with any other performance measures), because they mea-
sure different things. The fact that C,, and Cpkare both routinely reported
during the performance phase in automotive and other manufacturing pro-
cesses might cloud the issue but should not lead to them being regarded as
alternative measures of the same thing. For example, if a stable process is
reporting C,, = 2.1 and CPk= 0.9, then the most likely explanation is that
Capability Indices for Multiresponse Processes 243

the process mean is not optimally targeted. The information providedby the
C,, value tells us that the process is potentially capable without further need
to reduce variation. Process performance willbe improved, monitored by
CPk,by suitably adjusting the target for the process mean.
Similarconsiderationsapplytomultiresponsecapability indices.
Specifically, there is a clear justification for developing analogs of C,, for
the multivariate case that, of course, take no account of targeting. Such an
index will measurethepotentialoftheprocesstomeet specifications
(addressing question 1) but will not, by intent, measure actual performance.
Different measures must be devised for the latter purpose.
Another source of confusion arises when process capability and pro-
cess control issues are not separated. An illustrationof the point being made
here is based on the following example. During the 1997 Australian Open
Tennis tournament, someof the top players complained about the quality of
the balls being used. International regulations specify that they shall weigh
not less than 56.7 g and not more than 58.5 g and must be between 6.35 cm
and 6.67 cm i n diameter. The tennis ball production process must be set to
achieve both these specifications simultaneously. This defines a rectangular
specification region forthebivariatequalitymeasureconsistingofthe
weight and diameter of a tennis ball. A small sample of measurements on
ordinary club tennis balls was obtained that showed a correlation of 0.7
betweenweight and diameter. This information wasused to contrive the
situation shown in Figure 1 to illustrate the differencebetweencapability
andcontrolconsiderations.Supposethataperiod of stableproduction
produceddataapproximately following abivariatenormaldistribution
with a correlation coefficient of 0.7. A 99% probability ellipse for such a
distribution is shown in Figure 1. Now suppose that the next two measured
balls are represented by the + signs in the figure. Two conclusions can be
drawn, first that the process has gone out of statistical control and second
that the two new balls are perfectly capable of being used in a tournament.
In fact, the two new balls are arguably better, in thesense of being nearer to
the center of the specification region, than any of the balls produced in the
earlier stable phase.
Fromthe process controlpoint of view, theout-of-control signals
must be acted upon and steps taken to bring the process back into stable
production.Multivariate process controltechniques,such as thatintro-
duced by Sparks et al. [5] or those discussed in a previous chapter of this
book, are available for this purpose. Based on multivariate normal theory,
ellipsoidal control regions form the natural boundaries for in-control obser-
vations. Points falling outside the control region are usually interpreted as
meaning that something has gone wrongwith the process. From theprocess
capability point of view, it is whether or not production will consistently
veevers

6.3 6.4 6.5 6.6 6.7


Diameter (cm)

Figure 1 A 99% probability ellipse representing the bivariate distribution of the


tennis ball quality characteristics lies comfortably inside the specification rectangle.

meet specifications that is of primary importance. I n this case, the t-xt that
the region bounding the swarm of data points maybe ellipsoidal is of minor
importance. The main concern is whether or not it fits into the specification
region. Capability indices are not tools for process control and should not
be thought of as measures by which out-of-control situations are detected.
They are simply measures of the extent to which a process could (potential)
or does (performance) meet specifications. Issues of control and capability
need to be kept separate; otherwise unnecessary confusion can occur. For
example, although correlation is of critical importance i n control methodol-
ogy, it is largely irrelevant for many capability considerations.

3. MULTIVARIATECAPABILITYINDICES

Aspointed out by KotzandJohnson [ 2 ] , mostmultivariatecapability


indices proposed so far are really univariate indices derived from the vector
of quality characteristics. An exception is the three-component vector index
introduced by Hubele et a l . [6]. While a complete review of the subject to
date is not intended, some of the significant developments are mentioned
here. The indices fall broadly into two groups: those that use a hyperrectan-
gular specification region andthosethat use an ellipsoidal specification
Capability indices for Multiresponse Processes 245

region. Within those groups there are indices that measure capability poten-
tial and some that measure capability performance.
Let X, = (Xl, X,, ..., X(,)’
represent the vector of q quality character-
istics, andsupposethatanadequatemodelfor X, understableprocess
conditions is multivariate normal with mean vector 11 and variancexovar-
iance matrix C. Taking the widely accepted value of 0.27% to be the largest
acceptable proportion of nonconforming items produced, a process ellipsoid

(X - p)’C”(X - p) = 2

where cz is the 0.9973 quantile of the x* distribution on (I degrees of free-


dom, can be defined. More generally,c’ can be chosen to correspond to any
desired quantile.
Referring to the ellipsoid as the process region, the two questions of
interest can be rephrased as follows.
1. Withfreedomoftargeting,would it be possible forthe process
region to fit into the specification region?
2. During stable production withthemeanof the process distribu-
tion targeted at the point T, what proportion of nonconforming
product can be expected?
Attempts at direct extension of Cl, set out to compare a measure of process
variation with a measure of the variation allowed by the specifications. A
difficulty immediatelyarisesbecausethe specification region is almost
alwaysahyperrectangle.Even if it is not, it is unlikely to be ellipsoidal
and even more unlikely to be ellipsoidal with the same matrix C-l as the
process region. Nonetheless, capability indices have been proposed based on
ellipsoidal specification regions. Davis et al. [7] assume C = a’I and define a
spr.c.rrrl m t i o , U / o , for the special case of circular and spherical specification
regions. Here, U is the radius of the circle or sphere, and the target is the
centerpoint.Thus, they areaddressingquestions 1 and 2 together.The
focus of their article is on nonconforming parts, and they present a table
giving the number of nonconforming parts per billion corresponding to any
spreadratiobetween3.44and 6.85. Chan et al. [8] define an ellipsoidal
specification region with the same matrix as the process region and offer
an extension of C,,,,, to address question 2. Taam et a l . [9] also extend C,,,,?
using an index that is the ratio of the volume of a modified specification
region tothevolume of a scaledprocessregion.These last twoarticles
(apparentlyanearlier versionofthesecond one)are discussed by Kotz
andJohnson [2] together withthesuggestionsofPearnetal.[lo],who
introduce two indices based on the ratio of a generalized process length to
a generalized length allowed by the specifications.
246 Veevers

Tang and Barnett [l I] introduce three indices for multiresponse pro-


cesses. The first involves projecting the processellipsoid onto its component
axes and taking the minimum of the one-dimensional C,, values each scaled
by a projection factor and a deviation from target factor. They note that this
index does not involve the correlations between elements of X,. The second
index is similar to the first but uses the Bonferroni inequality to determine a
processhyperrectanglesuchthateach side is a lOO(1 - cr/q)% centered
probabilityintervalforthemarginaldistribution. A usualchoicewould
be to take C( = 0.0027. The third index is based on a process region obtained
using Sidak’s probability inequality but is otherwise of a similar form to the
first two. Tang and Barnett [ I 11 show that the third index is the least con-
servative and is favored over the other two.
Chen [12] definesageneral specification region,ortolerancezone,
consisting of all values of X, for which h(X, - T) 5 ro. where k (.) is a
positive function with the same scale as X, and r o is a positive number.
The process is capable if
P ( h ( X , - T) 5 ro) 2 0.9973
so, taking I’ to be the minimum value for which
P ( h ( X , - T) 5 r ) 2 0.9973
a capability index can be defined as ro/r. The formulation includes ellipsoi-
dal and hyperrectangular specification regions as special cases. Hubele et al.
[6] propose a three-component vector index for bivariate response processes.
The first component is an extension of C,, namely the ratio of the area of the
specification rectangletothearea oftheprocessrectangle. Thesecond
component is the significance level of Hotelling’s T’ statistic testing for a
location shift, and the third is an indicator of whether or not the process
rectanglefallsentirelywithinthe specification rectangle.Thislastcom-
ponent is necessary because the first component can give a C,,-like value
suitably greater than 1 despite one of the quality characteristics being, in
itself, not capable.
A completely different approach is taken by Bernard0 and Irony [13],
who introduce a general multivariate Bayesian capability index. They use a
decision-theoretic formulation to derive the index

Ch(D)= @-‘(P(X,E AID))


3

where A is the specification region, D representsthedata,and @ is the


standard normal distribution function. The distribution of X, can be of
Capability Indices for Multiresponse Processes 247

any type, and exploration of the posterior predictive distribution ofC,, given
D is limited only by available computing power.
Most of the above indices are not easy to use in practice and present
difficult problems in the exploration of their sampling distributions. Two
approaches that don’t suffer from this are given by Boyles [14] and Veevers
[ 151. Boyles moves away from capability assessment and promotes capabil-
ity improvement by using exploratory capability analysis. Further develop-
ments in this area are described by Boyles (in the present volume). Veevers’
approach is based on the concept of process viability, which is discussed in
the next section.

4. PROCESS VIABILITY

Veevers [ 15, 161 realized the difficulties associated with extensions of Cl, and
Cl,k to multiresponse processes and concluded that the reasons lay in the
logic underlyingthe structure of Cl, and C,)k. This led to thenotion of
process viability as a better way of thinking about process potential than
the logic underlying C,. He introduced the viability i d e s first for a single-
response process and then for a multiresponse process.
Basically, viability is an alternative to capability potential, leaving the
word “capability” to refer to capability performance. For a single-response
process it is easy to envisageawindowof opportunity for targeting the
process mean. Consider the process distribution, which need not be normal
and,conventionally, identifythe lower 0.00135 quantileandtheupper
0.99865 quantile. Placethisdistributionona scale of measurementthat
has the lower and upper specification limits (LSL and USL, respectively)
marked on it, with the lower quantile coincident with the LSL. If the USL is
to the right of the upper quantile, slide the distribution along the line until
the upper quantile coincides with the USL. The line segment traced out by
the mean of the distribution is the window of opportunity for targeting the
mean. The interpretation of the window is that if the mean is successfully
targeted anywhere in it, then the proportion of nonconformingitems will be
no greater than 0.27%. A process for which a window of opportunity such
a s this exists is said to be v i c h k ; i.e., all that needs to be done is to target the
mean in the allowable window. If. however, the USL is to the left of the
upper quantile (after the first positioning), then there is clearly more varia-
tion in the response than is allowed for by the specifications, and the process
is not viable. Sliding the distribution to the left until the upper quantile and
the USL coincide causes the mean to trace out a line segment that, this time,
can be thought of as a “negative” window of opportunity for targeting the
mean. Referring to the length of the window, in both cases, as w , a viable
240 Veevers

process will have a positive w and a nonviable process a negative w . The


viability index is defined as

11'
V, =
USL - LSL

If the process is comfortably viable, then w will be a reasonable portion of


USL - LSL, but if the process is only just viable, 11' will be zero and V,. = 0.
Processes that are not viable will have V, negative.
If the quality characteristic has a normal distribution with standard
deviation 0 , it is easy to see that 6 0 + )I' = USL - LSL for both positive
andnegative W, hence V , = 1 - l/C,]. Somereaders will knowthatan
early capabilityratiowas C, = l / C , (see, e.g.,Amsden et al. [17]), so
V,. = 1 - C,. Statisticalpropertiesofestimators of V , are relatively
straightforward to establish,asindicated in Veevers [15]. Itmust be
rememberedthatthe viability index is ameasureofcapabilitypotential
and addresses only question 1. The knowledge that a process is viable is
valuableeven if an unacceptableproportion of nonconformingparts is
produced whentheprocess is operating. It meansthattheprocessmust
be targetedbetter(question 2) to achieveacceptablecapabilityperfor-
mance, but there is no need, at this stage, to reduce variation. Of course,
i n a continuous improvement environment, steps would be taken to reduce
variation in thelongerterm,butthat is separatefromthepoint being
made here.
Extension of V,. to multiresponse processes requires the definition of
multidimensional
a window of
opportunity for
targeting
the
mean.
Because the process is viable, the distribution of X, can be located almost
entirely within thehyperrectangular specification region, A . And since
targeting is notat issue, thedistributioncan be thought of as free to
move around.Theshape of thedistribution will notchange withthis
movement,onlyitslocation. In particular. because thecorrelationsare
fixed, the orientation of a process ellipsoid for a multivariate normal dis-
tribution will remainconstantduringlocationshifts.Thewindow of
opportunity for targeting the mean of the distribution consists of all points
p for whichthe proportion of nonconforming itemswould beless than
0.27%. The boundary of the window can be envisaged as the locus of p as
the distribution is moved around inside A while keeping exactly 0.27% of
the probability mass outside A and 99.73% inside A . Figure 2 shows the
window of opportunity for a viable bivariate normally distributed process.
The window is almost a rectangle, with sides parallel to the specification
rectangle, except that its corners are rounded due to simultaneous breach-
ing of the two marginal specifications.
Capability Indices for Multiresponse Processes 249

I
-5 0 5
x1

Figure 2 The window of opportunity (dotted rectangle) for targeting the mean for
a viable bivariate process. The solid rectangle is the specification region.

The viability index foray-dimensionalmultiresponseprocess is


defined as

v0' -- volume of It*


volume of A

A process is viable only if it is separately viable in all its individual quality


characteristics. Otherwise it is not viable, and variation must be reduced,
at least in thecharacteristicsthatpromptedthenonviabledecision.In
ordertoproduce a practically useful index,Veevers [ I 51 representsthe
process distribution by a process rectangle that has as its sides the widths
of the one-dimensional marginal distributions. The width of a univariate
distribution is the difference between the 0.99865 quantile and the 0.00135
quantile (or a s appropriate, depending on the amount of probability to be
excluded).
The window of opportunity for a viable process can thus be envisaged
by sliding this rectanglearoundjust inside the specification region and
ignoringtherounding at the corners. For a viableprocess this leads to
the expression
250 Veevers

i= I

where V,.(X,) is the viability index for the ith quality characteristic X,. For
nonviable processes, Veevers [I51 defines negative windows of opportunity
in such a way as to ensure that the viability value obtained for a (q - I )
dimensional process is the same as would be obtained from the q-dimen-
sionalprocess by settingthemarginalvariance ofthe yth characteristic
equaltozero. Hence, is defined i n all nonviablecases to be

I,'.(/ =1-

where

0 if V,.(X,) 2 0
1 if V,.(X,)<O

As with any index for multiresponse processes, the viability index is best
used in acomparativefashion. In a processimprovementcampaignthe
viabilities can be compared after each improvement cycle, thus providing
a sinlple measureoftheprogressbeingmade. Vrq depends only onthe
marginal viabilities and is therefore independent of the correlation structure
of X,. The correlation coefficients do, however, affect the proportion of
nonconforming items that would occur if the process was i n production.
If anupperbound of 0.27% is required,then a conservativechoiceof
quantiles to use for the calculation of the nlarginal viabilities is 0.00135/q
and 1 - O.O0135/q. The specific choice in an improvementcampaign is
unimportant. since the emphasis is on changes in VrY rather than the pro-
portion nonconforming.
Having had some experience with multiresponse viability calculations,
the following modification to the Vr4 index is proposed. First, note that a
viable process with, say, q = 6 and marginal viabilities of 0.25 each (corre-
sponding to Cl, values of I .33) has Vr4 = 0.00024. It is difficult to relate this
small number to the reasonable level of viability it represents. Further, it
depends on q, and for larger values of q the viability index would be very
sInall. These difficulties can be overcome by defining a modified index
Capability Indices for Multiresponse Processes 251

for viable processes. This has thebenefit of being interpretable on the scale
of V,., independently of q. For nonviable processes, Vr9 is negative, so c9
must be defined as

which is also valid for viable processes and provides a general definition of
V& A plot of V;* for a two-response process is shown in Figure3. If desired,
VF9 can be convertedto a capabilitypotentialindex, Ci9, by C;q =
1/(1 - G 9 ) .
Viability calculations are illustrated in the following exampleused by
Sparks et al. [5] to demonstrate the dynamic biplot for multivariate process
monitoring. A flat rolled rectangular metal plate is supposed to be of uni-
form thickness (gauge) after its final roll. Measurements are made at four
positions on the plate, giving a four-dimensional response for the process.
The positions can be conveniently referred to as FL (front left), F R (front
right), BL (back left), and BR (back right). The original data are subject to a
confidentialityagreement, so theyhavebeentransformedbeforebeing
plotted as pairwise scatter diagramsin Figure 4. Typical specification limits
are superimposed, but it must be remembered that this is being done to
visualize process dispersion relative to specificationsand does not represent
actual process performance with respect to targeting. The two-, three-, and
four-dimensional specification regions are squares, cubes, and a hypercube,
as appropriate.
The individual viabilities for FL, FR, BL, and BR are calculated as
0.147, 0.185, 0.1 11, and 0.137, respectively. This implies the existence of a
positive window of opportunity for targeting the mean and gives Vr4 =
0.000415and c4 = 0.143.Using the relationship betweenviability and cap-

Figure 3 Theviability index V;; plottedagainstthe widths, W , and W2, of the


marginal distributions for a bivariate process with unit specification ranges.
252 Veevers

(D (D

P v iL
7-
N
~
..... N

E o am0

N N

P LSLA L""
I?
-f

'9 '9
-6 -4 -2 0 2 4 6
FL

. ". . ~ .~

us1
usL r"~
N

d o
N

LsL!- I? :- -I
LSI

-
6
(D

4 .4 -2 0 2 4 6

(D

USL

1 N
usL

-I
LSL
J
m
rn

-6 -4 -2 0 2 1 6 6 - 4 - 2 0 2 4 6
FR OL

Figure 4 Pairwise scatter diagrams of thickness data at four locations -FL, FR,
BL, BR---on 100 mctal sheets. Specification rectangles are superimposcd.
Capability Indices for Multiresponse Processes 253

ability, this corresponds to a capability potential of C;4 = 1.167. Since all


these values are intended for use in comparative situations, suppose some
process changes gave individualviabilities for FL, FR,BL, and BR of 0.190,
0.225, 0.175, and 0.210, respectively. Then, V,.4 = 0.00157 and V:4 = 0.199,
indicating the improvement in viability. Experience in the use of viability
indices is necessary i n order to get a feel for the extent of the improvement.
Converting to a capability potential value gives C;4 = 1.248. Practitioners
used to working with C/, may feel more comfortable on this scale of mea-
surement in the first instance.

5. PRINCIPALCOMPONENTCAPABILITY

Although specification regions are generally hyperrectangular, support for


differently shaped regions determinedby loss functions is growing. Consider
a situation where the marginal specifications have ranges 2 4 , i = I , 2, ...,q.
By transforming X, to Y,, where the elements of Y, are Y , = Xj/clj, the
specification regionbecomesahypercubeof side 2. If,on this scale of
measurement, the loss associated with an item is proportional to the dis-
tance between Y , and the center of the region, then a hyperspherical toler-
anceregionwould be appropriate. The word “tolerance” is usedhere to
distinguish the region from the specification region, which remains a hyper-
cube.
For thepurpose ofdevelopingacapabilityindexthereare several
choices of centered hyperspheres that approximate the specification region.
For example, there is the unit-radius inscribing hypersphere, the &-radius
outscribing hypersphere, and the hypersphere with the same volume as the
specification hypercube.
If Y, is adequately modeled by a multivariate normal distributionwith
variance-covariancematrix E,., thenthequestionofcapabilitypotential
revolves around whether or not the processellipsoid will fit inside the hyper-
sphere. This is governed only by the “length” of the principal axis of the
ellipsoid. A suitablelengthcan be obtained by takingamultiple of the
standard deviation of the first principal component, Z I , of E,, since this
is along the principal axis of the ellipsoid. Denoting by hl the eigenvalue
associatedwith Z I , it follows thatthestandarddeviation of ZI is 6.
Hence, taking 6& to be the length of the principal axis of the ellipsoid,
a capability potential index can be constructed by comparing this length
with the diameter of the tolerance region. Using the unit-radius hypersphere
gives
254 Veevers

and using the A - r a d i u s hypersphere gives

each of which could be used in its own right as a capability index. However,
it seems a sensible compromise to take the average of these two as a measure
ofcapabilitypotential.Hence,aprincipalcomponentcapabilityindex is
defined as

More generally, CilCcould be defined as k / f i , where k is a constant to be


determined from considerations of the maximum acceptable proportion of
thecenteredprocessdistributionallowedto be outsidethe specification
region.Since Cpc is meant to be an index of capabilitypotentialthat is
intended for use as a comparative measure, fine-tuning of k is unimportant
and will not be further considered here.
The sampling distribution of the natural estimator of Cl,c can be stu-
died using the sampling distribution of the eigenvalue associated with the
first principal component of the estimated variance-covariance matrix E,..
The following example shows the spirit in which Cl,(.may be used. The
plastic bracket and metal fitting attached to a car's internal sun visor are
manufactured to specifications relating to the torque involved in the swivel
action. Four torque quality characteristics, X4, are measured which, in dis-
guised units, have nominal values 2, 2.25, 2, 2.25 and specifications (1, 3), ( I ,
3.5), (1.3). ( I , 3 . 9 , respectively. Data on 30 items from a batch gave

0.0390 0.0306 - 0.0008 - 0.0004


0.0306 0.0423 - 0.0032 - 0.0018
- 0.0008 - 0.0032 0.0589 0.0519
- 0.0004 - 0.0018 0.0519 0.0579

with entries rounded to four decimal places. From this, i, = 0.1 105, giving
= 1.21. As an absolute value this should be interpreted with caution, but
for process improvement purposes it is useful as a comparative value. A
sample of 25 items from a batch produced underslightly different conditions
Capability Indices for Multiresponse Processes 255

gave i, = 0.086 and C/,,.= 1.37, showing a marked improvement. The tnan-
ufacturer’s aim is to keep theprocess at these conditions, which show it to be
potentially capable, and then concentrate on targetingat the nominal values
to ensure a capable performance.

6. CONCLUSION

Capability indices for multiresponse processes have been discussed. I t has


been stressed that capability potential indices are useful in their own right
and should not be confused or unfairly compared with capability perfor-
mance indices. Most of the literature on indices for multiresponse processes
concernsextensionsto C,,, and C,,,,,. The viability index, V,., however,
offers an alternative way of thinking about capability potential and extends
naturally to multiresponse processes. A modification to the multiresponse
viability index is proposedthatmakes it easier tointerpret i n practice.
Calculations are illustrated onreal data from a rolling mill. A new principal
component capability index is presented that is based on a loss function
proportionaltothedistancefromthe processmean tothetargetpoint.
Another real example from the motor parts industry is used to illustrate
the use of this index. I n a l l cases it is emphasized that capability indices for
multiresponse processes are best used in comparative fashion and should be
treated with caution as individual values.

REFERENCES

1. Kane VE. Process capability indices. J Qual Technol 18:41-52, 1986.


2. Kotz S. Johnson NL. Process Capability Indices. London: Chapman and Hall.
1993.
3. Chan LK. Cheng SW, Spiring FA. A new measure of proccss capability: C,),,,.J
Qual Technol 20:162-175, 1988.
4. Spiring FA. A unifying approach to process capability indices. J Qual Technol
29:49-58, 1997.
5. Sparks RS, Adolphson AF, Phatak A. Multivariate process monitoring using
the dynamic biplot. Int Stat Rev 65:325-349. 1997.
6. Hubele NF. Shahriari H, Cheng C-S. A bivariate process capability vector. In:
J B Keats. D C Montgomery, eds. StatisticalProcess Control In Manufacturing.
New York: Marcel Dekker, 1991, pp 299-310.
7. DavisRD,KanlinskyFC, Saboo S. Process capabilityanalysisfor processcs
with either a circular or a spherical tolerance zone. Qual Eng 5:41-51, 1992.
8. Chan LK, Cheng SW, Spiring FA. A multivariate measure of process capability.
J Modeling Simulation 1 l:l-6, 1991.
256 Veevers

9. Taam W, Subbaiah P, Liddy JW. A note on multivariate capability indices. J


Appl Stat 20:339-351, 1993.
10. Pearn WL, Kotz S, Johnson NL. Distributional and inferential properties of
process capability indices. J Qual Technol 24:216-231, 1992.
11. Tang PF. Barnett NS. Capability indices for multivariate processes. Technical
Report49EQRM 14. Victorian University of Technology, Melbourne,
Australia, 1994.
12. Chen H. A multivariate process capability index over a rectangular solid toler-
ance zone. Stat Sin 4:749-758, 1994.
13. Bernard0JM,IronyTZ. A general tnultivariate Bayesian process capability
index. Statistician 45:487-502, 1996.
14. Boyles RA. Exploratory capability analysis. J Qual Technol 283-98, 1996.
15. Veevers A. Viability and capability indexes for multi-response processes. J Appl
Stat 251545-558, 1998.
16. Veevers A. A capability index for multipleresponses of aprocess. Technical
Report
DMS-D95/1, CSIRO Division of Mathematics and
Statistics,
Sydney, Australia, 1995.
17. Amsden RT, Butler HE,Amsden DM. SPC Simplified: PracticalStepsto
Quality. New York: Quality Resources, 1986.
15
Pattern Recognition and Its Applications
in Industry

R. Gnanadesikan
Rutgers University, New Brunswick, New Jersey
J. R. Kettenring
Telcordia Technologies, Morristown, New Jersey

1. INTRODUCTION

In a very general sense, pattern recognition is often considered to be the


essence of intelligence. For example, an often heard argument for theability
of human chess masters to beat state-of-the-art computer programs is that
whereas the latter may be fast in enumerating a large number of moves and
consequences, the masters tend to rely on some innate “pattern recognition”
abilitiesbased on extensiveexperience. In amore limited sense, pattern
recognitionarises in many guises i n industrialsettings,e.g.,robotics in
manufacturing, detection of errors in massive software systems, and widely
used image analysis applications in medicine and in such things as airport
luggage scanners.
For purposes of this chapter, the phrase “pattern recognition” is used
toindicatean even more specific statisticalmethodologicalarea,that of
classification and clustering. The term “classification” is used for situations
wherein so-called training samples that can be labeled by their origin (the
case of “known” groups) are available and oneis interested in using these as
the bases for classifying so-called test samples. Other terminology for this
class of pattern recognition methods includediscriminantanalysisand
supervised learning. In the clustering scenario, on the other hand, all one
has are the data at hand, with no labels to identify sources (the case of
“unknown”groups),and theanalysisleads to findinggroupings of the

257
258 GnanadesikanandKettenring

observations that are more similar within groups than across them. This
setting is also known as unsupervised learning. There are, of course, many
real-world situationsthat fall betweenthetwoscenarios, andoftenone
needs a combination of the two approaches to find useful solutions to the
problem at hand. For instance, whiletheearlydevelopmentof so-called
neural networks, which basically are automatic classifiers implemented i n
either software or hardware, focused on supervised learning methods, the
current uses of these encompass both supervised and unsupervised learning
algorithms.
This chapter has three objectives. First, taking a broad view of busi-
ness and industry, it seeks to identify a variety of aspects of such enterprises,
as well as examplesof specific problem arising i n suchfacets,wherein
classification and clustering techniques are used to find appropriate solu-
tions. Second, using the theme of quality and productivity as a focus, it
describesasampleofapplications(drawnfromboththeliteratureand
our experience) in which this theme is a clear objective of using such tech-
niques. Third, it is aimed at discussing some methodological issues that cut
acrossapplicationsand need to be addressed by practitionerstoensure
effective use of the methods as well a s by researchers to improve the options
available to practitioners.
More specifically, Section 2 identifies areas of business and industry, as
well as some specific examples of problems in such areas, where classifica-
tion and clustering techniques have been used. Italso describes i n a bit more
detail a subset of the examples where assessment and improvement of qual-
ity, efficiency, and/or productivity are explicitly involved a s a goal of the
analysis. Section 3 discusses some general methodological issues that need to
be considered. Section 4 consists of concluding remarks.

2. ASPECTS AND EXAMPLES OF BUSINESS AND


INDUSTRIAL PROBLEMS AMENABLETO PAlTERN
RECOGNITION

Perhapsthebetterknownindustrialapplications of patternrecognition,
including some that were mentioned in the introduction, are in manufactur-
ing. However, one can identify a number of facets that are integral parts of
business and industry as a whole and give rise to problems that are amen-
able to the meaningful use of pattern recognition methods. Table 1 contains
a partial list of different facets of a business enterprise and some specific
examples of applications of classification and clustering methods in each
category. A subsetoftheexamples (identified by asterisks) in Table I ,
PatternRecognition in Industry 259

Table 1 Applications of Classification and Clustering Methods Within a


Business Enterprise
Finance
Use of discriminant analysis foreffective development of credit ratings of individuals
and firms, including bond ratings [See, e.g., Chapters IV and V of Altman et al.
(1981).]
*Use ofdiscriminantanalysisandclusteringfor developing “comparable risk”
groups of companies for the purpose of determining appropriate “rates of return”
(Chen et al., 1973, 1974; Cohen et al., 1977)

Marketing
Use of cluster analysis for market segmentation on the basisof geodemographic
similarity [Sce, e.g., Chapter 12 of Curry (1993)l and the recent development of
database marketing
*Use of cluster analysis foridentifying “lead users” and for product developmentin
light of the needs of such lead users (Urban and Von Hippel. 1988)

Resource allocation
Utilization of robotics (entailing the recognition of “shapes” and “sizes” of objects
to be assembled into a product) in assembly line manufacturing with gains in
quality and productivity arising from decreased variability and speed as well as
lower costs in the long run [See, e.g., Dagli et al. (1991).]
Niche applications of neural networksfor such things as speech and writing recogni-
tion (e.g.. voice-activated dialing of telephones; automatic verification of payments
of bills paid by customers via checks)
Use of cluster analysis for grouping similar jobs prior to the developmentof regres-
sion models for aiding assessment and improvement of utilization of computing
resources (Benjamin and Igbaria, 1991)
*Use of cluster analysis in the development of a curriculum that better meets job
needs and is likely to enhance worker productivity (Kettenring et a l . , 1976)

Software engineering
Useof fuzzy clustering to improve the efficiency ofa database querying system
(Kame1 et al., 1990)
Use of discriminant analysis for predicting which software modules are error-prone
(Conte et al., 1986)
*Use of neural networks for “clone” recognitionin large software systems (Carter et
al., 1993; Barson et al., 1995)

Strategic planning
*Use of cluster analysis for identifying efficient system-level technologies (Mathieu,
1992; Mathieu and Gibson, 1993)
260 GnanadesikanandKettenring

wherein assessment and improvement of quality, efficiency, or productivity


was an explicit goal, is now described in a bit more detail.

2.1. Finance
ASnoted in Table I , classification and clustering are used to establish
categories of comparable risk so astodetermineappropriateratesof
return.
Historically, and particularly duringthe 197Os, one role of governmen-
tal regulatory bodies in the United States was to set allowed rates of return
on equity for the companies they regulated. The regulated companies argued
that in order to attract investors they needed higher rates of return, while the
regulators felt pressured to keep them low. An accepted tenet for resolving
the two conflicting aims wasthat the rate of return should be commensurate
with the “risk” associated with the firm. For ilnplelnenting this principle,
one formal approach employs the capital assets pricing model espoused by
Lintner (1965), Markowitz (1959), and Sharpe (1964).Chen et al. (1973)
tookadifferent and more empirical approach by using dataconcerning
several variables that are acknowledged to be risk-related (e.g., debt ratio,
price/earnings,ratio,stock price variability) and findingcompanieswith
similar risk characteristics that could then be compared in terms of their
rates of return. Standard & Poor’s COMPUSTAT database pertaining to
over 100 utilities and over 500 industrials was the source, and a particular
interest of the analysis was to compare AT&T’s rate of return within the
group of firms that shared its risk characteristics.
At an initial. general level of analysis, Chen et al. (1973) addressed the
question of AT&T’s classification as belonging to either the utility group or
the industrial group through the use of discriminant analysis. They found
strong evidence thatAT&T belonged with theindustrials. To providea
different look, one could use cluster analysis to find groups of firms with
similar risk features and further investigate the particular cluster to which
AT&T belongs. Since the primary interest of the authors was in the latter,
and also partly because the number of firms was large, an attempt was made
to find a “local” cluster near AT&T in terms of the risk measures rather
than clustering all the firms [see Cohen et al. (1977) for details of the algo-
rithm involved]. This analysis led to detectingaclusterof 100 industrial
firms with risk comparable toAT&T’s. I n terms of the performance measure
of rate of return, AT&T’s value was found to lie below the median of the
rates of return of this cluster, thus providing a quantitative basis for arguing
a higher rate of return.
PatternRecognitioninIndustry 261

2.2. Marketing
In market research, classification and clustering can serve as aids in product
development in light of the needs of lead users.
Urbanand VonHippel (1988) describeaninnovativeapproachto
product development in situations where the technology may be changing
very rapidly. Efficiency in developing a product with an eye to capturing a
significant share of the market is the desired goal. The efficiency arises from
studying a carefully chosen subset of the potential market andyet ending up
having a product that is likely to satisfy the needs of and be adopted by a
much larger group of customers. The main steps of the approach proposed
by Urban and Von Hippel are to use cluster analysis for identifying a set of
“lead” users of the product, then seek information from such users about
whatfeaturesandcapabilitiestheywould like the producttohave,and
finally apply this information not only to develop the product but also to
test its appeal and utility for a wider group of users. The specific product
used to illustratetheapproach is softwareforcomputer-aided design of
printed circuit boards (PC-CAD). Careful choice of variables that are likely
to indicate “lead” users is a key part of and reason for the success of the
initial cluster analysis. Variables used included measures of in-house build-
ing of PC-CAD systems,willingness to adopt systems atearlystages of
development, and degree of satisfaction with commercially available sys-
tems. Atotal of 136 firmswereclustered on the basis of suchvariables.
Bothtwo-andthree-clustersolutionswerestudied,andtheformerwas
chosen as satisfactorywithoneofthetwoclustersbeingpredominantly
“lead” users. Treating the two clusters as if theywereprespecified ~-i.e.,
the discriminant analysis framework, for instance-the authors report that
the fraction correctly classified in the two clusters was almost 96%. More
interesting,wheninformationgatheredfromtheleaduserswasusedto
design a new PC-CAD system and this new designwaspresented to the
participants i n the study, about 92% of the lead user group and 80% of the
non-lead group rated it as their first choice! Urban and Von Hippel (1988)
also discuss the advantages and disadvantages of their lead-user methodol-
ogy in general contexts.

2.3. Resource Allocation


Kettenring et al. (1976) describe the role of cluster analysis to assess the
current validity ofcourseobjectives in a multifacetedindustrialtraining
curriculum for workers with evolving training needs. The approach involves
three major components: ( I ) careful preparation of an inventory of the p
current elements of the job, (2) collection of data about the nature of their
262 GnanadesikanandKettenring

jobs and training needs from a sample of n workers engaged in the job at
which the training is directed, and (3) cluster analysisof the resulting (p x n)
matrices in various ways. Inoneanalysis,the p = 169 rows of amatrix
indicating elements performed on the job by the sample of n = 452 workers
yielded insights into clusters of elements of the job that fit together and
mightpotentially be taughttogetherasamodule.Thesehelped identify
gaps in the existing curriculum where new resources were needed. In another
analysis, then = 452 workers were clustered into groups with common train-
ing needs. The range of needs across the clusters suggested that a training
program with flexible options would be an efficient way to train the workers.

2.4. SoftwareEngineering
Carter et al. (1993) (see also Barson et al., 1995) tackle the problem of clone
detection in large telecommunications software systems. A clone is a unit of
software source code that is very similar to some other unit of code in the
same system. In large systems with a long history, it may happen that there
are several clones of the same piece of software. These can unnecessarily
inflate the size of the overall system and make it less efficient to maintain.
For example, should there be a fault in one of the clones, it would probably
be present and need to be corrected in the others as well.
The twopapersmentionedabove discussdifferentneuralnetwork
approaches to software clone detection. In Carter et al. (1993), an unsuper-
vised neural net is used to form clusters of software units based on a set of
features or variables. The variables characterize different aspects of a unit of
source code such as its physical layout. New units of code canbe compared
against existing clusters to see if they fall within one of these clusters. The
overall approach is attractive, even though it does not yet appear to have
been widely applied.

2.5. StrategicPlanning
Mathieu (1992) (see also Mathieu and Gibson, 1993) discussesan interesting
use of cluster analysis for prioritizing critical technologiesi n national policy
making and guiding the choice of an efficient system-level technology. One
of the prime difficulties in such situations is the interdependencies among the
technologies. This work claims to be the first in the literature to provide a
systematicquantitativemethodfor explicitly identifying“highperfor-
mance”technologiesforaidingnationalpolicymaking. As stated by
Mathieu, “the purpose of using cluster analysis in technology planning is
to determine natural groupings of system level technologies based upon the
scientific interdependenciesthat link thesetechnologies.” Theparticular
PatternRecognition in Industry 263

application discussed in this work concerned satellite technologies and pol-


icy making related to these in Washington, DC. Thirty ( = n) system-level
technologies were considered for the clustering, and 72 ( = p) binary vari-
ablesthatmeasurethe presence orabsence of 72 element-level support
technologies in each of the system-level technologies were used for theclus-
ter analysis. The analysis led to six clusters of the system-level technologies,
with the smallest of the clusters containing only two technologies and the
largest group containing seven.
For aidingtheidentification of high perormancetechnologies,two
variables extraneous to the cluster analysis were introduced, market share
and sales growth rate, and averagevalues of these for all U.S. companies for
the system-leveltechnologies grouped in eachclusterwerecomputed.
Mathieu (1992)used an interestinggraphicalscheme (see Fig. 1) for a
two-dimensional display of these averages for the six clusters. The six clus-
ters are represented by circles and labeled with the names given to them by
Mathieu. The circles are centered at the average values with diameter pro-
portional to the total U.S. market size for each technology group and thick-
ness proportionaltoameasure ofcluster“tightness.” The displaythus
containsinformationonfourcharacteristics. Relativelylarge and thick
circles located toward the top left corner of the display would indicate the

1o x

Inboard Satellite
,ommunications Equip.

Average
Sales Growth icientific
iatellites
5%

Detection and
4 4
Remote
Sensing
Tracking Transmissiol
Equipment

OX
20K 1O K 0%

Average Market Share

Figure 1 Mathieu’s six clusters of system-level technologies. (Copyright 1991


IEEE.)
264 GnanadesikanandKettenring

system-leveltechnologies that werepreferred. From theconfiguration


shown here, Mathieu concluded that while no single technology group is
uniformly dominant with respect to all four characteristics, the two labeled
“onboardsatellitecommunicationsequipment”and “scientific satellites”
appearto be favorablechoices, whilethetwolabeled “remotesensing”
and “transmission equipment” are clearly ruled out in terms of the desire
to choose high performance technologies.

3. SOMESTATISTICALMETHODOLOGICAL ISSUES

The discussion in the previous section was designed to leave the impression
that methods of pattern recognition are used in many facets of business and
arehavingconsiderableimpact onmatters of qualityandproductivity.
Indeed, if one takes a reasonably holistic view of quality management, it
is not a stretch to conclude that these methods are a potent part of the
arsenal of tools for quality improvement.
At the same time, practitioners of these methods need to be aware of
the care that is necessary for their successful use. The applications literature,
unfortunately, is notreassuring in thisregard;subtledetailsareseldom
discussed, and canned programs appear to be heavily, even totally, relied
upon.
The difficulties start at the earliest partof the analysis when a commit-
ment is made to what data and which variables to use. The temptation is to
include every variable of possible value to avoid missing out on an impor-
tant one. The price one pays for this ranges from a needlessly watered down
analysis to full-blown distortion of the results. In cluster analysis, the risk is
particularly severe: Clear-cut clustersconfined to a subspaceof the variables
can be completely overlooked.
The traditional methods of discrillinant analysis have the nice math-
ematical property of being invariant under nonsingular linear transforma-
tion of the data. However,in most cluster analysis procedures,this is not the
case. There is explicit or implicit commitmentto a metricthatatone
extrememay be invariantbutotherwisewithoutrationale(as when one
uses the total covariance matrix of the entire data set to form a weighting
matrix for the metric) and at the other may involve no reweighting of the
variablesandthereforenosuchinvariance(as in thecase of Euclidean
distance). An intermediate, and far too popular, example is autoscaling or
weighting to equalize the total sample variances of all the variables. This
works against detecting clustersby all methods that take autoscaled data, or
distances derived from them, as input. Rather than putting the variables 011
: ~ I Iequal footing. according to their within-cluster variation (which is what
PatternRecognitioninIndustry 265

one wouldprefer to do), it placesvariableswithcluster structure on the


sameoverallfooting as those without such structure and therebymakes
it more difficult to find the clusters via standard
algorithms. See
Gnanadesikan et al. (1995) for further discussion and mitigating alterna-
tives.
Another worry is which method or algorithm to choose for all ana-
lyses. Neural networks‘?Classicaldiscriminantanalysis, or classification
trees? A hierarchical method, or a partitioning method of cluster analysis?
There are many choices. Users need to be sensitive to the pros and cons of
them and to resist having the analysis driven by the content of the nearest
software package. A very appealing strategy in pattern recognition work is,
in fact, to apply a thoughtful variety of methods to the data. The hope is
that major well-formed patterns will emerge from different looks at the data,
andothersthatare less pronouncedbut still potentiallynoteworthy will
reveal themselves in at least one of the alternative calculations.
The findings can also be made more credible by subjecting them to a
variety of sensitivity analyses for a particular method. For example, con-
trolled jiggling of the data or systematic deletion of variables and/or obser-
vations followed by reapplication of the method can help one to appreciate
just howstable or fragile theresults are (see Gnanadesikan et al., 1977;
Cohen et al., 1977).
As the number of variables, p, or observations, n, grows-and this is
clearly the trend i n many industrial applications-a much more daunting
challenge arises. Many of the standard pattern recognition methods become
impractical or literally break down. The irony of this is that with massive
sets of data one needs just such pattern recognition approaches to bring the
data under control by dividing them into manageable chunks.
To illustrate the point, consider what is probably the most popular
and widely available form of clustering, hierarchical cluster analysis. This
method operates on n(n - 1)/2 interpoint distances to produce hierarchical
trees with n leaves at the top and one trunk at the bottom. The distances
present data management challenges when n is large and the trees, which
ought to be studied, become so big that they cannot be readily drawn or
digested.
Other popular algorithms, such as k-means, may be more suitable as n
increases, but they arenotapanacea.Brand new approachesare really
needed. For example,“localizing”theanalysis so that one is looking for
patterns of a particular type in a particular region of spacemay be one
effective way to reduce the problem to a reasonable size. See Section 2.1
for an example.
Whenp is too large, other complexitiesarise. As indicatedalready,
masking of patterns is aseriouslimitation,andavailablemethodsfor
266 GnanadesikanandKettenring

variable selection anddimensionalityreduction,whethergraphical or


numerical, are unlikely to work well.
To make matters worse, even current practice for reducing the number
of variables when p is only moderately large is open to criticism. Again in
the context of cluster analysis, awidely advocated and practiced technique is
to reduce dimensionality via principal components analysis. Although this
can work well in some situations, thelogic of this approach is suspect, and it
is easy to give examples of when it fails.
The relative size of n and p can matter a lot for some types of pattern
recognition problems. If n is small relative to p, the already suspect reduc-
tion of variables via principal components will also suffer from numerical
instability problems. When both are very large, entirely new approaches to
patternrecognitionmay be theanswer. For example,onecanenvisage
extensive distributed computations of massive data sets. Local exploration
may be handled by burrowing deeply into the local detail. The global solu-
tion would be obtained by ultimately stitching the local solutions together.
In summary, there is much to worry aboutin terms of methodological
issues if one is to take advantage of pattern recognition techniques in com-
plex industrial problems. A “black box” or “canned program” approach
will not cut it and can easily do more harm than good.

4. CONCLUDING REMARKS

Pattern recognition methods are natural ones forhelping to improve quality


and productivity in industrial settings. Applications are prevalent, and sev-
eral rather different ones were given to illustrate this point. Nevertheless,
careful attention to detail is needed to ensure that the methods, which are
farfrom infallible, are effectively applied.When they are, they canbe
powerful tools in the search for total quality management.

REFERENCES

AltmanEI, Avery RB, Eisenbeis RA, Sinkey JFJr. (1981). Applications of


Classification Techniques in Business, BanklngandFinance.Greenwich,
CT: JAI Press.
Barson P, Davey N, Frank R, Tansley, DSW.(1995). Dynamic competitive learning
applied to the clone detection program. Proceedings International Workshop
on Applications of Neural Networks to Telecommunications, pp 234-241.
Benjamini Y , Igbaria M . (1991). Clustering categories for better prediction of com-
puter resources utilizations. Appl Stat 40:295-307.
PatternRecognitioninIndustry 267

Carter S, Frank RJ, Tansley DSW. (1993). Clone detection in telecomlnuliications


software systems.Proceedings InternationalWorkshoponApplications of
Neural Networks to Telecommunications, pp 273-280.
Chen HJ, Gnanadesikan R, Kettenring JR.(1973). A Statistical Study of Groupings
of Corporations. Bell Labs Technical Memorandum.
Chen H, Gnanadesikan R, Kettenring JR. (1974). Statistical methods for grouping
corporations. Sankhya B 36: 1-28.
Cohen, A, Gnanadesikan R, Kettenring JR, Landwehr JM. (1977). Methodological
developments in someapplications of clustering.In:KrishnaiahPR,ed.
Applications of Statistics. New York: North-Holland, pp 141-162.
Conte SO, DunsmoreHE, ShenVY. (1986). Software Engineering Metricsand
Models. Menlo Park, CA: Benjamin/Cummings.
Curry DJ. (1993). The New Markcting Research Systems. New York: Wiley.
Dagii CH. Kumara. SRT, Shin YC.(1991). Intelligent Engineering Systems Through
Artificial Neural Networks, New York: ASME Press.
Gnanadesikan R, Kettenring JR, Landwehr JM. (1977). Interpreting and assessing
the results of cluster analyses. Bull Int Stat Inst 47:451463.
Gnanadesikan R, Kettenring, JR. Tsao SL. (1995). Weighting and selection of vari-
ables for cluster analysis. J Classif 12:113-136.
Kame1 M. Hadfield B, Ismail M. (1990). Fuzzy query processingusing clustering
techniques. Info Process Manage 26:279-293.
Kettenring JR, Rogers WH, Smith ME, Warner JL.(1976). Cluster analysis applied
to the validation of course objectives. J Educ Stat 1:39-57.
Lintner J. (1965). The valuation of risky assets and the selection of risky investments
in stock portfolios and capital budgets. Rev Econ Stat 47:13-37.
Markowitz H. (1959). Portfolio Selectlon: Eflicient Diversification of Investments.
New York: Wiley.
Mathieu RG. (1992). A method based on cluster analysis for national and regional
technology policy development. Proceedings of the 1991 Portland
International Conference on Management ofEngineering and Technology,
PICMET'91. Piscataway, NJ: IEEE, pp 685-688.
MathieuRG,GibsonJE. (1993). Amethodologyfor large-scale R&Dplanning
based on cluster analysis. IEEE Trans Eng Manage 40:283-292.
Sharpe W. (1964). Capital assets prices: A theory of market equilibrium under con-
ditions of risk. J Finance 19:425-442.
Urban GL, Von Hippcl E. (1988). Lead user analysis for the development of new
industrial products. Manage Sci 34:569-582.
This Page Intentionally Left Blank
16
Assessing Process Capability with
Indices
Fred A. Spiring
The University of Manitoba, Winnipeg, and Pollard Banknote Limited,
Manitoba, Canada

1. GENESIS

The automotivc industry has been a leading promoter of process capability


indices a s tools for quality improvement. It is 110 longer alone, a s process
capability indices are now embraced by a wide variety of industries inter-
ested i n assessing the ability of a process to meet customers’ requirements.
Thepopularity of these indices is generally attributed to theirability to
provide a single-number summary that relates process performance to pro-
cess requirements.Practitioners use thesingle-numbersummary in many
ways including ( I ) awarding supplier audit points based on the magnitude
ofthesummaryvalue, ( 2 ) documentedevidenceofprocessperfornlance
relative to customers’requirements,and (3) in identifyingprocesses in
need of improvement.
The use of single-number summaries to assess the overall perforlnance
of a process has been criticized: however, when used in conjunction with
other quality tools. the information provided by these summaries can be
invaluable.Under the assumptionthat meeting or exceeding customer
requirements is the focus of most quality programs and considering process
capability indices to be the quantification of the process’s ability to meet
customer requirements, thc increasing use ofprocesscapabilitylneasures
seems only natural.Unfortunately, users ofprocesscapability indices
havedeveloped several “bad habits.” in part due to a lack of practical,
statistically sound techniques.

269
270 Spiring

2. PROCESSCAPABILITYINDICES

Process capability indices are used to assess a process’s ability to meet a set
of requirements. When used correctly these indices provide a measure of
process performance that in turn can be used in the ongoing assessment of
process improvement. Indices allow statistically based inferences to be used
in theassessmentofprocesscapability as well as in the identification of
changes in the ability of the process to meet requirements.
I t is generally acknowledged that Japanese companies initiated the use
of process capability indices when they began relating process variation to
customer requirements in the form of a ratio. The ratio, now referred to as
the pr.orr.ss cupcrhilitl~ir~des,is defined to be

- USL- LSL
cy =
60

where the difference between the upper specification limit (USL) and the
lower specification limit (LSL)providesameasureofallowableprocess
spread (i.e., customer requirements) and 60, 0’ being the process variance,
a measure of the actual process spread (see Fig. I).
C,, uses only the customer’s USL and LSL in its assessment of process
capability and fails to consider a target value. The five processes depicted by

Allowable Process Spread (USL -LSL)


4 b

LSL Actual
SpreadProcess (6a) USL

Figure 1 Allowable process spread versus actual process spread.


AssessingProcessCapability with Indices 271

the numbered normal curves in Figure 2 have identical values of a2 and


hence identical values of C,. However, because the means of processes 2, 3,
4, and 5 all deviate from the target(T), these processes would be considered
less capable of meeting customer requirements than process 1.
Processes with poor proximity to the target have sparked the deriva-
tion of several indices that attempt to incorporate a target into their assess-
ment of process capability. The most common process capability indices
assume T to be the midpoint of the specification limits and include

USL - LSL
C,I,,l =
+
6[0’ ( p - T)2]’’2
USL - p
Cp1, =
30
p - LSL
CP, = 30
CPk = min(C/d, C,,,)

and

C& = (1 - k ) C ,

Figure 2 Five processes with identical values of C,,.


272 Spiring

where k = 2 ( T - p(/(USL - LSL) and p represents the process mean such


thatLSL < p < USL.Thetwodefinitions CPkand C;k are numerically
equivalent when 0 5 k 5 1.
Individually, C,,, and CPIconsider only unilateral tolerances (i.e., USL
or LSL, respectively) when assessing process capability. Both use 30 as a
measure of actual process spread, while the distance from where the process
is centered ( p ) to the USL (for C,,,) or to the LSL (for CPl)is used as a
measure of allowable process spread. Both C,,, and C,,/ compare the length
of one tail of the normal distribution (3a)with the distance between the
process mean and the respective specification limit (see Fig. 3). In the case of
bilateral tolerances, C,,,, and C,,/ have an inverse relationship and individu-
ally do not provide a complete assessment of process capability. However.
conservatively taking the minimum of C,,,, and C,,/ results in the bilateral
tolerance measure defined as C,k.
Similar to CIJ,C,,,,, uses USL - LSL as a measure of allowable process
spread but replaces the process variance in the definition of C/, with the
processmean squareerroraroundthetarget.Forall processes, C,, and
C,,,, are identical when the process is centered at
the
target [i.e.,
p = T = (USL + LSL)/2]; however, as the process mean drifts from T ,
C,,,,, becomes smaller while C,, remains unchanged.

LSL T CL USL

Figure 3 Target is the midpoint of the specification limits.


AssessingProcess Capability with Indices 273

The generalized analogs of these measures do not assume T to be the


midpoint of the specifications (see Fig. 4) and are of the form

min[USL - T, T - LSL]
Cp,,,=
3[0’ + (p, - T)’]”’

Cp,, =
30 USL - T
CP, = T - L
3aS L ( & IT-”>
T - LSL

and

Note that the original definitions of C,,,,,, Cll,, C,,, and C,, are special cases
+
of the generalized analogs with T = (USL LSL)/2.
The process capability indices C,, CI,,,Cl,,,, CPk, and C,,,,, and their
generalized analogs belong to the familyof indices thatrelatecustomer
requirementstoprocessperformanceasaratio. As process performance
improves, through either reductions in variation and/or moving closer to

LSL T P USL

Figure 4 Target is not the midpoint of the specification limits.


274 Spiring

the target, these indices increase in magnitude for fixed customer require-
ments. In each case larger index values indicate a more capable process.
Many modifications to the common indices, as well as several newly
developed indices, have been proposed but are not widely used in practice.
With remarkably few exceptions these recent developments can be repre-
sented using the generic process capability index

min[USL - T , T - LSL]
c,,,,.= 3[a2 + w(p - T)2]”2

where w is aweight function. Allowing the weight functiontotakeon


differentvaluespermits C,,,,, to assumeequivalentcomputationalforms
host
for
a of potential
capabilitymeasures. For example, with
+
T = (USL LSL)/2 and w = 0, C,,,, is simply C,,, while for 1v = 1, C,,,,.
assumes the generalized form of C,,,,. Letting p = lp - TI/a denote a mea-
sure of “off-targetness,” the weight function

for 0 < p where d = (USL - LSL)/2 and CI = p - (USL + LSL)/2 allows


C,,,,. to represent Cik.The weight function

k(2 - k )
\v =
(1 - k)2p2

for 0 < k < 1 allows CPwto represent Cpk,or alternatively, defining w as a


function of C,,

for 0 < p / 3 < C,, again results in C,,,,. representing C,,k.


A recent refinement that combines properties of both C,,k and C, is
defined to be

min[USL - p , p - LSL]
=
Cpmk
+
3[a2 ( p - T)2]1’2

and can be represented by C,, using the weight function


AssessingProcessCapability with Indices 275

3. INTERPRETINGPROCESSCAPABILITYINDICES

Traditionally, process capability indices have been used to provide insights


into the number (or proportion) of product beyond the specification limits
(i.e., nonconforming).Forexample,practitioners cite a C,j value of 1 as
representing 2700 parts per million (ppm) nonconforming, while 1.33 repre-
sents 63 ppm; 1.66 corresponds to 0.6 ppm; and 2 indicates < 0.1 ppm. C,,k
has similar connotations, witha CPk of 1.33 representing a maximum of
63 ppm nonconforming.
Practitioners, in turn, use the value of the process capability index and
its associated number conforming to identify capable processes. A process
with C(, 3 1 has traditionally been deemed capable, while C,, < 1 indicates
that the process is producing more than 2700ppm nonconforming and is
deemed incapable of meeting customer requirements. In the case ofC,,k, the
automotive industry frequently uses 1.33 as a benchmark in assessing the
capabilityofaprocess.Several difficulties arisewhenprocesscapability
indices are used in this manner, including (1) the robustness of the indices
to departures from normality,(2) the underlying philosophy associated with
converting index values to ppm nonconforming, and (3) the use of estimates
as parameters.
The number of parts per million nonconforming is determined directly
from the properties of the normal distribution. If the process measurements
do not arise from a normal distribution, noneof the indices provide a valid
measure of ppm. The problem lies in the fact that the actual process spread
(60) does not provide a consistent measure of ppm nonconforming across
distributions. For example, suppose that 99.73% of the process measure-
ments fall within the specification limits for five processes, where the statis-
tical distributionsassociated with the processes are (1) uniform,(2)
triangular, (3) normal, (4) logistic, and (5) double exponential (see Fig. 5).
The values of C,, for the five processes are 0.5766, 0.7954, 1.0000, 1.2210,
and 1.4030, respectively. Hence as long as 6 0 carries a practical interpreta-
tion when assessing process capability and the focusis on ppm nonconform-
ing, none of the indices should be considered robust to departures from
normality.
276
Spiring

2.5

0.5

0
-1 -0.5 0 0.5 1
X

Figure 5 Five processes with equivalent nonconforming but different values of C,,.

Inherent in any discussion of the number nonconforming as a measure


of process capabilityis the assumption that product produced just inside the
specification limit is of equal quality to that produced at the target. This is
equivalent to assuming a square-well loss function (see Fig. 6) for the quality
variable. In practice, the magnitudes of C,,, C,,/,C,,,,, and C,,L are interpreted
as a mensure of ppm nonconforming and therefore follow this square-well
loss function philosophy. Any changes in the lnagnitude of these indices
(holding the customer requirements constant) is due entirely to changes in
the distance between the specification limits and the process mean. C/,, C,,,,,
C'/,/, and C/,k d o not consider the distance between ,u and T but are used to
identify changes in the amount of product beyond the specification limits

Figure 6 Square-well loss function.


AssessingProcessCapabilitywithIndices 277

(not proximity to the target) and are therefore consistent with the square-
well loss function.
Taguchi uses the quadratic loss function (see Fig. 7) to motivate the
idea that a product imparts “noloss” only if that product is produced at its
target. He maintains that small deviations from the target result in a loss of
quality and that as the productincreasingly deviates from its target there are
largerandlarger losses in quality.Thisapproachtoqualityandquality
assessment is differentfromthetraditionalapproach,whereno loss in
quality is assumeduntiltheproductdeviatesbeyonditsupper or lower
specification limit (i.e.,square-well loss function). Taguchi’sphilosophy
highlights the need to have small variability around the target. Clearly in
this context the most capable process willbe one that produces a l l of its
product at the target, with the next best being the process with the smallest
variability around the target.
The motivation for C,,,, does not arise from examining the number of
nonconforming product in a process but from looking at the ability of the
process to be in the neighborhood of the target. This motivation has little to
do with thenumber of nonconforming,althoughupperboundsonthe
number of nonconformingcan be determinedfornumericalvaluesof
C,,,,,. The relationship between C,,,, and the quadratic loss function and its
affinity with the philosophies that support aloss in quality for any departure
from the target set C,,, apart from the other indices.
C,,k and C,,,,, are often called second generation measures of process
capability whose motivations arise directly from the inability of C,, to con-
siderthetargetvalue. The differences in theirassociated loss functions
demarcate the two measures, while the magnitudinal relationship between

Target

Figure 7 Quadratic loss function.


278 Spiring

C, and CiIk,C,,,, are also different. C1,k and C,,,,, are functions of C, that
penalize the process for not being centeredat the target. Expressing C, and
Cl,k as

and

C/)k= (1 -
USL
2 ‘ p- T I >c/,
- LSL

illustrates the “penalizing” relationship between C,, and C,,,,, CiIk,respec-


tively. As the process
meandrifts
from
the
target
(measured by
p = lp - T ( / ( T )both
, C,,,, and C,,, decline as a percentage of Cl, (Fig. 8).
In the case of C,,,,,, this relationship is independent of the magnitude of C,,,
while C,,, declines as a percentage of C,, with the rate of decline dependent
on the magnitude of C,,. For example, in Figure 8, C ,,, (5) represents the
relationship between C,,k and C, for C,l = 5, and is different from CPk(l),
which represents the relationship between Cllkand Cl, for C,i = 1.
Cllkand C,,, have different functional forms, are represented by dif-
ferent loss functions, and have different relationships with Ci, as the process
drifts from the target. Hence although C,,,, and CPkare lumped together as
second generation measures, they are very different in their development
and assessment of process capability.

I ... . . . , .. .
CP

0 1 2 3 4 5 6 7 8 4 ,
P

Figure 8 Relationships between C,, and cTPli,


C,,,,.
AssessingProcess Capability with Indices 279

4. ANALYZINGPROCESSCAPABILITY STUDIES

The usual estimators of the process capability indices are

or
A A

Cl,k. = ( 1 - k)C,

where k^ = 2lT - .V(/(USL - LSL), s is the sample standard deviatio?,an! S


is the sample mean.The probability density functions (pdfs)of C,,, C,,,e/,,
and C,,, are easily determined, assuming the processAmeasureFentsfollow a
normal distribution. However, the distributions of C/Jkand Cl,p raise some
challenges, as their pdfs are functions of dependent noncentral t distribu-
tions for which only asymptotic solutions currently exist.

4.1. ConfidenceIntervals
Several inferential techniques have recently been developed, most of which
havehadlittleimpact onthepractice of judging a processcapable. In
defense of the practitioners, several notable texts promote the use of esti-
mates as parameterswith the proviso that large samplesizes (i.e., I I > 50) are
required. A general confidence interval approach for the common indices
can be developed using C,,,. and its associated estimator C,,,,.. The general
form of the estimator for C,,,,. is

USL - LSL
=
‘/J!Y
+
6[6’ w(S - T)2]1/2

where e2
= Cr=,[(si - .V)’/n] and S = C~=,(.Y,/H). Assumingthatthe
process
-2
a -measurements are
( U ’ / I I ) ~ ~ -(,2, ) S-
normally distributed it follows that (1)
N [ p , a2/n],and (3) .V and Z 2 / n are independent.
280 Spiring

Assuming I V and T to be nonstochastic, it follows that ( S - T)’


with noncentrality
parameter h = /?(p- T)’/a2
-
(a’/n)x:,,
and
w ( S - T)’ - (nt~’/n)x:,~.Defining

Q$ is a linear combination of two independent chi-square distributions,


xipl + ~ I I X ~ , , ,whose cumulative distribution function (cdf)
Qf,,(x) can be
expressed as a mixture of central chi-square distributions with the general
form

The d i ’ s are simplyweightssuch that d j = 1 and functions of the

degrees of freedom (11 - 1 and I), the noncentrality parameter ( h ) . and the
weight function (w)of the linear combination of chi-square distributions.
The functional form of the di’s for the general Qi,,(x) are

for i = 1,2.3, . . ., when h denotes the value of the noncentrality parameter


and I,’ the value of the weight function. The value of the d j ’ s and Q~,,(.Y)can
be calculated using the following Mathematica code:

In[ll:
* * To determine the di‘s for the number of specified * *
* * i’s enter the values of 1 and w **
1= ;w=
Do[Print [Sum[Sum[Exp[-(1)/2]( ( (1)/2)-(b-k)) ( ( (b-k) ! )- -l)*
(w^(-.5-b+k))((l-w-(-l))-(k+g-b))Gamma[(.5+g-b)]*
Binomial [b-l,k]/(Gamma[(g-b+l)IGamma[.51),
~k,O,b~l,~b,O,g~ll,~g~l,i~l
AssessingProcessCapabilitywithIndices 281

In121 :
* * Approximate the value of the distribution by **
* * replacing an infinite sum with an finite sum of * *
* * i+l terms using values of n, a , 1 and w **
((Statistics'ContinuousDistributions'
;w= ;n= ;a= ;
Sum[Quantile[ChiSquareDistribution[n+2gl ,a]*
Sum[Sum[Exp[-(1)/2] (((1)/2)-(b-k))(((b-k)!)--1)*
(~-(-.5-b+k))((l-w-(-l))-(k+g-b))Gama[(.5+g-b)]*
Binomial[b-l,k]/(Gama[(g-b+l)lGamma[.51~,
{k,O,b}l,{b,O,g}l,{g,l,i}l+
(Exp[-1/2] (~~(-0.5))*Quantile[ChiSquareDistribution[n] ,a])

The pdf of e,,,,.


can then be expressed as a function of Q?,,(s), al!o-
wing confidence intervals and statistical criteria to be used in +sses$ng GI,,,,
while also providing small sample distribution properties for C,,, e,,,,,Cl)k,
and Cl,k.. Returning to the general form of the index,

it follows that [(I + w h / t ~ ) ] ' ~ ~=C ~(USL


,,,. - LSL)/6a. By considering

where Qi.,(B) represents the value of the Qi,,(x) variate for 11, h and prob-
ability B. It follows that

which implies
282 Spiring

resulting in a general confidence interval for C,,,. of the form

USL - LSL
c/J,!' = ' / J
60

and the confidence interval in Eq. (1) becomes

where (USL - LSL)/6s.


Sinularly, for 11' = I ,

with confidence interval

for

USL - LSL
cpl
ll
6(s2 + [ n / n - l](S - T ) 2 } 1 / 2
The weight function

k(2 - k )
11' =
( 1 - k)'p'
AssessingProcess Capabilitywith Indices 283

for 0 < k < 1, andassuming p and k tobenonstochastic, results in


C,,,,. = C;k with confidence interval

A A

where C;k = (1 - k)C,,


For

assuming that p(0 < p ) , cl, and are known (Le., nonstochastic), C,,,,. = Cl1k
results in the confidence interval

A n

where CIlk= min(C,,, C,,,).


The weight function may have to be estimated on occasion. However,
it is often possible to obtain good information regarding the weight func-
tions from the data used to ensure that the process is in control. Since we
require that the process be in a state of statistical control prior to determin-
inganyprocesscapabilitymeasure,thisgenerallyrequiresthatcontrol
charts be keptontheprocess.Inmostsituationsthecontrolcharts will
provide very good information regarding values necessary in determining
the weight function. For example, S and S from the control chart can pro-
vide information regarding and 0,respectively, which in turn provides an
alternative method for determining the distribution function and associated
confidence interval for each of the estimated indices.

4.2. Monitoring Process Capability


A criticism of the traditional process capability study
is that it provides only
a snapshot in time of the process’s ability to meet customer requirements.
Processcapabilitystudiesareoftenconductedatstartupandthenagain
during a supplier’s audit or after changes have been made to the process.
As a result, practitioners have little knowledge of the process’s capability
over time. Withtheadvent of small-samplepropertiesforthevarious
284 Spiring

measures of process capability, it is now easier to incorporate stochastic


inferences into the assessment and analysis of process capability measures
and to assess capability on a continuous basis.
If all otherrequirementsaremet, it is possible to estimate process
capability using the information gathered at the subgroup level of the tradi-
tional control charts. The usual control chart procedures are used to first
verify the assumption that the processis in control. If the process is deemed
in control, then estimates of the process capability can be calculated from
the subgroup information. These estimates are then plotted, resulting in a
chart that provides insights into the nature of a process’s capability over its
lifetime. The proposed chart is easily appended to an J?&R (or s) control
chart and facilitates judgments regarding the ability of a process to meet
requirements and the effect of changes to the process, while also providing
visual evidence of process performance.
Letting .xI, s 2 , .x3, . . . , s t , represent the observations in subgroup t of
an J?&s control chart used to monitor a process, consider

min[USL - T , T - LSL]
=
3[s: + n ( S , - T ) 2 / ( ~-?I)]’/’
C/l,l,

where sf is the subgroup sample variance andS, the average of the observa-
tions in subgroup t. If an X&R chart is used, consider

min[USL - T . T - LSL]
c,,,,,= 3[(R,/d2)’ + n ( F , - T ) * / ( n
- 1)]’/*

where R, denotes the range for subgroup t and d 2 the usual control chart
constant. Each subgroup in the process provides a measure of location, .TI,
and a measure of variability (either R, or s f ) . Hence an estimate of C,,,,,can
be determined for each subgroup, which results in a series of estimates for
C,,,,, over the life of the process.
A mean line as well as upper and lower limits can be created for a
capability chart using information gathered from the control chart. Similar
to Shewhart control charts, the upper and lower limits for C,,,,will represent
the interval expected to contain 99.73% of the estimates if the process has
not been changed or altered. The mean line, denoted q,
will be

- min[USL - T , T - LSL]
=
qJt11
3((S/c4)* + [ n / ( n - 1)](.?- T)’}‘/*
AssessingProcess Capability withIndices 285

when using an ,?&s chart. Assuming equal subgroup sizes, S denotes the
average of the subgroup averages S t .

, the average of the subgroup standard deviations


S si,

and the traditional constant.


A7
c4
-
Assuming that the process measurements are X N [ p ,a’] and using
u- = ~ ~ = , [ ( -s S, ) ’ / ( I I - I)], we can rewrite Eq. (2) as

Simplifying this expression we get

where

The upper ( V I )and lower ( L I )limits for ~ / 1 1 , , in conjunction with an f & s


chartdependonthesubgroup size I I andnoncentralityparameter h.
Analogous to the use of .?and S in Shewhart charts, the noncentrality para-
meter h = n ( p - T)’/u2 can be estimatedfromthecontrolchartusing
[(.?-- T)/(S/c4)]’.
When using ,?&R charts with equal subgroup sizes,

- min[USL - T , T - LSL]
Cp,, =
3((R/d$ + [I?/(/? - 1)](.7 - T)?)I”

where R denotes the average of the subgroup ranges Ri,


286 Spiring

and d 2 thetraditionalconstant.Theupperand lower limits for c/,,,,


in
__ with an X & R chartare
conjunction oftheform Ul = J 3 G and
Ll = J2CPrll,where J2, J3 are constants that depend on the subgroup size
and a noncentrality parameter A. Again analogous to the use of .?and R in
Shewhart charts, the noncentrality parameter A = n ( , u - T ) 2 / ~ can
2 be esti-
mated from the control chart using [(.? - T)/(R/d2)]*.

5. EXAMPLE
5.1. The Process
In this example 20 subgroups of size I O were gathered from a process for
which the customer had indicated that USL = 1.2, T = 1 and USL = 0.8.
I n this case T is the midpoint of the specification limits; however, all calcu-
lations use thegeneraldefinitions in determining ~,,,,,,and theassociated
limits. From the 20 subgroups we found .? = 1.1206 and S = 0.1 1, which
resulted in an uppercontrol limit of 1.230 and alower control limitof
1.014 for S and an upper control limit of 0.189 and a lower limit of 0.031
for s . Looking first at the s chart, the process variability does not appear
unusual (i.e., no out-of-controlsignals), which also seemsto be the case with
the S chart.Thecontrol limits and centerlines forthe X & s chartsare
included in Figure 9.
Since the process appears to be in control, we proceed to determine
k~,,,,for each subgroup. In the case of subgroup 1, S and s were found to be
1.15 and 0.136, respectively, resulting in

min[ 1.2 - 1, 1 - 0.81


c,,,,,,= 3[0.136’ + 10( 1.15 - 1)2/9]”2 = 0.32

kl,,,,,and the subsequent 19 subgroup values of ?,,,,,, are plotted in Figure 9.


Using ( 1 ) the customer’s requirements USL = 1.2, T = 1, and LSL = 0.8,
(2) the process results .?= 1.1206 and S = 0.11. and (3) the constants I Z = 10
and = 0.9727, we determinedthat

A=/? (“-
7
S/C4
T)2= Io( 1.1206- 1
0.1 1 /0.9727
) = 11.4
AssessingProcessCapabilitywithIndices 287

min[USL - T , T - LSLl
I1 -

0.2
-
- = 0.3918

1.25 -
UCL
1.20 "

l 05 -- - I

LCL
l o o + :
0 2 4 6 8 IO 12 14
18 16 20

020 - UCL
0 18 "

0.16

0.08 -- 'I

006 .- I

0.0) "
002 '1 ;
LCL
0 2 4 6 8 10 12 I 18
4 16 20

Figure 9 Capability chart appcndcd to a n f&schart.


288 Spiring

The values of 1. and I, for 11 = I O and A = 11.4 are

and

resulting in the limits

CJl = 2.23985(0.3918) = 0.87757 and = 0.6120(0.3918) = 0.2398

which are sketched in Figure 9.

5.2. ObservationsandInsights
Several things are evident from Figure 9. Clearly, the estimates of the pro-
cess’s capability v a r - from subgroup to subgroup. Except for subgroup 19,
the fluctuations in C, appear to be due to random causes. In period19 the
processcapabilityappearstohaveincreased significantly andwarrants
investigation. Practitioners would likely attempt to determine what caused
the capability to rise significantly and recreate that situation in the future.
I f the estimated process capability had dropped below L,, this would
signal a change in the process, and if the process capability was not at the
level required by the customer, changes in the process would be required. In
a continuous improvement program the process capability should be under
constant influence to increase. The capability chartused in conjunction with
the traditional Shewhart variables charts will provide evidence of improve-
ment. It mayalso assist in endingtheunfortunatepractice ofincluding
specification limits on the S chart, as the additional chart will incorporate
the limits and target into the calculation of process capability.
Much like the effect of first-time control charts, practitioners will see
that process capability will vary over the life of the process, illustrating the
idea that the estimates are not parameter values and should notbe treated as
such.Theproceduresprovideevidenceofthe level ofprocesscapability
attained over the lifetime of the process rather than at snapshots taken,
for example, at the beginning of the process and not until some change in
the process has been implemented. They will also provide evidence of the
AssessingProcessCapabilitywithIndices 289

ongoing assessment of process capability for customers. The effect of any


changes to the process will also show up on the chart, thereby providing
feedback to the practitioner regarding theeffect changes to the process have
on process capability.

6. COMMENTS

Several ideas have been presented that address some concerns of two dis-
tinguished quality practitioners in the area of process capability, Vic Kane
(Kane, 1986) and Bert Gunter (Gunter, 1991). Unfortunately, as noted by
Nelson ( 1 992), much of the current interest in process capability indices is
focused on determining competing estimators and their associated distribu-
tions, and little work has dealt with the more pressing problems associated
with the practical shortcomings of the indices. Continuous monitoring of
processcapabilityrepresentsasteptowardmoremeaningfulcapability
assessments. However, much work is needed in this area. In particular, as
practitioners move to measures of process capability that assess clustering
aroundthetarget,the effect of non-normalitymay beless problematic.
Currently, however, meaningful process capability assessments i n the pre-
sence of non-normal distributions remains a research problem.

REFERENCES AND SELECTED BIBLIOGRAPHY

Boyles RA. (1991). The Taguchi capability index and corrigenda. J Qual Technol
23: 17-26,
Chan LK. ChengSW. Spiring FA. (1988). A new measure of process capability: C,,,,,.
J Qual Tcchnol 20: 162-175.
ChouYM,OwenDB, Borrego SA. (1990).Lowerconfidencelimits on process
capability indices. J Qual Technol 22:223-229.
Gunter BH. (1991). Statistics corner (A five-part series on process capability studies).
Qual Prog.
Johnson T. (1992). The relationship of C,,,,, to squared error loss. J QUA Technol
24:211-215.
Juran JM. (1979). Quality Control Handbook. New York: McGraw-Hill.
Kane VE. (1986). Process capability indices andcorrigenda. J QualTechnol
18141-52, 265.
Kotz S. Johnson NL. (1993). Process Capability Indices. London: Chapnun& Hall.
Nelson PR. (1992). Editorial. J Qual Tcchnol 24:175.
Rodriguez RN. (1992). Recent developments in process capability analysis. J Q L I ~
Technol 24: 176- 187.
290 Spiring

Spiring FA. (1995). Process capability: A total quality management tool. Total Qual
Manage 6(1):21-33.
Spiring FA. (1996). A unifying approachto process capability indices. J Qual
Techno1 29:49-58.
Vannman K. (1995). A unified approach to capability indices. Stat Sin 5(2):805-820.
17
Experimental Strategies for Estimating
Mean and Variance Function
G. Geoffrey Vining
Virginia Polytechnic Institute and State University, Blacksburg, Virginia
Diane A. Schaub
University of Florida, Gainesville, Florida
Carl Modigh
Arkwright Enterprises Ltd., Paris, France

1. INTRODUCTION

An important approach for optimizing an industrial process seeks to find


operating conditions that achievesome target condition for the expected
value for a quality characteristic (the response) and minimize the process
variability.ViningandMyers(1990)suggestthattheresponseandthe
process variance form a dual response system. They use the dual response
methodologyproposed by MyersandCarter (1973) to find appropriate
operatingconditions.Thisdualresponseapproachallowstheanalystto
see where the process can achieve the target condition and where theprocess
variability is acceptable. As a result, the engineer can make explicit com-
promises. Del Castillo and Montgomery (1993) extend this method by show-
ing how to use the generalized reduced gradient, which is available in some
spreadsheet programs such as Microsoft Excel, tofind the appropriate oper-
ating conditions. Lin and Tu (1995) suggest a mean squared error approach
within this context. Copeland and Nelson (1996) suggest a direct function
minimiation of the mean squared error with a bound on how far the esti-
mated response can deviate from the desired target value.
Vining andMyers (1990) advocatereplicatinga full second-order
design. Such an approach is often prohibitively expensive in terms of the

291
292 Vining et al.

overall number of design runs. Vining and Schaub(1996) note that often the
process variance follows a lower order model than the response. They sug-
gest replicatingonlyafirst-orderportionofstandardresponsesurface
designs,which significantly reduces theoveralldesign size. Thischapter
extends the work of Vining and Schaub by exploring alternative ways for
choosing the portion of the design to replicate.

2. CRITERION FOR EVALUATINGDESIGNS

Suppose we run an appropriate experiment with a total of 11 runs. Let I Z ~ be


,
the number of distinct settings that are replicated. Consideras our model for
the response,

where y is the I I x 1 vector of response, X i s the I I x 11,. model matrix, B is the


p,. x 1 vector of unknown coefficients, and E is the 11 x 1 vector of normally
distributed random errors. Similarly, consider as the model for the process
variance.

7 = zy

where r is the I ? , , x 1 vectoroflinearpredictors, Z is the I I , , x pt, model


matrixforthelinearpredictors,and y is the p,, x 1 vectorof unknown
coefficients. We relate the ith linear predictor, 7;. to the ith process variance
CT;by

where h is a twice differentiable monotonic function. Defineh,! to be the first


derivative of / I with respect to the ith 7 . Often, analysts use the exp function
for h, which is similar to using a log transformation on the observed sample
variances. Throughout this chapter, we follow this convention; thus,

This approach guarantees that 0;2 > 0.


Consider the joint estimation of B and y . The expected information
matrix, J, is
EstimatingMeanandVarianceFunction 293

x’WI,x
J=[ 0 z’wzzz
O I

where W I , and WZ2are diagonal matrices with nonzero elements I/$ and
(h,!/c#)/2, respectively. Vining andSchaub (1996) prefer to use M , the
expected inforlnation matrix expressed on a per-unit basis, where

1
M=-J
11

Insomesense, M represents a momentmatrix.Oneproblem with this


approach, however, is that we use all 11 of the experimental runs to estimate
theresponse,but we use only 11,. distinctsettings to estimate the process
variance. In this chapter, we propose an alternative moment matrix, M*,
defined by

which is a block diagonal matrix with separate moment matrices for each
model on the diagonals.
One definition of an “optimal” design is that it is one that maximizes
the information present in the data. Much of optimal design theory uses
appropriate matrix n o r m t o measure the size of the information matrix.
The determinant is the most commonly used matrix norm i n practice, which
leads to D-optimal designs. I n this particulnr case, we must note that M*
depends on the ~ t ’ swhich
, in turn depend on y through the T ~ ’ S However.
.
we cannot know y prior to the experiment; hence. we encounter a problem
in determining the optimal design. One approach proposed by Vining and
Schaub (1996) assumes that T~ = to for i = 1 , 2, . . . , / I . Essentially,this
approach assumes that i n the absence of any prior information about the
process variance function, the function could assume any direction over the
region of interest. By initially assuming that the process variance function is
constant, the analyst does not bias the weights in any particular direction.
With this assumption. we can establish that an appropriate D-optimality-
based criterion for evaluating designs is
294 Vininget al.

The criterion provides an appropriate basis for comparing designs. By its


definition, we are able to compare in a meaningful fashion designs that use
different numbers of total runs and different replication schemes.

3. COMPUTER-GENERATEDDESIGNS

We used this criterion within a modified DETMAX (Mitchell, 1974) algo-


rithm to generate optimal finite-run designs. Figures 1-7 display the three-
factor designs generated by this algorithm over a cuboidalregion of interest
for r7 = 14, n = 15, 17 = 18, I? = 22, 17 = 26, 11 = 32, and n = 59, respectively.
Taken together, these figures suggest how the optimal design evolves with
additional design runs.
Figure 1 indicates that the computer starts with a Notz (1982) design
witharesolution I11 fractionreplicated.TheNotzdesign is interesting
because it usesseven out of the eight cubeorfactorialpoints.Itadds
three axial points in order to estimate the pure quadratic terms. Figure 2
shows what happens as we add the next point to the design. As one should
expect, it brings in the other factorial point. Figure 3 shows the optimal
design for 17 = 18 total runs. Interestingly, it starts adding theface centers of
thecubedefined by thefactorialruns. The resultingdesign is acentral
composite designwitharesolution I11 portionreplicated,whichVining

I
Figure 1 The three-pdctor D-optimal design for 14 runs over a cuboidal region.
EstimatingMeanandVarianceFunction 295

Figure 2 The three-factor D-optimal design for 15 runs over a cuboidal region.

Figure 3 The three-factor D-optimal design for 18 runs over a cuboidal region.
296Vining et al.

Figure 4 The three-factor D-optimal design for 22 runs over a cuboidal region.

and Schaub call a replicated factorial design. Figure 4 shows that at tz = 22


thedesignreplicatesall of the cube points, as opposed to thereplicated
factorial, which wouldreplicateonlyaresolution 111 fractionofthe full
factorial. Interestingly, Figure 5 shows that at IZ = 26, the computer adds
midpoints of edges. Vining and Schaub recommend their replicated factorial
design for this situation. The optimal design takes a slightly different strat-
egy. Figures 6 and 7 show that as we continue to add runs, the computer
moves to a 3' factorialwithreplicatedcubepoints.Itappearsthatthe
proposed criterion favors replicating the cube points and then augmenting
with points from the full 3'.
Figure 8 summarizes the D values for the three-factor computer-gen-
erateddesignsoveracuboidalregion.Interestingly,the D value actually
seems to peak around 17 = 32 totalruns, with D = 0.5851. The initial
increase in D with 17 makes a lot of sense because the extra runs provide
necessary symmetries. As the cube points are replicated more and more, we
presume that some imbalance in information results between the strict first-
order terms and the strict second-order terms. This imbalance may explain
why the D values drop slightly from IZ = 32 to the largest sample size stu-
died, 12 = 80.
Figures 9 and 10 extend this study to the computer-generated designs
forfourand five fxtors, respectively. In eachcase, D increases as IZ
increases. The largest values for D observed were 0.5806 for the four-factor
case and 0.5816 for the five-factor case. These figures suggest either that D
EstimatingMeanandVarianceFunction 297

Figure 5 The three-factor D-optimal design for 26 runs over a cuboidal region.

Figure 6 The three-factor D-optimal design for 32 runs over a cuboidal region.
298 Vining etal.

i
1 .

Figure 7 The three-factor D-optimal design for 59 runs over a cuboidal region.

Q i
I

0 20 40 60 80 100

Figure 8 Plot of the value for D for the three-factor computer-generated design
over a cuboidal region.
EstimatingMeanandVarianceFunction 299

i
0 20 40 60 80 100

Figure 9 Plot of the value for D forthefour-factorcomputer-generated design


over a cuboidal region.

0 20 40 60 80 100

Figure 10 Plot of the value for D forthe five-factor computer-generated design


over a cuboidal region.
300 Vining et al.

approaches some asymptote or that D may peak at some sample size larger
than the ones studied.

4. COMPARISONS OF DIFFERENTREPLICATION
STRATEGIES

Figures 11-13 use the D criterion to compare the following design strategies
for three, four, and five factors over a cuboidal region:
A fully replicated central composite design (CCD)
A fully replicated Notz (1982) design
A replicated axial design (a CCD with only the axial points replicated)
A replicated factorial design (a CCD with only a resolution 111 fraction
replicated)
A replicated 3/4 design (a CCD with only a 3/4 fraction replicated)
A replicated full factorial (a CCD with the entire factorial portion
replicated)
The fully replicated CCD should always be a "near-optimal" design for each
situation. Insome sense, it provides a "gold standard"forcomparisons.
However,replicatinga full CCD is rather expensive i n terms of overall
design size. The Notz design is a minimum run D optimal design for the
second-ordermodeloveracuboidalregion.Replicating a minimalpoint
design is one logical way to reducetheoveralldesign size. Vining and
Schaub (1996) note that the replicated Notz design performs surprisingly
well i n the joint estimation of thetwo models. Vining and Schaub proposed
thereplicatedaxial andthereplicatedfactorial as alternative designs for
reducingthetotalnumberofruns. The replicated 3/4 design is another
possible alternative. The optimal designs generated in the previous section
strongly suggest the replicated full factorial strategy.
Figure I 1 summarizes the three-factor results. In this figure, 111 refers to
the number of runs at each replicated setting. We evaluated each design
using 4, 8, and 12 replicates. As expected, the replicated CCD appears to
be the best overall design. Interestingly, the replicated full factorial actually
was better for I H = 4. The designs that replicated only a portion of their runs
a l l became less efficient as the replication increased. We believe that this is
due to an increase in theimbalance in these designs. The replicated full
factorial performed slightly better than the other partially replicated designs.
The replicated 3/4 and thereplicated factorial performed very similarly. The
replicated axial performedquitepoorly.ThereplicatedNotzperformed
almost as well as the replicated CCD.
EstimatingMeanandVarianceFunction301

- Repllcaled Faclorlal
. .....
.. Replicated Axlal
"- CCD

0
0
i ,
"

"

".
Not2
Repllcated 314
Repllcated Full Factorial

0 5 10 15

Figure 11 Comparisons of designs in terms of D for the three-factor cuboidal case.

Table 1 summarizes the number of runs required by each design. The


replicated factorial requires the fewest, and the replicated CCD requires the
most. Our D criteriontakesthetotalsample size intoaccountandthus
provides a fair conlparison for these designs. I n many situations, the experi-
menter cannot afford large numbers of total runs due to either time or cost.
The replicated factorial appears to be relatively competitive in terms of the
D criterion while at the same time minimizing the total number of runs. In
this light, the replicated factorial is often a very attractive design for this
type of experimentation.
Figure 12 summarizesthefour-factorresults.Here,thereplicated
CCD performsuniformlybest.Onceagain,theperformance of a l l the

Table 1 Design Sizes for the Three-Factor Case


111

Design 4 X 12
Replicated factorial 26 42 58
Replicated axial 32 56 X0
Replicated 3/4 32 56 x0
Replicated full factorial 38 70 102
Notz 40 80 120
CCD 56 112 168
302 Vininget al.

-Replksled Fanorla
........ Replicated Axlal
"_
1,
CCD
"
No12
" Replicated 3 4
". ReplicatedFull Fanom

0 5 10 15

rn

Figure 12 Comparisons of designs in terms of D for the four-factor cuboidal case.

designs that replicate only a portion of their runs decreases with greater
replication. The replicated full factorial, replicated 3/4 factorial, and repli-
cated factorial designs all perform similarly, with the replicated full factorial
performing slightly better than the others and the replicated factorial per-
forming slightly worse. The replicatedaxial performs very poorly.Once
again, the replicated Notz performs similarly to the replicated CCD.
Table 2 summarizes the total number of runs for each design. In this
case, the replicated factorial and the replicated axial require exactly the same
number of runs. They in turn require fewer runs than any other design. Once
again, taking into account Figure 12 and Table 2, the replicated factorial
appears to be a reasonable design strategy in many situations.

Table 2 Design Sizes for the Four-Factor Case


m
Design 12
Replicated factorial 48 80 112
Replicated axial 48 80 112
Replicated 3/4 156 60 108
Replicated full factorial 12 136 200
Notz 180 60 120
CCD 288 96 192
EstimatingMeanandVarianceFunction 303

.'..__.
..__
___
N
O
4
i

i
- Factorial
Replicated
...*-.._
........ RePlicated Axlal
......,, '..
"- CCD
i' " .
Notz
-- Replicated314
Replicated Full Factorial
I

0 5 10 15

rn

Figure 13 Comparisons of designs in terms of D for the five-factor cuboidal case.

Figure 13 summarizes the results for the five-factor case. Interestingly,


the replicated Notz design performed best, edging out the replicated CCD.
The replicatedaxial again performed worst. We see biggerdifferences in
performance among the other three, with the replicated full factorial per-
forming uniformly better than the replicated 3/4, which in turn uniformly
outperformed the replicated factorial.
Table 3 summarizes the total number of runs required by each design.
The replicated CCD hereuses a resolution V fraction of the 2' factorial
design.The replicatedfactorial,however,must use the full 2' factorial
design in order to minimize
the number replicated
of points.
Consequently,thereplicatedfactorial is notalwaysthesmallestdesign.

Table 3 Design Sizes for the Five-Factor Case

m
Design 4 8 12
Replicated factorial 66 98 130
Replicated axial 136 56 96
Replicated 3/4 1 I4 210 306
Replicated full factorial 14 138 202
Notz 84 168 252
CCD 104 208 312
304 Vining et al.

The real message of Table 3 is that all of the design strategies require a large
number of runs. I n many situations, the total is prohibitive.

5. CONCLUSIONS

Our research suggests the following conclusions. First, the proposed D cri-
terion suggests that if we fit a second-order model to the response andfirst-a
order model to the process variance, then we need to replicate only a subset
of the base second-order design. Second, this criterion appears to prefer
replicating the full factorial as the sample size permits. Third, the replicated
factorial and the replicated 3/4 factorial designs tend to perform well for
small to moderate amounts ofreplication.Finally,for large amounts of
replication, we maywanttoconsiderreplicatingat least aresolution V
fraction (the replicated full factorial).

REFERENCES

Copeland KAF. Nclson PR. (1996). Dual response optimization via dircct function
minimization. J Qual Technol 28331-336.
Del Castillo E, Montgomery DC. (1993). A nonlinear programming solution to the
dual response problem. J Qual Technol 25:199-204.
Lin DKJ. TUW. (1995). Dual response surface optimization. J Qual Technol 27:34-
39.
Mitchell TJ. (1974). An algorithm for the construction of D-optimal experimental
designs. Technometrics 16:211-220.
Myers RH, Carter WH Jr. (1973). Response surface techniques for dual response
systems. Technometrics 15:301-317.
Notz W. (1982). Minimal point second order designs. J Stat Planning Inf 6:47-58.
Vining G G , Myers RH. (1990). Combining Taguchi and response surfacc philoso-
phies: A dual response approach, J Qual Technol 22:3845.
Vining GG, Schaub D. (1996). Experimental designs for estimating both me811 and
variance functions. J Qual Technol 28: 135-147.
18
Recent Developments in
Supersaturated Designs
Dennis K. J. Lin
The Pennsylvania State University, University Park, Pennsylvania

1. AGRICULTURALANDINDUSTRIALEXPERIMENTS

Industrial management is becomingincreasingly aware of the benefits of


running statistically designed experiments. Statistical experimental designs,
developed by Sir R. A. Fisher in the 1920s, largely originated from agricul-
tural problems. Designing experiments for industrial problems and design-
ing experiments for agricultural problems are similarin their basic concerns.
There are, however, many differences. The differences listed in Table 1 are
based on the overall characteristics ofall problems. Exceptions can be found
in some particular cases, of course.

Industrialproblemstendtocontainamuchlargernumber of factors
under investigation and usually involve a much smaller total number
of runs.
Industrial results are more reproducible; that is, industrial problems con-
tain a much smaller replicated variation (pure error) than that of agri-
cultural problems.
Industrial experimenters are obliged to run their experimental points in
sequence andnaturallyplantheirfollow-upexperimentsguided by
previous results; in contrast, agricultural problems harvest all results
at one time. Doubts and complications can be resolved in industry by
immediatefollow-upexperiments.Confirmatoryexperimentation is
readily availableforindustrialproblemsandbecomesaroutine
procedure to resolve assumptions.

305
306 Lin

Table 1 Differences Between Agricultural andIndustrialExperiments


Subject Agriculture Industry
Number of factors Small Large
Number of runs Large Small
Reproducibility Less likely More likely
Time taken Long Short
Blocking Nature Not obvious
Missing values Often Seldom

The concept of blocking arose naturally in agriculture but often is not


obvious for industrial problems. Usually, industrial practitioners need
certain specialized training to recognize and handle blocking variables.
Missing values seem to occur more often in agriculture (mainly due to
natural losses) than in industry. Usually, such problems can be avoided
for industrial problems by carrying out well-designed experiments.
The supersaturated design method considered in this chapter suggests
one kindofscreeningmethodforindustrialproblemsinvolvingalarge
number of potential relevant factors. It may notbe an appropriate proposal
for some agricultural problems.

2. INTRODUCTION

Considerthesimplefactthat wherethere is an effect, there is acause.


Qualityengineersareconstantlyfacedwithdistinguishingbetweenthe
effects thatarecaused by particularfactorsandthosethataredueto
random error. The ‘‘null’’ factors are then adjusted to lower the cost; the
“non-null” (effective) factors are used to yield better quality. To distinguish
between them,a large number of factorscanoften be listed as possible
sources of effects. Preliminary investigations (e.g., using professional knowl-
edge)mayquicklyremovesomeofthese “candidatefactors.” It is not
unusual, however, to find that more than 20 sources of effects exist and
that among those factors only a small portion are actually active. This is
sometimes called “effect sparsity.” A problem frequently encountered in this
area is that of howtoreducethetotalnumber ofexperiments.This is
particularly important in situations where an individual run is expensive
(e.g.,withrespect to money or time).Withpowerfulstatisticalsoftware
readily available for data analysis, there is no doubt that data collection is
the most important part of such problems.
Supersaturated Designs 307

To obtain an unbiased estimate of the main effect of each factor, the


number of experiments must exceed (or at least be equal to) the number of
factors plus one (for estimating the overall grand average). When the two
numbers are equal, thedesign is called a saturated design; it is the minimum
effort required to estimate all main effects. The standard advicegiven to
users in such a screening process is to use the saturated design, which is
“optimal” based oncertaintheoreticaloptimalitycriteria.However,the
nonsignificant effects are not of interest. Estimating all main effects may
be wasteful if thegoal is simply todetectthe few activefactors. If the
number of active factors is indeed small, then the use of a slightly biased
estimate will still allow one to accomplish the identification of the active
factors
but significantly reducethe amount of experimental work.
Developing such screening designs has long been a well-recognized problem,
certainly since Satterthwaite (1959).
When all factors can be reasonably arranged into several groups, the
so-called group screeningdesigns can be used (see, e.g., Watson, 1961). Only
thosefactors in groupsthatarefoundtohavelarge effects arestudied
furtherhere.Thegroupingscheme seems to be crucialbuthasseldom
been discussed. The basic assumptions (such as assuming that the directions
of possible effects areknown), in fact,depend heavily onthegrouping
scheme.Whilesuch methodsmay be appropriate in certainsituations
(e.g., bloodtests), we areinterested in systematicsupersaturated designs
for two-level factorial designs that can examine k factors in N < k + 1
experiments in which no grouping scheme is needed. Recent work in this
area includes, forexample,that ofLin(1991,1993a,1993b,1995, 1998),
TangandWu (1997), Wu (1993), Dengand Lin (1994), Chenand Lin
(1998), Cheng (1997), Deng et al.(1994,1996a,1996b), Yamada and Lin
(1997) and Nguyen (1996).

3. SUPERSATURATEDDESIGNSUSINGHADAMARD
MATRICES

Lin (1993a) proposed a class of special supersaturated designs that can be


easily constructed via half-fractions of theHadamardmatrices.These
designs can examine k = N - 2 factors with I I = N / 2 runs, where N is the
order of theHadamardmatrixused.ThePlackettandBurman (1946)
designs, which can be viewed as a special class of Hadamard matrices, are
used to illustrate the basic construction method.
Table 2 shows the original 12-run Plackett and Burman design. If we
take column 1 1 as the branching column, then the runs (rows) can be split
into two groups: group I with the sign of + 1 in column 1 1 (rows 2, 3, 5, 6, 7,
308 Lin

Table 2 A Supersaturated DesignDerivedfromthe Hadamard Matrix of Order


12

Run Row 1 2 3 4 5 6 7 8 9 10 11
~~~

I + + - + + + - - - + -
1 2 + - + + + - - - + - +
2 3 - + + + - - - + - + +
4 + + + - - - + - + + -
3 5 + + - - - + - + + - +
4 6 + - - - + - + + - + +
5 7 - - - + - + + - + + +
8 - - + - + + - + + + -
9 - + - + + - + + + - -
IO + - + + - + + + - - -

6 II - + + - + + + - - - +
12 - - - - - - - - - - -

and 11) and group I1 with the sign of -1 in column 1 1 (rows 1. 4, 8, 9, IO,
and 12). Deleting column 1 1 from group I causes columns 1-10 to form a
supersaturated design to examine N - 2 = I O factors i n N / 2 = 6 runs (runs
1-6, as indicated in Table 2). It can be shown that if group I1 is used, the
resulting supersaturated design is an equivalent one. In general, a Plackett
andBurman (1946) design matrixcanbe split into twohalf-fractions
according to a specific branchingcolumnwhose signs equal + 1 or -1.
Specifically, take onlytherows that have + 1 in thebranchingcolumn.
Then, the N - 2 columnsotherthanthebranchingcolumn will form a
supersaturated design for N - 2 N - 2 factors in N / 2 runs. Judged by a
criterion proposed by Booth and Cox (l962),these designs have been shown
to be superior to other existing supersaturated designs.
Theconstructionmethods here are simple.However,knowing in
advancethatHadarnard matricesentertainmany“good”mathenmtical
properties, the optimality properties of these supersaturated designs deserve
further investigation. For example, the half-fraction Hadanlard nlatrix of
order 17 = N / 2 = 4t is closely related to a balanced incomplete block design
with ( u , / I , I . , /<) = (2r - I , 41 - 2 , 2 t - 2, t - 1) and h = t - 1. Consequently,
the E(.?) value (see Section 4) for a supersaturated design from a half-frac-
tion Hadamard matrix is n ’ ( r 7 - 3)/[(2rl- 3 ) ( n - I)], which can be shown to
be the minimum within the class of designs with the same size. Potentially
promising theoretical results seem possible for the construction of a half-
fractionHadamard matrix.
Theoretical implications deserve
detailed
scrutiny andare discussed below. For moredetailsregardingthis issue.
please consult Cheng (1997) and Nguyen ( 1996).
Supersaturated Designs 309

Notethattheinteractioncolumns of Hadamardmatricesare only


partiallyconfounded with other main-effect columns. Wu(1993)makes
use of such a property and proposes a supersaturated design that consists
of all main-effect andtwo-factorinteractioncolumnsfromany given
Hadamard matrix of order N . The resulting designhas N runs and can
accommodate up to N ( N - 1)/2 factors. When there are k < N ( N - 1)/2
factors to be studied, choosing columns becomes an important issue to be
addressed.

4. CAPACITYCONSIDERATIONS

As mentioned, when a supersaturated design is used, the abandonment of


perfect orthogonality is inevitable. The designs given in Lin (1993a) based
onhalf-fractionsofHadamardmatrices have a very nice mathematical
structurebutcanbe usedonly toexamine N - 2 factors in N / 2 runs,
where N is the order of the Hadamard matrix used. Moreover, these designs
do not control the value of the maximal pairwise correlation r , and, in fact,
large values of I’ occur in some cases.
Consider a two-level k-factor design in n observations with maximal
pairwise correlation I’. Given any two of the quantities (tz, k , I’), Lin (1995)
presentsthepossiblevaluesthatcan be achieved forthethirdquantity.
Moreover, designs given in Lin (1995) may be adequate to allow examina-
tion of many prespecified two-factor interactions. Some of the results are
summarized in Table 3.
Table3showsthemaximumnumber of factors, klllilX,thatcan be
accommodated when both I I and I’ are specified for 3 I 11 I 25 and 0 I I’ I
1/3 (Table 3a for even 17 and Table 3b for odd n ) . We see that for r 5 1/3,
many factors can be accommodated. For fixed 11, as the value of r increases,
k,ll;,, also increases. That is, the larger the nonorthogonality, the more fac-
tors can be accommodated. In fact, k,,, increasesrapidly in this setting.
Certainly the more factors accommodated, the more complicated are the
biased estimation relationships that occur, leading to more difficulty i n data
analysis. On the other hand, forfixed 1‘, the value of k,,,, increases rapidly a s
11 increases. For r 5 1/3, one can accommodate at most 11 1 factors in 18
runs or 66 factors in 12 runs; for I’ I 1/4, one can accommodate 42 factors
in 16 runs;for I’ I 1/5, onecanaccommodate 34 factors in 20 runs.
Providedthat thesemaximalcorrelationsareacceptable, this can be an
efficient design strategy.
310 Lin

Table 3 Maximal Number of Factors Found, k,,,;,,, as a Function of 11 and tlr,


for 3 5 ) I 5 25 and r 5 113
(a) Even n
Number of runs tz Maximumabsolute cross product, nr = Ic,!cjl
6 4 2 0 8
4
6 - IO
8 7 -

IO -~ 12
12 11 - 66
14 - 13 - 1 I3
16 15 ~~

42 -~

18 - 17 - 111
20 ~. 34
22 - 20 "
92 -
24 - 33 - 276

(b) Odd 11

Number of runs 11 Maximumabsolute cross product, nr = Jc,!ciJ


7 5 1 3

3 3
5 4
7 7 15
9 7 12
11 11 14
13 12 14
15 37 15 15
17 50 15 17
19 33 19 19
21 19 19 34 92
2394 33 23 23
2576 32 23 23
SupersaturatedDesigns 311

5. OPTIMALITYCRITERIA

When a supersaturated design is employed, as previously mentioned, the


abandonment of orthogonality is inevitable. It iswell known that lack of
orthogonality results in lower efficiency; therefore we seek a design that is as
“nearly orthogonal” a s possible. One way to measure the degree of non-
orthogonality betweentwocolumns, c, and cJ, is to consider their cross-
product, sii = c:cJ; a larger I.sjil implies less orthogonality. Denote the largest
1 . ~ ~ among
1 all pairs of columns for a given design by s , and we desire a
minimum value for s ( s = 0 implies orthogonality). The quantity s can be
viewed as the degree of orthogonality that the experimenteris willing to give
u p t h e smaller, the better. This is by nature an important criterion. Given
any two of the quantities (17, k , s), it is of interest to determine what value
can be achieved for the third quantity. Some computational results were
reported by Lin (1995). No theoretical results are currently available, how-
ever. It is believed that some results from coding theory can be very helpful
in this direction. Further refinement is currently under investigation.
If two designs have the same value of s, we prefer the one in which the
value of I s i . l - s is a minimum. This is intimately connected with the expec-
tation of s-, E(
4 s2 ). first proposed by Booth and Cox (1962) and computed
i), (i)
~~

a s z &/( where ,f; is thefrequencyof sf among all pairs of


columns.

Intuitively, E(.?) gives the increment in the variance of estimation arising


from nonorthogonality. It is, however, a measurement for pairwise relation-
ships only. More general criteria were obtained by Wu (1993) and Deng et
al. (1994,1996b).DengandLin(1994)outlinedeightcriteria useful for
supersaturated designs: s = max )c,!cj(; E(s2); p (Lin, 1995); D,., A , , E ,
(Wu, 1993); B criterion(Denget.al., 1996a,1996b); andr-rank (see
Section 8). Furthertheoreticaljustification is currentlyunderstudy.
Optimal designs in light of these approaches deserve further investigations.
In addition, the notion of multifactor (non)orthogonality is closely related
to the multicollinearity in linear model theory.

6. DATAANALYSISMETHODS

Several methods have been proposed to analyze the k effects, given only the
n ( c k ) observationsfromtherandombalance design contents (see, e.g.,
Satterthwaite, 1959). These methods can also be applied here. Quick meth-
ods such as these provide an appealing, straightforward comparison among
312 Lin

factors,but it is questionablehowmuchavailableinformationcan be
extracted using them;combining several of these methods provides a
moresatisfyingresult. In addition, three data analysis methods for data
resultingfroma supersaturated design are discussed in Lin (1995): (1)
normal plotting. (2) stepwise selection, and (3) ridge regression.
To study so many columns i n only a few runs, the probability ofa false
positivereading (type 1 error) is a major risk here. An alternative to the
forward selection procedure to control these false positive rates is as follows.
Let N = ( i l ,i z , . . . , i,,} and A = (i,, + 1. . . . , i + k } denote indexes of inert
and active factors, respectively, so that N U A = ( I , . . . , k ) = S. If X denotes
+ +
the / I x p design matrix, our model is Y = P I Xg E , where Y is the I I x 1
observable data vector, p is the intercept term, 1 is an n-vector of I's, fi is a
h- x 1 fixed and unknown vector of factor effects, and c is the noise vector. I n
themultiplehypothesistestingframework. we havenull andalternative
pairs H / : pi = 0 and H f ' : pi # 0 with H / true for .i E N and HI' true for
. I E A.
Forward selection proceeds by identifying the maximum F statistics at
successive stages. Let F,'"' denote the F statistic for testing H , at stage s .
Consequently, define

where

In addition, if the first s variables are , f i v c w l and the test is used to


evaluate the significance of the next entering variable (of the remainillgk - s
variables), the procedure is again exact under the complete null hypothesis
of no effects among the li - s remaining variables. The exactness disappears
Supersaturated Designs 313

with simulated p values, but the errors can be made very small, particularly
with controlvariates.The analysisof datafromsupersaturated designs
along this direction can be found in Westfall et al. (1998).

7. EXAMPLES

Examples of supersaturated designs as real data sets can be found in Lin


(1993, 1995). Here we apply the concept of supersaturateddesign to identify
interaction effects fromamain-effect orthogonal design.Thisexample is
adapted from Lin (1998). Consider the experiment in Hunter et al.(1982). A
12-run Plackett and Burman design was used to study the effects of seven
factors (designated here as A, B, . . ., G ) on the fatigue life of weld-repaired
castings. The design and responses are given in Table 4 (temporarily ignore
columns 8-28). For the details of factors and level values, see Hunter et al.
(1982).
Plackett and Burman designs are traditionally known as main-effect
designs, because if all interactions can temporarily be ignored, they can be
used to estimate all main effects. There are many ways to analyze such a
main-effect design. One popular way is the normal plot [see Hamada and
Wu ( 1 992), Figure I]. Using this method, it appears that factor F is the only
significant main effect. Consequently a main-effect modelis fitted as follows:
j= 5.73 + 0.458F with R’ = 44.5%.
Note that the low R’ is not very impressive. Is it safe to ignore the
interaction effects? Hunter et al. claim thatthe designdid notgenerate
enoughinformationto identify specific conjecturedinteraction effects. If
this is not the case here, is it possible to detect significant interaction effects‘?
Hamada and Wu (1992) introduced the concept of effect heredity. After
main effects were identified, they used forward selection regression to iden-
tify significant effects among a group consisting of ( I ) the effects already
identified and (2) the two-factor interactions having at least one component
factor appearing among the main effects of those already identified. In this
particular example, a model for factor F and interaction FG was chosen:

j= 5.7 + 0.4583 - 0.459FG, R’ = 89% (1)

Now, if we generate all interaction columns, AB, AC, . . ., FG, together with
+
all main-effect columns, A, B, . . ., G , we have 7 21 = 28 columns. Treat all
of those 28 columns in 12 runs as a supersaturated design (Lin, 1993) as
shown in Table 4. The largest correlation between any pair of the design
columns is f 1 / 3 . The resultsfromaregular stepwise regressionanalysis
(with a! = 5% for entering variables) yields the model
314 Lin
""""""
I I I I I I
""""""
I I l l I I
""""""
I I I I I I
""""""
I I I I I I
""""""
I I I I I /
""""""
I I I l l I
""""""
I I I l l I
""""""
1 1 I I I I
""""""
I I I I I I
""""""
I I I I I 1
""""""
I I I I l l
""""""
I / I I I I
""""""
I I I l l I
""""""
I I / I l l
""""""
I I I I I I
""""""
I I I I I I
""""""
I l l I I I
""""""
I l l I I 1
""""""
I I l l I I
SupersaturatedDesigns 315

j = 5.73 + 0.394F - 0.395FG - 0.191AE, R' = 95% (2)

a significantly better fit to the data than Eq. (I). An application of the
adjusted p-value method (Westfall, et al. 1998) reaches the same conclusion
in this example.
Note that the AE interaction,in general, would never be chosen under
the effect heredity assumption. Of course, most practitioners may consider
adding main effects A, E, and G to the final model because of the signifi-
cance of interactions FG and AE. The goal here is only to identify potential
interaction effects. In general, for most main-effect designs, such as Plackett
andBurman typedesigns(exceptfor 2""' fractionalfactorials),onecan
apply the following procedure [see Lin (1998) for the limitations]:
Step 1. Generate all interactioncolumnsandcombinethem withthe
+
main-effect columns. We now have k(k 1)/2 design columns.
+
Step 2. Analyze these k(k 1)/2 columns with tt experimental runs as a
supersaturated design. Data analysis methods for such asupersatu-
rated design are available.
Note that if the interactions are indeed inert, the procedure will work well,
and if the effect heredity assumption is indeed true, the procedure will end
up with the same conclusion as that of Hamada and Wu (1992). The pro-
posed procedure will always result in better (or equal) performance than
that of Hamada and Wu's procedure.

8. THEORETICALCONSTRUCTIONMETHODS

Deng et al. (1994) proposed


supersaturated
a design
of the
form
X,. = [H, RHC], where H is a normalized Hadamard matrix, R is an ortho-
gonal matrix, and C is an 11 x ( n - c) matrix representing the operation of
column selection. Besides the fact that some new designs with nice properties
can be obtained this way, the X,. matrix covers many existing supersaturated
designs. This includes the supersaturated designs proposed by Lin (1993a),
Wu(1993), and Tang and Wu (1993).Somejustificationsof its optimal
properties have been obtained as follows.
It can be shown that

xix,.= (C'i'R'H H'RHC)


ttl+,. = (cd'w, , n1/,-,.

where W = H'RH = (wj,) = (h/Rh,) and h i is thejth column of H. Further,


the following theorem can be demonstrated.
316 Lin

Theorem
Let H be a Hadamard matrix of order I I and B = ( h l ,. . . , h,) be an n x I’
matrix with all entries f l and V = H’B = ( u j i ) = h,(h,.Then
I . For any fixed 1 5 j 5 1’, I? = Cy=,L$.
2. In particular, let B = RH and W = H’RH = (wv). We have
a. (I/n)W is an 12 x IZ orthogonal matrix.
b. I? = Cy=lN$ = CJLIbt$.
c. wj, is always a multiple of 4.
d. If H’ is column-balanced, then fn = Cy=,1 1 ’ ; ~ = x.’=,
wv

Corollary
For any R and C such that ( I ) R’R = I and (2) rank (C) = I I - c, all X,.
matrices have an identical E(.?) value.

This implies that the popular E(.?) criterion used in supersaturated designs
is invariantforanychoiceof R and C. Therefore, it is not effective for
comparingsupersaturated designs. In fact,followingthe argument in
Tang and Wu (1993), the designs given here will always have the minimum
E(.?) valueswithintheclassofdesigns of the same size. One important
feature of the goodnessof a supersaturated screeningdesign is its projection
property (see Lin 1993b). We thus consider the r-rank property as defined
below.

Definition
Let X be a column-balanced design matrix. The resolution rank (or I’ rank,
for short) of X is defined as = d - I , where d is the minimum number
,/

subset of columns that are linearly dependent.

The following results are provided by Deng et al. (1994).


1. If no column in any supersaturated design X is fully aliased, then
the I’ rank of X is at least 3.
2 . )?Rh,= Cyxl1lIqh;.
3. Let W = H’D(h/)H, where D(h/) is the diagonal matrix associated
with hl, namely, the Ith column vector of H; and 11 = 41. Then
a. If r is odd, then there can be exactly three 0’s in each row, or
each column, of W. The rest of nv,, can only be of the form
+
f 8 k 4, for some nonnegative integer k .
b. If t is even, then every entry wji in W can be of the form f S k ,
for some nonnegative integer IC.
SupersaturatedDesigns 317

These results are only the first step. Extension of these results to a more
general class of supersaturated designs in the form S K = (RIHC1,. . . , RK
HCK) is promising.

9. COMPUTERALGORITHMICCONSTRUCTIONMETHODS

More and more researchers are benefiting from using computer power to
construct designs for specific needs.Unlikesomecases from the optimal
designperspective(such asD-optimal design). computerconstruction of
supersaturated designs is not well developed yet. Lin(1991) introduced
the first computer algorithm to construct supersaturated designs. Denote
the largest correlation in absolute value among all design columns by I', as
a simple measure of the degree of nonorthogonality that can willingly be
given up. Lin (1995) examines the maximal number of factors that can be
accommodated in such a design when I' and 11 are given.
AI Church at GenCorp Companyused the projection properties in Lin
and Draper (1992,1993) to develop a software package named DOE0 to
generate designs for mixed-level discrete variables. Such a program has been
used at several sites in GenCorp. A program named DOESS is one of the
results and is currently in a test stage. Dr.Nam-KyNguyen (CSIRO,
Australia) also independently works on this subject. He uses an exchange
procedure to construct supersaturated designs and near-orthogonal arrays.
A commercial product called Gendex is available for sale to the public, as a
result. Algorithmic approaches to constructing supersaturated designs seem
to have been a hot topic in recent years. For example, Li and Wu (1997)
developed a so-called columnwise-pairwiseexchange algorithm. Such an
algorithm seems toperform well forconstructingsupersaturated designs
by various criteria.

10. CONCLUSION

I. Using supersaturated designs involves more risk than using


designs with more runs. However,it is far superior to other experi-
mental approaches such as subjectively selecting factors or chang-
ing factorsoneat a time. Thelattercan be shownto have
unresolvable confounding patterns, though such confounding pat-
terns are important for data analysis and follow-up experiments.
2. Supersaturated designs are very useful in earlystagesofexperi-
mental investigation of complicated systems and processes invol-
ving many factors. They are not used for a terminal experiment.
318 Lin

Knowledge of the confounding patterns makespossible the inter-


pretation of the results and provides the understanding of how to
plan the follow-up experiments.
3. The success of a supersaturated design depends heavily onthe
“effect sparsity” assumption. Consequently, the projection prop-
erties play an important role in designinga supersaturated experi-
ment.
4. Combining several data analysismethodstoanalyzethedata
resultingfrom a supersaturated design is alwaysrecommended.
Besides the stepwise selection procedure [and other methods men-
tioned in Lin (1993)], PLS (partial least squares), adjusted p value
(see Westfall, et al. (1998)), and Bayesian approaches are promis-
ing procedures for use in identifying active factors.
5. Anotherparticularlysuitable use forthesedesignsisintesting
“robustness,” where theobjective is nottoidentifyimportant
factors but to vary all possible factors so that the response will
remain within the specifications.

REFERENCES

Booth
KHV,Cox DR. (1962). Somesystematic
supersaturated
designs.
Technometrics 4:489-495.
Chen JH, Lin DKJ. (1998). On identifiabilityofSupersaturateddesigns. J Stat
Planning Inference, 72, 99-107.
Cheng CS. (1997). E(.?)-optimal supersaturated designs. Stat Sini, 7, 929-939.
Deng LY, Lin KJ. (1994). Criteria for supersaturated designs. Proceedings of the
Section
on
Physical
and
EngineeringSciences,
American Statistical
Association. pp 124-128.
DengLY, Lin DKJ, Wang JN. (1994).SupersaturatedDesignUsingHadamard
Matrix. IBMResRepRC19470,IBMWatsonResearchCenter.
Deng LY, Lin DKJ, Wang JN. (1996a). Marginally oversaturated designs. Commun
Stat 25( 11):2557-2573.
Deng LY, Lin DKJ, Wang JN. (1996b). A measurement of multifactor orthogon-
ality. Stat Probab Lett 28:203-209.
Hamada M, Wu CFJ. (1992). Analysisof designed experiments with complex alias-
ing. J Qual Techno1 24:13&-137.
Hunter GB, Hodi FS, Eager TW. (1982). High-cycle fatigue of weld repaired cast Ti-
6A1-4V. Metall Trans l3A: 1589-1 594.
Li WW, Wu CFJ. (1997). Columnwise-pairwise algorithms with applications to the
construction of supersaturated designs. Technometrics 39:171-179.
Lin DKJ. (1991).Systematicsupersaturateddesigns.WorkingPaper No. 264,
College of Business Administration, University of Tennessee.
SupersaturatedDesigns 319

Lin DKJ. (1993a). A new class of supersaturated designs. Technometrics 35:28-31.


Lin DKJ. (1993b). Anotherlookatfirst-ordersaturated designs: The p-efficient
designs. Technometrics 35:284292.
Lin DKJ. (1998). Spotlightinteraction effects i n main-effectdesigns. Quality
Engineering 11(1), 133-139.
Lin DKJ. (1995). Generatingsystematicsupersaturated designs. Technometrics
37:213-225.
Lin DKJ, Draper NR.(1992). Projection properties of Plackett and Burman designs.
Technometrics 34:423428.
Lin DKJ. Draper NR. (1993). Generating alias relationships for two-level Plackett
and Burman designs. Comput Stat Data Anal 15:147-157.
Nguyen N-K. (1996). Analgorithmicapproach to constructingsupersaturated
designs. Technometrics 38:69-73.
Plackett RL, Burman JP. (1946). The design of optimum multifactorial cxperimcnts.
Biometrika 33:303-325.
Satterthwaite F. (1959). Random balance experimentation. Technometrics 1: 11 1-137
(with discussion).
Tang B, Wu CFJ. (1997). A method for construction of supersaturated designs and
its E(.?) optimality. Can J Stat 25:191-201.
Wcstfall PH.Young SS. (1993). Resampling-Based MultipleTesting. New York:
Wiley.
Westhll. PH. Young SS, Lin DKJ. (1998). Forward selection error control in the
analysis of supersaturated designs. Stat Sin, 8, 101-1 17.
Watson. GS. (1961). A study of the group screening methods. Technometrics 3:371-
388.
Wu CFJ. (1993). Construction of supersaturated designs through partiallyaliased
interactions. Biometrika 80:661-669.
Yamada S. Lin DKJ. (1997). supersaturated designs including an orthogonal base.
Can J Stat 25:203-213.
Youden WJ, Kempthorne 0,Tukey JW, Box GEP, Hunter JS. (1959). Discussion on
“Random balance experimentation” by Sntterthwaite. Technometrics 1 : 157”
184.
This Page Intentionally Left Blank
Statistical Methods for Product
Development: Prototype Experiments
David M. Steinberg
Tel A viv University, TelA viv, Israel
Ssren Bisgaard
University of St. Gallen, St. Gallen, Switzerland

1. INTRODUCTION

Many authors have emphasized the importance of product development for


long-term business survival [la]. The rapid pace of technological progress
in today's economy makes it increasingly important to reduce development
time and get new products to market quickly. Page [5] discovered that most
of the development cycle was devoted to the physical development of the
product. In our experience, much of that effort goes into experiments whose
goals may include improving performance, comparing design alternatives,
increasing reliability, or verifying that the product meets its stated goals and
specifications. Thus efficient methods of experimentation can be of great
value in ramping up the learning curve and accelerating the product devel-
opment process [6, 71.
In this chapter we focus i n particular on the use of factorial experi-
ments for prototype testing, building on theideas in Bisgaard and Steinberg
[8]. Prototypetestsprovide designengineerswithvaluableinformation
about the performance of products before they are sent further downstream
fortoolingandramp-upforproduction.The knowledgeacquiredfrom
these tests can be used to optimize and robustify products. Often a sequence
of prototypes is built, beginning with computer-aided design (CAD) draw-
ings and leading to the construction of a full-scale product. Since prototype
tests canberunfromearly on in thedevelopment cycle, they canhelp

321
322 SteinbergandBisgaard

eliminate potential quality problems without the large costs and delays that
are usually incurred when problems are discoveredi n the later phases of the
design-to-production cycle.
The common paradigm for prototypetesting is to build and evaluate a
single model at each stage. This approachis implicit in the excellent account
by Wheelwright andClark [ 3 , Chapter 101 ofthe role of prototypes in
product development.
It is our experience that great gains can be made by using factorial
experiments to study and improve product design at the prototype stage.
Several alternatives can be made, varying important design factors accord-
ing to a factorial plan. The results of such experiments can substantially
accelerate the path from concept development to finished product and can
significantly lower the risk of discovering serious quality problems late in the
development cycle.
A striking exampleof the importance of rapid feedback at early stages
in the design process is presented by Clark and Fujimoto [9, Chapter 71 in
their comprehensive study of auto manufacturers. They found that thelead
time for developing a new car was about 25% less in Japan than in the
United States. One major reason for this difference was that the Japanese
companies were much more successful than their American counterparts at
rapidly reducing the number of design problems early in the development
process. ClarkandFujimotocreditedthis difference totheprototyping
strategiesthat wereprevalent in thetwocountries.The U.S. companies
built few prototypes and treated them as master models; the Japanese com-
panies built manyprototypesand usedthem toprovideinformationfor
finding and solving design problems. Our approach couples the power of
statistical experiments with the Japanese strategy.
Prototype experiments have two interesting statistical features. First,it
is typically much more expensive to build a prototype than to test it. Thus
there is good reason, once a prototype is built, to test it extensively. The
relevant test conditions, which can often be laid out in a factorial plan, will
then be nested within the prototype configurations, in what is known as a
split-plot structure. Second, interest often focuses on a performance curve
rather than on a single number output. In motor testing, for example, the
test might examine fuel consumption as a functionof load or rpm, torque as
a function of rpm, compression ratio as a function of single a 360" stroke, or
the curve trace of the torque or power delivered through a gear shift cycle
from forward through neutral to reverse and back again. Other examples
include the hysteresis curve in the testing of transformers, the spectrum of
the emitted light in the testing of light bulbs, the hardness as a function
of depth in ion implantation of steel, the pressure versus time curve in a
pyrotechnic chain, and the characteristic curve in the testing of transistors.
ProductDevelopment:PrototypeExperiments 323

Experiments that include factors related to product design along with


factorsthat reflect test settingshavereceivedsomeattentionwithinthe
robustparameter designstrategyofTaguchi [IO]. The paradigm recom-
mended by Taguchi is to use a factorial design to prepare product or system
configurations and then to run each configuration at different settings (fol-
lowing a second factorial plan) of noise and signal factors. The noise factors
might reflect possible variations in the production or use environment, and
the signal factorsrepresentadjustableinputsthattheproductusercan
control to produce a desired response (e.g., the force applied to a brake
pedal). This paradigm is similar to the settingwe have in mind, in particular
what Taguchi has called “dynamic experiments,” which study the perfor-
mancecurveofaproductwithrespecttoa signal factor.However, our
method of analysis differs from that proposed by Taguchi. An approach
similartoours was proposed by Miller andWu [ II ] forrobust design
experiments with signal factors.
In this chapter we describe the general statistical methodology pro-
posed by Bisgaard and Steinberg [8] for prototype tests.We begin in the next
section with a general discussion of the product design process and the role
of prototype testing. Section 3 presents a number of examples of prototype
experiments. Section 4 describes a simple two-stage analysis that is appro-
priate when the experiment focuses on a performance curve and the test
conditions are nestedwithin prototypes. Section 5 illustratestheanalysis
with an experiment to improve an engine starting system [12]. Some con-
cluding remarks follow in Section 6.

2. THE PRODUCT DESIGN PROCESS

In Figure 1 we show a schematic representation of the product development


process first introduced by Bisgaard [6]. The steps shown there are the same
onesfound in mosttraditional texts onproductdevelopment,but we
emphasize in Figure 1 that the development process is best viewed as one
that is cyclical and ongoing, not a linear procession with a distinct beginning
andend.Mostproducts evolve fromsimilarpredecessors,go through a
sequenceofimprovement cycles, andultimatelyspawn new products.
These cycles within the product development process have muchin common
with the Plan-Do-Check-Act cycle of Deming [I31 and Shewhart [20].
One of the most important features of the development process shown
in Figure 1 is the acquisition of new knowledge at each stage. Experiments
often play a key role in unlocking these secrets of nature. Even when the
source of insight is a theoretical breakthrough or comes from observational
data, experiments will typically be needed to verify the theory. In our own
324 Stelnberg and Bisgaard

Figure 1 A conceptual view of the product development process as a cyclic learn-


ing cycle, analogous to Shewhart and Deming’s Plan-Do-Check-Act cycle.

contacts with design engineers, we regularly see experiments used to test new
concepts, compare designs, evaluate new materials, optimize performance,
improvequalityand reliability, and verify performancespecifications.
Efficient experimentation can be a crucial tool in the quest to bring high
quality products to market ahead of the competition. Carefully planned
factorialexperimentscanprovideinvaluableknowledgethroughoutthe
development cycle. See Bisgaard and Ellekjaer [7] for a broad conceptual
account.
The prototype stage is especially well suited to experimental work.
Typically prototypes are built fairly early in the development of a new
product, when it is easiest to make design changes. Factorial experiments
on prototypes can be an ideal method for comparing design alternatives
and shaping the direction of future development. Once that direction is set
andlargeamountsoftimeandmoneyhave beeninvested,itbecomes
ProductDevelopment:PrototypeExperiments 325

increasingly difficult tomakeanyfundamentalchangesto the product


design. Thus the biggest payofffrom additional knowledge, and hence
fromgoodexperiments, is atthe earlystages in thedevelopment cycle
when prototypes are being built and studied.

3. PROTOTYPEEXPERIMENTS:SOMEEXAMPLES
3.1. AirplaneWing
Initial prototype development often takes the form of CAD drawings rather
than actual physical mock-ups. Software that simulates the proposed oper-
ating environment can then be used to study the performance of the design
on the computer. The experimentin question here was carried outby a team
of engineers at the“conceptdesign”stage.Thetwomaingoals were to
improve the performance of thewing. as measured by thrust per unit weight,
and to minimize the cost per unit performance. Five different aspects of the
wing were studied: the sizes ofthree physical dimensions, the number of
strength supports on the wing, and the type of material used in construction.
Two possible values were considered for each of these factors, and eight
prototypes were thendefined, in accordwith a standard z5” fractional
factorialexperiment.Eachprototype was carefullydrawn by the design
team using CAD software. The weight and cost of each prototype wing
was then calculated and finite elementanalysiswas used to compute the
thrusts.

3.2. EngineThrottleHandle
Bisgaard [I41 described an experiment to improve the performance of the
throttle handle for an outboard motor. The goal of the experiment was to
derive appropriatetolerancesfor seven physical dimensions by studying
their effects onfriction in thehandle.Thethrottlehandle is assembled
from three parts: a knob. a handle, and a tube. Of the dimensions studied,
three were related to the knob, three to the handle, and one to the tube. An
interestingfeature of thisexperiment is thatseparateexperimentalplans
were set up for making prototypes of each of the three components (a 23
plan for the knobs. a 2’” plan for the handles, and a 2’ plan for the tubes).
All possible matchings of the prototype components were then assembled
and tested for friction.
326 SteinbergandBisgaard

3.3.EngineExhaust
Taguchi [IO, p. 13 11 described an experiment to reduce the CO content of
engineexhaust. Sevendifferentcharacteristicsoftheenginedesignwere
studiedusingasaturated two-leveldesign that specified eight prototype
engines. Each engine was then run at three different driving modes, which
constitutedthe test conditionsfor this study. Bisgaard andSteinberg [8]
analyzed the results from this experiment and found that one of the factors
had an interesting, and statistically significant, effect on the shape of the
response curve, as shown in Figure 2. With this factor at its low level, the
response curve was lower at the middle driving mode buthigher at the high
mode. The engineering significance of this effect depends on which driving
modes will be encountered most often. The lower driving modes likely cor-
respond to the sort of stop-and-start traffic common in large cities, and it
might then be desirable to choose the factor at its low level to reduce the CO
content at these modes.

3.4. Kitchen
Mixer
Ott [I51 described an experiment to improve a kitchen mixer. Each mixer
was assembled from three components: a top unit, a bottom unit, and gears.
An experiment was run to determine which of these three components was
the cause of inefficient operation. Forty-eight mixers were used in the study,
half of them efficient and half inefficient. Each mixer was disassembled, and

1.6

1.4

12

1.o
I
1.o 2 .o 3 .O
Driving Mode

Figure 2 The estimated response curves forCO exhaust versus driving mode at the
two levels of factor A for the engine exhaust experiment. The response curve with A
at its high level (solid line) is lower than the curve with A at its low level (dashed linc)
across most of the driving modes but shows a sharp increase at high driving modes.
ProductDevelopment:PrototypeExperiments 327

then 48 new mixers were assembled,swapping parts from the original mixers
to form a 23 factorial design whose factors were the three components. The
two levels for each factor were determined by the source of the component
in an efficient (or inefficient) mixer. The experiment clearly pointed to the
tops as the source of the problem.

3.5. PyrotechnicDevice
Milman et al. [16] reported on an experiment to improve the safety ofa
pyrotechnic device. It was known that the safety improvements could be
achieved by using a new type of initiator, but there was concern that this
change would adversely affect the performance of the device. An experiment
was run to test 24 prototype devices, mating eachof three safe initiators with
four types of main charge and two types of secondary charge. The observed
response for each prototype device was a trace of pressure against time.

3.6. FluidFlowController
Bisgaard and Steinberg [8] described an experiment to study how prototype
fluid flow control devices respond to changes in electrical input and flow
rate. The controller was assembled from two components. Two experimen-
tal factors described dimensions of the first component, and a third factor
described a dimension of the second component. As in the engine throttle
experiment, the eight prototype controllers were formed by making four
versions of the first component (following a 22 plan) and two versions of
the second component and then mating all possible pairs of components.
Each prototype was subjected tosix test conditions formed by crossing three
levels of the electrical input with two flow rates.

3.7. HearingAid
A remote control unit developed to permit easy control of a new, miniatur-
ized hearing aid suffered from poor reception. A factorial experiment was
carried out to test several conjectures as to the source of the problem, in
particularthatthedifficult-to-controlvariation in thereceptor coil was
causingvariations in thetransmissionfrequencyandthatthetypeof
cover used was affecting reception. The experiment showed that coil varia-
tion was the major problem and that it could be easily remedied by exploit-
ingalarge interaction betweenthe coil and thetransmissionprogram
(another factor in the experiment). The choice of cover was found to have
no effect at all.
328 SteinbergandBisgaard

3.8. BearingManufacture
Although we have emphasized throughout this chapter the use of factorial
experiments for prototype products, the same ideas can be applied to pro-
totype process development. Hellstrand [17] described an experiment con-
ducted at SKF, oneof the world's largest manufacturers of ball bearings, to
improve a production process. The goal of the experiment was to improve
bearing life, and threefactors were studied in a 23 plan: heat treatment,
osculation, and cage design. The experiment uncovered a large interaction
effect between heat treatment and osculation thatled to a fivefold increase in
bearing life.

4. ANALYSIS OF PROTOTYPEEXPERIMENTS
4.1. StandardExperimentalPlans
Some prototype experiments are standard factorials or fractional factorials
(e.g., the airplane wing and throttle handle experiments). No special meth-
ods are needed for the analysis of these experiments.

4.2. Two-Stage Analysis for Nested Test Conditions


Prototypes are typically much more expensive to make than to test, and it
will then be advisable to apply a sequence of test conditions to each proto-
type. This scheme generates a split-plot structure in which the test condi-
tions are nested within the prototype design. The analysis should correctly
account for the nesting.
We suggest a simple, yet general, two-stage analysis method for experi-
ments with nested test conditions:
1. Estimatethe effects ofthe test factorsfor each prototype. We
discuss below some useful ways to summarize these effects.
2. Use the effects found i n stage 1 as "data" in a standard factorial
analysis to study the effects of the design factors that guided the
construction of the prototypes.
As a n example,supposethere is a single test factor t and interest
focuses on the performance curve thatdescribes its relationship to an output
1'. For each prototype, fit apolynomialperformancecurve.Themodel
equation for the ith prototype is
ProductDevelopment:PrototypeExperiments 329

where gl(t) is a polynomial of degree 1. We define the polynomials so that


they are orthogonal with respect to the levels of the test factor. An advan-
tage of this is that only the mean level effects involve interprototype (“whole
pIot”) error. Any effects related to the slope or curvature or higher order
properties of the performance curve will involve only intraprototype (“sub-
plot”) error. We also scale the orthogonal polynomials so that

where the sum runs over all the test settings. The scaling guarantees thatall
the coefficients (except the constant) will have the same variance, a property
that is important at the second stage of the analysis.
The use of orthogonal polynomials with our scaling convention leads
to simple coefficient estimates. If we denote by yl’ = (vi,,. . . , J’;,~)the obser-
vations on the ith prototype at each of the s test conditions, the least squares
estimates of the coefficients are given by

The constant term is the average of the s observations, and the polynomial
coefficients are simple linear contrasts.
At the second stage of the analysis, each of the polynomialcoefficients
foundabove is treated as aresponsevariableandaseparateanalysis is
carried out for the coefficients of each degree. The analysis of the constant
terms reveals which factors affect the mean level of the performance curve,
the analysis of the linear coefficients shows which factors affect slope, etc.
Important effects that stand out from error can be identified with standard
tools such as normal probability plots and analysis of variance (ANOVA).
Note that the effects onthemean level include“wholeplot”error,but
effects on other aspects of the performance curve, including average coeffi-
cients, involve only “subplot” error. ANOVA can account for this situation
by doing a split-plot analysis.For the graphical analysis, separate plots must
be prepared for the two sets of effects. Our scaling convention from stage1
implies that all the performance curve coefficients have the same variance.
330 SteinbergandBisgaard

We take similar care at the second stage to ensure that the effects have the
same variance and can thus be combined on a single probability plot. We
recommend computing the average value of each coefficient (for ease of
interpretation) and then scaling all the design Factor contrasts to have the
same variance as the average. This property canbe checked by setting up the
regression matrix Z for the design factor effects with all elements in the first
column equal to 1 and then verifying that Z’Z = n l , where I is the identity
matrix. Each row of the matrix (Z’Z)”Z’ = (l/n)Z’ then gives one of the
factor effects.
Orthogonal polynomials are a convenient choice to describe a perfor-
mance curve, but other sets of orthogonal functions could alsobe used. For
some of the engine testing applications described in Section 1, we would
naturally expectperiodicbehavior.Inthat case, trigonometricfunctions
could be used to generate orthogonal contrasts in the test conditions.
Some experiments involve more than one test factor. Examples above
are the fluid flow controller and the engine startingsystem studies. For these
experiments, the natural approach is to estimate theeffect of each test factor
for each of the prototypes. Interactions among the test factors can also be
included if the test array permits their estimation. The analysis will then
reveal which product characteristics can be used to affect the dependence of
theresponseonthevarious test factors. For example, in the fluid flow
controller experiment, one important goal was to obtain accurate predic-
tions of the relationship between the response and thetest conditions so that
controllers could be designed to meet any desired response pattern.
Thetwo-stageanalysishasanappealing simplicity. I t canalso be
justified more formally using theory developed for growth curve models in
our performance curve context. Bisgaard and Steinberg [8] showed that, for
thesemodels,thetwo-stageanalysisactually computes generalized least
squares estimates of the parameters (maximum likelihood estimates if the
data are normally distributed).We refer interested readers to that article for
details on the statistical model and its analysis.
Our analysis approach shares some common ground with that recom-
mended by Taguchi [IO] for robust design experiments, but there are some
important differences that we would like to point out. The approach taken
by Taguchi is to compute, for each prototype, a single summary measure
across all the test conditions.Thissummarymeasure, whichhe calls a
signal-to-noise ratio, is then taken as a response variable much as in our
stage 2 analysis. The major difference between Taguchi’s approach and ours
is that we compute a complete, multicoefficient summary at our first stage,
as opposed to Taguchi’s use of a univariate summary. This difference may
appear small but is in factsubstantial.Thesingle-numbersummarycan
throw away much valuable information that is captured by the complete
ProductDevelopment:PrototypeExperiments 331

summary.Steinberg and Bursztyn [18] and Bisgaard andSteinberg [8]


showed that Taguchi’s approach can miss important effects and identify
spurious effects that are easily handled by the multicoefficient summary.

4.3. Analysis with Analog Traces


The observed response for each prototype may be a continuous analog trace
against time, as in the pyrotechnic experiment. These curves canbe analyzed
by applying the methods of Section4.2 to a digitized version of the response
along a grid of time points.
An alternative strategy that is often useful is to take as response vari-
ablesparticularfeatures of the observed performancecurvesthatare of
interest. In the experiment on the pyrotechnic device, an important feature
wasthedelaytime (i.e., thetimefromactivationuntilthepressure first
begins to increase). Feature analysis has the advantage of focusing attention
on the most salient aspects of the performance curves. Most features will
involve both whole plot and subplot error components and will have differ-
ing variances. So it will not in general be possible to combine estimated
effects fordifferentfeatures (as we do above for the performance curve
effects).
Featureanalysiscanalso be appliedwhenphysicalconsiderations
suggest nonlinear
a modelthat,modulosomeunknownparameters,
describes the response curve. The estimated parameters can then be taken
as the first-stage summaries of the performance curves for the prototypes.
Box and Hunter [19] applied for this approach for nonlinear models.

5. EXAMPLE:THEENGINESTARTINGSYSTEM
EXPERIMENT

In this section we show how our two-stage analysis method can be applied
to an experiment on engine starting systems that was described by Grove
andDavis [12, p. 3291. Foradditionalexamples, we refer the interested
reader to Bisgaard and Steinberg [SI.
The goal of the engine starting system experiment was to reduce the
sensitivity of the system to variations in ambient temperature. The perfor-
mance of the system was evaluated via the relationship between the air-to-
fuel (AF) ratio atthe tip of the spark plug and the fuel mass pulse, which is
controlled by the electronic engine management system. This measure was
adopted because the automotive engineers knew that it was a key indicator
of ignition success. The experiment studied seven components of the starting
system:injector type, distancefrominjectortiptovalvehead,injection
332 SteinbergandBisgaard

timing, valve timing, spark plug reach, spark timing, and fuel rail pressure.
Six different injector types were used; three levels were used for each of the
remaining factors. The L 1 8orthogonal array was used to define the experi-
mental plan for the prototype starting systems. Each of the 18 systems was
then tested at six conditions, formed by crossing three fuel mass pulses (30,
45, and 60 msec) with two temperatures ( - 1 5 T and +15"C). Two tests were
run at each condition, so there are 12 results for each prototype.
The full data set, additional details on the experiment, and a number
of alternative analyses can be found in Grove and Davis [ 121. We proceed
here only with our approach.
Increasing the fuel mass pulse (FMP) injects more fuel into the engine,
and initial plotsof the data for each prototype show, as expected, a negative
correlation between the A F ratio and the FMP. They also show that the AF
ratio is typically higher at - 15°C than at +15"C. A number ofpossible
models might be considered linking the A F ratio to the FMP, and there is
not clear evidence in the experiment to prefer one model over another. For
some prototypes, the A F ratio is almost a linear function of the FMP; for
others the inverse of the A F ratio is nearly linear, and for others the log of
the ratio is mostnearly linear. We elected to work withtherelationship
between thelogarithm of the A F ratioandthelogarithm ofthe FMP,
whichseemed to be most appropriate for the full set of prototypes both
for achieving linearity and for reducing the dependenceof residual variation
on the mean level of response. But we caution that other metrics could also
be used and might lead to somewhat different conclusions.
The first stage of our analysis is to estimate for each prototype the
effects of log FMP and temperature, including their interactions, on log A F
ratio. The levels of FMP were equally spaced (30, 45, and 60 msec), and if
we had kept FMP on its original scale we could have used standard poly-
nomial contrasts to compute its linear and quadratic effects. For example,
the linear effect would be proportional to the average of the results at 60
msec minus the average of the results at 30 msec. The logarithms of the
FMP levels are 3.40, 3.81, and 4.09, and the resulting scaled contrasts are
(-0.372, 0.040, 0.332) (linear) and (0.169, -0.406, 0.237) (quadratic). The
main effect contrastfortemperature is (-0.289,0.289). Theinteraction
contrastsaresimilartothe FMP contrasts,but multiplied by 1 or -1,
according to whether the temperature is high or low, respectively. Each of
the contrasts, when squared and summed over the12 test points, gives a sum
of I , in accord with our scaling convention.
The second stage of our analysis estimates the effects of the design
factors on each of the first-stage coefficients. Since there are 18 prototypes,
the "average" contrast in the effects computation has each element equal to
1/18. ,411 the remaining factor effect contrasts are scaled to have the same
ProductDevelopment:PrototypeExperiments 333

sum of squares. The linear contrast for each three level factor is (-0.068, 0,
0.068), andthequadraticcontrast is (0.0393, -0.0786, 0.0393). Injector
type, the 6 level factor, is represented byfive orthogonal contrasts. These
contrasts are formedby taking the maineffects and interactions of the 2 and
3 level columns that were used at the design stage to assign the levels of this
factor.
Figure 3 shows a normal probability plot of the effects on mean level
(i.e., on the constant terms from the within-prototype regressions). None of
the contrasts sharply deviates from a straight line through the origin. Only
the two lowest values hint at statistical significance. The strongest contrastis
one that corresponds to injector type and indicates that types 4, 5, and 6
have lower average AF ratios than do types I , 2, and 3. The other large
contrast is forthelinear effect of fuel rail pressureandindicates lower
average AF ratios with higher pressure.

Normal Probability Plot for Effects on Mean Level

-.
0

. *
..
8- ..

N
?-

7
-2 -1 0 1 2
Quantiles of Standard Normal

Figure 3 A normalprobability plot of thefactor effccts onthe mean level of


response from our stage 2 analysis of the engine starting system experiment.
334 SteinbergandBisgaard

Figure 4 shows a normal probability plot for the effects related to the
performance curve. The contrasts for the linear effect of log FMP and for
the effect of temperature are clearly significant and dominate all the others.
Figure 5 shows a normal probability plot without the two very large con-
trasts and helps to clarify which contrasts stand out from noise. The only
contrasts that appear to be statistically significant are the three largest and
the two smallest, all of which correspond to interaction effects with tem-
perature. The factors that interact with temperature are the injector type
(two significant contrasts), the distance from the injector tip to the valve
head (both the linear and quadratic components), and the valve timing (the
linearcomponent).The nextlargestnegativecontrast is theinteraction
between temperature and the quadratic component of the valve timing, so
it seems prudent to also take account of this effect in developing a model for
the system.

21
..

N .
r

-2 -1 0 1 2
Quantiles of Standard Normal

Figure 4 A normal probability plot of the factor effects on the performance curve
from our stage 2 analysis of the engine starting system experiment.
ProductDevelopment:PrototypeExperiments 335

..
.’ ..
7

0-

-2 -1 0 1 2
Quantiles of Standard Normal

Figure 5 A normal probability plot of the factor effects on the performance curve
from our stage 2 analysis of the engine starting system experiment, after deleting the
two large effects due to the linear contrasts for fuel mass pulse and temperature.

We can now use the above information to compare different system


configurations. First, we observe that the experiment hasindeed borne out a
clear linear relationship between log AFR and log FMP. The average rela-
tionship estimates logAFR by 3.83-1 . I 3 (log FMP). For the threefuel mass
pulses used in the experiment, the resulting estimates of log AFR are4.25 (at
30 msec), 3.79 (at 45 msec), and 3.46 (at 60 msec). There is no detectable
curvature in thelogAFR-log FMP relationship,andtheonlypossible
dependenceonthedesignfactors is that the mean level of the line may
decreasewheninjector 4, 5, or 6 is used and when fuel rail pressure is
increased. The design factors have no effect ontheslope of the line.
Overall, we concludethattherelationship is quiteconsistentacrossthe
prototype conditions.
There is also a strong relationship between log AFR and tempera-
ture, but it is affected by interactions with three of the design factors. It is
easiest to study and model those effects by computing the average stage 1
336 SteinbergandBisgaard

temperature effect at each level of the relevant factors, which are listed in
Table 1. The average temperature effect was - 1.173. Since the goal of the
experimentwas to reduce sensitivity totemperaturevariation, we seek
levels ofthethreefactorsthatmakethetemperature effect closer to 0.
The best choice is to take an injector of type 6 and use the middle tip-to-
head distance and the low level of valve timing (the middle level is almost
equally good). If we assume that the design factors have additive effects on
the temperature effect, the estimated increases in that effect from each of
thesechoices are 0.296 (frominjector type), 0.225 (fromthetip-to-head
distance),and 0.153 (fromthe valvetiming). The estimatedtemperature
effect is then -0.499, about 60% closer to 0 than its average value. Thus
the experiment has identified factor settings that substantially reduce the
sensitivity totemperature,resulting in less variation in productresponse
and more uniform starting performance.
It is worthnotingthat if we place themean level effects andthe
performance curve effects on the same probability plot (after appropriate
scaling of the mean level effects), many of the mean level effects stand out
from the line through the origin, contrary to our earlier conclusion that at
most two contrasts are significant and then just barely. Thisfinding suggests
that the within-prototype error, on which we base the statistical significance
of the performance curve contrasts, is too small for judging the mean level
contrasts. That, in turn, implies that a substantial amount of the variability
i n thedatamay be attheinterprototype level. Thisinformationcould
be valuable for future efforts to make the performance curves still more
uniform.

Table 1 Average Estimated Ten1perature Effect from the Stagc 1 Analysis at


Each Level of the Three Factors That Had Significant Interactions wlth
Temperature in the Stage 2 Analysis
Level
Factor 1 2 3 4 5 6

Injector type -1.581 -1.058 -1.268 -1.202 -1.055 -0.877


Tip-to-head distance - 1.507 -0.948 - 1.066
Valve timing - 1.020 - 1.030 - 1.469
ProductDevelopment:PrototypeExperiments 337

6. CONCLUSIONS

Prototype testing is an important stage in the development of new products


and production processes. Great gains are possible by exploiting factorial
designs in prototype studies. Engineers can use thesestudies to compare
design options, to increase the feedback from the prototypes, and to accel-
erate the design process.
Statistical methods for prototype experiments must take account of
the fact that prototypes, being expensive to build but often cheap to test,
may be run through a batteryof test conditions,which themselves constitute
a factorialdesign. Our two-stageanalysisprovidesasimplescheme for
modeling the ensuing performance curve and its dependence on the design
factors. It correctly accounts for the split-plot error structure that arises
when the test conditions are nested within the prototype design and permits
quick identification of important effects from normal probability plots.

ACKNOWLEDGMENTS

The research of D. M.Steinberg was carried outin part while he was visiting
theCenterforQualityandProductivityImprovement,Universityof
Wisconsin-Madison.He is gratefultotheCenterforproviding excellent
research facilities. The research of S. Bisgaardwascarried out in part
undergrantnumber DM1 950014 fromthe U S . National Science
Foundation.

REFERENCES

I. GL Urban, J R Hauser. Design and Marketing of New Products. Englewood


Cliffs, NJ: Prentice-Hall, 1993.
2. JW Wesner, JM Hiatt, D C Trimble. Winning with Quality. Applying Quality
Principles in Product Development. Reading. MA: Addison-Wesley. 1994.
3. SC Wheelwright, KB Clark.RevolutionizingProductDevelopment. New
York: The Free Press, 1992.
4. WI Zangwill. LightningStrategiesforInnovation. New York: Lexington
Books. 1993.
5. AL Page. Assessing new productdevelopment practices andperformances:
Establishing crucial norms. J Prod Innov Manag 10:273-290, 1993.
6. S Bisgaard. A conceptual framework for the use of quality concepts and sta-
tistical methods in product design. J Eng Design 3 3 - 4 7 , 1992.
338 SteinbergandBisgaard

7. S Bisgaard, M R Ellekjaer. Designing quality into products during the design


and development phase. Proc Eur Org Qual, Trondheim, Norway 2:285-296,
1997.
8. S Bisgaard. D M Steinberg.The design andanalysisof 2"-P x s prototype
experiments. Technometrics 3952-62, 1997.
9. KBClark, T Fujimoto. Product
DevelopmentPerformance:Strategy,
Organizationand
Management in the
WorldAutoIndustry.
Boston:
Harvard Business School Press, 1991.
IO. G Taguchi.IntroductiontoQuality Engineering. WhitePlains, NY: Kraus
International Publications, 1986.
11. A Miller, CFJ Wu. Parameter design for signal-response systems: A different
look at Taguchi's dynamic parameter design. Stat Sci. 1996. Vol. 1 I. 122-136.
12. D M Grove, T P Davis. Engineering Quality and Experimental Design. Burnt
Mill, Harlow, UK: Longman Scientific and Technical, 1992.
13. WE Deming. Out of the Crisis. Cambridge, MA: Massachusetts Institute of
Technology, Center for Advanced Engineering Study, 1986.
14. S Bigsaard. Designing experiments for tolerancing assembled products.
Technometrics 38:142-152. 1997.
15. ER Ott. A production experiment with mechanical assemblies. Ind Qual Cont
9:124-130, 1953.
16. B Milman, I Sirota, D M Steinberg.Improvingthesafety ofa pyrotechnic
ignitor through a controlled experiment. Propel Explo Pyrotech 20:294-299,
1995.
17. C Hellstrand. The necessity of modern quality improvement and some experi-
ence with its implementationin the manufacturer of rolling bearings. Phil Trans
ROY SOC (Lond)A 3271529-537, 1989.
18. DM Steinberg, D Bursztyn.Dispersion effectsin robust design experiments
with noise factors. J Qual Tech 26:12-20, 1994.
19. G E P Box, W G Hunter. A useful method for model-building. Technometrics 4:
301-318, 1962.
20. WAShewhart.Statisticalmethodfromthe viewpointof qualitycontrol.
Washington DC.: Graduate School, U S . Department of Agriculture.
20
Optimal Approximate Designs for
B-Spline Regression with Multiple Knots
Norbert Gaffke and Berthold Heiligers
Universitat Magdeburg, Magdeburg, Germany

1. INTRODUCTION

Piecewise polynomial regression may serve as an alternative to nonlinear


regression models in the case of a single real regressor variable, since poly-
nomial splines possess excellent approximation properties. If the knots have
been chosen appropriately, the spline modelis linear in the parameters, and
hencetools fromlinearmodelanalysis and experimentaldesigncan be
utilized. For an overview on the use of polynomialsplines in regression
modeling, the reader is referred to Ref. 1.
Let [a,h] be a compact interval (a, h E R, a < h) with associated parti-
tion by given knots,

where L 2 I . A polynomial spline (with respect to the knotsK ~ , . . .K, K~ t ), of


degree at most d >_ 1 is a function on [a,61 that coincides on each subinter-
val [K,, K , + ~ ] with some polynomial of degree at most d , 0 5 i 5 L - 1, and
that satisfies some smoothness conditions at the interior knots K ~ . ., . , Ke-1,
stated next. Let s I , . . . , se-1 be given integers with 0 5 si I(1- 1 for all
i = 1 , . . . , L - 1, where s, denotes the desireddegree of smoothness at
knot K , of the spline functions considered. We abbreviate K = ( K ~ ., . . ,K ~ )
for the vector of knots and s = (sI,. . . , s e - 1 ) for the vector of smoothness
degrees. Let S,(K,s) be the set of all polynomial splines of degree at most d
with respect to the knotsK being si times continuously differentiable at K , for

339
340 GaffkeandHeiligers

all i = I , . . . , e - 1 . Note that .si = 0 means simply continuity at K,, and e =


1 describes ordinary clth degree polynomial regression. Obviously, S,,(K,s) is
a linear space, and its dimension is known to be (cf. Ref. 2, Theorem 5)

To define the particular B-spline basis B , , . . . , BI, of S,(K, s ) to be


employed, we assign multiplicity e/ - s, toeachinteriorknot K,, i =
I , . . . , e - 1, and multiplicity e l + 1 to both boundary knots. Consider the
extended knot vector t having the knots K ( ) , . . . , K~ as components where
each knot is repeated according to its multiplicity, i.e.,

Now a family Bi,[,,i = 1, . . . , k + el - q; q = 0, I , . . . , e / , of functions on [a, h]


is recursively defined as follows.

where

Then the B-spline basis BI, . . . , BI, of SJK,s)is given by

(cf. Ref. 2, Theorems I O and 11).


I t is not difficult to see that the basis enjoys the properties
5Spline RegressionwithMultipleKnots 341

0 5 Bj(s)5 1 for all i = 1, . . . , k and all s E [N, h] (5a)


B , ( s )= 1 if and only if s = N (5b)
B ~ ( s= ) 1 if and only if s = h (5c)

B j ( s )= 1 for all s E [(I, h] (54


I= I
if i = 1

I
[ Gf,/+2)
( s E [N, h] : B,(s) > 0) = ( I , , t,+l,+l) if i = 2, . . . , k - 1 (5e)
(tkI hl if i = k

We note that the small support property (5e) is a particular feature of the
basic splines Bi. Figure 1 shows the B-splines for a special case.
A further favorable property of the B-spline basis, Eq. (4), is its u p i -
wricmce under affine-linear transformation of the knot vector K . That is, if
the interval [a,b] (and its knots K , , i = 0, . . . , e ) are transformed to another

‘ I B2 1 B3

1 BG

Figure 1 B-Splines for d = 3. I = 3 . K = (0,0.5,0.7, 1). and s = (2, 1).


342 Gaffke and Heiligers

interval [i,61 with knots K, = L(K;),i = 0, . . . , e, by the affine-linear trans-


formation L, then the B-spline basis B l , . . . , B k of Sd(K, s) defined corre-
spondingly by Eqs. (3) and (4) is

&X) = B;(L-I(s)) for all i = 1 , . . . ,k andall .X E [i,61 (6)

Hence Eq. (6) allows us to standardize the interval [a, h], e.g., to [0, I].
The spline regressionmodel states that a regression function y is a
member of the space S ~ ( Ks),, i.e.,

for some coefficient vector 8 = (e,, . . . , Ok)’, which has to be estimated from
the data (the prime denotes transposition). Under the standard statistical
assumptions that the observationsof the regression function at anyx values
are uncorrelated and have equal (but possibly unknown) variance c2,the
ordinary least squaresestimator of 8 willbe used. So fordesigningthe
experiment,i.e.,forchoosingthe s values at which theobservations of
y ( x ) aretobetaken,theconcepts of optimallinear regressiondesign
apply.Formathematicalandcomputationaltractability we restrictour-
selves to the approximate theory. An approximate design 6 consists of a
finite set of distinct support points-xI, . . . , .x,. E [ a , h] (where the support size
r 2 1 may depend on 6)and corresponding weights <(xl),. . . , c(x,) > 0 with
6(.xi) = 1. The design 6 calls for C(.x,) x 100% of all observations of the
regression function at x, for all i = 1 , . . . , r . The moment matrix (or infor-
mation matrix) of 6 is given by

where B(x)= ( B , ( s ) ,. . . , B k ( s ) ) ’ Note


. that, by Eq. (5e), for all s E [a, h] the
matrix B(s)B(x)’ has a principal blockof size (d + 1) x (d + 1) outside
which all the entries of B(s)B(x)’vanish. Hence the moment matrix M(6)
of the design 6 is a band matrix with d diagonals above andbelow the main
diagonal, Le., the (i,j)th entries of M ( 6 ) are zero whenever )i - j l > d.
Under a design 6,all coefficients 8;, i = 1, . . . , k, are estimable if and
only if the moment matrix of 6 is nonsingular, or equivalently if and only if
it is positivedefinite.Amongthosedesigns 6 [with M ( 6 ) being positive
definite], an optimal design is one that minimizes @ ( M ( ( ) ) ,where @ is a
given(real-valued)optimalitycriteriondefined onthe set PD(k) of all
BSpline RegressionwithMultipleKnots 343

positive definite k x k matrices. We are concerned here with Kiefer's @/,


criteria (-00 p p 5 1) includingthemostpopular D , A , and E criteria
through p = 0, - 1, "a, respectively. These are defined by

Q o ( M )= [det(M)]"'k, @-m(M) = I / h l ( M )

where h l ( M )5 h 2 ( M )p . . . Ih k ( M )denote the eigenvalues of M E PD(k),


arranged in ascending order. We note that @,(M) is continuous as a func-
tion of p . In particular, @-,(M) = limp+-, O,,(M) for all M E PD(l<), and
hencethe non-smooth E criterion can be approximated by a smooth QP
(with, e.g., p = -50).
InSection 2 we describethealgorithmanddiscussthenumerical
results.Some results on the support of optimal designsfor special cases
areproved in Section 3, providingthusa first steptowardatheoretical
explanation of the numerical results.

2. COMPUTINGNUMERICALLYOPTIMALDESIGNS

The basic algorithm we used is that of Gaffke and Heiligers [3], with neces-
sary adaptations to the present situation of polynomial spline regression as
are described in detail in Ref. 4. So we only briefly outline the method.
A sequence of moment matrices M,,, 11 = I , 2, . . ., is computed, corre-
sponding to some approximate designs t,,,11 = 1,2, . . .. The current design
t,!,however, is not computed (except for the final iteration when the algo-
rithm terminates). Thus an increasingset of support points calling for some
clustering or elimination rules is avoided. For twice continuously differenti-
able optimality criteria @ having compact level sets (as, e.g., the @/, criteria
with "co < p < I), the generated sequence of moment matrices M,, have
been shown to converge to an optimal solution to

Minimize @ ( M ) (Sa)
Subjectto A4 E Conv{B(s)B(s)' : s E [N, b]} n PD(k)
(8b)

where 0 is the optimality criterion under consideration and ConvS denotes


the convex hull of a set S of matrices (cf. Ref. 3, Theorem 2.2). Note that, by
Eq. (7), restriction (Sb) just expresses that feasible
a matrix M is nonsingular
and is the moment matrix of some approximate design.
344 Gaffke and Heiligers

So the algorithm solves problem (8) numerically. Additionally, for the


final iterate M * , say, a decomposition is computed (see below),

with r E N, N 5 x: < . . . < x:


5 b, and \ v ; , . . . , \IJ: > 0 such that
wf = 1. A numerically optimal design is then given by c* having sup-
port points s: and weights [*(.x:) = $, i = 1, . . . , r .
Any starting point MI is chosen from the feasible set (Sb), e.g., MI =
<I
M ( t l )with an initial design whose support contains k distinct points .xl <
. . . < s k such that Bi(si)> 0 for all i = 1, . . . ,k (see Lemma 1 in Section 3).
Given
- 17 E N and the current (feasible) iterateM,,, a feasible search direction
M,, is computed as the optimal solution of a quadratic convex problem

Here we have denoted by lowercase letters m,,, r??, and m ( s , ) the moment
vectors obtainedfrom M,,, M , and M ( s , ) = B ( s , ) = B(s,)B(s,)’,respec-
tively, by a usualvectoroperationturningmatricestocolumnvectors.
Owing to the symmetry and the band structure of the moment matrices it
suffices to apply the vector operation to the main diagonal and the d diag-
onalsabovethemaindiagonal. So thevector operatorconsideredhere
selects that part of a symmetric matrix A and arranges the entries in some
+
fixed order, resulting in a vector vec(A) E RE;,where K = ( d l)(k - d / 2 ) .
In Eqs. ( I O ) we have

where s I , . . . , .x, are certain points from[ a , h] to be described next (note that
these points including their total numberr depend on r7, but this dependence
is dropped here to simplify the notation), and G,, denotes the gradient of Q,
at M,, in the space of symmetrical k x k matrices endowed with the scalar
product ( A , B ) = tr(AB). The matrix V occurring when vectorizing the gra-
dient is a fixed K x K diagonal matrix with diagonal entries equal to 1 or 2,
such that those components of vec(G,,) coming from the diagonal of G,,
receive weight 1 while the off-diagonal elements are weighted by 2. This is
to ensure that g,, is the gradient at m , , of the function
SSpline RegressionwithMultipleKnots 345

where vet-' is the inverse operation of vec, converting an K-dimensional


vector into a band matrix [ n ? being restricted in Eq. (11) to the set of all
vectors obtained by vectorizing positive definite band matrices]. Note that
although M,, is a band matrix, this is not true in general for the gradient G,,,
e.g., for the Q, criteria with -GO < 11 < 1 we have

The points s,, i = 1, . . . , 1', in (lob) are most crucial for obtaining a good
search direction by solving the quadratic problem. Their choice is guided
by the equivalence theorem, i.e., the first-order optimality conditions for
problem (8). A feasible moment matrix M* is an optimal solution if and
only if

B(s)'(-G*)B(s) 5 tr(-G*M*)
for all s E [ a , h ] (12)

where G* is the gradient of Q at M * . Moreover, if M* is an optimal solution,


then for any representation of M * as

with some

one has

B(s:)'(-G*)B(s:) = tr(-G*M*) for all i = 1 , . . . , I'

From this it appears reasonable to choosein (lob) the local maximum points
s I , . . . , s,.of
the function

(including, of course,its glohnl maximum points). In fact, computing these is


not too difficult, since B(s)'(-G,,)B(s) is a polynomial spline of degree at
most2d.i.e.,apolynomialofdegree atmost 2d on eachsubinterval
[ K ; , K , + ~ ] , i = 0, . . . , e - 1. Thus, standard routines for computing all zeros
346 Gaffke andHeiligers

of polynomials can be used. Figure 2 shows an example of the function (13)


for an early iterate and for the final one.
Thematrix H,, i n (loa) is apositivedefinite K x K matrix, which
should be an approximation of the Hessian matrix of 4 from (1 1) at n z , , .
A good job is done by the Broyden-Fletcher-Goldfarb-Shanno (BFGS)
update

where

with any positive definite initial choice of H I .


Thequadraticminimizationproblem (IO) can be solved by the
Higgins-Polak methodas described in Ref. 3 . Let H,,be thesolution

3s -

2s -

15 -

5.

Figure 2 The function (13) for iterate I I = IO (dotted line) and for the final iterate
11= 43 (solid line). Under consideration is the cubic spline model as in Figure I , and
the optimality criterion is the A criterion (I, = -1).
5Spline RegressionwithMultipleKnots 347

obtained. We note that the Higgins-Polak method also provides weights


wo, w I , .. . , w,. 2 0 summing up to 1 and such that

but this is used only in the final step (see below). Let M,,
= vec”(E,,). Now,
a search along the line segment

(with some fixed a! < I , usually close to 1) is performed to obtain the next
iterate M,,+l.
To summarize,themethodforsolving (8) is amodified quasi-
Newtonmethod.Thesearchdirection is based onalocalsecond-order
approximation of the objective function 0.The constraint set in (lob) over
which thequadraticapproximation is minimizedmay be viewed asa
polyhedral neighborhood of the current vector iterate m,,. It may appear
more natural to minimize that quadratic approximation over the set of d l
moment vectors

r77 = vec(M), M E Conv{B(.y)B(s) : .x E [a, h ] }

This, however, is practically impossible.


After termination of the algorithm with a final iterate M,, (for stopping
criteria see Ref. 3), a corresponding numerically optimal design c*
is com-
puted by applying the Higgins-Polak method to the problem of minimizing
the final quadratic approximation (loa) over the slightly smaller set

that is, the final vector iterate m,,is removed from the generator set in (lob).
This has proved to be favorable, since otherwise a positive w O may occur in
(14) that could prevent the identification of a corresponding design. We thus
obtain an optimal solutionm*, say, to that quadratic problem, a non-empty
subset I of indices from [ I , . . . , I ) , and positive weights It,;, i E I , summing
up to 1 and such that
348 GaffkeandHeiligers

In all our numerical experiments we observed that 117* is very close to the
final vector iterate m,, and shares numerically the same value of 4. Hence, a
numerically optimal design is given by (* supported by s i , i E I , and weights
(*(x;) = w;.
The algorithm shows good convergence behavior,in particular a good
Ioccrl convergence rate as it is usually observed by a quasi-Newton method.
For instance, the D-optimal designs for spline degree ri = 2 and one single
interior knot (i.e.? = 2 , s I = 1) derived theoretically in Ref. 5, page 43, and
in Ref. 6, Theorem 2, arefound very accurately by thealgorithm.For
degrees d = 3,4, 5 and one single interior knot, D-optimal designs within
the class of designs with minimum support size k were found numerically by
Lim [6]. The presentalgorithmcomputed precisely thesedesigns asthe
numerically D-optimal ones in the class of d l designs (up to two printing
errors in the tables on page 176 of Ref. 6).

Table 1 Numerically Optimal Designs in the Spline Model (Fig. 1)

D A E
Support Weight Support Weight Support Weight
0.00000 0.14286 0.00000 0.08848 0.00000 0.07361
0.00000 0.14286 0.09128
0.00000 0.00000 0.07361

0.16329 0.14286 0.15315 0.16875 0.14473 0.17424


0.14473 0.14286 0.14473 0.16962 0.14473 0.I 7424

0.43037 0.14286 0.434 15 0.1 8454 0.43418 0.20559


0.43418 0.14286 0.43418 0.18134 0.43418 0.20559

0.62989 0. I4286 0.63363 0.14444 0.633 16 0.17364


0.63316 0.14286 0.63316 0.14328 0.63316 0.17364

0.75929 0.14286 0.75807 0.15269 0.75720 0.14992


0.75720 0.14286 0.75 720 0.14820 0.75720 0.14992

0.90894 0.14286 0.9 1 I79 0.17049 0.9 1907 0.15293


0.91907 0.14286 0.91907 0.17121 0.91907 0.15293

1.ooooo 0.14286 1.ooooo 0.09062 I .ooooo 0.07008


I .00000 0.14286 I .00000 0.09506 I .00000 0.o7008
Note: Under consideration are the D, A. and E eriterla. The numbersin italics give the optimal
designs supported by the Chebyshev points.
5Spline RegressionwithMultipleKnots 349

Table 1 shows a few numerical results for the D and A criteria and the
approximate E criterion i n thecubicsplinemodelas in Figures 1 and
2. The designsaddressed in Table 1 by italics arethe D-, A - , andE-
optimal designswithinthesubclassofthosedesigns concentrated on the
Chebyshev points, Le., supported by the k extrema1 points of the equioscil-
lating spline in S,/(K,s) (cf. Ref. 7, Section 2). For the D and A criteria these
are computed by a simplified variant of the above algorithm, fixing s I , . . . ,
x,.(r = k ) to those Chebyshev points, while the E-optimal designis from Ref.
7, Theorem 4. By that theorem the E-optimal design (among crll designs) is
supported by the Chebyshev points. We see from Table 1 that the
optimal design numerically coincides with the E-optimal design. For the D
and A criteria the Chebyshev restricted designs do not differ much from the
numerically optimal designs. The D efficiency of the former with respect to
the latter is 0.99335, and the A efficiency is 0.99476. Similar results hold true
for other spline setups.
In all thecases we considered,thenumericallyoptimaldesignhas
minimum support size and the boundary points CI and h are support points.
For D optimality, the minimum support size property has been conjectured
in Ref. 5 , page 45, Conjecture I . In our final section we present some first
results toward a theoretical foundation of the observed phenomena.

3. SOME RESULTS ON OPTIMAL SSPLINE REGRESSION


DESIGNS

The B-spline basis B,, . . . , Bk of S(K,s) defined by (4) enjoys the fundamen-
tal property of totnl positivit),; i.e., for any points s I , . . . , .q. such that CI 5
s i < . . . < s k 5 b the collocation matrix

is totally positive. Recall that a k x k matrix A = (u;,.~);,,=~ .,_,, is said to be


totally positive if and only if all its minors are nonnegative,i.e., if and only if
for a n y p E ( I , . . . , k ) and allp row and column indices 1 5 il < . . . < i, 5 k
j,,
and 1 5,jl < . . . < 5 k one has

Moreover, by Ref. 2, Theorem 12, the collocation matrix (16) is nonsingular


if and only if its diagonal elements are positive. From this we obtain
350 Gaffke andHeiligers

Lemma 1
For any design [, the moment matrix of ( from Eq. (7) is nonsingular (and
hence positive definite) if and only if there are support points z I < ... < z,,.
of 6 such that Bj(z,)> 0 for all i = 1, . . . , k .
Proof. Arrange the
support
points of [ in increasing
order,
c[ 5 .x1 < . . . < .x, p b, say. We may write

where

N ( 6 ) = (B;(.X-,)),=L
.k and W ( [ )= diag(((sl), . . .((x,))
/=I. .I

Obviously, M ( 6 ) is nonsingular if and only if the rows of N ( ( ) are


linearly independent, or equivalently, if and only if there exist k column
incides 1 p j l < . . . < jk p r such that the submatrix

is nonsingular. As notedabove, this is equivalent to Bi(z,)> 0 for all


i = I , . . . , klemma
,the
and is proved. 0
A design 6 is said to be udnissible~forS(,(K,s),if and only if there is no
designsuch that M ( 6 ) p M ( i ) and M ( ( ) # M(&).That is, the admissible
designs are precisely those whose moment matrices are maximal with respect
to the Loewner partial ordering in the set of all moment matrices of designs.
The Loewner partial ordering in the set of all symmetrical k x k matrices is
defined by

A pB if and only if B - A is positive semidefinite

Note that admissiblity of a design does not depend on the particular choice
of the basis of the spline space S ~ ( K s)., For, if we choose another basisf =
( f l , . . . ,fi)’ (e.g., the truncated power basis as in Ref. 8). then this is related
to our B-spline basis B = ( B , ,. . . , Bk)’ by a linear transformation, i.e.,
f ’ = TB, for some nonsingular k x k matrix T . Hence the resulting moment
matrices of designs under basis f ,
&SplineRegressionwithMultipleKnots 351

(s,,. . . , x,.being the support pointsof () are related to the moment matrices
M(6) under the B-spline basis by

M , ( t ) = TM(6)T’ for all designs e (18)

Obviously, for the Loewner partial ordering we have

A s B G+ T A T ’ S TBT’

for any symmetrical k x k matrices A and B.


Any reasonable optimality criterion@ is decreasing with respect to the
Loewner partial ordering, i.e.,

If A , B are positivedefinite and A 5 B, then @ ( A ) 2 @ ( B ) . (19)

Many optimality criteria @ are strict/y decreasing; i.e., if additionally A # B


in (19), then @ ( A ) > @ ( B ) .Examples are the @/, criteria for finite 11 we used
in Section 2. If @ is strictly decreasing, then obviously any @-optimal design
is admissible.
The result of Ref. 8, Theorem I . 1, states that a design t is admissible
for S,/(K,s ) if and only if

e
where supp(6) denotes the supportof and L s J is the largest integer 5 s . For
the case that s i E (0, 1) for all i = I , . . . ,C - I , the observed minimum sup-
port size property of @,-optimal designs (where p < 00) is explained by the
following result (cf. also Ref. 8, pp. 1558-1559).

Lemma 2
Let s, E (0, I } for all i = 1, . . . , C - 1. If ( is admissible for S(/(K,
s) and the
moment matrix M ( e ) is nonsingular, then the support size of 6 is equal to k
[the dimension of S ~ ( K s)],, and the boundary knotsK O , K~ and all the interior
knots K ; with smoothness s i = 0 are in the support o f t .
Proof: By ( I ) , k = Crl+ 1 - a, where a denotesthenumber of
+
interior knots K ; with s i = 1. Consider the B = C 1 - a knots

K;, < . . . < Klp


352 Gaffke and Heiligers

which are the end knots of the interval and the interior knots with smooth-
ness zero. By Eq. (20), for all u = I , . . . , B - I ,

where a,, denotesthenumberofknotsofsmoothness 1 in theinterval


(K,,,? K,,,+, ). Hence,

Since M ( t ) is nonsingular, we have #supp(t) 2 k , and thus K,, , . . . , K,,, E


#supp(e)
supp([) and =k. 0
For polynomial spline regression with higher smoothness, a theoretical
explanation of the minimum support size property of @,-optimal designs is
still outstanding. It has not even been proved that the support ofa @/,-
optimaldesign necessarily includesthe boundarypoints oftheinterval
[ o , h]. However, for Doptimality (p = 0) thelattercan be proved (see
Lemma 3 below; see also Ref. 6, Lemma I).
For the rest of the chapter wewill be concernedwithD-optimal
designs for polynomial spline regression. As is well known [and is obvious
from Eq. (18)], D optimality of a design (within any class of designs) does
not depend on the particular choice of the basis of the space S[/(K,s); thus,
wewill use the notion of a D-optimal design for S[,(K.s). The following
result has been stated by Kim [6, Lemma I]. However, the proof given in
that paper is not convincing, in our view, and we give different proof here.

Lemma 3
The D-optimal design for S(/(K, s) (with arbitrary degree, knots, and asso-
ciated multiplicities) has both boundary points K() = CI and K~ = h among its
support points.
P ~ m f Let 6 be any design with nonsingularmomentmatrix M ( c ) ,
and let s I < . . . < x, be the support points of 6. Consider the representation
(17) of M ( < ) .I n the following we denote by N t ( ; : : : : : ; lthe
) submatrix of N ( 6 )
with respective row and column indices 1 p i, < . . . < i, 5 li and 1 5 j1 <
. . . <,j,, 5 I' (where 1 5 p 5 k ) , i.e.,
SSpline RegressionwithMultipleKnots 353

By (17) and the Cauchy-Binet formula, we have

Suppose that s I > N. We will prove that cannot be D optimal.


Let f be the design obtained from 6 by replacing the support point SI
by CI and f(u)= ((s,),<(.x,) = ((si)for a l l i = 2, . . . , I'. By (21) and itsversion
for <,we obtain

det M ( c ) - det M ( i )

From (5a)-(5e) we see that the first column of Ni (l l .....)


, ~ ~ ~ is
. - the
~ first unit
vector in Rk; thus

Since the collocation matrix NE (;::::::::,>


is totally positive, we have by the
Hadamard-type inequality (cf. Ref. 9, p. 191) and by (sa),

Moreover, by (5b), the last inequality in (24) is strict whenever the matrix
N6
(i:;;;;.lk is nonsingular. In fact, such indices 2 5 j 2 < ' . . < . j k 5 I' exist.
For, by (21), since M(<) is nonsingular,there exist indices 1 a,j, <
'.:.")
j z < . . . < . j k II' such that NE(/I .I".....lk is nonsingular, and againby apply-
ing the Hadamard-type inequality to the latter totally positive matrix we
obtain
354 Gaffke and Heiligers

Together with Eqs. (22)-(24), it follows that

det M ( ( ) < det M ( i )

and thus 6 canot be D optimal.


The case for s,.< h is treated analogously.

Some results on D-optimal designs for S,/(K,s)within t h e class ofmini-


mum support clesig~wwere derived i n Ref. 10. Actually, in that paper differ-
ent polynomial degrees d j on each subinterval [K,, K,+~], i = 0, . . . ,.!! - I ,
were admitted, but we will not follow this extension here. For short, adesign
with support size k = dimSd(K, s) that is D optimal within the subclass ofall
designs with support size k will be called a D-optimd minin7unl support
desigrl,for S[\(K,
s). As is well known, a D-optimal minimum support design
assignsequalweights Ilk to eachofits support points. As the proof of
Lemma 3 shows,the result of that lemmapertains also to aD-optimal
minimum support design.Hence, by (l7), a D-optimal minimum support
design for S,/(K,s ) is determined by its support points

where x* = (ST, . . . ,s k* ) IS
’ an optimal solution to the problem

Maximizedet N(x)
Subject to xI = CI < s 2 < . . . < s k - l < .yk = h

For thecaseof merely continuous polynomial splineregression (that is,


.si = 0for all i = 1 , . . . , .!! - I ) it was claimed by Park [IO, p. 1521 by some-
what heuristic arguments, that the D-optimal minimum support design in
S[/(K,0) is obtained by putting together the support points of the D-optimal
designs for ordinary dth degree polynomial regression on the subintervals
[K,, K;+,],i = 0, . . . , e - I . We give a proof thereof in Corollary 5. We start
with a more general result.
SSpline Regression with MultipleKnots 355

Lemma 4
Let io E { 1, . . . , e - I ] be such that the interior knot K,, has smoothness
.Y;,= 0. Then the support of the D-optimal minimum support design for S,
( K , s ) is the union of the supportsof the D-optimal minimum support designs
for S,/(K('), & I ) ) and for S[,(K(*),
s ( ~ ) ) respectively,
, where

K(l) - -
- (Kg, K I , . . . , K,,), - (sI , . . . , "in
-I )

K(2) = (K,n, K,,,+I, . . . , Kg) s(2)= (S;,+I, . . . , se+l)

(If io = 1 or io = - I , the sets S d ( ~ ( &II ))) , or S ~ ( K s(2))


( ~ )are
, to be under-
stood as the space of all clth-degree polynomials over [ K ~KI] , or [ K ~ - I , ~ e ] ,
respectively.)
Proof. Consider the vector t = ( t l , . . . , t k + d + l )of multiple knots from
Eq. (2). Let ko be the index for which

From (5e) we see that

B , , . . . , Bk,-l vanish on [K;~,, h]

and

&,+I, . . . , B k vanish on tu, K,,]

Also, observing (sa) and (5d), we have

Let x = ( x I , . . . , .xk) satisfy (25) and such that the collocation matrixN ( x ) is
nonsingular, i.e, B,(.x,) > 0 for all i = I , . . . , k. By (27a)-(27c), s ~ , < - ~K,,
and N ~ , , +>~ K,,,. Hence the Hadamard type inequality for totally positive
matricesentails,usingnotationsforsubmatrices as in theproof of
Lemma 3 ,
356 Gaffke and Heiligers

0 < det N(x)

5 det N,
I,...,k()-
l,...,ko- 1
'>Bk,l(.~ku)
det N, (""'
...,k
+

ko+1, . . . , k
1, ..., k o - 1 k"+1 ,...,k
5 det N,
I,...,/<()- 1

If x k , , = kill, then the column vector (B~(K,,,,


. . . , B~(K,,))'is the koth unit
vector in R k , as follows from by (27c), (5a), and (5d). Hence, if sk,, = K ; ( ~ ,
then there is equality in (28a)-(28c), but otherwise there is strict inequality
throughout. Consequently, an optimal solution x* to (25a) and (25b) must
satisfy .I$,,= K , ~ Now,
~. for any x satisfying (25b) and ski, = K,,], we may write

Hence

det N(x)= det N,(x('))det N 2 ( x ( * ) )

where we have denoted

Equation (29) ensures that an optimal solution x* of (25) must be such that
x*(') is an optimal solution to the problem

and x*(2)is an optimal solution to the problem


5Spline RegressionwithMultipleKnots 357

Now the assertion follows by observing that the matrices N l ( x " ' ) and
N2(-t?') are the collocation matrices to x") and x(?)under special bases of
the spline spacesS(/(K('), ,$I)) and S,[(K(?),
,+')), respectively. For, note thatby
(26) and ( I ) , /io is the dimension of the space SI/(^('), ,s(')). The B-splines
B I ,. . . , Bk restricted totheinterval [ ( I , K,,,] are clearly members of
S,I(K('), .s(')i; they are linearly independent [since by (29) there is a nonsin-
gular collocation matrix in these splines], and hence they are a basis of the
space S,/(K('), .+I)). Similarly, it can be seen that the dimension O f S,/(K'",
+
is equal to k - ko I , and the B-splines Bk,,,. . . , B k restricted to the inter-
val [K,,,, h] forma basis of space
the s(/(~(?), s(:)). 0
Repeated application of Lemma 4 for merely continuous polynomial
spline regression yields

Corollary 5
The support of the D-optimal minimum support design for S,/(K, 0) is the
union of the supports oftheD-optimaldesignsforordinarytlth-degree
polynomial regressions over the subintervals [ K , , K , + ~ ] ,i = 0, . . . , L' - I .

REFERENCES

I. RL Eubank. (1984). Approximate regression models and splines. Conimun Stat


Theor Methods 13:433484.
2. K Morken. (1996). Total positivity and splines. In: M Gasca. CA Micchelli,
eds. Total Positivity and Its Application. Dordrecht: Kluwer, pp 47-84.
3. N Gaffkc, B Heiligers.(1996).Second ordermethodsfor solving extremum
problems from optimal linear regression design. Optimization 36:41-57.
4. N Gaffkc. (1998). Numerical computation of optimal approximate designs i n
polynomialspline regression. Journal of Combinatorics,Informationand
System Sciences, Special Volumc "Design of Experiments
andRelated
Combinatorics." 23: 85-94.
5. VK Kaishev. (1989). Optimal experimental designs for the B-spline regression,
Comput Stat Data Anal 8:3947.
6. YB Lim. (1991). D-Optimal design in polynomial spline regression. Korean J
Appl Stat 4:171-178.
7. B Heiligers. (1998). E-Optimal designs forpolynomial splineregression.
Journal of Statistical Planning and Inference, 75: 159-172.
8. WJ Studden, DJ VanArman (1969). Admissible designs for polynomial spline
regression. Ann Math Stat 40: 1557-1 569.
9. T Ando. (1987). Totally positive matrices. Linear Algebra Appl 90:165-219.
IO. SH Park. (1978). Experimental designs for fitting segmented polynomial regres-
sion models. Technometrics 20: I51LI54.
This Page Intentionally Left Blank
On Dispersion Effects and Their
Identification
Bo Bergman
Linkoping University, Linkoping, and Chalmers University of
Technology, Gothenburg, Sweden

Anders Hynen
AB6 Corporate Research, Vaster&, Sweden

1. INTRODUCTION

Understanding variation is fundamental to quality improvement and custo-


mersatisfaction. That was realized early by Shewhart (1931) andlater
emphasizedby,forexample,Deming(1986, 1993). While Shewhartand
Demingmainlyconcentratedonthereductionofvariation by removing
so-called assignable or special causes of variation, Taguchi (1986) suggested
a systematic way to make products and processes insensitive to sources of
variations (see also Taguchi and Wu (1980)). This strategy is usually called
robust design methodology or robust design engineering; see, for instance,
Kackar (1985), Phadke (1989), and Nair (1992). An important step is to
identify factors,controllable by thedesignerorprocessdeveloper,that
affect the dispersion of a response variable y of interest.
Let x denotea vectorof controlfactors,and let z be avectorof
environmental variables that vary in a way usually not controllable by the
designer, although someof its components mightbe controllable during the
course of an experiment. A quite general way to describe the outcome y is

+ +
where f ( x ) is the expectation of y, .f’(x) g(z) h ( x , z) is the conditional
expectation of y given z; here h(x, z) corresponds to the interactionbetween

359
360 BergmanandHynen

x and z. In robust design methodology we want to determine levels of the


factors, i.e., components of x, such that the effect on y of the variation of E
and z is made as small aspossible whileJ'(x) is kept on target.Assume that it
is possible to vary a l l components of z in an experiment. Then the interac-
tion between x and z is important in order to identify a robust design; see,
for example, Box et a l . (1988) and Bergman and Holmquist (1988). Very
often, however, we cannot vary all components of z; we have to find factors
(components of x) that affect the dispersion of J'. i.e., variables having a
dispersion effect. To clarify this approach we expand the variance of J' by
conditioning on the environmentalf.dctors z:

The two terms in the variance of J' can be interpreted as follows. The first
term on the right, E[Var[l*lz]], portrays how the variance of J', given z, is
affected by dispersion effects, i.e., factors affecting the spread of the data.
The second term on the right,Var[E[ylz]]. portrays how the variance of J, is
affected by parameters in the location model-including fixed effects of z
such as design by environmental interaction effects. The approach is moti-
vated by the incorporation of dispersion effects, since direct location mod-
eling of both design factors and environmental factors is allowed; thus this
standpoint reduces the risk of confounding location effects and dispersion
effects. Theoreticaljustificationfortheapproach is alsoprovided by
Shoemaker et al. (1991), Box and Jones (1992), and Myers et a l . (1992).
In this chapter we discuss how to identify control factors, i.e., product
or process parameters, having dispersion effects; in particular, we discuss
howdispersion effects can be identified usingunreplicatedexperimental
designs i n the 2k"' series of fractional factorial designs (see Bergman and
Hynen. 1997). For some extensions to more general designs, see Blomkvist
et al. (1997) and Hynkn and Sandvik Wiklund (1996).

2. IMPROVING ROBUSTNESSTHROUGH DISCOVERY OF


DISPERSION EFFECTS

When it is possible to vary environmental (noise) factors in an experiment,


robustnessimprovement is possible throughlocation effect modeling if
interaction effects are found. However, in this section improving robustness
through minimization of the first variance term in (2) is considered. It was
not until fairly recently that dispersion effect modeling became a central
issue i n parameterdesign,originallynotemphasized even by Taguchi.
Historically,therearemanyanecdotesassociatedwithdispersion effect
Dispersion Effects and Their Identification 361

modeling, but many of these are merely anecdotes or aimed at making the
estimation of location effects as efficient as possible. During the past decade
this problem area experienced a rapid growth of interest, as shown by the
number of applicationsandpublishedpapers.Ingeneral,thereare two
approaches for dispersion effects modeling: Either the experiment is repli-
cated, or it is not. Major emphasisin this chapter is placed on the lattercase;
however, for the sake of completeness both approaches are considered.
In a replicated experiment, identification of dispersion effects is fairly
straightforward. Depending on the error structure of the experiment, e.g.,
on whether or not the replicates are carried out fully randomized, the iden-
tified dispersion effects are effects measuring variability either between or
within trials. Some may use the terms rqdiccrres and cl~rplicates,or genuirw
and , f C / s c ~ replicates, respectively. If we compute sample variances, under
each treatment combination, on which new effects can be computed, the
analysis is rather uncomplicated. Taking the logarithm prior to computing
the effects improves estimation (see Bartlett and Kendall, 1946). The new
effects, which can be seen as dispersion effect estimates, can be plotted on
normal probability paper to discriminate between large and small effects or
analyzed with other techniques such as analysis of variance.For more back-
ground on this topic, see Nair and Pregibon (1988) and Bisgaard and Fuller
(1995).
If the problem of dispersion effect modeling is a fresh arrival. identi-
fication of dispersion effects from unreplicated experiments is of even more
recent date. Rather pioneering, Box and Meyer (1986b) published a paper
addressing dispersion effect identification from unreplicated two-level frac-
tional factorial experiments in the 2k-/' series. Their contribution was not
entirely unique; Daniel (1976). Glejser (1969), and many others had touched
upon the subject earlier, but Box and Meyer were the first to propose dis-
persion effect identification from unreplicated experiments as an important
aspect of parameter design. In a paper by Bergman and Hynkn (1997), in
which the problem area is surveyed and a new method is introduced; dis-
persion effects fromunreplicateddesigns in the 2k-/' series cannow be
identified withwell-knownstatistical significance testingtechniques (see
SO Section 3). It is still tooearlytojudgethe significance ofthe new
method,butcomparedto existing methodsthe new proposaldoesnot
rely on distributional approximations or model discrimination procedures
that are entirely ad hoc. There is, however, an assumption of llorlllality that
is rather critical (see Hynen, 1996). Moreover,themethodproposed in
Bergman andHynen (1997) is generalized toexperilnental designs other
than the two-level designs from the 2"/' series by Blolnkvist et a]. (1997)
and to the inner and outer array setup by Hynln and SandvikWiklund
(1996). The use of normalprobabilityplottingandtransfornlatiolls in
362 BergmanandHynen

combination with the method of Bergman and Hynen ( 1 997) is considered


by Blomkvist et al. (1997).
A different approach to the same problem was taken by Nelder and
Lee(1991) and by Engel and Huele (1996). In both papers a generalized
linear model approach is taken: see also McCullagh and Nelder (1 989).
Overall, even if many problems remain to be solved, the contribution
provided by the papers cited above constitutes a technique that can be useful
for many purposes. It can be used to identify general heteroscedasticity, to
relateheteroscedasticitytocertainfactorsstudied in theexperiment, or
simply to provide an additional component to thedesign engineer’s toolbox
useful for identifying the most robust design solution. Also note that the
techniques used for unreplicated designs may beusedin conjunction with
duplicated designs to identify different components. The method suggested
by Bergman and Hynen (1997) is discussed in the following section.

3. DISPERSIONEFFECTS IN TWO-LEVELFRACTIONAL
FACTORIAL DESIGNS

Let i denote one of the factors in an unreplicated two-level fractional fac-


torial design. Define a;+ as the average variance of the observations when
factor i is at its high level, and let a;- be defined correspondingly. Factor i is
said to have a dispersioneffect if c$+ # a;-. Natural but naive indicators for
+ and 0-; are the sample variancesbased on all observations when factor i
a
:
is at its high and low level, respectively, Le., s’(i+) and s2(i-). Box and
Meyer(1986b)suggested the useof ratios F, = s2(i+)/s2(i-) to identify
dispersion effects. However,despitethenotation, theynotedthatthe F
ratios did not belong to an F distribution owing to the presence of disper-
sion and location effect aliasing. The location effects had to be eliminated
beforeestimatingdispersion effects. Therefore,estimates werecalculated
from residuals obtained after eliminating suspected location effects. Later,
somealternativesto Box and Meyer’s approach were given. Nairand
Pregibon (1988)extended the
method to
the
case
with
replications.
Furthermore,essentialcontributionsare given by Wang (1989) and
Wiklander (1994), who propose alternatives to the unreplicated case.

3.1. LocationEffects
Let y be the (12 x 1) response vector from a complete or fractional factorial
experiment with an ( n x n ) design matrix X with column
vectors
xo, . . . , x,,+I.Column x. is a column of I’s, and the remaining columns
representcontrastsforestimatingthemainandinteraction effects. We
Dispersion Effects and Their Identification 363

assume that the observations y l , . . . ,y,, are independent with variances


Vbli] = ai, u = I , . . . , n. Possibly, a,,depends on the factors varied in the
experiment. Note that, for example,

Let z = (I/n)X’y be the vector of estimated mean response, main, and


interaction effects. As usual, we denote E [ z ] = fl, whereupon zg, . . . , zl,-l are
independent with equal variances

As noted by Box and Meyer (1993), the “vital few and trivial many” prin-
ciple suggested by Juran (the Pareto principle) ensures that in most cases
only a few B’s are nonnegligible. Therefore, we can use the normal plotting
techniquesuggested by Daniel(1976) to find these B’s (see alsoDaniel,
1959). Of course, there may be problems due to confoundings when highly
fractionated designs are used, but this issue is not discussed further here.
See, forexample, Box and Meyer (l986a, 1993), who give an interesting
approach to these problems using Bayesian techniques.
Under the Pareto principle, only a few degrees of freedom are used to
estimate nonnegligible /3 values. Therefore, the remainder of the contrasts
can be used to estimate the variance ai, i.e.,

In order toidentify dispersion effects, we shall use additional contrasts


that are based on linear combinations of those column vectors in X asso-
ciated with negligible location effects.

3.2. Dispersion Effects


Box and Meyer (1986b) created new column vectors based on columns from
the original design matrix X:

1 1
X/li+ =?(x,+x,.,) and xjli- =-(x, - x,.,)
2
364 Bergman and Hynen

where x,./is the column vector corresponding to the row-wise (Hadamard)


product of x, and x/;Le., if xi and x/ correspond to main location effects,
then xi./corresponds to the i x j interaction effect. Note, for example, that
the uth element of xili+ is equal to x,,~if x,,,= + I and zero otherwise.
Let us now introduce the set r, of nonordered pcrirs of column vectors
(x/,x,.,} from X, such that the pair (x,,, x,.,,)is excluded and neither of the
corresponding contrasts z/ = xly and :I.l = x,',y has been judged to estimate
nonnegligible location effects, i.e., their expected values are judged to be
zero:

E[xlyl = E[x,!;y] = 0

Note that there are ( n - 1)/2 members of TI if all location effects are
judged to be negligible, i.e., if we have E[xjy] = 0 for allj. It is straightfor-
ward toshowthatthecontrastscorrespondingto (I), ziIj+ = x,lli+yand
have variances
zili- = X~;~-Y,

Vur[zili+]= n-aj+
2 11 7
and V ~ W [ ; ~=
~ ,-a;-,
-] respectively (3)
2 2

Now, let xibe associated with a studied factor,i.e., let x,'y estimate one
of the main effects. If a;+ and 0-; are different, this factor has a dispersion
effect. Therefore,
the difference
between and gives information
about the magnitude of this dispersion effect. If we can find many indices
belongs to r,, then all the corresponding &+ and
.j such that (x,, x,:i}
can be used to estimate the difference between ai?, and c$-. Moreover, since
the column vectors xili+ are orthogonal, the contrasts ~/21~+ are independent.
Therefore, we can use an F test fortesting Hi" : D;+ = 0-; against
Hi, : a
:+ # c$- with the test statistic

Under Hi,,. thedistribution of F is an F distribution with ( m , n 7 )


degrees of freedom, where t n is thenumber of elements (xi,xi.i}in r,.
Note that the test is double-sided; i.e., both large and small values of F,
supply evidence against the null hypothesis.
DispersionEffectsandTheirIdentification 365

3.3. AlternativeExpressions
The intuitive understanding of the above expressions might be somewhat
vague. However, more intuitive expressions exist.Compute new “residuals,”
F,,, z[ = I , . . . , I ? , based on a location model including the active location
effects expanded with the effects associated with column i and all interaction
terms between i and the active location effects. Then the statistic 0;” may
be computed a s

A third interpretation is the following. Given the identified location


model, fit separate regressions to the two sets of data associated with the
high and low levels of column i, respectively; i.e., use column i a s a branch-
ing column. Compute the corresponding residual vectors, i,+and i,-, and
calculate 0BH a s

This alternative is, in fact, a generalization of the parametric test suggested


by GoldfeldandQuandt (1965). Theyproposedasimilarapproachfor
identifying heteroscedasticity in a more general regression model.
Regarding the three alternatives, we see that the second one intuitively
explainsthedifferences between D f H and the methods based directlyon
residuals. That is, it is necessary to adjust the original residuals to obtain
independence between thetwosets of residuals. This independence is, of
course, conditional on the judgments made in ( 2 ) but provides the sufficient
requirements for DBH being F-distributed. The third alternative is the most
natural way forward to generalizetheproposedmethod to designs other
than the 2”p series, e.g., to nongeometric Plackett and Burmandesigns and
to factorial designs with more than two levels (see Blomkvist et al.. 1997).

4. AN ILLUSTRATIONFROMDAVIES (1956)

I n Davies (1956), data from an improvement study concerningthe quality of


dyestuff was presented. The outcomewas also given an interesting reanalysis
by Wiklander (1994). We use the same dataset to illustrate our method. The
improvement study was carried out as a z5-’ fractional factorial experiment
366 Bergman and Hynen

without replicates involving five factors, labeled A-E. The defining relation
was chosen as I = -ABCDE. The quality of the dyestuff was measured
by aphotoelectricspectrometer,whichgaveaqualitycharacteristicof
“the smaller the better” type; i.e., the lower the value recorded, the better
the quality. Responses and all 15 orthogonal columns concerning location
main and interaction effects are given in Table 1.
Since no independent error estimateis available, the normal probabil-
ity plot of contrasts suggested by Daniel(1976) is a convenient tool for
analysis (see Fig. 1). From this plot it appears that factor D is theonly
location effect present in thedata; hence columnsotherthan D can be
used for estimating dispersion effects. In Davies (1956) only location effects
were considered, but Wiklander (1994) detected and showed evidence of a
dispersion effect from factor E. Further investigations willbe conducted
using our method.
An estimate of the dispersion effect from factor E becomes available
on combining certain columns according to Eq. (1). The pairs of columns
included must be judged to belong to the set rE,Le., judged not to corre-
spond to active location effects. These new contrasts and their calculated
values appear in Table 2.
For illustration, the contrastzAIE+is derived by combining columns A
and AE, i.e.,

cr:+
Furthermore, testing Hi, : c r j2+ = 0;-against H i , : # 0-; for other factors
will require calculations analogous to those in Table 2 but based on other
contrasts. The results from such a procedure are presentedin Table 3. Note
however, that the five F tests are not independent.
FromTable 3, wesee thatfactor E hasadispersion effect that is
difficult to disregard. Wiklander (1994) detected this dispersion effect and
found it significant. However, she used only (3, 3) degrees of freedom in a
similar test. Furthermore, even factor D might have a dispersion effect that
was not detected by Wiklander (1994). However, a complete analysis ofdata
shouldalwaysinvolveresidualanalysis,whichhere reveals apossible
abnormality in observation 11. Treating y I I as a missing observation and
recalculating it by setting some negligible contrast to zero (see Draper and
Stoneman, 1964) shows that the dispersion effect from D becomes insignif-
icant.Furthermore.thedispersion effect from E is fairly insensitive to
changes in y II , and it is therefore reasonable to considerE as the only active
dispersion effect on the dyestuff data.Of course, there is also always therisk
of overestimating the significance due to the multiple test effect.
Table 1 Design Matrix, Responses, and Confounding Structure up to Two-Factor Interactions for the Dyestuff Data
u A B C D AB AC AD BC BD CD -DE -CE -BE -AE -E j',,

1 - - - - + + + + + + - - - - + 201.5
2 + - - - - - - + + + + + + - - 178.0
3 - + - - - + + - - + + + - + - 183.5
4 + + - - + - - - - + - - + + + 176.0 g
5 - - + - + - + - + - + - + + - 188.5
6 + - + - - + - - + - - + - + + 178.5 3
7 - + + - - - + + - - + + - + 174.5 8
8 + + + - + + - + - - + - - - - 196.5 =
9 - - - + + + - + - - - + + + - 255.5 2
10 + - - + - - + + - - + - - + + 240.5 (D
2
11 - + - + - + - - + - + - + - + 208.5
12 + + - + + - + - + - - + - - - 244.0 2
13 - - + + + - - - - + + + - - + 274.0
14 + - + + - + + - - + - - + - - 257.5 2
2.
15 - + + + - - - + + + - - - + - 256.0
16 + + + -1 + + + + + + + + + + + 274.5 E
(D

z=!
0
%
5.
3
368 Bergman and Hynen
99.99 i 1 1 1 I I I I ' l l [ I l l I I 1 I I I i I Ill I l l I I '

99.9 -....... ............ ..


J ........-
4.........6..........1..........6..........1.......... <
. .
99 -.......i..........i.........i..........i ..........i..........i;.....Fa&D
.......d ..........d .........6 ..........i ..........;..........I..........i..?....-
95
-.......+..........3.........+.................. ....................;........-
C

90 j
80
70
.........................................
-....... ..........
4
i.......;: i ..........e .......-..e..........
0.....0....6..........1..*.....4..........,...................-
I

.......... .......; ........ .......i .......... ......:........-


50
30
20
-e......

-
A A..

-.......6 ..........6f...........f
A..

c I G

.......... .......j...........i...................-
.........;.. .....8.;..........6 ..........;...................-
t....

10 -....... ..........
A .....*..:.......... ..........:..........:........-
............A G

5 -.......e ..................... I
.......... ........-
:...........:..........<...........I
0 ;
;

1 -....... 9 ................................ ..........e .......-


......................

.1 -....... .......... ...... ...... .......... .......... .................-


G A... A,... I & 1..

.O1 I Ill I I I 1 1 , II I l l I I I Ill I 1 Ill Ill I 1

-80-60-40-20 0 2040 60 80
Contrast Values

Figure 1 Normal probability plot of contrasts for the dyestuff data.

Table 2 Contrasts of Use for Estimating the Dispersion Effect from Factor E
Contrast E = '' f " E = '""

-7.5 11.0
0.5 -61 .0
37.5 75.0
9.5 124.0
26.5 "2.0
35.0 6.5

Table 3 F Ratios for the Five Factors from the Dyestuff Data
d.f. F ratio P value

0.36
2.83
0.37
4.47
0.14
DispersionEffectsandTheirIdentification 369

5. GENUINEREPLICATESANDSPLIT-PLOTDESIGNS

Genuine replicates require full randomizationboth between runsand


withinreplications,whichentailsalargeamountofexperimentalwork
(see, e.g., Box et a l . , 1978, p. 319). Whenexperimentsareexpensive,as
is often the case i n industry, the randomization procedure within replicates
is sometimesneglected and the experiment is given a split-plot structure.
As seenin one of the examples provided by Bergman and Hynen (1997),
this doesnot have to be adisadvantageouspropertybutcaninstead be
used to estimate two different variance components. Earlier analytic tech-
niquesdid notsupport this special property,for whichreasonsplit-plot
designs have received some criticism. However, some constructive remarks
were made by Box and Jones (1 992), Lucas and Ju (1 992), and Anbari and
Lucas ( 1994).
The method presentedin this chapter is applicable to experiments with
both genuine and split-plot replicates. Genuinereplicates simply increase the
degrees of freedom associated with the test statistic, Eq. (4), while split-plot
replicates enable estimation of oneadditional
variance
component.
Therefore,thelatterofthesetwotechniquesoughtto give thegreatest
increase in knowledge of how the system really works.

6. ON THE PLANNING OF ROBUST DESIGN


EXPERIMENTS

The area of robust designmethodology is constantlydeveloping;thus a


routine for planning experiments is very difficult to establish. In particular,
developmentsenabling new methods for dispersion effect estimation will
requirechanges in existing robust designtechniques.We donot claim
that the method presented in this chapter is the final step within this area.
On the contrary, furtherresearch is necessary to fully understand the impact
ofdispersionestimationonexperimentalwork.In this chapter, we have
focused mainly on identification, although the success of an experiment is
dependent on thorough planning. Therefore, some effects on the planning
phase are worth mentioning.
Finding new techniques for testing and estimating dispersion effects
fromunreplicatedexperiments is alargesteptowardimprovingdesign
economy. For instance, at thescreening stage of sequential experimentation,
replicates for
identifying
dispersion effects will notbe necessary.
Furthermore, it becomes possible to estimate additional variance compo-
nents, which gives new perspectives on the use of some special designs such
370 Bergman and Hynen

as split-plot designs as well as Taguchi’s cross-product designs. Finally, and


probablythemostimportant issue tokeep in mind, no technique is so
perfect that sequential experimentationbecomes unimportant. Problem sol-
ving is an iterativelearningprocess,where“all-encompassing”solutions
seldomcomeinstantaneously.The Plan-Do-Study-Act
cycle, orthe
Deming cycle(see Deming, 1993),is a model for every learning process,
even the experimental one.

ACKNOWLEDGMENTS

Thisstudyhas benfinanced by the Swedish Research Councilfor


Engineering Sciences. We also wish to thank the participants in a project
on design of experiments and robust design methodology supported bya
number ofSwedish industrial firms. We are also grateful to our colleagues at
the Division of Quality Technology for their valuable support.

REFERENCES

Anbari FT, Lucas JM. (1994). Super-efficient designs: How to run your experiment
andhigherefficiencyandlowercost.In1994ASQC48thAnnualQuality
Congress Proceedings. May 24-16, 1994, Las Vegas, Nevada.
Bartlett MS, Kendall DG. (1946). The statistical analysisof variance-Heterogeneity
and the logarithmic transformation. J Roy Stat SOC Ser B 8:128-138.
Berman B, Hynen A. (1997). Dispersion effects from unreplicated designsin the 2k-p
series. Technometrics 39(2).
BergmanB, HolmqvistL.(1988). A Swedishprogrammeonrobustdesignand
Taguchi methods. In: Bendell T, (ed.) Taguchi Methods. Proceedings of the
1988 EuropeanConference.London, 13-14 July1988.Amsterdam:Elsevier
Applied Science.
Bisgaard S, Fuller HT. (1995). Quality quandaries-Reducing variation with two
level factorial experiments. Qual Eng 8(2):373-377.
Blomkvist 0, HynCn A, Bergman B. (1997). A method to identify dispersion effects
from unreplicated multilevel experiments. Qual Rehab Eng Int 13(2).
Box GEP, Jones S. (1992). Split-plot designs for robust product experimentation. J
Appl Stat 19( 1):3-26.
Box GEP, Meyer RD. (1986a).Ananalysisforunreplicatedfractionalfactorials.
Technometrics 28(1):11-18.
Box GEP, Meyer RD. (1986b).
Dispersioneffects
from
fractional
designs.
Technometrics 28(1): 19-27.
Box GEP, Meyer RD. (1993). Finding the active factors in fractionated screening
experiments. J Qual Techno1 25(2):94-105.
Dispersion Effects and Their Identification 371

Box GEP,HunterWG,HunterJS. (1978). Statisticsfor experimenters-An


Introductionto Design, Data Analysis, andModel Building.New York:
Wiley.
Box GEP. Bisgaard S, Fung C. (1988). An explanation and critique of Taguchi’s
contribution to quality engineering. Qual Reliab Eng Int 4(2):123-131.
Cook RD, Weisberg S. (1983). Diagnosticsfor heteroscedasticity in regression.
Biometrika 7O(l):I-l0.
Daniel C. (1959). Use of half-normal plots in interpreting fctorial two-level experi-
ments. Technometrics 1(4):311-341.
Daniel C. ( I 976). Applicrrtion of Statistics to Inrilrsrrinl E.uprritnerltatior1.New York:
Wiley.
Davies OL, (ed.)(1956). Design and Anulysis of Incl~rstrialEsperimrnts. London:
Oliver and Boyd.
Deming WE. (1986). Out of the Crisis. Cambridge,MA:Cambridge University
Press.
Deming WE. (1993). The New Economics, for Industry, Government, Education.
Cambridge, MA: Massachusetts Institute of Technology.
Draper, NR, Stoneman DM. (1964). Estimating missing values in unreplicated two-
level factorial and fractional factorial designs. Biometrics 20(3):443458.
Engel J, Huele AF.(1996). A generalized linear modeling approach to robustdesign.
Technometrics 38(4):365-373.
Glejser H. (1969).A new test forheteroscedasticity. N o 6, RQT&MRRepSer
Divisionof QualityTechnologyandManagement,Linkoping University,
Sweden.
Goldfeld SM, Quandt RE.(1965). Some tests for homoscedasticity. J Am Stat Assoc
60(310):539-547.
Hyntn A. (1996). A note on non-normality and dispersion effectidentification in
unreplicated factorial experiments. No 6, RQT&M Res Rep Ser, Division of
Quality Technology and Management, Linkoping University, Sweden.
Hynen A, Sandvik Wiklund P. (1996). On dispersion effects from inner and outer
arrayexperiments. No 9,RQT&MResRepSer, Division of Quality
Technology and Management, Linkoping University, Sweden.
Kackar RN. (1985). Off-line qualitycontrol,parameter design, andtheTaguchi
method (with discussion). J Qual Techno1 17(4):176-209.
L U C ~ S J M . J U(1992).
H L Split plotting and randomization in industrial experiments.
1992 ASQC 46th Annual Quality Congress Transactions. May 18-20, 1992,
Nashville, TN. Milwaukee, WI: ASQC.
McCullaghP,NelderJA. (1989). GeneralizedLinearModels. 2nd ed.London:
Chapman and Hall.
Myers R, Khuri AI, Vining G . (1992). Response surface alternatives to the Taguchi
robust parameter design approach. Am Stat 46(2): 131-1 39.
Nair VN. (1992). Taguchi’s parameter design:A panel discussion. Technometrics
34(2):127-161.
Nair VN, Prcgibon D. (1988). Analyzing dispersion effects from replicated factorial
experiments. Technometrics 30(3):247-257.
372 BergmanandHynen

Nelder JA, Lce Y. (199 I). Generalized linear models for the analysisof Taguchi type
experiments. Appl Stochastic Models Data Anal 7: 107-120.
Phadke MS. (1989). Quality Engineering Using Robust Design. EnglewoodCliffs,
NJ: Prentice-Hall.
Shewhart WA. (1931). E c ~ m m i cCor~trol
~ of' QLIoII'I~ Prothrc.1. New
of' Mnr~tIfirct~rrc~cl
York: Van Nostrand. (A 1981 reprint is available from thc Anlerican Society
for Quality Control.)
Shoelnaker AC, Tsui K-L, Wu CFJ. (1991). Economical experimentation methods
for robust design. Technometrics 33(4):415427.
Taguchi G. (1981). On-Line Quality Control During Production. Tokyo: Japanese
Standards Association.
Taguchi G. (1986). Introduction to Quality Engineering. Tokyo: Asian Productivity
Organisation.
Taguchi G. WuY. (1980). Introduction to Off-Line Quality Control. N21goy:1. Japan:
Central Japan Quality Control Association.
Wang PC. (1989). Tests for dispersion effects from orthogonal arrays. Comput Stat
Data Anal 81109-1 17.
Wiklandcr K . (1994). Models for dispersioneffects in unrcplicated two-level factorial
experiments. Thesis No. 1994: IjISSN 1100-2255, The University of
Gothenburg, Sweden.
22
A Graphical Method for Model Fitting in
Parameter Design with Dynamic
Characteristics
Sung H. Park
Seoul National University, Seoul, Korea

Je H. Choi
Samsung Display Devices Co., Ltd., Suwon, Korea

ABSTRACT

Detecting the relationship between the mean and variance of the response
and finding the control factors with dispersion effects in parameter design
and analysisfordynamiccharacteristics areimportant. Inthis paper,a
graphical method, called multiple mean-variance plot, is proposed to detect
the relationship between the mean and varianceof the response. Also to find
the control factorswith dispersion effects, the analysisof covariance method
is proposed,and itspropertiesarestudiedcompared withthedynamic
signal-to-noise ratio. A casestudy is presented to illustratetheproposed
methods.

1. INTRODUCTION

Achieving high productqualityat low cost is a very important goal in


modernindustry.One ofthemostpopular statisticalmethodsusing an
experimentaldesignapproach to reach thisgoal is parameter design,
which is often called robust parameter design. Parameter design was pro-
posed by Taguchi (1986, 1987) and explained further by Box (1988), Leon et
al. (1987), Nair (1992), Phadke (1989), andPark (1996), amongmany
others. The main idea of parameter design is to determine the setting of

373
374 Park and Choi

control factors (or design parameters) of a product or process in which the


response characteristic is robust to the uncontrollable variations caused by
the noise factors and hence has a small variability.
Theparameter designusesthe S/N (signalto noise) ratio 10 log
(p2/02), for the static characteristics (hereafter it willbe called the static
S/N ratio),where p and o2 arethemeanandvariance of theresponse,
respectively. S/N is the ratio of the power of the signal to the power of
the noise. The S/N ratio for the dynamic characteristics, which will be here-
after called the dynamic S/N ratio, is defined as 10 log (p2/02) under the
+
model J = CI DM + E or y = PM + E , where y is the response, M is the
signal factor, and o2is the variance of the errorE . Here P2 implies the power
of the signal, and o2 implies the power of the noise.
The usefulness of the dynamic S/N ratio has been proved, since many
engineering systems can be adequately described as dynamic characteristic
problems. See, for instance, many case studies presented in the American
Supplier Institute (1991) symposium on Taguchi methods.

2. DESCRIPTION OF THE DYNAMIC CHARACTERISTICS


SYSTEM

In the parameter design, the experimental factors areclassified according to


their roles into the following three classes.
1. Signal factor ( M ) . This factor influences the average value but not
the variability of the response. It is also called the target-control
factor.
2. Noise factor ( N ) . This factor has an influenceover theresponse
variability but cannot be controlled in actual applications.
3. Control factor (x). This factor can be controlled and manipulated
by theengineer, and its levelis selected to make the product’s
response robust to noise factors. It is the goal of the experiment
to determine thebest levels of the control factors that are robust to
noise factors under the existence of a signal factor.
Parameter design systems are classified into two categories according
to the nature of the target value of the response. One is the static system,
which has a fixed target value, and the other is the dynamic system, which
has varied target values according to the levelsof the signal factor. The
dynamic system is shown in Figure 1. In this section the dynamic character-
istic problem which has a continuous signal input and a continuous output
with some control factors and noise factors is considered.
ParameterDesignwithDynamicCharacteristics 375

t
Figure 1 Dynamic system of parameter design.

3. UNKNOWNVARIANCEFUNCTIONANDDETECTION

Let y v k denote the response corresponding to the ith setting of the control
factors, ,jth level of the signal factor, and kth noise factor or repetition, for
i = 1, ..., I; j = 1, ..., 177; and k = I , ..., n. Then the data structure of the
response in the dynamic system is assumed to be expressed as

The data structure in this section has the following assumptions:


1. Theerrorhaszeroexpectationandvariance 0;.
2. The expectation of y u k is.f;.(M,) for all k and can be expressed as a
polynomial, especially thefirst-orderpolynomial +
a, p,M, or
PiM,.
3. The effect of the noise factors is included in the error variance 05,
so the subscript of the variance term 0;. does not contain k .
4. The variance of the response depends on its expected value and
can be expressed as 0; x V[E(yvk)], where the variance function V
(.) represents the relationship of the varianceof the response to its
mean,andtheterm 0: representstheremainingpart,which
depends on the ith control factor setting.
The experimenter is interested in finding the control factor setting that
makes 0: small and minimizes 0; x V[E(yijk)]. In general the relationship
between the mean and varianceis unknown, and the detection and modeling
of the variance function V(.) is important.
376 Park and Choi

For the detectiong of V ( . ) and model fitting, the following three-step


optimization procedure is proposed.
Step 1. Detecttherelationshipbetweenmeanandvariance by con-
structing a multiple mean-variance plot.
Srcy 2. Find the control factors with dispersion effects by the analysis
of covariance (ANCOVA) method.
S r q 3. Fit the response as a functionf)(M,) of the signal factor M to
adjust the sensitivity of the response to the signal factor M .

3.1 DetectingtheRelationshipBetweenMeanand
Variance by Using a Multiple Mean-Variance Plot
T o detect the relationship V ( . ) between the variance and the mean of the
response, a multiple mean-variance plot (MMVP) is suggested. Nair and
Pregibon (1986) proposed the mean-variance plot, and Lunani et al. (1995)
proposed the sensitivity-standard deviation (SS) plot for the dynamic char-
acteristic problems. Lunani et al. considered the model where the variance
structure satisfies the relationship

Under this model there is a logarithmic relationship between the sensitivity


measure (pi) and the standard deviation ( s i ) ,

log(s;) = log(0J
e A
+ -log((3,) (2)
2

where si and si are obtained from the regressionfitting for each control
factor setting i. Lunani et al. plotted [log (pi),log(sj)] for each control factor
and visually examined the plots to check the nature of the relationship. They
noticed that when some control factors have dispersion effects, the inter-
cepts log ( 0 , )can vary from one control factor setting to another, makingit
possible to have several parallel lines with a common slope8/2 in the SS plot
under model (2).
The MMVP is proposed for model (1). It is the combination of the
mean-variance plot and the multiple SS plot. Under model ( l ) , there is a
logarithmic relationship between Jii and ,s;.,
ParameterDesign with DynamicCharacteristics 377

where ,s; = Ck (yuk - ju)2/(n - I). Note that the expected values of j , , and

.s$ are E(jju)=.fi(Mj) = pi, and E(s:) = Vvi(M,)]o’ = V(pu)o;, where


p..!I
= E(yijk).The procedure used for the MMVP is as follows.
1. Get 1 x 111 data of pairs (Tu, s
)i
for each control factor setting i and
each signal factor level .j.
2. Plot thesepaireddata [log Ci;;,), log (s:.)] onthescatterplotfor
each control factor.
3. Identify the points of the frame of each control factor according to
its levels.
4. Detectthevariancerelationship V(.).
For example, if the orthogonal array LIRas the inner array and a
three-level signal factor are used for experiments, a total of 54 (= 18 x 3 )
paired data [log CF;,), log (s;)] are obtained. By plotting [log (4;;,),log (s~~)],
detection of the form of V ( . ) is possible. If the points are scattered like an
exponential function, the exponential function taken on log (-F;,), Yi, would
make the points linear. If that is the case, then V(p) = exp(8F) is selected as
the proper variance function. For an example see Figure 2 in Section 4.
Like the SS plot of Lunani et al., if the variance function is properly
selected and the assumptionof model (1) holds, then the points on the frame
of the control factor with dispersion effects are identified on separate lines.
Then the control factors with dispersion effects can be easily found.
Note that when the objective of the analysisis focused on the variance
of the response, the term log (.s2) is usually used rather than s2 for certain
statistical reasons. One reasonis that the effect on dispersion maybe reason-
ablyconsideredasamultiplicative effect ratherthananadditive effect.
Moreover, a linear model on log (.?) can be easily used without constraint.
In addition, the performance of log (s2) is stable when the hteroscedasticity
problem occurs. Logothetis (1989) showed that the meanof log (.y2) depends
on log ( 0 2 and
) F I , and the variance of log (.?) is stable depending on only 12,
and furthermore log ( s 2 ) converges to approximate normality asn increases.

3.2. Finding the Control Factors with Dispersion Effects by


the ANCOVA Method
In the second step, finding the control factors with dispersion effects, the
implementation of the ANCOVA method is proposed where the covariateis
determined from the selected variance function at step I . This method is an
extension of the models of Logothetis (1989) and Engel (1992).
Logothetis ( 1 989) thought that the relationship could be detected by
using the regression model on log (.$) with an independent variable log G;):
378 Park and Choi

Engel (1992) noticed that the parameter log (a)is a nonconstant term in the
Logothetismodeland replaced it by thetermlog ( a f ) ,which is alinear
function of the control factors.

When the logarithm is taken on the variance termin model ( I ) , the following
equation is obtained.

Here x, is the row vector of the control factors, andy is a parameter vector.
When this model is applied to practical applications, the fitting model (7) is
used as the form of ANCOVA. Here the sample variance .sf on log is the
dependent variable, the control factors are the factor given in the vector xi,
andthesamplemean Ti, or itsfunction / ~ ( j j j ~is) thecovariate,where
/l(.)ol = V ( . ) :

The control factors of significance are selected to have dispersion effects


from the analysis of the model.
Note that this model has two main differences from Engel's model:
1. The variance function V(p) is a general function instead of p6.
2 . The coefficient 8; is considered a nonconstant parameter.
Thevariancefunction V(p) cannot be easily detected in thestatic
system, so thepowerofthemeanmodel ( 5 ) is mainlyused in Engel's
paper. But in the dynamic system V(p) can be detected, and it can have a
general form. When several V(p)'s are candidates, for example, V ( p )= p"'
or V ( p )= exp(e,p), the variance function V(p) can also be detected at step
1, and the selection of V(p) can be done by some variable selection techni-
que of the regression with log (s;) as the dependent variable. The variance
function V(p) is preferred that separates the plotted points into parallellines
with a common slope and different intercepts, because the control factors
with dispersion effects need to be detected and well selected there. Taking
V ( p ) as pH is the direct generalization of the model of Engel.
A nonconstant 8; has the practical meaning that as the mean increases
the variance can increase at a different rate at each level of some control
factors. If the coefficients of the lines look identical from the search at step1,
ParameterDesignwithDynamicCharacteristics 379

then fitting model (7) is the ANCOVA method without interactionbetween


the covariate and the factors. At the beginning of the analysis, model (7)
with constant term 8 is used. If the points are separated into lines with
differentslopes in theframe ofsomecontrolfactors,thenchangingthe
variance function V(p) or extending the term 8 into 8; may be considered.
When V(p) = pLBand the parameter 8 is taken as equal to 2 before-
hand, the ANCOVA method is equivalent to the procedure for finding the
control factors to maximize the static S/N ratio. We can observe that the
following model is derived from model (7):

The dynamic characteristic approach has some merits compared to the


static characteristic approach for detecting the variance function. One is that
it has a large numer of degrees of freedom when the dispersion effects of
control factors are checked. In the exampleof Engel (1 992), the inner array
is saturated, so there is no degree of freedom allowed for covariate log G;).
However, the ANCOVA method has a large number of degrees of freedom
when the level-of-signal factor is large. Another merit is the distribution of
the mean response. As the mean value is more widely spread, the precision
of estimation of thevariancefunctionincreases(Davidian andCarroll,
1987). In the static systemtheresponse is usually distributed around the
fixed target value and less spread. But in the dynamic system the target value
varies according to the signal input value, and the response is widely spread
according to the signal factor level. Taguchi's optimization procedure with
the dynamic S/N ratio does not enjoy these merits. The sample variances of
each signal factor level are combined into one quantity, the dynamic S/N
ratio. Here the ANCOVA method is proposed to utilize thesemerits by
taking the sample mean and the sample variance at each signal factorlevel.

3.3. Fitting the Response as a Function of the Signal


Factor
When the variance of the response is a function of the levels of the signal
factor, the use of weighted least squares (WLS) is recommended to estimate
, f ) ( M )for each i. After the control factors with dispersion effezs are chosen
at step 2, the variance of response yQk is estimated as %.* VO;,>.Then the
weights are the inverse of the estimated variance of each response, and the
380 Park and Choi

WLS method is applied for each control factor setting i to adjust the sensi-
tivity of the response to the signal factor M .

4. AN EXAMPLE:CHEMICALCLEANINGEXPERIMENT

In this section, the data set from the chemical cleaning process for Kovar
metal components (American Supplier Institute, 1991) is reanalyzed to show
how to use the ANCOVA method and multiple mean-variance plot pro-
posed in Section 3 to find the control factors with dispersion effects and the
functional relationship between the mean and the variance.
The response y is the amountof the material removed as a result of the
chemical cleaning process.The inner array is L18including a two-level factor
A and three-level factors B, C, D,E, F, and G. The outer array consists ofa
three-level signal factor M crossed with L, for a compound array of three
two-level noise factors X , Y , Z . The signal factor M is the acid exposure
time, which is known to have a linear impact on the expected value of the
response. By imposing the linearityof the signal factor, the processbecomes
predictable andmorecontrollablefromtheengineering knowledge. The
information about the experimental factors and the raw data are given in
Tables 1 and 2.

Table 1 ExperimentalFactors and Levels for ChemicalCleamngExperiment


Factor level
description
and labelFactor 0 1 2
Control factor
A Part status at Britc-dip Dry Wet
B Descale acid exposure time BO Bl B2
C Descale acid strength Cn CI C?
D Descaleacidtemperature LowHigh Med
E Nitric/acetic (ratio) EO El E2
F Percent i n Brite-dip acid LowHigh Med
C Brite-dipacidtemperature standard Remachine
Noise factor
X Descale acid age New Uscd
Y Brite-dip acid age New Used
Z Part type Stampcd Machined
Signal factor
M Exposure
time in Brite-dip A4 I M? M3
ParameterDesignwithDynamicCharacteristics 381

Table 2 Experimental Layout and Raw Data for ChemicalCleaningExperiment

Signal factor
I 0
"
0 1 1 0
M?

0 1 1 0 0 1 1
Noise
factor

x
Control factor 0 1 0 1 0 1 0 1 O I O I ~ Y
A B C D E F G 0 1 1 0 0 1 I O 0 1 1 0 z
C o l . 1 2 3 4 5 6 7 1 2 3 4 I 2 3 4 1 1 2 3 4 1
I 0 0 0 0 0 0 0 9 1 11 5 1 1 14
17
22
14 19
2926 18
2 0 0 I I 1 1 I 2 7 3 13 0 2 5 43556343 67638843
3 20 2
20 2 2 26
36 3 8 2 6 40
57
71
44 59
8202
59
4 0 I 0 0 1 1 2 27
39 5024 51 8692
48 78 113 12368
5 0 I I 1 2 2 0 1 1 3 2 7 1 4 24 3337
22 32
34
51
31
6 0 I22 0 0 I 1 4 2 0 2 3 1 6 27 304420 38
505928
7 0 2 0 I 22 2 1 2 1 8 2 3 1 3 26 31
32
23 29
4246
29
208 1 2 0 0 0 25 3 3 4 5 3 6 34 45
9555 4266127 80
9 02 2 0 1 1 1 1 9 2 4 2 7 1 7 33 4646
27 40
61
7033
10 I 0 02 2 I 0 2 5 4 34 0 3 2 385362
41 56
739551
II 1 0 1 0 0 2 1 2 5 4 2 3 6 2 1 42 49
6336 47
58
81
56
12 1 0 2 1 I 0 2 1 6 2 0 1 7 8 22 3236 20 34 43 5333
13 1 1 0 12 2 I 2 7 3 9 2 8 1 9 46 625536 58
8478
48
14 1 I 1 2 0 0 2 2 6 4 87 7 3 7 428410460 52 1 1 I 10978
15 I 1 2 0 1 1 0 31 5 6 5 9 3 2 58 88 9850 77 115 12871
16 1 2 02 2 0 I 17 1 7 3 4 1 8 23 3156
30 32426740
17 1 2 1 0 0 1 2 21 2 5 6 1 2 0 30 439536 4060 I24 60
18 I 2 2 1 1 2 0 1 3 2 6 2 2 1 0 23 27
2720 26
5241
28

The response data yj,k (i = 1, ..., 18; j = I , 2,3; k = 1 , 2 , 3 , 4 )are sum-


marized into 18 x 3 paired data [log (i;ji), log (si)]
for each control factor
setting i and signal factor level j . These paired data k,i),log (si)]
and [log
(Ti), log (si) ] plotted in Figures 2a and 2b. These figures definitely show
are
that a function relationship exists between the variance and mean of the
response and can be explained by a linear function on the log-log scale. We
can assume that h(p) = 1.1 rather than h ( p ) = exp(p) for model (7).
In Figure 3, the multiple mean-variance plots of [log (.Vi,), log (si)]
show which control factors havedispersion effects. In the frame of factor A ,
the points for level I (symbol + ) are shown along with the points for level 0
(0).Two separate fitting lines can be drawn, with a common slope and

differentintercepts.Whenthe level of factor A is 0, theresponsehasa


smaller variance. By similar work, level 0 of factor B can be selected. For
factor C the difference between levels 0 (0)and 2 (0)does not look large, and
Park and Choi

1.8 1.6 1.4 1.2 2.0

Log (Mean)

T-

Io o

20 40 60 80 100

Mean

Figure 2 Plots of [log ( j i i ) , log (si)] and LFji, log (s;)] for chemical cleaning
experiment.

either of those may be selected. In the other frames of Figure 3. the points
are not divided into separate lines according to the levels of factors D,E, F,
and G.
The results from the analysisof the dynamic S/N ratio are presentedin
Table 4. These results show that A , B, C, and D are the important factors
with respect to the S/N ratio. The best level selected is A o , Bo, C, and D l ,
which is similartothe selected level in theresultsfromthe ANCOVA
method except for factor D. But in the ANCOVA method, factors C and
D are not very significant (their p values are 0.053 and 0.057, respectively),
and other levels of these factors may be selected.
Factor A Factor B

+#

0
0

1.2 1.4 1.81.6 20 1.2 1.4 1.6 1.8 20


Factor C Factor D

1.2 1.42.01.81.6 1.2 1.4 1.6 1.8 20

Factor E Factor F

0
+ J 7

1.2 1.4 1.6 1 1.2


8 2.0 1.4 1 8 1 6 2.0
Factor G
0
m

K
0
r
0 +

1.2 1.4 2.08 1


1.6

Figure 3 Multiple plots of [log (Fo). log (s;)] for chemical cleaning experiment.
384 Parkand Choi

Table 3 ANOVA Table for the ANCOVA Method with Covariate log (Pii)
Source DF Adjusted SS F p Vulue
Covariate 100.12
1 4.81013 0.000
A I 13 0.323 5.66 0.022
B 2 0.91742
0.002 7.01
C 2 0.35064
0.053 3.15
0.057 D 3.06 2 0.30866
E 2 0.09540
F 2 0.14475
G 2 0.05400
A x B 2 0.18098
le) 37 2.59806
Pooled error 45 3.091 12
T 53 17.92377

Table 4 ANOVA Table for the Dynamic Signal-to-Noise Ratio


Source DF Adjusted SS F P( %1
A 1 12.13
14.010 14.84
B 2 30.166 15.98 26.27
C 2 28.34
32.4 17 17.17
D 2 19.84
23.101 12.24
E 2 0.002
F 2 2.620
G 2 1.667
1.93 A x B 2.42 2 3.542
((>) 2 0.803
Pooled error 8 5.907 1 1.49
T 17 109.132 100.00

REFERENCES

AmericanSupplier
Institute. (1991). Taguchi Symposium: CaseStudiesand
Tutorials, Dearborn, MI: ASI.
Box GEP. (1988). Signal-to-noise ratios, performance criteria, and transformations
(with discussion). Technometrics 30: 1-40.
DavidianM,Carrol RJ. (1987). Variancefunctionestimation.AmStat Assoc
82:1079-1091.
Engel J. (1992) Modellingvariation in industrialexperiments.ApplStat 41(3):
579-593.
ParameterDesignwithDynamicCharacteristics 385

Leon R. Shoemaker AC. Kackar RN. (1987). Performance measures independent of


adjustment: An explanation and extension of Taguchi's signal-to-noise ratios
(with discussion). Technometrics 9:253-285.
LogothetisN. (1989).Establishing a noise performancemeasure.Statistician 38:
155-174.
Lunani
M,Nair V. Waserman GS. (1995). Robust Design with Dynamic
Characteristics: A Graphical Approach to Identifying Suitablc Measures of
Dispersion. Tech Rep 253, University of Michigan.
Nair VN. ed. (1992). Taguchi's parameter design: a panel discussion. Technometrics
34: 127-160.
Nair VN, Pregibon D. (1986). A data analysis strategy for qualityengineering experi-
ments. AT&T Tech J 6573-84.
Park SH. (1996). Robust Design and Analysis forQuality Engineering. London:
Chapman & Hall.
Phadke MS. (1989). Quality EngineeringUsing Robust Design.EnglewoodCliffs,
NJ: Prentice-Hall.
Taguchi G. (1986). Introduction to Quality Engineering. Tokyo: AsIan Productivity
Organization.
Taguchi G . (1987).Systemof ExperimentalDesign.White Plains, NY:Unipub.
Kraus International.
This Page Intentionally Left Blank
23
Joint Modeling of the Mean and
Dispersion for the Analysis of Quality
Improvement Experiments
Youngjo Lee
Seoul National University, Seoul, Korea

John A. Nelder
Imperial College, London, England

1 . INTRODUCTION

The Taguchi method for analyzing quality improvement experiments has


been much discussed. It first defines summarizing quantities called perfor-
mance measures (PMs) and then analyzes them using analysis of variance.
PMs are defined as functions of the response y ; however, we believe that
they should be regarded as quantities of interest derived after analysis of the
basic response and defined as functions of the fitted values or parameter
estimates. One of Taguchi’s signal-to-noise ratios (SNRs) involves Cyp4.
This is notagoodestimate of pp4, though fiP4 might be acceptable.
However, as Box (1 988) showed, Taguchi’s signal-to-noise ratios make sense
onlywhenthe logof theresponse is normallydistributed.Thecorrect
statistical procedure is (1) to analyze the basic responses using appropriate
statistical models and then (2) to form quantities of interest and measures of
theiruncertainty.Taguchi’sprocedureinvertsthisestablished processof
statisticalanalysis by formingthePMs first and thenanalyzingthem.
However, most writers concentrate on the analysis of PMs, though they
mayuse other than signal-to-noise ratios. Miller and Wu (1996) refer to
the Taguchi approach as performance measure modeling (PMM) and the

387
388 Lee and Nelder

establishedstatisticalapproachasresponsefunctionmodeling(RFM).
They,ofcourse,recommend RFM. However,whattheyactually do
seems to be closer to the PMM approach. The major difference is that
they
considerstatistical
modelsfor
responses
before
choosingPMs.
Because of the initial data reduction to PMs, their primary tool for analysis
is restricted tographicaltools such asthenormalprobabilityplot.
Interpretation of such plots can be subjective. Because information on the
adequacy of the model is in the residuals, analysis using PMs makes testing
for lack of fit difficult or impossible.
In 1991, we (Nelder and Lee, 1991) published a paper giving a general
method that allows analysis of data from Taguchi experiments in a statis-
tically natural way. exploiting the merits of standard statistical methods. In
this chapter, we provide a detailed exposition of our method and indicate
how to extend the analysis to Taguchi experimental data for dynamic sys-
tems.

2. THE MODEL

Taguchi robust parametric design aims to find the optimal settingof control
(i.e., controllable) factors that minimizes the deviation from the target value
caused by uncontrollable noise Factors. Robustness means that the resulting
productsare then less sensitive tothe noise Factors. Supposearesponse
variable J * can be modeled by a GLM with E()*,)= p, and
var (1.;) = @;V(p,), where 4; are dispersion parameters and V ( ) the var-
iance function. The variance of y; is thus the product of two components;
V ( p i )expresses the intrinsic variability due to the functional dependence of
thevarianceonthemean p,, while @; expresses theextrinsicvariability,
which is independent of therange of meansinvolved.Suppose we have
control factors C1, ..., Cl, and noise factors N , , ..., N ( / . In our 1991 paper
we considered the following joint models for the mean and the dispersion

where g ( ) is the link function for the mean, and.fi(CI,.... C , , N 1 , ..., N , ) are
linear models for experimental designs, e.&., the main effect of the model is
C , + ...+C , + N I + ...+N,. The log link is assumed for the dispersion as a
default; there are often insufficient data to discriminate between different
Joint Modeling of the Mean and Dispersion 389

link functions. We need to choose for each model a variance function, alink
function,and terms in thelinearpredictor. By choosinganappropriate
variance function for the mean, we aim to eliminate unnecessary complica-
tions i n themodel duetofunctionaldependence between themean and
variance [the separation of Box (1988)l. It is useful if the final mean and
dispersion models have as few common factors as possible. The link func-
tion for the mean should give the simplest additive model [the parsimony of
Box ( 1 988)].
Control factors occuring in.fi( ) only or in both,/;( ) and/;( ) are used
to minimize the extrinsic variance, and control factors occurring in./, ( ) only
are then used to adjust the mean to a target without affecting the extrinsic
variability.
If we analyze PMs such as SNRs, calculated over the noise factors for
each combination of the control factors, it is then impossible to make infer-
ences about the noise factors in the model for the mean. This reduction of
data leads to the number of responses for the dispersion analysis being only
a fraction of those available for the mean. We do not have such problems
since we analyze the entire set of data; see Lee and Nelder (1998).

3. THEALGORITHM

Whena GLM familyofdistributionsdoesnot exist fora given V(p;).


Wedderburn's(1974)quasi-likelihood(QL) is often used for inference
from the mean model ( I ) . However, it cannot be used for joint inference
frombothmean and dispersionmodels;forthis we need Nelder and
Pregibon's (1987) extended quasi-likelihood (EQL), defined by

where I/,

For given
= -2
l:+;,
(19, - u)/V(u)du denotes the GLM deviance component
the EQL is, apart from a constant, the quasi-likelihood
(QL) of Wedderburn (1974) for a GLM with variance function V(p,). Thus
maximizing Q+ with respect to will give us the QL estimators with prior
weights 1 /+,,
satisfying
390 Lee and Nelder

TheEQLprovidesa scaleddeviancewithcomponent d,/+;, and this


deviance may be used as a measure of discrepancy, so that we can create
an analysis-of-deviance table for a nestedset of models, as with GLMs. The
differences of such deviances allow us to identify significant experimental
factors on the samelink scale and to compare differentlink functions for the
mean model ( I ) .
For given p,, the EQL gives a GLM with the gamma distribution for
thedeviancecomponents d,, and this formsthe basis ofthedispersion
model. Thus, with the EQL we can identify significant experimental factors
for both the mean and the dispersion models. However, when the number of
mean parameters is relatively large compared with the sample size, disper-
sion estimators can be seriously biased without appropriate adjustment for
the degrees of freedom. The REML technique removes this bias for mixed
linearmodels(PattersonandThompson, 1971). Coxand Reid(1987)
extended the REML idea to a wider class of models that satisfy an ortho-
gonality relation of the form +
E(a2Q spay) = 0. The Cox-Reid adjusted
profile EQL becomes

where W’ is an 11 x n diagonalmatrix with ithelement { I/(+;V(p,))}


(@,/aq,). Thus for inference fromthedispersionmodel (2) we (Lee and
Nelder. 1998) use Q:; then aQ:/ay = 0 gives estimating equations fory, (--a2
T,
@lay2)” a variance estimate for and -2Q: the basis of a deviance test. To
overcome the slow computation of REML estimation, we (Lee and Nelder,
1998) have developedan efficient approximation.
The EQL is the true likelihood for the normal and inverse Gaussian
distributions, so our estimators (deviance tests) for p and y are the ML and
REMLestimators(likelihoodratioandadjustedlikelihoodratio tests),
respectively. The EQL also gives good approximations for the remaining
distributions of the GLM family.Thereare two approximations in the
assumed model
for
the
dispersion. The first lies in assuming that
+;
E ( ( / )= in generalthebias is smallexcept in extreme cases, e.g.,
Poisson errors with small p. Such biases enter the analysis for the mean
only through the weight and do not much affect the estimates of p. The
second approximation is the assumption of a gamma error for the dispersion
analysis, regardless of the error chosen for the mean. The justification for
this is the effectiveness of the deviance transformin inducing a good approx-
imation to normalityfor all the GLM distributions (Pierce andSchafer,
1986), excluding extreme cases such as binary data; see also the simulation
study of Nelder and Lee (1992).
Joint Modeling of the Mean and Dispersion 391

In summary, our model consistsof two interlinked GLMs, one for the
mean and one for the dispersion as follows. The two connections, one in
each direction, are marked. The deviance component from the model for the
mean becomes the response for the dispersion model, and the inverse of the
fitted values for the dispersion modelgive prior weights for the mean model.
(See Table I.) In consequence, we can use all the methods for GLMs for
inferences from the joint models, including various model-checking proce-
dures.

4. STATISTICALMODELS FOR DYNAMICSYSTEMS

Recently, there has been an emphasis on making the system robust over a
rangeofinputconditions, so therelationshipbetweentheinput (signal
factor)andoutput(response) is ofinterest.FollowingLunanietal.
(1997), we refer to this as a dynamic system. Miller and Wu (1996) and
Lunani et al. (1997) have studied Taguchi's method for dynamic systems.
Suppose we have a continuoussignal factor M , measured at m values. These
researchers consider models analogous to the mean and dispersion models

and

where g( ) is the link function for the mean and I ( ) is the function describ-
ing the relationship between the input (signal factor) and output (response).

Table 1
GLM
Response
Mean
Variance

Deviance component
Prior weight 1
392 Lee and Nelder

Thefunction I ( ) may be knownaprioriormayhaveto be identified.


Lunani et al. assume that I ( M ) = M P , i.e., that it is a linear function with-
out an intercept, andMiller and Wu select I ( M ) = bo + p, M + p 2 M 2 .The *
operator in Eq. (3) represents the fact that parameters ofI ( ) are modeled as
functions of C, and N / . In dynamic systems the signal factor is used to adjust
the mean using the mean model (3), and control factors are set to optimize
the sensitivity measure [see Miller and Wu (1996) and Lunani et al. (1997)l.
The fitting of dynamic systems has so far been done i n two stages; in
stage I parameters in I ( ) are estimated for each run, and in stage I1 models
are fitted separately to eachset of stage I parameter estimates. For example,
+ +
with Z ( M ) ,= Po & , M &M’ we fit I ( M ) for each individual run, com-
puting bo, P I , and p2, and then fit separate models for these as functions of
C, and N , . However, the model chosenby this approach may not fit the data
well because data reduction to PMs under the wrong model makes testing
for lack of fit difficult. Our method analyzes thewhole data set and does not
require two stages of fitting. All that is necessary is that the software allow
the specification of compound terms of the form A . s in the linear predictor,
denoting that the slope for s varies with the level of the factor A .

5. ADVANTAGES OF THEGLMAPPROACH

The advantage of analyzing all theindividualresponsesusingtwointer-


linked GLMs over the analysis of variance of PMs (with possible transfor-
mation of the data) are as follows:
1. Box’s (1988)twocriteria, separationandparsimony,cannot
necessarily both be achieved by a single datatransformation,
while theGLManalysis achievesthemseparately by choosing
appropriatevarianceand link functionsforthetwointerlinked
GLMs. Analysis is thus always carried out on the original data.
2. Any GLMcan be used formodelingthemeans.Thuscounts,
proportions, and positive continuous quantities can be incorpo-
rated naturally into the model.
3. Our model uses all the information i n the data. For example, the
dispersion analysis has a response for each observation, just as
with the mean. Compare this with the use of s:
calculated over
the noise factors for each combination of the control factors; this
leads to the number of responses for the dispersion analysis being
only a fraction of that available for the mean. Such sf do not use
random variation but are functionsof arbitrarily selected levels for
the noise factors. Furthermore, when the .s; (p) are computed over
Joint Modeling of the Mean and Dispersion 393

noise (signal) factors in static (dynamic) systems it is impossible to


make inferences about those noise (signal) factors in the model for
the dispersion (mean). With this approach, the signal Factor can-
not be included in the dispersion model (4) for dynamic systems.
4. The model is defined for any design. For example, our method can
be used for dynamic systems as easily as for static systems, and
we canalsoconsidermoregeneralmodelssuchaslog (+) =
.fS(C,,..., C,,, N I ,..., N , , M ) for (4).
5. The use of a GLM forfittingthedispersionmodelmeansthat
model-checking techniques, such as residual plots, developed for
GLMs generally can be applied directly to both parts of the joint
model.

6. CONCLUSION

Data from Taguchi experiments shouldbe analyzed in a statistically natural


way so that existing statistical methods can be used; this allows for statis-
tically efficient likelihood inferences, such as the likelihood-ratio test, model-
checking diagnostics to test the adequacy of the model, and maximum like-
lihood estimation or restricted maximum likelihood estimation to be used.
Our method supports these desirable aims.

REFERENCES

Box GEP. (1988). Signal-to-noise ratios. performance criteria and transformations.


Technometrics 30: 1-17.
CoxDR. Reid N. (1987). Parameterorthogonalityandapproxlmatcconditional
inference. J Roy Stat Soc Ser B 49: 1-39.
Lee Y. Nelder JA. (1998). Generalized linear models for the analysis of quality-
improvement experiments. Can J Stat, 26: 95-105.
Lunani M, Nair VN. Wassweman GS. (1997). Graphical methods for robust design
with dynamic characteristics. J QUA Techno1 29: 327-338.
McCullagh P, Nelder JA. (1989). Generalized Lmear Models. London: Chapman
and Hall.
Miller A, Wu JCF. (1996). Improving a measurement system through designed
experiments. Stat Sci 1 1 : 122-136.
Nelder JA, Lee Y. ( 1 99 I). Generalized linear modelsfor the analysis of Taguchi-type
experiments. Appl Stochast Models Data Anal 7: 107-120.
Nclder JA, Lee Y. (1992). Likelihood. quasi-likelihood and pseudo-likelihood: Sonlc
comparisons. J Roy Stat Soc Ser B 54:273-284.
394 Lee andNelder

Nelder JA, Pregibon D. (1987). An extended quasi-likelihood function. Biometrika


74: 221-232.
Patterson HD, Thompson R. (1971). Recovery of interblock informationwhen block
sizes are unequal. Biometrika 58: 545-554.
Pierce DA, Schafer DW. (1986). Residuals in generalized linear models. J Am Stat
ASSOC81: 977-986.
Wedderburn RWM. (1974), Quasi-likelihood functions, generalized linearmodels
and the Gauss-Newton method. Biometrika 61: 439447.
Modeling and Analyzing the
Generalized Interaction
Chihiro Hirotsu
University of Tokyo, Tokyo, Japan

1. INTRODUCTION

The analysis of interactionis the key in a wide variety of statistical problems


including the analysis of two-way contingency tables, the comparison of
multinomial distributions, and the usual two-way analysis of variance. It
seems, however, that it has been paid much less attention than it deserves.
In the usual analysis of variance, both of the two-way factors have
generally been assumed to be controllable, and the combination that gives
the highest productivity has been searched for. We should, however, also
consider the possibilities that the factors maybe indicative or variational.By
an indicative factor we mean a fixed but uncontrollable factor such as the
region in the adaptability test of rice varieties where the problem is to choose
the best level of the controllable factor (the variety of rice) for each level of
the indicative factor (region) by considering the interaction between these
two factors.Thenaprocedure is desired forgroupingthe levels ofthe
indicative factor whose responses against the levels of the controllable factor
are similar so that a common level of the controllable factor canbe assigned
to every level of the indicative factor within a group.
By a variational factor we mean a factor that is fixed and indicative
within an experiment but acts as if it were a random noise when the result is
extended to thereal world. A typical example is the noise factor in Taguchi’s
parameter design, where the problem is to choose the level of the control-
lable factor to give not only the highest but also the most stable responses
against the wide range of levels of the noise factor. For all these problems

395
396 Hirotsu

the usual omnibus F test for interaction is not very useful, and row-wise
and/or columnwisemultiple comparisonprocedures have been proposed
(Hirotsu, 1973, 1983a, 1991a). Those procedures are also useful for model-
ing andanalyzing contingencytables andmultinomialdistributionsnot
restricted narrowly to the analysis of variance (Hirotsu, 1983b, 1993).
Another interesting problem is detecting a two-way changepoint for
the departure from a simple additive or multiplicative model when there are
intrinsic naturalorderingsamong the levels ofthetwo-wayfactors.
Detecting a change in the sequence of events is an old problem in statistical
process control, and there is a largebody of literature dealing with this.
These works, however, are mostly for univariate series of independent ran-
dom variablessuch as normal,gamma,Poisson,or binomial [e.g., see,
Hawkins (1977), Worsley (l986), and Siegmund (198611. Therefore in this
chapter I discuss an approach to detecting a two-way changepoint.

2. MODELING THE INTERACTION IN THE ANALYSIS OF


VARIANCE FRAMEWORK

Supposethat we are given two-wayobservationswithreplicationsand


assume the model
= 11,~ + &iik, i = 1, ..., a; j = I , ...,h; k = 1, ..., I'
wherethe ejik areindependentlydistributedas N ( 0 , 0').The pii may be
modeled simply by pi; = 11+ ai + pi if the hypothesis of no interaction is
accepted. When it is rejected, however, we are faced with a more compli-
cated model, and it is desirable to have a simplified interaction model with
fewer degreesoffreedom.Severalmodelshave been proposed along this
line. includingthoseofTukey(1949), Mandel (1967). and Johnson and
Graybill (1972). The block interactionmodelobtained a s a result ofthe
row-wise and column-wise multiple comparisons is also a useful alternative
(Hirotsu, 1973. 1983a, 1991a).
For row-wise multiple comparisons we define an interaction element
between two rows, the nzth and the nth, say, by
L(w 11) = (lJz)PA(p,,, - p,J
where hi, = (pi,,..., pi,,)' and P/, is a ( h - 1) x h matrix satisfying = Ih-1
and P,,PL = /(, = h.'j,j; with / a n identity matrix and j a vector of 1's. Then
a multiple comparison procedure for testing L(m; n) = 0 is given in Hirotsu
(1983a) to obtain homogeneous subgroups of rows so that in each of them
all interaction elements are zero. The columns can be dealt with similarly.
Then the resulting model can be expressed as
Modeling and Analyzing the Generalized Interaction 397

with (up);,= 0, (crp),, = 0 and (~rp);~ = ( ~ r ~ ) ~if, , ,i,, i' E G,, and j , j ' E J!.,
where G I , ,I I = I , ..., A and J ( , ,I I = I , ..., B denote the homogeneous sub-
groups of rows and columns, respectively. We use the usual dot bar notation
throughout the paper. Model ( I ) may be called the block interaction model
with df(A - l ) ( B - 1 ) for interaction. The row-wise and/or columnwise m u l -
tiplecomparisons seem particularly useful fordealingwithindicative or
variational factors: see Hirotsu (1991;~ 199lb,1992) for details.

3. THE GENERALIZEDINTERACTION

We encounter two-way table analysis even i n the one-way analysis of var-


iance framework if only we take the nonparametric approach.
The data i n Table 1 are the half-life of the drug concentration in blood
for low and high dosesof anantibiotic.This is a simpletwo-sample
problem. I n the nonparametric approach, however, we change those data
into rank data a s given i n Table 2. I n Table 2 we are interested in whether
the 1's are more likely to occur to the right than to the left for the high dose
relative to the low dose since that would suggest that the high dose is more
likely to prolong the half-life.
Table 3 is the result of a dose-response experiment and gives the same
type of data with Table 2 where the ordered categories are thought tobe tied
ranks. Again the high categories seem to occur Inore frequently i n the higher
dose. It is the problem of analyzing interaction to confirm these observa-
tions statistically.
The outconle of a Bernoulli trial can also be expressed in a similar way
to Table 2. We give an example in Table 4, where the probability of occur-
rence changes from 0.2 to 0.4 at the 1 Ith trial.
The outcomes of an independent binomial sequence are also summar-
ized similarly to Table 4. We give an example in Table 5, which is taken from
a clinical trial for heart disease.

Table 1 Half-lifc o f Antibiotic Drug

25 1.5s 1.63 1.49 1.53 2.14


200 1.78 1.93 1 .x0 2.07 I .70
398 Hirotsu

Table 2 Rank Data Obtained from Tablc 1

Rank
Dose 1 2 3 4 5 6 7 8 9 1 0
~~ ~ ~ _ _ _ _ _ _ _ _ _ _ _ _ ~

25 1 1 1 1 0 0 0 0 0 1
200 0 0 0 0 1 1 1 1 1 0

Table 3 Usefulness in a Dose-FindingExperiment


4 3 1 2 5 6
Slightly Not Slightly
Drug
Undesirable
undcsirable useful useful Useful Excellent Total

AF3lng 7 4 33 21 10 I 76
6 AF6mg 5 21 16 23 6 77

Table 4 Outcome of Bernoulli Trial with Probability Change at the 1 Ith Trial

Run

Table 5 Independent Binomial Sequcnce


Dose level (mg)
300 Outcome 225 200 100 150
Failure 16 IK 9 9 5
Success 20 23 27 26 9
Total 36 41 36 35 14
ModelingandAnalyzingtheGeneralizedInteraction 399

In any of Tables 2-5 we denote by p j j the occurrence probability of the


(i,,j) cell, i = 1, 2 ; j = 1, ..., k . Then in Tables 2 and 3 we are interested in
comparing two multinomials ...,P;,~(JI;.= l), i = 1,2, and Tables 4 and 5
are concerned with comparisons of k binomials (plJ,p?,j(p,J = I),.j = 1, ...,k .
Regardless of the differences between the sampling schemes, however, we
are interested in both cases in testing the null hypothesis,

against the ordered alternative,

PA P22
<-s ... 5 -P 2 k
-
PI1 PI? Plk

taking into the account the natural ordering in columns. In (3) we assume
that at least one inequality is strict. It then includes as its important special
case a changepoint model,

where J is anunknownchangepoint,thedetection of which is an old


problem in statistical process control.
The hypotheses (2), (3), and (4) canbe expressed in terms of the
interaction parameters in the log-linear model

The interaction term (~$3)~ can be interpreted as an odds ratio parameter in


this context. Thuswe can generalize the usual analysis of interaction into the
analysis of odds ratio parameters in multinomials, where an ordered alter-
native hypothesis is often of particular interest.
Under the null hypothesis, Eq. (2), we base our statistical inference on
the conditional distributiongiven sufficient statistics. Regardless of the sam-
pling schemes, this leadsto the hypergeometric distributiongiven all the row
and column marginal totals [see Plackett (19Sl)l. This is why we need not
distinguish Tables 2 and 3 from Tables 4 and 5.
400 Hirotsu

4. A SAMPLEPROBLEM

Given half-life data (1.21, 1.63, 1.37, 1S O , 1.81) at a dose level of 50 mg/
(kg . day) in addition to Table 1, we obtain Table 6. We also have placebo
data in the doseeresponse experiment, with which we obtain Table 7.
Next suppose that the products from an industrial process are classi-
fied into three classes(lst, 2nd, 3rd) andtheir probabilities of occurrence are
changed from (1/3, 1/3. 1/3) to (2/3, 1/6, 1/6)at the 1 lth trial. An example of
theoutcome is shown in Table 8. This is regarded a s anindependent
sequence of trinomials.
I t should be notedthat in all threeexamplestherow-wise and/or
columnwisemultiplecomparisonsareessential.Notingtheexistence of
the natural orderings in both rows and columns, we are particularly inter-
ested in testing the null hypothesis

against the ordered alternative

Table 7 Uscfulncss in a Dose-FindingExperiment


3 1 2 4 5 6
Slightly Not Slightly
Undesirable
Drug undesirablc useful useful Useful Excellent Total
Placebo 3 6 31 9 15 1 71
AF3mg I 4 33 21 10 1 I6
AF6tng 5 6 21 I6 23 6 77
Modeling and Analyzing the Generalized Interaction 401

Table 8 Products Classified into Thrcc Classcs


RUII

3rd I I ( 1 0 0 0 I 0 0 I n o o o o o o I o o
2nd 0 0 0 I I I O I I O 0 0 I 0 0 I 0 0 0 0
IS l o o ~ o o o o o n o ~ ~ o ~
Total 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

with at least one inequality strict. Again the alternative hypothesis includes
as its special case a two-way changepoint model such that the inequality

+
holds only when i 5 I , i' L I 1 and , j 5 J.,j' > J I , where ( I , J ) is the +
unknown changepoint. This is a natural extension of the one-way change-
point model (4).

5. TESTINGTHE ORDERED ALTERNATIVE FOR


INTERACTION-TWO-SAMPLE CASE

The analyses of interaction in the analysis of variance model andin the log-
linear model are parallel to some extent. at least for two-waytables [see
Hirotsu (1983~1,1983b)], and here we give only the procedure for the latter
for brevity.

5.1. ComparingTreatments
The most popular procedure for comparing treatments is Wilcoxon's rank
sun1 test. I n that procedure theJth category is given the score of the mid-
rank,

2
and the rank sum of each treatment is defined by

W, = w j ~ * l j , i = 1.2
I
402 Hirotsu

where ~ is theobservedfrequency
3 , ~ in the(ij)th cell. Thestandardized
difference of the rank sums is then defined by

*) =
1
I2 -
(x) 1
(jj + - Z)
where

For evaluating the p-value of W(1; 2) we can use a normal approximation.


The network algorithm of Mehta et al. (1989) can also be applied to give the
exact p-value. As an example, for the data of Table 4 we obtain W ( 1 ;2) =
1.320 with the two-sided p-value 0.187 by the normal approximation.
Another possible approach is thecumulativechi-squaremethod
(Hirotsu, 1982). For this method we partition the original table at the jth
columntoobtaina 2 x 2table by poolingcolumns as in Table9and
calculate the goodness-of-fit chi-square statistic

Then the cumulative chi-square statistic is defined by


x*? = XI2 + ... + Xk-1
2

The null distribution of x** is well approximated by the distribution of


the constant times the chi-square variable dxt, where the constantd and the
degrees of freedom v are given by the formulas

Table 9 Calculating the Cumulative Chi-square Statistic


Column pooled
Row (1, ...,j ) o’+ 1 , ..., k ) Total
1 ?‘I
2 Yz.
Total Y..
ModelingandAnalyzingtheGeneralizedInteraction 403

and
v = (k - I)/d
When the y J are all equal as in Table 4, x*2 iswell characterized by the
expansion

where x:,x:, . . . are the linear, quadratic, etc. chi-square components each
with one degree of freedom (do and are asymptotically mutually indepen-
dent; see Hirotsu (1986) for details. More specifically, xfl) is just the square
of thestandardizedWilcoxonstatistic.Thusthestatistic is used to test
mainly but not exclusively the linear trend in p 2 j / p I ,with respect to j . For
the data of Table 4, = 30.579 and constants are obtained as d = 6.102
and v = 3.114. The approximated two-sidedp-value is then obtained as
0.183.

5.2 ChangepointAnalysis
The maximal component of the cumulative chi-square statistic
x a = maxj x,2
is known as the likelihood ratio test statistic for changepoint analysis and
has been widely applied for the analysis of multinomials with ordered cate-
gorical responses since it is a very easy statistic to interpret. Some exact and
efficient algorithms have been obtained for calculating its p-value, which is
based on the Markov property of the sequence of the chi-square compo-
nents, x:, . . . , xi-l [see Worsley (1986) and Hirotsu et al. (1992)l. Applying
thosealgorithmstoTable4, we obtainthe two-sidedp-value0.135 for
xM2 = 5.488, which givesmoderate evidence for the changein the probability
of occurrence.
Incomparingthethreestatisticsintroducedabovefortestingthe
orderedalternatives(3),theWilcoxonstatistictestsexclusivelyalinear
trend,max x2 is appropriate for testing thechangepointmodel(4),and
x*2 keeps a high power over a wide range of the ordered alternatives. As
an exampleofcomparingtwomultinomialswithorderedcategorical
responses, the three methods are applied to the data of Table 3, and the
results are summarized in Table IO. For reference, the usual goodness-of-fit
404 Hirotsu

Table 10 Three Methods Applied to Tahlc 3

Two-sided Test statistic p-valuc


W’( I : 2) = 2.488 0.0 128
X:’ = 18.453 0.0096
X j f = 10.303 0.0033
x’ = 12.762 0.0257

chi-square value is shown at the bottom of the table: it does not take the
natural ordering into account and as a consequence is not so efficient as the
other three methods for the data.

6. TESTING THE ORDERED ALTERNATIVE FOR


INTERACTION-GENERAL CASE
6.1. Comparing Treatments on the Whole
As ;HI overall test for the association between ordered rowsand columns.
rank correlations such as Spearman’s p or Kendall’s r and the Jonckheere
test are well known.Here we introduce a doublycumulativechi-square
statistic defined by

x3
so that Y(,Il= J - , , is the grand total of observations. The (i..j)th component
is the goodness-of-fit chi-square value for the 2 x 2 table obtained in the
same way as Table 9 by partitioning and pooling the original N x /< data at
the ith row and the ,jth column.
The statistic x**’ is again well approximated by(/x: with
Modeling and Analyzing the Generalized Interaction 405

As an example, the doubly cumulative chi-square methodis applied to Table


7. For calculating xi
it is convenient to prepare Table 1 I .
The constants are obtained as

6- 1
tl = til X (12 = 1.5125 X 1.2431 = 1.8802, v = (3 - 1)- = 5.3 I9
1.X802

+
Then the p-value of x**? = 0.00773 ... + 1.41212 = 31.36087 is evaluated
a s 0.0065 by the distribution 1.8802 x:,3l9. This is highly significant, suggest-
ing the dose dependence of responses.

6.2. MultipleComparisons of Treatments


Although the doubly cumulative chi-square value generally behaves well i n
suggesting any relation between ordered rows and columns, it cannot point
out the optimum level of treatment. For the dose-response experiment an
interesting approach is to detect dose levels between which are observed the
most significantly different responses or the steepest slope change. A possi-
ble approach to this is to partition rows between i and i + 1, to obtain the
+
appropriate statistic S(1,..., i; i 1. ...,k ) to compare two groups of rows
+
( I , ..., i) and (i 1. ..., k - l), and then to make multiple comparisons of
S( I , ..., i; i + 1. ..., k ) for i = 1 . .... k - 1. For the rank-based approach, S
cannaturally be taken as theWilcoxonstatistic, which we denote by

Ordcrcd category

Dose (1)(2 - 6) (1.2)(3 - 6) (I - -


3)(4 6) (I - 4)(5. 6) (1 - 5)(6) Total
406 Hirotsu

+
W (1 , ..., i; i 1. ...,k). The statistic S can also be based on the cumulative
chi-square statistic, which we denote by 1 , ..., i; i + 1, . . . , k ) . They are
x*’(
calculated a s two-sample test statistics between the two subgroups of rows
+
(1, . . . , i) and (i 1, ..., k). The formula to obtain the asymptotic p-value of
max W (1 , .... i; i + 1, .... k) is given in Hirotsu et al. (1992), and the one for
+
max x*’( 1 , ..., i; i 1. ..., k) in Hirotsu and Makita (1992), where the max-
imum is taken over i = 1, ..., k - 1. The multiple comparison approaches
applied to the data of Table 7 are summarized in Table 12.

6.3. Two-way ChangepointAnalysis


Themaximalcomponent ofthedoublycumulativechi-squarestatistic,
denoted by max max x’ij, can be useful for testing the two-way changepoint
model Eq. ( 5 ) . An efficient algorithm to obtain the exact p-value of max max
x$ is proposed in Hirotsu (1994. 1997). Applying it to Table 8, theone-sided
p-value of
max, max, xf = 7.500
is obtained as 0.0476, which suggests the increased probability of occurrence
of the first class in later periods.
The max max chi-square value can also be used in the context of the
dose-response experiment. When applied to the dataof Table 7, the exact p-
value of max max xi = 10.033 is evaluated as 0.014; see Hirotsu (1997) for
details.

6.4. Modeling by the Generalized Linear Model


Another useful approach for modeling multinomialswith ordered categories
is to use a generalized linear model, such as proportional odds and propor-
tional hazards models. The goodness-of-fit chi-square valueoftheblock
interactionmodelappliedtothetaste-testingdata of five foods in five
ordered categorical responses by Bradley et al. has been compared to fitting
of
the
proportional
odds
model of Snell (1964) and its
extension
(McCullagh, 1980); see Hirotsu (1990, 1992) for details.

Table 12 Multiplc Comparisons of Three Dose Levels


Test statistic p-value Two-sided
niax W = W(1.2; 3) = 2.7629 0.0 I I
max x*2 = x*’( I , 2; 3) = 24.010
0.005
Modeling and Analyzing the Generalized Interaction 407

7. SOME EXTENSIONS
7.1. GeneralIsotonicInference
Amonotonicityhypothesis in adose-responserelationship,say can be
naturallyextended to theconvexityhypothesis (Hirotsu, 1986) and the
downturnhypothesis (Simpson andMargolin, 1986), which arestated in
the one-way analysis of variance setting as
H,. : 112 - PI IP3 - p2 F .” 5 Po - Pu-I

and
H” : pI F ... 5 Pr+l 2 p,+z 2 ... 2 T = 1, ..., p,l-l
respectively. InHirotsu (1986) astatistic is introducedfortestingthose
hypotheses, and an application of its maximal component is also discussed
in Hirotsu and Marumo (1995). These ideas can be extended to two-way
tables, and a row-wise multiple comparisons procedure was introduced in
Hirotsu et al. (1996) for classifying subjects based on the 24 h profile of their
blood pressures, which returns to approximately its starting level after 24 h,
wherethecumulativechi-square and linear trend statistics are obviously
inappropriate.For a more generaldiscussion fortheisotonicinference,
one should refer to Hirotsu (1998).

7.2. HigherWayLayout
The ideas of the present chapter can be naturally extended to higher way
layouts. As oneof those examples, a three-way contingency table with ageat
four levels, existence of metastasis into a lymph node at two levels, and the
soating grade at three levels, is analyzed in Hirotsu (1992). An example of
highly fractional factorial experiments with ordered categorical responses is
given in Hamada and Wu (1990); see alsothediscussionfollowing that
article.

8. CONCLUSION

The analysis of interaction seems to have been paid much less attention than
it deserves. First, the character of the two-way factors should be taken into
account in makingstatisticalinference to answeractualproblemsmost
appropriately. Row-wise and/or columnwise multiple comparisons are par-
ticularly useful when one of the factors is indicative or variational. Second,
analysis of the generalized interaction is required even in the one-way ana-
lysis of variance framework if the responses are ordered categorical, which
408 Hirotsu

includes rank data as an important special case. Then testing the ordered
alternatives for interaction is of particular interest, and the cumulative chi-
square statistic and its maximal component are introduced in addition to the
well-known rank sum statistic. Based on these statistics, a method of multi-
ple comparisons of ordered treatments is introduced as well as an overall
homogeneity test. Third, the independent sequence of multinomials can be
dealt with similarly to the multinomial data with ordered categories. For
example, a sequence of Bernoulli trials can be dealt with as two multino-
mials with cell frequencies all zero or unity. In this context we are interested
in changepoint analysis, for which the maximal component of the cumula-
tive chi-square statistic is useful. When there are natural orderings in both
rows and columns, the maximal component of the doubly cumulative chi-
square statistic is introduced for detecting a two-way changepoint. Finally
those row-wise and/or colurnnwise multiple comparisons are useful not only
for comparing treatments but also for defining the block interaction model.

REFERENCES

Bradley RA, Katti SK, Coon TJ. (1962). Optimal scaling forordered categories.
Psychometrika 27: 355-374.
Hamada M, Wu CFJ. (1990). A critical look at accumulation analysis and related
methods (with discussion). Technometrics 32: 119-130.
Hawkins DM. (1977). Testing a sequence of observations for a shift in location. J
Am Stat Assoc 72: 180-186.
Hirotsu C. (1973). Multiple comparisons in a two-way layout. Rep Stat Appl Res
JUSE 1-10,
Hirotsu C. ( 1 982). Use of cumulative efficient scores for testing ordered alternatives
in discrete models. Biometrika 69: 567-577.
Hirotsu C. (1983a). An approach to defining the pattern of interaction effects in a
two-way layout. Ann Inst Stat Math A 35: 77-90.
Hirotsu C. (1983b). Definingthepattern of association in two-waycontingency
tubles. Biometrika 579-589.
Hirotsu C. (1986). Cumulative chi-squared statistic as a tool for testing goodness of
fit. Biometrika 73: 165-173.
Hirotsu C. (1990). Discussion on Hamada and Wu's paper. Technometrics 32: 133-
136.
Hirotsu C. (1991a). Statistical methods for quality control-Beyond the analysis of
variance. Proc 2nd HASA Workshop, St Kirik, pp. 213-227.
Hirotsu C. (1991b). An approach to comparing treatments based on repeated mea-
sures. Biometrika 75: 583-594.
Hirotsu C. ( 1 992). Analysis of Experimental Data, Beyond Analysis of Variance (in
Japanese). Tokyo: Kyoritsu-Shuppan.
Modeling and Analyzing the Generalized Interaction 409

HirotsuC. (1993). Beyond analysis of variance techniques: Someapplications in


clinical trials. Int Stat Rev 61: 183-201.
Hirotsu C. (1994). Two-way changepoint analysis-The alternative distribution (in
Japanese). Proc Annu Meeting JpnSOCMath. Stat Math Branch. pp. 153-154.
Hirotsu C. (1997). Two-way change-point model and its application. Aust J Stat (to
appear).
Hirotsu C. (1998). Isotonic inference. In: Encyclopedia of Biostatistics, New York:
Wiley, to appear.
Hirotsu C, Makita S. (1992). Multiple comparison procedures based on the cumu-
lative chi-squared statistic (in Japanese). Proc Annu Meeting Jpn SOCAppl
Stat, pp. 13-17.
Hirotsu C, Kuriki S , Hayter AJ. (1992). Multiple comparison procedures based on
the maximal componentof the cumulative chi-squared statistic.Biometrika 78:
583-594.
Hirotsu C. Marumo K . (1995). changepoint analysis for subsequent mean differences
and its application (in Japanese). Proc 63rd Annu Meeting Jpn SOCStat. pp.
333-334.
Hirotsu C, Aono K, Adachi E. (1996). Profile analysis of the change pattern of the
blood pressure within a day. Proc 64th Annu Meeting JpnSOCStat, pp. 50-5 1.
Johnson DE, Graybill FA. (1972). An analysis of a two-way model with interaction
and no replicatiotn. J Am Stat Assoc 67: 309-328.
McCullagh P. ( 1 980). Regression models for ordinal data.J Roy Stat SOCB 42: 109-
142.
Mandel J. (1969). The partitioning of interaction in malysis of variance. J Res Natl
Bur Stand B73: 309-328.
Mehta CR, Patel NR,TsiatisTA. (1989). Exact significance testing toestablish
treatment equivalence with ordered categorical data. Biometrics 40: 8 19-825.
Plackett RL (1981). The Analysis of Categorical Data, 2nd ed. London: Griffin.
Siegmund D. (1986). Boundary crossing probabilities and statistical applications.
Ann Stat 14: 361404.
Simpson and Margolin. ( 1 986).
Snell EJ. (1964).Ascaling procedure for ordered categorical data. Biometrics 20:
592-607.
Tukey JW. (1949). One degree of freedom for non-additivity.Biometrics 5: 232-242.
Worsley KJ. (1986). Confidence regions and tests for a change point in a sequence of
exponential family random variables. Biometrika 27: 103-1 17.
This Page Intentionally Left Blank
25
Optimization Methods in Multiresponse
Surface Methodology
Andre 1. Khuri
University of Florida, Gainesville, Florida

Elsie S. Valeroso
Montana State University, Bozeman, Montana

1. INTRODUCTION

One of theprimary objectives in aresponsesurfaceinvestigation is the


determination of the optimum of a response of interest. Such an undertak-
ing may also be carriedout when several responses are under consideration.
For example, in a particular chemical experiment, aresin is required to have
a certain minimum viscosity, high softpoint temperature, and high percen-
tage yield (see Chitra, 1990, p. 107). The actual realization of the optimum
depends on the nature of the response(s) and the form of the hypothesized
(empirical) model(s) being fitted to the data at hand.
Optimization in response surface methodology (RSM) has received a
great deal of attention, particularly from experimental researchers. This is
evidenced by the numerous articles on optimization that have appearedin a
varietyofprofessionaljournals. See, forexample,Fichtalietal. (1990),
Floros (1992), FlorosandChinnan(1988a, 1988b), GuillouandFloros
(1993), Mouquet et al. (1992), and the two review articles by Khuri (1996)
and Myers et al. (1989), to name just a few.
Forthemostpart,currentoptimizationtechniques in RSM apply
mainly to single-response models. There are, however, many experimental
situations where several response variables are of interest and can subse-
quently be measured for each setting of a group of control variables. Such

411
412 Khuri and Valeroso

experiments are referred to as t~~ultiresponseesperitnents. For example, the


qualityofaproductmaydependon several measurablecharacteristics
(responses). Hill andHunter (1966)were perhapsthe first authorsto
make reference tomultiresponseapplications in chemistryand chemical
engineering. A review ofRSMtechniquesapplicable to multiresponse
experiments is given by Khuri (1996). See also Khuri and Cornell (1996,
Chapter 7).
Theoptimizationproblem in amultiresponsesetting is notas well
defined as in the single-response case. In particular, whentwo ormore
responses are considered simultaneously, their data are multivariately dis-
tributed. In this case, the meaning of “optimum” is unclear, because there is
no unique way to order such data. Obviously, the univariate approach of
optimizing the responses individually and independently of one another is
not recommended. Conditions that are optimal for one response may be far
from optimal or even physically impractical for the other responses from the
experimental point of view.
The purpose of this chapter is to provide a comprehensive survey of
the various methods of multiresponse optimization currentlyin use in RSM.
A comparison of some of these methods is made in Section 3 using two
numerical examples from the semiconductor and food science industries.

2. METHODS OF MULTIRESPONSEOPTIMIZATION

Multiresponse optimization requires finding the settings of the control vari-


ables that yield optimal, or near optimal, values fortheresponses under
consideration. Here, “optimal” is used with reference to conditions deemed
more acceptable, or more desirable, than others with respect to a certain
criterion. Multiresponse optimization techniques can be graphical or analy-
tical.

2.1 Graphical Techniques


In the graphical approach to optimization, response models are fitted indi-
vidually totheirrespective data.Contour plotsaregeneratedandthen
superimposed to locate one or more regions in the factor space where all
the predicted responses attain a certain degree of “acceptability.” There can
be several candidate points from which the experimenter may choose. Note
that these plots limit consideration of the control variables to only two. If
there are more, then the remaining variables are assigned fixed values. In
this case, a large number of plots will have to be generated.
MultiresponseSurfaceMethodology 413

Contour plotting was initially used in the early development of RSM.


For example, it was described by Hill and Hunter (1966) in reference to an
article by Lind et al.(1960). More recently, an improved graphical technique
was deployed using computer-generated contour surfaces, with three control
variables, instead of two, represented on the same diagram. This technique
was discussed, for example, by Floros and Chinnan (1988b), who credited
Box (1954) and Box and Youle (1955) for being the originators of this idea.
It is worth noting here that renewed interest in the graphical approach
has evolved in recent years due to advances in computer technology. This
approach is simple and easily adaptable to most commonly used computer
software packages. However, it has several disadvantages. For example, its
capability is limited in large systems involving several control variables and
responses. Also, since only two or three control variables can be represented
in the same plot, the number of generated plots can be quite large, as was
mentioned earlier. This makes it difficult to identify one set of conditions as
being optimal. Furthermore, the graphical approach does not account for
the possibility of having correlated responses, which may also be heterosce-
dastic. Obviously, graphs based on such responses are not very reliable and
may adversely affect the finding of optimum conditions. In particular, fail-
ure to recognize multi-collinearities among the responses can lead to mean-
ingless results in the fitting of the response models (see Box et a l . , 1973) and
hence in the determination of optimum conditions.

2.2 AnalyticalTechniques
Analyticaltechniquesapplymainly to linear multiresponse models. Let I’
denote the number of response variables, and let x = (sI,s 2 , ..., sk)’ be a
vector of 1-, related control variables. The model for theith response is of the
form

1’; =fl(x)p; + E,, i = 1 , 2, ..., I’ (1)

wheref,(x) is a vector of order p i x 1 whose elements consist of powers and


products of powers of the elements of x up to degree d;(? I ) , p, is a vector of
p i unknownconstant coefficients, and E , is randomexperimentalerror.
Suppose that there are IZ sets of observations on .vl, ..., The corre- ~ 3 ~ )I,...

sponding design settings ofx are denoted by xI,x2, ..., x,,. From (1) we have

+
j’,,,= f ( ( x , , ) P ; E,,,, i = 1, 2, ...( r ; Li = 1, 2, ...)I ? (2)

where yrriis the rrth observationonModel ( 2 ) can be written in vector


form as
414 Khuri and Valeroso

yi=X,pi+e,, i = 1 , 2,..., r (3)

where y i and E; are the vectors of J ~ , , ~ ’ and


S E , , ~ ’ s , respectively, and X , is a
matrix of order x p i . It is assumed that X , is of full column rank and that
E(&,)= 0 and Var(Ei) = o?Z,,, where I,, is the identity matrix (i = I , 2, ..., r ) .
Furthermore, we assume that COV(E,,E.~) = ovZ,,,i #.j. Let E = (oii).The
models in Eq. (3) can be combined into a single linear multiresponse model
of the form

where X is ablock-diagonalmatrix,diag ( X l ,X z , ...,X , ) , p = [pi : :


x
... : pi] and E = [clr : E?/ : ... : E : ] ’ . Hence, Var(E) = @ I,,, where @ denotes
the direct product ofmatrices. The best linear unbiased estimator (BLUE) of fl
is given by (see Khuri and Cornell, 1996, Chapter 7)

s = [X’(E-l @ Z , , ) X ] - ’ X r ( ~ -@
’ I,,)y

b
In general, depends on the variance-covariance matrix E, which is
unknown-andmusttherefore be estimated.Zellner(1962)proposedthe
estimate E = (6q),where

f:
Srivastava and Giles (1987, p. 16) showed that is singular if I’ > 11. They
demonstrated thatAr5 11 is a, necessary, but not sufficient, condition for the
x
nonsingularity of E. Using in place of in Eq. (5) produces the following
estimate of fL

This is known as Zellner’s seemingly unrelated regression (SUR) estimate of


p. It is alsoreferred to as an estimated generalized least squares (EGLS)
estimate of p. It can be computed using PROC SYSLIN (SAS, 1990a). In
particular, if X i = X . (i = I , 2, ..., r), then it is easy to show that (5) reduces
to
Multiresponse Surface Methodology 415

In this case, the BLUE of pi coincides with its ordinary least squares (OLS)
estimate, which does not depend on C, that is,

This special case occurs when the response models in ( I ) are of the same
degree and form and are fitted using the same design.
From Eqs. ( I ) and (7), the ith predicted response,?(,;(x), at a point x in
a region R is given by

where is theportion of p,
in Eq. (7) thatcorrespondsto pi.
Now by a multiresponse optimization of the responses we mean find-
ing an x in R at which ?(,;(x), i = I , 2, ..., r, attain certain optimalvalues. The
term “optimal” is defined accoding to some criterion. In the next two sec-
tions, two optimality criteria are defined and discussed.

The Desirability Function Approach


The desirability function approach (DFA) was introduced by Harrington
(1965). The response models in ( I ) are first fitted individually using OLS
estimates of the p j ’ s , namely,

The corresponding predicted responses are

The jT(x)’s are then transformed into desirability functions denoted by d j ( x ) ,


where 0 5 d j ( x ) 5 I , i = 1,2, ..., r . The value of d j ( x ) increases as the “desir-
ability” of the corresponding response increases. In a production process,
the responses y l , y 2 , ..., y r usually measure particular characteristics of a
product.
The choice of the desirability function is subjective and depends on
howtheuserassessesthedesirabilityofagivenproductcharacteristic.
Harrington (1965) used exponential-type desirability transformations.
Later, Derringer and Suich (1980) introduced more general transformations
that offer the user greater flexibility in settingupdesirability values.
Derringer and Suich considered one-sided and two-sided desirability trans-
416 Khuri and Valeroso

formations. The former are employedwhen the ff(x)'s are to be maximized.


In this case, c/;(x) is defined by

I1 otherwise

where u i is the minimum acceptable value of fT and u; is such that higher


values of jT would not lead to further increase in the desirability of the ith
response (i = 1. 2 , ..., r ) . The value s is specified by the user. Note that if the
minimization of ?:(x) is desired, then (];(x) is chosen as

I 1 otherwise

where iiand V i are specified values (i = 1, 2, 3, ..., r ) . Two-sided desirability


transformations are used when y j has both minimum and maximum con-
straints. The corresponding (/;(x) is given by

r&) =

0 otherwise

where here u; and q are, respectively, minimum acceptable and maximum


acceptable values of jT,ci is that value of ft considered "most desirable"
(target value), and s and t are specified by the user.
Once the desirability functions for all the responses have been chosen,
the c/;(x)'s are then combined into a single function, denoted by [/(x), which
measurestheoveralldesirabilityoftheresponses.DerringerandSuich
(1980) adopted the geometric mean of the c/;(x)'s as such a function, that is,
MultiresponseSurfaceMethodology 417

We note that 0 5 d ( x ) 5 1 and that d ( x ) = 0 if any of the tl;(x)’sis equal to


zero. Thus if a product does not meet a specified characteristic, it is deemed
unacceptable. Large values of tl correspond to a highly desirable product.
Hence, optimum conditions are found by maximizing &x) over the experi-
mental region. The nlultiresponse optimization problem has therefore been
reduced to the maximization of the single function (/(x).
More recently, Derringer (1994) referred to the desirability function
approach as the desirability optimization methodology. He also provided
information concerning software availability for its computer implementa-
tion. Note that the actual maximization of d ( x ) can be carried out only by
using search methods, as opposed to gradient-based methods, because d ( x )
is not differentiable at certain points. Del Castillo et a l . (1996) proposed
modifieddesirabilityfunctionsthatareeverywheredifferentiable so that
more efficient gradient-based optimization procedures can be used.

TheGeneralizedDistanceApproach(GDA) was introduced by Khuri


and
Conlon (1981). The responses are
assumed to be adequately
represented by polynomial models of the same degree and form within the
experimental region R . In this case, the Xi’s i n models ( 3 ) are equal to a
common matrix X,. The estimates of pi, i = 1,2, ..., r, andthe
correspondingexpressionsforthepredicted responses are givenby Eqs.
(9) and ( 1 1). respectively.
If the assumptions made earlierin Section 2.2 concerning the distribu-
tions of the responses are valid, then

wheref(x) is the common form off,(x), i = I , 2, ...,I’, and oji is the (i,,j)th
element of C, the variance-covariance of the responses. Hence, i f j ( s ) LCl
(x) : .&(x) : ... : ?,.(x)]’ is the vector of predicted responses, then its variance-
covariance matrix is given by

Since X , = X ” for i = 1 , 2 , ..., I’, an unbiased estimator of C is given by


418 KhuriandValeroso

where Y = bl : y 2 : ... : y,.] is the n x I‘ matrix of multiresponse data and po


is the number of columns of X 0 [see Khuri and Conlon (1981), formula 2.31.
!I r 5 n - Po, then Eo will be nonsingular provided that Y is of rank r. Using
E,, in place of E in (14), an unbiased estimator of VarE(x)] is obtained,
namely,

The main idea behind the generalized distance approach is based on


measuring the distance of j ( x ) from the so-called ideal optimum, which is
defined as follows: Let 6; denote the optimumAvaluc if j i ( x ) obtained indi-
viduallyover a region R, i = 1, 2, ...,r. Let 4 = &, ..., +,.)’. If these
individual optima are attained at the same point in R, then an ideal opti-
mum is said to be achieved. In general, the occurrence of such an optimumis
very rare, since the 4;’s attain their individual optima at different locations
in R . In this case, we search for the IoFation of a near ideal optimum, a point
x. in R at which j ( x ) is “closest” to 4. Here, “closeness” is determined by a
metric p$(x), $1 defined as follows:

PW), 61 = [ N x >- 6 ) ’ G k i w 1 1 - 1 6 ( 4 - 6)I1I2

Thusthemultiresponseoptimizationprobjem in this approachhas been


reduced totheminimizationof pp(x),41 withrespect to x over R .
Optimum conditions found in this manner result in a so-called compromise
ideal optimum.
Several other metrics were proposed in Khuri and Conlon (1981), for
example,

where &,;; is the ith diagonal element of Eo(i= 1, 2, ..., r ) . The metric pI is
appropriate whenever the responses are statistically independent.The metric
measures the total relative deviation of >(x) from $. It can be used when
Eo is ill-conditioned.
MultiresponseSurfaceMethodology 419

Remark 1. Itshouldbenotedthat in thegeneralized distance


approach, $ i is treatedasa fixed quantity, when in fact it is randokn
(i = 1 , 2 , ..., r ) . To accountfortherandomness in theelements of 4,
Khuri and Conlon (1981) developed a rectangular confidence region, C,,
on @,the vector of true individual optima over theregion R. For a fixed x in
R, themaximumof pp(x), q] is obtained withrespect to q in Cb. This
maximum provides a conservative estimate of pp(x), 41, the metric that
should be minimized with respect to x instead of p E ( x ) , 41. The maximum
so obtained, which is a function of x, is minimized with respect to x over R.
A more detailed discussion concerning this max-min approach is also given
in Khuri and Cornell (1996, Chapter 7).
The computer implementation of Khuri and Conlon’s (1981) general-
ized distance approach, including the use oftheconfidenceregion C+, is
availablethroughthe MR (formultipleresponses)softwarewritten by
Conlon (1988). A copy of the MR code along with the accompanying tech-
nical report and examples can be downloaded from the Internet at ftp://
ftp.stat.ufl.edu/pub/mr.tar.Z. Note that the mr.tar.Z fileis compressed. It
should be uncompressed and then compiled. Furthermore, MR fits a sec-
ond-degree polynomial model to each response.
An Extension of K h r i and Conlon’s (1981) GDA. The generalized
distance approach (GDA) described earlier requires that all fitted response
modelsbe of thesameformanddegreeanddependon all thecontrol
variables under consideration. Valeroso (1996) extended the GDA by mak-
ing it applicable to models that are not necessarily of the same degree or
form. The following is a summary of Valeroso’s extension.
The models considered are of the form given in (1). The SUR (or
EGLS) estimates of pi are obtained from formula (7). The expressions for
the
predicted
responsesare
given by formula (10). Let j , ( x ) =
...,jcAx)l’.Then,
LCcr~(x),jc,2(x),

where A’(x) = diag[f;(x),f;(x), ...,f,!( x)]. An estimate of the variance-cov-


ariance matrix of j , ( x ) is approximately of the form

GE,(x)] = A’(x)[X’(k” @Z,,)fl-’A(x)

where the elements of 2 are given in (6). The metric p defined in (16) is now
replaced by
420 Khuri and Valeroso

$,,I = [G,.(x)- $ , , ) ’ { A ’ ( x ) [ X ’ ( g - ~’ I , I ) X ] ” A ( x ) ) ” G , , ( ~-) $c,)]’’2


~,,li)~,(x),
(18)
. . A

where &(, = @ ( , 2 , ..., $,,)’ and $,, is theindividualoptimum of fc,i(x)


over the region R. Minimizing the metric p,, over R results in a simultaneous
optimization of the I’ predicted responses.
Valero:o’s (1996) extension also includes an accountability of the ran-
domness of @(, by applying a max-min approachsimilar to the one described
in Remark I .

2.3. OtherOptimizationProcedures
Thereareotheroptimizationproceduresthat involvemore thanone
response. Some of these procedures, however, are not truly multivariate in
nature since they do not seek simultaneous optima in the same fashion as in
Section 2.2.

The Dual Response Approach


The dual response approach (DRA) was introduced by Myers and Carter
(1973). It concerns the optimization of a single response, identified a s the
primaryresponse,subject to equalityconstraintsonanother response
labeled the secondary response. Both responses are fitted to second-degree
models. Biles (1975) extended this idea by considering more than one sec-
ondary response.
Del Castillo and Montgomery (1993) presented an alternative way to
solve the DRA problem by using a nonlinear optimization procedure called
the generalized reduced gradient (GRG) algorithm.They demonstrated the
advantages of this algorithm and made a reference to software packages for
its computer implementation.
The DRA can be used in experimental situations where both the mean
and varianceofaprocess are of interest. One is consideredtheprimary
response andtheotherthesecondary response [see Vining andMyers
(1990) and Myers et al. (1992)l. Previously, the DRA was used by Khuri
and Myers (1979) to provide an improvement to the method of ridge ana-
lysis, which is an optimization procedure for a single response represented
by a second-degree model within a spherical region [see Draper (1963)l. The
modification imposed certain quadratic constraints for the purposeof limit-
ing the size of the prediction variance. More recently, several authors ela-
borated further on the use of the DRA in conjunction with the modeling of
both themeanand variance. For example, Lin and Tu (1995) suggested
usingthenleansquared error(MSE)as a new objectivefunction to be
MultiresponseSurfaceMethodology 421

minimized. This MSE is the sum of the estimated process variance and the
square of the differnce between the estimated process mean and some target
value. Copeland and Nelson (1996) proposed using direct function minimi-
zation based on Nelder and Mead's (1965) simplex method. Lin and Tu
(1995, p. 39) made an interesting comment by stating that the use of the
DRA for solving the mean-variance problem can work well only when the
mean and variance are independent.

Optimization via Constrained Confidence Regions


Optimization via constrainedconfidenceregions(DelCastillo, 1996) is
somewhat related to the DRA. The responses are fitted individually using
either first-degree or second-degree models. Confidence regions on the loca-
tions of the constrained stationary points for the individual responses are
obtained if their corresponding models are of the second degree. If some of
the models are of the first degree, then confidence cones on the directions of
steepestascent (or descent) are used.These regions (or cones)arethen
treated as constraints in a nonlinearprogrammingproblem where one
response is defined as a primary response. The next step requires finding a
solution that lies inside a l l the confidence regions and/or cones.

A Fuzzy Modeling Approach


The fuzzy modeling approach of Kin1 and Lin (1998) is based on the so-
called fuzzy multiobjective optimization methodology. It is assumed that the
degree of satisfaction of the experimenter with respect to the ith response is
maximized when .$(x) [see formula (1 l)] is equal to its target value TI and
and )y"' denote
decreases a s .?:(x) moves away from Ti, i = I , 2, ..., I'. I f )~~'1"'
lower and upper bounds on the ith response, respectively, then the degree o f
satisfaction with respect to the ith response is defined by a function called
the membership function, which we denote by n l j [ i . : ( x ) ] , i = I , 2, ..., I', and is
given by

The values of )~;''I1 and )y'" can be chosen as the individual optima of .?:(x)
over a region R . We note that the definition of this function is similar to that
of the desirability function. Simultaneous optimization of the responses is
422 KhuriandValeroso

achieved by maximizingtheminimumdegreeofsatisfaction,that is,


mini(mi~;T(x)],i = 1,2, ..., r } . Additional constraints may be added to this
formulation as appropriate.

The Procedure of Chitra (1990)


TheprocedureofChitra (1990) is similartothe generalizeddistance
approach. Chitra defined different types of objective functions to be mini-
mized. These functions measure deviations of the responses from their target
values. Theprocedureallowsthe inclusionof several constraints on the
responses and control variables.
Remcrrk 2. The generalizeddistance approach is theonlymultire-
sponse optimization procedure that takes into account the variance-covar-
iance structureof the responses. Werecall that this structure affects thefit of
the models. It should therefore be taken into consideration in any simulta-
neous optimization. Also, in order to avoid anydifficulties caused by multi-
collinearities among the responses, the multiresponse data should first be
checked for linear dependences among the columns of Y [see formula (1 5)].
Khuri and Cornell (1996, pp. 255-265) provide more details about this and
show how to drop responses considered to be linearly dependent on other
responses.
Theextension of the generalized distanceapproach in Section 2.2
makes it now possible to apply this procedure to models that are not of
the same form or dependent on the same control variables. On the other
hand, the desirability function approach, although simple to apply, is sub-
jective, as it depends on how the user interprets desirabilities of the various
responses. The user should be very familiar with the product whose char-
acteristicsaremeasured by theresponsesunderconsideration.Derringer
(1994, p. 57) provided some insight into the choice of desirability values.
He stated that “theprocess of assigning desirability curvesand their weights
is best done by consensus in the early stages of product conception. The
consensus meeting should include an expert facilitator and representatives
from all functional areas involved with the product.” Care should therefore
be exercised in setting up desirability functions. Improperly assessed desir-
abilities can lead to inaccurate optimization results.
It should be recalled that in Derringer and Suich (1980), no account
was given ofthe variance-covariance matrix of the responses, not even at the
modelling stage. Del Castillo et al. (1996, p. 338), however, recommended
using Zellner’s (1962) SUR estimates to fit the models in ( I ) [see formula
(7)]. Furthermore, the desirability function approach has no built-in proce-
dure for detecting those responses,if any, that are either linearly dependent
MultiresponseSurfaceMethodology 423

or highly multicollinear. Ignoring such dependences can affect the overall


desirability and hence the determination of optimum conditions.

3. EXAMPLES

Inthissection, we illustratetheapplicationoftheextendedgeneralized
distance approach (GDA) and the desirabilityfunction approach (DFA)
ofSection 2.2 andthedual response approach(DRA) usingthe G R G
algorithm of Section 2.3. We present two examples, one from the semicon-
ductor industry and the other from the food industry.

3.1 A SemiconductorExample
An experiment was conducted to determine the performance of a tool used
to polish computer wafers. Three control variables were studied: .x1 = down
force, .x2 = tablespeed,and -x3 = slurry concentration.Themeasured
responses
were
removal rate ofmetal
(RR), oxide
removal
rate
(OXRATE), and within-wafer standard deviation (WIWSD). The objective
of the experiment was to maximize y1 = selectivity and minimize y 2 = non-
uniformity, where

WIWSD RR
and y2 =
= OXRATE RR

A Box-Behnken design with eight replications at the center and two replica-
tions at each noncentral point was used. Each treatment run required two
wafers. The first wafer was used to measure RR and WIWSD. The second
wafer was used to measure OXRATE. Thedesign points and corresponding
values of y1 and y 2 are given in Table 1.
Before determining the optima associated with y l and y 2 , we need to
select models that provide good fits to theseresponses.Sincethemodels
are fittedusingZellner’s(1962)seeminglyunrelatedregression (SUR)
parameterestimation [see formula (7)], measuresofthegoodnessof fit
for SUR models should be utilized. These include Sparks’ (1987) PRESS
statisticand McElroy’s(1977) R2 statistic. Thelatter isinterpretedthe
same way as the univariate R2 in that it represents the proportion of the
totalvariationexplained by the SUR multiresponsemodel.Thesemea-
suresprovidetheuserwithmultivariatevariableselectiontechniques,
which, in general,requirescreeningalargenumberofsubsetmodels.
To reducethe number of models considered, Sparks (1987) recommends
using the univariate R2, adjusted R2, and Mallows’ C, statistics to identify
424 Khuri and Valeroso

Table 1 Experimental Design and Response Values (SemiconductorExample)


variables
Coded control Responses
s , y2
0 0 0 0.49074 0.18751
0 0 0 0.39208 0.19720
1 0 1 0.85866 0.12090
1 0 1 0.74129 0.16544
-1 0 1 0.33484 0.65322
-I 0 1 0.29645 0.75198
I -1 0 0.57887 0.15566
1 -1 0 0.62203 0.10841
1 1 0 0.70656 0.14648
1 1 0 0.88 189 0.09600
0 0 0 0.43939 0.24803
0 0 0 0.46587 0.23759
-I 1 0 0.30218 0.55831
-1 I 0 0.36169 0.71 183
0 1 1 0.60465 0.23622
0 1 1 0.53486 0.26489
0 -1 -1 0.48908 0.24406
0 -1 -1 0.4368 I 0.38756
-1 0 -1 0.25005 0.6305 1
-I 0 -1 0.19546 0.72421
0 -1 1 0.52298 0.25327
0 -1 1 0.42990 0.25019
0 0 0 0.45782 0.32923
0 0 0 0.469 I O 0.29522
1 0 -I 0.63714 0. I2583
1 0 -1 0.79454 0.19912
0 I -1 0.88856 0.27198
0 I -1 0.842 18 0.29578
-1 -1 0 0.13258 0.62442
-1 -I 0 0.13665 0.53618
0 0 0 0.498 10 0.29392
0 0 0 0.46321 0.37023

“good”subsetmodels.Foreachcombination ofsuchmodels,Sparks’
PRESS and McElroy’s R’ statisticsarecomputed.The“best”multire-
sponsemodel is theone withthesmallest PRESS statisticvalueand a
value of McElroy’s R2 close to 1. On this basis, the following models were
selected for
and y2:
MultiresponseSurfaceMethodology 425

.?[,.I(X) = 0.441 + 0.2382.1-1 + 0.1 109.~2- 0.0131~3


(19)
+ 0.0429.~;+ 0.0912s: - 0 . 0 7 7 3 . ~ ~ ~ ~

jc,2(.y) = 0.2727 - 0.2546~1+ 0.0014~2- 0.01 14s3+ 0.1216.~; (20)

The SUR parameter estimates, their estimated standard errors, the values of
the univariate R’, adjusted R’, and C,, statistics and values of McElroy’s R’
and Sparks’PRESSstatisticsare given in Table 2. NotethattheSUR
parameters estimates were obtained using PROC SYSLIN in SAS
(1990a), and the univariate R’, adjusted R’, and C,, statistics were computed
using PROC REG in SAS (1989). From Table 2 it can be seen that models
( 1 9) and (20) provide good fits to the two responses.
On the basis of models (19) and (20), the individual optima of jc,l(x)
and j,?(x) over the region R = ((sI,s 2 , s 3 ) l x:=l
.Y: I2) are given in Table
3. These values were computed using a Fortran program written by Conlon
(1992), which is based on Price’s (1977) optimization procedure. The simul-
taneous optima of j , l ( x ) and Tr2(x)over R were determined by using the
extension of the GDA (see Section 2.2). The minimization of pe in (18) was

Table 2 SUR Parameter Estimates and Values of C,,, R’, and Adjusted R’
(Semiconductor Example)
Responses“
Parameter ?’<,I J’t.2

0.4410(0.0190) 0.2727(0.0135)
0.2382(0.0155) -0.2546(0.0 135)
0.1109(0.0155) 0.0014(0.0135)
-0.0131(0.0155) -0.01 14(0.0135)

-0.0773(0.0219)
0.1216(0.0191)
0.0429(0.0219)
0.09 12(0.02
19)
6.17 4.39
0.91 0.93
0.89 0.91
“The number in parentheses is the standard error.
Note: McElroy’s R’ = 0.9212: Sparks’ PRESS statistic = 103.9.
426 Khuri and Valeroso

carried out using a program written in PROC IML of SAS (1990b). The
results are shown in Table 3.
T o apply the DFA, we use formulas (12) and (13) for t l l ( x ) and d 2 ( x ) ,
respectively, where

I1 otherwise

and

if jc,22 1.0

I’ otherwise

Note that the values 0.95 and 0.20 in d l and d 2 , respectively, are of the same
order of magnitude as the individual maxima and minima of jc,l and j,,,,
respectively. Notealsothat,onthe basis ofa recommendation by Del
Castillo et al. (1996, p. 338), we have used the SUR predicted responses, j e l
(x) and j e 2 ( x ) , instead of j ; ( x ) and j ; ( x ) . The latter two are the ones nor-
mally used in the DFA and are obtained by fitting the models individually
[see formula (1 l)]. The overall desirability function d(x) = [ d , ( ~ ) d ~ ( x ) ] ” ~
wasmaximizedover R using theFortranprogramwritten by Conlon
(1992). Alternatively,Design-Expert(Stat-Ease, 1993) softwarecanalso
be used to maximize d ( x ) . The DFA results are given in Table 4.
The results for the DRA are given in Table 5. In applying this proce-
dure to the present example, each of the two responses was considered as the

Table 3 Individual and GDA Simultaneous Optima for the Semiconductor


Example
~ ~~~ ~ ~~ ~

Optimum Response
Individual optima
.PPI(1) Max = 0.8776 (0.7888,0.9031, -0.7479)
ir2(X) Min = 0.1302 (0.9443, -0.0468.0.9689)
Simultaneous optima (GDA)
d
i (4 Max = 0.8641 (0.9976,0.9127, -0.3961)
5.Z(X) Min = 0.1463 (0.9976,0.9127, -0.3961)
Note: Minimum value of pr in Eq.18 is 0.8610.
MultiresponseSurfaceMethodology 427

Table 4 DFA Simultaneous Optima for the Semiconductor Example


mum Response
. F p l (1) Max = 0.8772 (0.8351,0.7951, -0.8172)
;”,(X) Min = 0.1556 (0.8351.0.7951, -0.8172)
Note: The maximum of d ( x ) over R is 0.9609.

primary response. Its optimum value was then obtained over R using the
constraint that the other response is equal to its individual optimum from
Table 3. Values of the DRA optima in Table 5 were computed on the basis
of the G R G algorithm using the “solver” tool, which is available in the
Microsoft Excel (Microsoft, 1993) spreadsheet program. For more details
on how to use this tool, see Dodge et al. (1995).
The resultsofapplying GDA, DFA, and DRA are summarized in
Table 6. We note that the results are similar to one another. The maxima
of j p , ( x ) under GDA and DFA are close, and both are higher than the
maximum under DRA. Their overall desirability values are also higher.

3.2. A Food Industry Example


Tseo et al. (1983) investigated theeffects of -xl = washing temperature, s 2 =
washing time, and .x3 = washing ratio on springiness bl),thiobarbituric
acid number ( y 2 ) , and percent cooking loss ( y 3 ) for minced mullet flesh. It
is of interest to simultaneously maximize y l and minimize y 2 and y 3 . The
design settings in the original and coded variables and the corresponding
multiresponsedataare given in Table7.Notethatthe designused is a
central composite design with three center point replications and an axial
parameter equal to 1.682. The same data set was reproduced in Khuri and
Cornell (1996, pp. 295-296).
The multivariate variable selection techniques [Sparks’ (1987) PRESS
statistic and McElroy’s (1977) R’ statistic] mentioned in the previous section
were used, and the following SUR models were obtained:

Table 5 DRA Optima for the Semiconductor Example


imum Response
j

4(x) 0.7639 ( 1 .O, 0.3299.0.9440)
.F&) 0.1488 (1.0.0.7751, -0.6319)
428 KhuriandValeroso

Table 6 Comparison of GDA. DFA. and DRA Results for the Semiconductor
Example

Method
~~

GDA
DRA DFA
Optimal response valuc (0.86.0.1 5) (0.88,O. 16) (0.76.0.15)
Optin1al settings (1.0,0.91,-0.40) (0.84.0.80.-0.82) See Table 5.
Minimum metric (p(,) 0.8610 1.1768 Not applicable.
Ovcrall desirability 0.9537 0.9609 0.8967

Fl,l(-r)= 1.8807 - 0.0974~1 +


- 0.0009~2 0.0091~3- 0.1030.yf
+ 0.00 13s: + 0.0028s:
fe2(X) = 22.5313 + 5.6609~1- 0.1719r2 - 1 . 2 2 6 8 . ~ +~7.8739s;
+ 0.1489s: + 2 . 6 9 2 0 . +~ ~0 ~. 1~7 5 2 . ~ ~ ~ ~
.?,,3(X) = 17.81 18 + 0.7442~1- 0.0120~2- 1 . 0 7 1 0 + ~ ~3.4798s;
+ 0.8288s: + 1.6731.~:+ 1 . 3 0 2 0 +~ 1.9716sl.y3
~~~

Table 7 Experimental Design and Response Values (Food Industry Example)

Original
control
variables
Coded
control
variables
Responses

XI x2 xi sI s2 s 3 .1'1 J'? .1'3

2.826.0 18.0 -1.000


-1.000
"1.000 1 .X3 29.31 29.50
40.0 2.8 18.0 1.000 -1.000 -1.000 1.73 39.32 1 9.40
26.0 8.2 18.0 -1.000 1.000 -1 .000 1.X5 25.16 25.70
40.0 8.2 18.0 1.OOO 1.000 -1 .000 1.67 40.8 1 27. I O
26.0 2.8 27.0 -1.000 -1.000 I.000 1.X6 29 .x2 2 I .40
40.0 2.8 27.0 1.000 -1.000 1.000 1.77 32.20 24.00
26.0 8.2 27.0 -1.000 1.000 1.000 1.88 22.0 1 19.60
40.0 8.2 27.0 1.000 1.000 1.000 1.66 40.02 25. 10
22.5 5.5 21.2 -1.682 0.000 0.000 1.81 33.00 24.20
0.000
1.68222.5 5.5 44.8 0.000 1.37 51.59 30.60
33.0 1.0 0.000
22.5 -1.682 0.000 1.85 20.35 20.90
33.0 10.0 22.5 0.000 1.682 0.000 1.92 20.53 18.90
33.0 5.5 14.9 0.000 0.000 -1.682 1.88 23.85 23.00
33.0 5.5 30.1 0.000 0.000 1.682 1.YO 20.16 21.20
33.0 5.5 22.5 0.000 0.000 0.000 1.89 2 I .72 18.50
33.0 5.5 22.5 0.000 0.000 0.000 1.88 21.21 18.60
33.0 5.5 22.5 0.000 0.000 0.000 1.87 21.55 16.80
MultiresponseSurfaceMethodology 429

The estimated standard errors for the parameter estimates, the values of the
univariate R', adjusted R', and C,, statistics, and values of McElroy's R'
Sparks' PRESS statistics are given in Table 8. We can see that the fits of the
three models are quite good.
The individual optima and the GDA simultaneous optima over the
region R = ((sI.x ? , .x3)\ x:=l
.x: 5 3 ) are given in Table 9.
The results of the DFA are presented i n Table IO. Here, the desirabil-
ity values were computed using the functions

Tl>,(X)- 1.3
ti,@) = if 1.3 < jc,l
< 2.5
2.5 - 1.3
otherwise

d'(s)= ?:,'(X)-51
I7 - 51
if 17 < 51

otherwise

Table 8 SUR Parameter Estimates and Values of C,,, R', and Adjusted R'
(Food Industry Example)
Responses"

Intercept 1.8807(0.0207) 22.531 3(0.8854) 17.81 lg(0.8097)


,
.Y -0.0974(0.0097)
5.6609(0.5538)
0.7442(0.3818)
X? -0.0009(0.0097) -0.1719(0.5538) -0.0120(0.3818)
s.3 0.009 l(0.0097) - I .2268(0.5538)
-1.0710(0.3818)
.YI .Y? 1.3020(0.4370)
2.6920(0.7234)
s1s 3 I .97 16(0.4323)
.Y?.Yj 0.1752(0.7158)
7
.Yi 3.4798(0.4198)
7.8739(0.5823)
-0.1030(0.0107)
SI7
0.8288(0.4198)
0.1489(0.5823)
0.0013(0.0107)
1.6731(0.4162) s j 0.0028(0.0107)
ci, 6.61 7.82 8.59
R- 0.93 0.95 0.87
Adj. R' 0.88 0.9 I 0.74
"Thc number insldc parenthescs I S thc standard crror.
Notc: McElroy's R' = 0.9271; Sparks' PRESS statlstic = 230.21.
430 Khuri andValeroso

Table 9 Individual and GDA Simultaneous Optima for the Food Industry
Example
mum Response
Individual optima
.?‘,I(x) Max = 1.9263 (-0.4661, -0.3418, 1.6276)
!;‘,?(X) Min = 18.8897 (-0.5347.1.1871.1.1415)
.?&) Min = 17.4398 (-0.2869, 0.2365, 0.4970)
Simultaneous optima (GDA)
.?d (x) Max = 1.9136 (-0.5379,1.0435,0.8622)
.PPZ(X) Min = 19.3361 (-0.5379, 1.0435, 0.8622)
.?Ax) Min = 17.9834 (-0.5379, 1.0435. 0.8622)
Note: Minimum value of p<,In Eq. (18) IS 0.9517.

= if 14 < jC,3< 30
14 - 30
otherwise

In setting up thesefunctions, we assumedthattheranges of acceptable


values forthethree responses are 1.3 < y I < 2.5, 17 < y z < 51, and 14
<~ ‘ < 3 30.
Finally, for the DRA, each of the three responses was considered tobe
the primary response, and its optimum value over R was obtained under the
constraints that the other two responses are equal to their respective indi-
vidual optima from Table 9. The results are shown in Table 1 1 .
A summary of the optimization results of applying GDA, DFA, and
DRA to this example is given in Table 12. Here also the results are similar,
with the GDA and DFA providing slightly smaller minima for 4’2 and y 3
than the DRA.

Table 10 DFA Simultaneous Optima for the Food Industry Example

mum Response

(x)
?;?I Max = 1.9127 (-0.4504, 0.6176, 0.8081)
?;<,?(X) Min = 19.8768 (-0.4504. 0.6176, 0.8081)
?;&) Min = 17.6386 (-0.4504, 0.6176. 0.8081)
Note: The maximum of (/(x) over R is 0.7121.
MultiresponseSurfaceMethodology 431

Table 11 DRA Optima for the Food Industry Example

Location
Response Optimum
(-0.5617, 1.1228. 0.9415)
(-0.3514, 0.2824, 0.5605)
(-0.5077, 1.0716, 1.2625)

Table 12 Comparison of GDA, DFA, and DRA Results for the Food Industry
Example
RA DFA GDA
Optimal response values (1.91.19.34,17.98) (1.91.19.88.17.64) (1.91.20.55,18.61)
Optimal settings (-0.54. 1.04, 0.86) (-~0.45,0.62, 0.81) See Table 1 1
Minimum metric (p,,) 0.95 I7 1.2832 Not applicable
Overall desirability 0.7098 0.7121 0.6885

ACKNOWLEDGEMENT

We acknowledge the help of Ms. Terri L. Moore in providing the technical


background for the example in Section 3.1.

REFERENCES

Biles WE. (1975). A response surface method for experimental optimization of multi-
response processes. Ind Eng Chem, Process Des Dev 14:152-158.
Box GEP. (1954). The exploration and exploitation of response surfaces: Some gen-
eral considerations and examples. Biometrics IO:16-60.
Box GEP, Youle PV. (1955). The exploration and exploitation of response surfaces.
An example of the link between the fitted surface and the basic mechanism of
the system. Biometrics 11: 287-323.
Box GEP, Hunter WG, MacGregor JF,Erjavec J. (1973). Some problems associated
with the analysis of multiresponse data. Technometrics I5:33-5 I .
Chitra SP. (1990). Multi-response optimization for designed experiment. Am Stat
Assoc Proc Stat Comput Sect, pp. 107-1 12.
ConlonM. (1988). MR:Multipleresponseoptimization.TechRep No. 322,
Department of Statistics, University of Florida, Gainesville, FL.
Conlon M. (1992). The controlled random search procedure for function optimiza-
tion. Commun Stat Simul Comput B21: 919-923.
432 Khuri and Valeroso

Copeland KAF, Nelson PR. (1996). Dual response optimization via direct function
minimization. J Qual Technol 28: 331-336.
Del Castillo E. (1996). Multiresponse process optimization via constrainedconfi-
dence regions. J Qual Technol 28: 61-70.
Del Castillo E, Montgomery DC. (1993). A nonlinear programming solution to the
dual response problem. J Qual Technol 25: 199-204.
Del CastilloE,MontgomeryDC, McCarville DR. (1996).Modifieddesirability
functions for multiple response optimization, J Qual Technol 28: 337-345.
Derringer G C . (1994). A balancing act: Optimizinga product’s properties. QUA Prog
27: 51-58.
Derringer GC, Suich R. (1980). Simultaneous optlmization of several response vari-
ables. J Qual Technol 12:214-219.
Dodge M. Kinata C. Stinson C. (1995). Running Microsoft Excel for Windows 95.
Washington, DC: Microsoft Press.
Draper NR. (1963). “Ridge analysis” of response surfaces. Technometrics 5: 469-
479.
Fichtali J, Van de Voort FR, Khuri AI. (1990). Multiresponse optimization of acid
casein production. J Food Process Eng 12: 247-258.
Floros JD. (1992). Optimization methods in food processing and engineering. In:
HuiYH.ed. Encyclopediaof Food Science andTechnology. Vol. 3. New
York: Wiley, pp 1952-1965.
Floros JD, Chinnan MS. (1988a). Seven factor response surface optimization of a
double-stage lye (NaOH) peeling process forpimientopeppers. J Food Sci
53:631-638.
Floros JD, Chinnan MS. (1988b). Computer graphics-assisted optimization for pro-
duct and process development. Food Technol 42:72-78.
Guillou AA. Floros JD.(1993). Multiresponse optimization minimizes salt in natural
cucumber fermentation and storage. J Food Sci 58: 1381-1389.
Harrington EC. (1965). The desirability function. Ind Qual Control 21:494498.
Hill WJ, Hunter WG.(1966). A review of response surface methodology: A literature
survey. Technometrics 8: 571-590.
Khuri AI. (1996). Multiresponse surface methodology. I n : Ghosh S, Rao CR. eds.
Handbook of Statistlcs, Vol. 13. Amsterdam: Elsevier Science, pp 377406.
Khuri AI. Conlon M.(1981). Simultaneous optimization of muhiple responses repre-
sented by polynomial regression functions. Technometrics 23:363-375.
Khuri AI, Cornell JA. (1996). Response Surfaces. 2nd ed. New York:Marcel
Dekker.
Khuri AI. Myers RH. (1979). Modified ridge analysis. Technometrics 21: 467-473.
Kim KJ, Lin DKJ. (1998). Dual response surface optimization: A fuzzy modeling
approach. J Qual Technol 3O:l-IO.
Lin DKJ, TUW. (1995). Dual response surface optimization. J Qual Technol 27: 34-
39.
Lind EE, Goldin J, Hickman JB. (1960). Fitting yield and cost response surfaces.
Chem Eng Prog 56: 62-68.
MultiresponseSurfaceMethodology 433

McElroy MB. (1977). Goodness of fit for seemingly unrelatcd regressions: Glahn’s
R:,.v and Hooper’s F’. J Econometrics 6:381-387.
Microsoft (1993). Microsoft Excel User’s Guide, Version4.0. Redmond,WA:
Microsoft Corporation.
Mouquet C, Dumas JC. Guilbert S. (1992). Texturizatlon of sweetened mango pulp:
optimization using response surface methodology. J Food Sci 57: 1395-1400.
Myers RH, Carter WH. (1973). Response surface techniques for dual responsc sys-
tems. Technometrics 15:301-3 17.
Myers RH, Khuri AI, Carter WH. (1989). Responsesurfacemethodology: 1 9 6 6
1988. Technometrics 31:137-157.
Myers RH, Khuri AI,Vining G . (1992). Response surface alternatives to the Taguchi
robust parameter design approach. Am Stat 46: 13 1-1 39.
Nelder JA. Mcad R. (1965). A simplex method for function minimization. Comput J.
7:308-313.
Price WL. (1977). A controlled random search procedure for global optimization.
Comput J. 20:367-370.
SAS (1989). SASjSTAT User’s Guide. Vol. 2, Version6. 4th ed. Cary, NC: SAS
Institute. Inc.
SAS (1990a). SAS/ETS. Version 6. Cary, NC: SAS Institute, Inc.
SAS (1990b). SAS/IML Software, Version 6. Cary, NC: SAS Institute, Inc.
Sparks RS. (1987). Selecting estimatorsandvariables in the seemingly unrelated
regression modcl. Colnmun Stat Simul Colnput B16:99-127.
SrivastavaVK. Giles DEA. (1987).Secmingly Unrelated Regression Equations
Models. New York: Marcel Dekker.
Stat-Ease (1993). Design-Expert User’s Guide, Version 4.0. Minneapolis, MN: Stat-
East, Inc.
Tseo CL. Deng JC, Cornell JA, Khuri AI, Schmidt RH. (1983). Effect of washing
treatment on quality of minced mullet flesh. J Food Sci 48: 163-167.
Valeroso ES. (1996). Topics in multiresponse analysis
and
optimization.
UnpublishedPhD Thesis. DepartmentofStatistics, Universityof Florida,
Gainesville. FL.
Vining GG, Myers RH. (1990). Combining Taguchi and response surface philoso-
phies: A dual response approach. J Qual Tcchnol 22:38-45.
Zellner A. (1962). An efficient method of estimating seemingly unrelated regressions
and tests for aggregation bias. J Am Stat Assoc 57348-368,
This Page Intentionally Left Blank
26
Stochastic Modeling for Quality
Improvement in Processes
M. F. Ramalhoto
Technical University of Lisbon, Lisbon, Portugal

1. INTRODUCTION

In any service industry there are essentially twotypesof products to be


considered,product service andproductsupply. Product service can be
defined as how the service has been provided, and product supply is what
has been provided (this is, in many cases, what is commonly called product).
The product service is usually provided through a service delivery process of
a queuing system. The service delivery process is essentially described by a
queuing model. This paper deals only with the product service.
To develop policies to provide consistently high product service for a
wide range of customer types and arrival and service rates at “reasonable”
cost is one of theultimatetargetsofmostqueuingsystemmanagers.
Usually, those are not easy targets. The present chapter presents a metho-
dology to address them.
In Section 2 the differences between product service and product sup-
ply are discussed. In Section 3 a way is provided of quantifying delay and
discomfort in the queuingsystem of theservice industry in order toachieve a
product service of high quality. Six external queuing system quality dimen-
sions and four internal queuing system quality dimensions are defined to
address delay anddiscomfort.Theexternalquality dimensions--perfor-
mance, flexibility, serviceability (responsiveness), reliability, courtesy (empa-
thy), andappearance(tangib1es)”providea way toestablishakind of
channel of communication between thequeuing system managersand
operatorsandtheircustomers(they allowthe managerstounderstand

435
436 Ramalhoto

their customers’ expectations and perceptions of the queuing system). The


first three internal quality dimensions-timeliness, integrity, and predictabil-
ity-provide a way to establish a kind of channel of communication between
the managers and the actual physics of the queuing system (they allow the
managers to identify and understand the limitations of the production pro-
cess). Thefourthinternalquality dimension--customer satisfaction-
provides a way to establish a kind of channel of communication between
the managers and their market competitors. Once we have established the
channels of communication we have to learn how to use them to commu-
nicate efficiently and to find thesolution or the way of coping with the
identified problems. Most of those problems have to do with the design of
the service delivery process.
Behind a service delivery processthere is usually aqueuingmodel
responsible for itsfailure or its success. In Section 4 themostrelevant
queuingmodelsaddressingthereductionofdelayanddiscomfortand
their functional relationship with the basic queuing model parameters are
presented and discussed (two analytical queuing models that consider the
quality dimension flexibility, one queuing model that considers the custo-
mers’ perceptions of waiting and service, and a brief reference to approx-
imations and bounds for queuing models with time-dependent arrival rates
and to retrial queuing models). Usually, there are more than one queuing
model able to respond to the needs of a particular service delivery process.
Eachqueuingmodeloptionmight lead to different levels of delay and
discomfort reduction, impact on customer satisfaction, and costs. The aim
is to find the “optimal” choice that balances it all. I n Section 5 a simulation-
decisionframework,calledtotalqualityqueuemanagement, is described
that explicitly considers and evaluates alternative queuing model options
andmakes the necessary decisions by selectingthoseparticular options
that provide the best projected performmce scores, in terms of specified
scoring criteria, based on measures linkedto the quality dimensions selected.
Section 6 consists of conclusions and further remarks.

2. PRODUCT SERVICE AND PRODUCT SUPPLY


2.1. Distinguishing Product Service and Product Supply
There might be situations where a clear cutoff between the product service
andtheproductsupply is too difficult to achieve.However,usuallythe
product supply is an object and the product service is not. Also, in most
cases, a poor product service might ruin an excellent quality product supply
and vice versa. Therefore, both the quality improvement of product service
Stochastic Modeling for Process Quality 437

and that of product supply have to be looked for and considered equally
important.
Qualityimprovement of theproductsupply is linked tostochastic
maintenance, reliability, quality control, and experimental design techniques.
Furthermore, an important problem is how to achieve a high-quality product
supply without increasing cost.I n many situations the study of interactions
among maintenance, reliability, and control charts, through a total quality
management (TQM) approach, might help toreach that goal. However, that
is not the concern of this chapter, which deals only with the product service.
It has been recognized by several authors including Deming (see, e.&.,
Ref. 1 ) that people who work in queuing systems are usually not aware that
they too have a product to sell and that this product is the service they are
providing. The product service is frequently invisible to the operators. They
have difficulties in seeing the impact of their performance 011 the success or
failure of the organization that employs them, on the security of their jobs,
and on their wages. Perhaps it would make sense to propose a quality index
(based on some of the quality dimensions tobe defined next) for most of the
relevant queuing systems of common citizens’ everyday life (that would also
help their operators to understand better the importance of their mission).
Just imagine all the queuing systems relevant to our everyday life operating
underthecustomersatisfactioncriterion efficiently, adequately,andat
controlled costs.

2.2. Identification of Differences


Product service cannot be stored, so apparently at least some measurements
must be almost immediate. In fact, product service is intangible and ephem-
eral or perishable. It cannot be stockpiled and mustbe produced on demand
(it should be noted that similar constraintsnow exist on the production of at
least some product s~,pply,owing to the new requirements in manufacturing
production, such as just-in-time or zero inventory). Frequently, thedelivery
of the product service involves the customer andbegins a very time-sensitive
relationship with the customer. The involvement of the customer also makes
the definition of quality of the product service vary over time much more
quickly than that of the product supply. Customersalso add uncertainties to
the process, becauseit is often difficult to determine their exact requirements
and what they regard as an acceptable standard for the productservice. This
problem is magnified by the fact that standards are very often subjective,
based on personalpreferences or moods rather than on technical perfor-
mance that can be easily measured [2]. Whereas a product service may have
completely satisfied a customer yesterday, exactly the same product service
may not do so today because of the customer’s mood. On the other hand,
438 Ramalhoto

with the same equipment and for the same required service, because of the
mood of the operator(if the operator is a human), the productservice might
be of poor quality today even if usually it is not. Queuing and waiting in
general are at the same time personal and emotional. Qualitative and quan-
titative aspects of human behavior toward waiting have to be addressed. In
most cases if customers are pleasantly occupied while waiting (entertain-
ment, socially relevant information, opportunity to make interesting con-
tacts, job opportunities, extra information about the queuing system itself,
etc.), their perception of the length of the waiting time and of whether it is
“reasonable” may differ substantially. Unlike the product supply, which can
usually be sampled and tested for quality, the product service cannot, at
least not easily. The record of an inspection of the product service cannot be
assumed to be a “true” reflection of its quality. For instance, duringinspec-
tion the operator (if a human) might be quicker, more courteous, and more
responsive to customers than if left alone. (However, if the operator feels
pleasure in providing a high quality product service and is proud of con-
tributing to the higher standardsof the queuing system, he or she works well
even without any kind of inspection.) Moreover, unlike the controlof qual-
ity in the product supply [l], the quality of the product-service depends both
on the operator and on the customer. Also, product service can be classified
a s poor by someandgood by others.Indeed,itsqualification,goodor
faulty, need not be consistent.
On assessing the effectiveness of a product service, quantitative and
qualitative factors have to be taken into consideration. It is also expected
that different individuals will have different judgments and different opi-
nions about many factual issues. Nevertheless, if the process continues long
enough, the observers are expected to independently arrive at very similar
interpretations. That, obviously.encouragesthedevelopmentofmechan-
isms of communicatiotn between the system’s management and their custo-
mers. Moreover, product service is delivered at the moment it is produced.
Any quantification or measurement taken is thus too late to avoid a failure
or defect with that particular customer. However, that situation might be
alleviated if acommunicationmechanism is already in operation(for
instance,atthe exit thecustomercould be asked, or given a short and
clear questionnaire, to quantify the product service just received according
to the quality dimensions to be defined in the next section and tobriefly state
what he or she would like to see improved in it; means of contacting the
customer for mutually relevant communication in the future should also be
recorded if the customer is interested). The success of the communication
mechanismdependsheavilyonshowingcustomersthattheyhavebeen
heard by thesystem managersandthat their relevantopinions really
make a difference.
StochasticModelingforProcess Quality 439

Nevertheless, product service quality mustalways be balanced


between customer expectations and their perception of the product service
received. A higher quality product service is one with which the customers’
perceptions meet or exceed their expectations. It is obvious that it is much
more difficult to define quantitative terms for the features that contribute to
the quality of product service than to quantify the quality of the product
supply. Therefore, the primary areaof difficulty is that of identifying appro-
priate quality “measures” (quantities resulting from measurements or quan-
tification) that we call here quality dimensions. These quality dimensions
also serve as a common language amongthecustomers,operators,and
managers.

3. QUALITYDIMENSIONS

I shall classify the quality dimensions into external and internal.

3.1. External Quality Dimensions


The quality dimensions-performance, flexibility, serviceability (responsive-
ness), reliability, courtesy (empathy), and appearance (tangib1es)“are here
called external quality dimensions and defined as follows, in a slightly dif-
ferent way than in Refs 3 and 4. Note that all external quality dimensions
are defined from the customer’s viewpoint.
Pe~;forn~m~c.e is the primary operating characteristics of thequeuing
system. I t can be “measured” by, for instance, the “absence or perceived
absence of waiting time,” “total sojourn time in the system not exceeding
X units of time.” “competitive price,” etc.
FlcsibilitJ~is the queuing system’s built-in ability to quickly respond to
changes i n demand. It canbe “measured” by, for instance, the durationof
a traffic peak (how quickly the peak is gotten rid of).
Scrviceuhilit~~ (responsi~~c~ne.~.~)
is the ability of the queuing system to
respond to the individual needs of a particular customer. It can be mea-
suredby,forinstance, thetime to respondtothose individual needs,
including length of time to answer enquiries or to answer complaints.
Reliability is the ability to always perform the product service depend-
ably, knowledgeably and accurately, and as expected by the customer.
C o w t r q ~(enzpathJ))is the caring, individualized attention provided to
the customer, the effort to understand the customer’s needs, the ability to
convey trust and confidence. Those are factors more linked to standards
of preferential human behavior, which are most subjective and difficult to
440 Ramalhoto

controlandevaluate.They need separateattentionandjoint research


work with other specialists in order to set up adequate ways of quantify-
ing them.
A p p c m m e (trrngibles) is the quality appearance of the physical envir-
onment and materials, facilities, equipment, personnel, and communica-
tionsusedtoproducetheproduct service. To quantify this quality
dimension, joint research work with other specialists is also required to
set up the right questions to lead toan adequate way of quantifying them.
The first four dimensions are mainly concerned with the cost-benefit
characteristics of theparticularqueuing system understudy. In fact, in
manysituationsonce theyreachreasonablyhigh ranks it is easierto
improvethe last twodimensions.Otherwise,a very kindoperatorwho
does not know the job well will very soon be considered to be of little use
to the customer. An office full of well-dressed operators and sophisticated
equipment is not necessarily the most important factor for the customer,
particularly if the first four dimensions are not ranked high. They might
even represent an insult for the customer who knows that, directly or indir-
ectly, he or she is paying for that luxury.
Those quality dimensions are of great value as facilitators of system
improvement but not in the ongoing business of monitoring and improving
product service quality and cost reduction. They can be obtained only after
the product service is delivered. Also, they reflect the views of the customer
and not necessarily the real state of the system. They indicate the targets,
from the pointof view of the customers, that mustbe aimed for. However, a
lot more might be learned by comparingtheranking ofthosequality
dimensions with the “real” state of the system (for instance, by establishing
prioritytargetsandidentifyingtheneedtoaddmorerelevantquality
dimensions). In fact, other external quality dimensions could be envisaged,
such as managers’, operators’, and, when applicable, customers’ commit-
ment to quality. That is, of course, another external quality dimension that
is difficult to measure but not so difficult to quantify.

3.2. InternalQualityDimensions
We need “measures” thatwill help us to deliver what the customer expects or
to improve the queuing system beyond customers’ expectationsat reasonable
prices. For that, the quality dimensions timeliness, integrity, predictability,
and“customersatisfaction,” called here internalqualitydimensions,are
adopted. The quality dimension timeliness has been referred to, by several
authors. as oneof the most influential components in the quality of a product
service, because the product service has to be produced on demand.
StochasticModelingforProcessQuality441

Tinle1itles.s is formed by the necess time, which is the time taken to gain
attention from the system; time qumit1g, which is the time spent waiting
for service (and which can be influenced by the length of the queueand/or
itsintegrity);and rrctior~time, which is thetimetaken to providethe
required product service.
Integrity deals with the completeness of service and must set out what
elements are to be includedin order for the customer to regard theservice
as satisfactory. This quality dimensionwill set out precisely what features
are essential to the product service.
Prcvfictahility refers to the consistency of the service and also the per-
sistence or frequency of the demand. Standards for predictability identify
the proper processes and procedures that need to be followed. They may
include standards for the availabilityof people, materials, and equipment
and schedules of operation.
Customer satisfaction is defined here as the way to provide the targets
of success, which may be based on relative market position for the provi-
sion of a specific queuing system.

So far, we have established external and internal channels of commu-


nication and “measures” that tie together, in equal terms though with dif-
ferent roles, themanagers,theiroperators(aspart of theproduction
process), customers,andmarketcompetitors.The aim is to build upa
fair partnership ofsystem managers, operators, market competitors, and
customers, a l l able to communicate among themselves and committed to
quality improvement and cost reduction of the system. Let me call this the
mnnnger tetrrthedrorl cowcept (see Fig. 1). Thisconceptallows a TQM

A
CUSTOMERS

MANAGERS

/
MARKET OPERA7TORS
OPERATORS
COMPETITORS (Production Process)
Figure 1 Manager tctrahedron.
442 Ramalhoto

approachtothequality of queuing systems i n the way discussed,for


instance, in Refs. 4-6.
Furthermore, the first internal quality dimension is clearly part of the
theory of queues. Namely, access time has to do with the theory of retrial
queues, and queuing time and action time are waiting time andservice time,
respectively. Unlike manufacturing, the production process in queuing sys-
tems of the service industry is usually quite visible to customers, since they
are often part of this process. Therefore, it is crucial to place some quality
improvementeffortsonimprovingtheproductionprocess. The service
deliveryprocess might be seen a s the process of producing the product-
service. Parasuraman et al [7], throughexternalqualitydimensions,have
also identified the service delivery process as the key to improving product
service quality and building customer loyalty. To improve the service deliv-
eryprocess essentially meanstoimprovethequeuingmodelbehind it.
Timeliness provides basic measures of its performance.
Let me now give examples of queuing model studies relevant to the
quality improvement of the service delivery process.

4. SOME EXAMPLES OF IMPORTANT QUEUING MODELS


IN QUALITY SERVICE

Some “product service failures or defects are very often linked to “unaccep-
tableaccesstime,” “unacceptablequeuingtime,”“unacceptableaction
time,” and “unacceptable sojourn time in the system.” All are clearly mea-
sured in queuing theory terms. Those failures or defects, as already men-
tioned, might ruin the rankingof most df the other quality dimensions. The
way to prevent those failures or defects rests in the quality of the design of
theprocessdeliveryofthe queuingsystem.Often, if nothing is done to
spread out the arrival pattern or to change the service rate or to modify
the service discipline, thequeuing systemexperiences very uneven traffic
flows and serious failures or defects occur in theproduct service. All of
those possible failures or defects have costs. Very often the cost of delay
is to lose customers.

4.1. Two Queuing Models that Consider the Quality


Dimension Flexibility
Queuing models that address the queuing system quality have to be able to
efficiently deal with the peak duration that might occur in those systems.
Very often,therate of arrivaltothe system isvery uneven,subject to
random fluctuation, orperiodically time-dependent. Designing such a queu-
StochasticModeling for ProcessQuality 443

ing system specially to meet the peak demands is not always the best action
to take, because it can be costly and the excess capacity can have negative
psychological effects on the customer.
On theotherhand,apoorrank in flexibility might lead topoor
ranks in almost all the other quality dimensions. Most traditional queuing
models are unable to respondquickly to changes in theirenvironment.
(The basic queuing parameter, namely, the number of operators, is usually
assumed to be unchangednomatterwhat is happening in thequeuing
system.)The result is unacceptablequeue sizes and waitingtimes.Long
queues are, with few exceptions(e.g.,the restaurant with excellent food,
product supply at a good price), always considered an indication of poor
product service.
Ramalhotoand Syski [8] showhowqualitymanagementconcepts
ofsatisfyingthecustomercan be incorporated into the design of queu-
ingmodels.Theyproposeandstudyaqueuingmodelthataims to
providemanagerswith a way of dealingwithsometemporarypeak
situations,that is to say, to have high ranking in the flexibility quality
dimension.The model is essentiallya G/G/c/FCFS(or a G/G/c/c+d/
FCFS, i.e. first come first served queuingmodelwith e, c operators
andd waitingposition;d is omittedwhenequal to zero or infinite)
queuingmodelunderthefollowingadditionaldecisionrule,calledhere
rule 1.

Rule 1. If the queue size exceeds h (the action line), introduce another
server (or k servers, k 2 1); when it falls below CI (the prevention or ‘*1 1arm
line), withdraw one server (or k servers, I< 2 I), h > a .
For the M/M/ queuing model c, (i.e., first come first served
queuing model with Poisson arrival
process andexponential service
times
distribution
with
cooperators and infinite waitingpositions)
under rule 1, the equilibrium distribution
of
the
state
of
the
two-
dimensionalMarkov process thatcharacterizesthequeuing model is
derived.Somefirst-passage-timeproblems useful in thequality design
ofthequeuing system are solved.Severalextensions of these analytical
results tomore general settings,
including
nonhomogeneous Poisson
arrivals,are discussed.
For the M/M/c queue underrule I , where the arrival rate is denoted by
+
h and the service rate by p, p = h / [ ( c li)p], z = h/(cp), p < r , p < I , and
+
for i = 0, 1 , 2 , ...; tz = c, c k , denote the steady-state probability of
having i customers in thequeuingsystemand tI operators serving,
Ramalhotoand Suskiprove, amongother results. that [Ref. 8, p. 163,
Eqs. (9) and (IO)]
444 Ramalhoto

+
A measure of preference to use c k operators for a short period of
time (Ref. 8, p. 164, Eqs. (18) and (19)], is given by D ( h , [ . + k ) , the entrance
probability to the set of states ( i , c) for i = 0, ..., CI - 1, before entering the
set of states ( i , c + k ) for i = h + 1, h + 2, ..., when starting from the bound-
+
ary state (h, c k ) . The value of D ( h , c + k ) gives an indication of the tendency
+
toward c operators, when starting with c k operators.

By letting t + I , Ramalhoto and Syski [SI obtained

+
with p = c/(c k ) .
Other rules could be considered as alternatives to rule I ; for instance:
R u k 2. When the queue size exceeds h (the action line), shorten the
service time (for instance,by deferring some tasks to be worked out later, by
dividing and scheduling when the service can be provided in multiple sepa-
rate segments, or by reducing the quality of service).
Ruko 3. Identify classes of service needed by customers(eachclass
requiring a different service time and being of different "value"), and treat
the customers in separate queues, when the total queue length exceeds h (the
action line).
Which rule is preferable? Section 5 addresses this question.
Stochastic Modeling for ProcessQuality 445

Affinity Operators
Thereare several importantexamples of queuing systems in the service
industry, where it is “more efficient” to have a customer serviced by one
operator than by any other. Thus the system schedules customers on the
queue of their affinity operator. To address the inevitable imbalance in the
number of customers assigned to each operator, there are several policies
that can be considered. Any conventional queuing model under rule 3 might
also be seen as a related model. Nelson and Squillante [9] consider a general
threshold policy that allows overloaded operators to transfer some of their
customers to underloaded operators. They vary four policy control para-
meters. Decomposition and matrix-geometric techniques yield closed-form
solutions. They illustrate the potential sojourn time benefits even when the
costs of violating affinities are large and experimentally determine optimal
threshold values. One of the important applications of those models isin
maintenance after sales, which has become a significant portion of manu-
facturing quality.

4.2. An Analytical Queuing Model that Considers the


Customers’ Perceptions of Waiting and Service
Conventionalqueuingcontroltheoryconsidersthecosts ofwaiting in
terms of time andmoney. For instance,KitaevandRykov [IO] collect
thenewest results of thetheory of Markov(semi-Markovand semi-
regenerative)decisionprocesses related to queuingmodelsandshowits
applicationstothecontrol ofarrivals, service mechanism, and service
discipline. The theory of Markov decisionprocessesclaims thatunder
certainconditionsthere exists anoptimalMarkovstationarystrategy
thatcan be constructedaccordingtoanoptimalprinciple based onan
optimality
equation. Usuallythis approach
does
not
account for
customers’ perceptions of waiting time and service.
Carmon et al. [ I I ] examine how the service should be divided and
scheduledwhen it can be provided in multipleseparatesegments.They
analyzevariants of this problem by using a modelwith aconventional
functiondescribingthewaitingcost,which is modified toaccountfor
some aspects of the psychological cost of waiting in line. They analytically
show. in some particular cases, that considerations of the psychological cost
can result in prescriptionsthatareinconsistentwiththosedictated by
conventionalqueuingcontrol.From theseresultsandthecomments in
theprevioussections, it is obviousthat psychologicallybased queuing
research has a very important role to play in quality improvement i n service
industries.
446 Ramalhoto

4.3. Numerical Approximations for Queuing Models with


Time-Dependent Arrival Processes
In any real-life queuing system of a service industry, there is seasonal (daily,
weekly, and so on) patterns of traffic, rush hours and slack times. Queuing
models with nonhomogeneous stochasticprocess arrivals better reflect these
time-dependent traffic situations. However, the analysis of time-dependent
behavior is very difficult and very often impossible, even for the simplest
conventional queuing models. Nevertheless, the infinite server queue with
nonhomogeneous Poisson arrivals and general service time distribution is
one of the very rare exceptions, where time-dependent analysis is completely
known and useful in practice.
In Ref. 12, it is shownthat in theergodic M/M/r/r + dqueuing
model, on the one hand, the distribution of almost any relevant queuing
characteristiccan be rewritten in terms of thethirdErlangformula(the
probabilityofnonimmediate service), which depends only on I’ and rp,
where p is the traffic intensity. On the other hand, the number of waiting
customers, number of serversoccupied,number of customers in the sys-
tem, waiting time in the queue, and total sojourn time in the system, in the
stationary state, are sums ofthecorrespondingrandomvariables of the
M/M/r/r loss queuingmodel (well approximated by the infinite-server
queueforalmostany valueofthebasic parameters involved, and even
for the time-dependent case) and of the M / M / l / l / + (d-I) queuing model,
respectively, weighted by thethirdErlang formula. The third Erlang for-
mula value also indicates a “heavy/low” traffic situation. An extension of
someofthoseresults to the M/M/r/r+d queuingmodelwithconstant
retrialrate is presented in Ref. 13, where the probability of not avoiding
the orbit parallels the role of the third Erlang formula. In both models the
decomposition’s physical properties seem to be robust to severalgeneral-
izations, including the time-dependent (transient) case. However, in many
situations, namely, the time-dependent ones, there is no closed formula for
most of the probability distributions. Therefore, exact comparisons are not
possible. Approximationsandbounds might be obtainedthroughthis
decomposition approach.
It iswell known that many queuing system practitioners empirically
approximate the M,/M/r/r + dqueuingmodelwithnonhomogeneous
Poissosnarrivals by theinfinite-serverqueuingmodelwithnonhomoge-
neous Poisson arrivals. Based on this practice and on the results presented
in Ref. 12, Ref. 14 providessimple-to-use
a empirical approximation
method to obtain bounds and approximations forthe M,/G/r/r + d queuing
model.Otherauthors, such asWhittand his coauthorsat the AT&T
Laboratories,have developed other,more sophisticated approachesto
Stochastic Modeling for Process Quality 447

tackle the problemof obtaining approximations and bounds for the M,/G/r/
r + d queuing model. A lot of research work is still needed on this queuing
model. Its great importance in the service industry has alreadybeen shown,
for instance, in Ref. 15.

4.4. The Retrial QueuingModel


As shown in the previous section, the access time to the queuing system is
one of the components of the internal quality dimensiontimeliness. In fact,
usually,acustomerwhose firstcall for access to the queuing system is
unsuccessful will repeat the call, once or several times, in quick succession,
thus giving rise to the phenomenon of repeated attempts. The retrial queu-
ing model studies this phenomenon. The effect of repeated attempts is to
leadtoadditionaltheoretical difficulties,even for the M/M/1/1 queuing
model with constant retrials. The study of the M/M/r/r queuing model
withretrialsinvolvesmultidimensionalrandomwalks.Approximations
and numerical methods for this queuing model date back to 1947 [16], but
Ref. 17 is the first book completely dedicated to retrial queues.
When a queuing system is very successful it is usually because more
customers are seekingaccess. If not properly controlled, the numberof cus-
tomers seeking access might eventually ruin the queuing system’s quality
reputation. Therefore, it is crucial to understand the interplay of the basic
parameters-A, arrival rate; v, service rate; and a, access rate-and their
influence on the most relevant quality dimensions. Figures 2-4 illustrate
thekind of three-dimensionalsurfacesthatrepresentthemeanandthe

Figure 2 Mean value of the waiting time in the M/M/l/I queuing modelwith
constant retrial rate and h = 1.5.
40 "
Figure 3 Variance of thewaitingtimeinthe M/M/l/l queuing model with con-
stant retrial rate and h = 1.5.

Figure 4 Mean value of the waiting time in the M/M/l/l + 1 queuing model with
constant retrial rate and h = 1.5.

400

Figure 5 Variance of thewaitingtimeinthe M/M/l/l+ 1 queuingmodelwith


constant retrial rate and h = 1.5.

448
Stochastic Modeling for Process Quality 449

Figure 6 Mean value of the waiting time in the M/M/2/2 queuing model with con-
stant retrial rate and h = 1.5.

variance of the waiting time, as functions of01 and v, for the M/M/1/1 (one
server and no waiting position), M/M/1/1+ 1 (one server and one waiting
position), and M/M/2/2 retrial queuing models with constant retrial rate a
and for different ergodicity intensities. Resultsof this kind help to evaluate
the range of arrival, retrial, and service rates that provide consistentlyhigh
product service quality in an increasingly successful queuing system. (Also,
for example, providing k extra servers, as in Section 4.1, when h or/and 01
increase beyond a certain threshold might be an adequate short-term policy
to maintain the high quality of the productservice in an increasingsuccessful
queuing system).

40"

Figure 7 Variance of the waiting tlme in the M/M/2/2 queuing model with con-
stant retrial rate and h = 1.5.
450 Ramalhoto

Remrrk 1. Perhaps it should be noted that if the design of the deliv-


ery process is no longer fit for the purposes required, it will cause a kind of
“common cause variation.” The temporary changes in the arrival or service
mechanism will cause a kind of “special cause variation.” Both causes of
variation have to be addressed.
Remcrrk 2. The following types of robustness are desirable: ( I ) The
queuing model behind the delivery process is robust if its expected perfor-
mance is not too much affected by “reasonable” changes i n the arrival and
departure rates. (2) The operatoris robust if its performanceis not too much
affected by “reasonable” product service changes required by the customer.
Ren7~rk3. Insteadofsettingupdirectinspection of the operators,
promote channels of communication among customers and operators to
build a joint commitment to improving the quality of the product service.
RP/??mk4 . Whenever possible, eliminate or substantially reduce wait-
ingtimeandqueue size. Managersandoperatorsshouldnetwork with
customers through, for instance, new technologies in order to have custo-
mers’ arrivals as close as possible to the instant they begin service.
R C W I 5N. ~ Specific
~ goals should be set for certain quality dimensions,
such as access time not greater than s , duration of peaks not greater than J’,
queuing time (waiting time) not greater thanz , action time (service time) not
greater than h , and delivery process idle time not greater than 11. A cost-
benefit analysis should be established for queuing systems i n monopolistic
or ugently needed service industries.
Rcnlcrrk 6. An efficiently run queuing system should inform its cus-
tomers at arrival that (1) on average, the waiting time to initiate service is
shorter than a certain value and (2) its queuing size (if not visible) is shorter
than a certain value. Whenever needed, it should spread out arrivals, for
instance, by (3) setting up appointment schemes, (4)pricing at peak load
intervals, when applicable, and (5) establishing priority schemes for special
classes of customers.
Retmrrk 7. Build on the I S 0 9000 gains by introducing a request for a
good understanding of customers’ needs a s well as operators’ limitations
(by the manager tetrahedron concept) and the use of an adequate delivery
process design.
Queuing theory certainly has arole to play in the search for the better
adjusted nlodels to the needs of quality management of service industry
queuing systems. However, the probabilistic results needed to understand
and control the stochastic behavior of those queuing systems cannot all be
Stochastic Modeling for Process Quality 451

determined analytically and need an interdisciplinary approach. They have


to be obtained by a mixture of educated intuition (based on some of the
queuing analytical and algorithmic results available), heuristics, simulation,
and decision making guided by researchfindingsonthepsychologyof
waiting.
In fact, what seems to be required here is the creation of a framework
with the ability to jointly consider data management (from the internal and
external quality dimensions selected), process delivery design (robust queu-
ing models, including psychologically based queuing models), and decision
making also based on cost-benefitanalysis. As alreadystressed, in most
situationsthe service deliveryprocess is theonethat,moreoftenthan
not, needs special attention.

5. EMPIRICAL MODEL BUILDING FOR THE QUALITY


IMPROVEMENTOFQUEUINGSYSTEMS

Usually more than one queuing model is capable of responding to the need
to improve or redesign a particular service delivery process. Each queuing
model option might lead to different levels of reduction of delay and dis-
comfort, impact on customer satisfaction, and costs. The aim, in most cases,
is to find the“optimal”solutionthatbalancesthecustomer delay and
discomfort against operator idleness at the same cost.
Ramalhoto [ 181 formulated a practical simulation decision framework
that considers and evaluates alternative queuing model options and makes
the necessary decisions by selecting those particular options that provide the
best prqjectedperformancescores, in termsof specified scoringcriteria,
based on measures linked to the quality dimensions selected. The queuing
model options are defined as “control parameters” in this framework. For
instance, the queuing models corresponding to rules 1,2, and 3, respectively,
defined i n Section 4. I , can be represented quantitatively by the following
three basic control parameters: X , , the regular size of the service staff; X 2 ,
thepercentage by whichthe service times foreachcustomerareto be
reduced or expedited (as a function of queue length or any other relevant
quantity); X 3 , the amount by which the regular service staff is augmented by
other personnel (such as secretarial or clerical staff to meet periods of heavy
demand); X,, the number of different classes of service needed by customers;
and X , , the percentage of the regularservice staff to allocate to each of those
different classes of service. Thisframework is called totalqualityqueue
management.
452 Ramalhoto

5.1. The Total Quality Queue Management Framework


Basically, the total quality queue management framework consists of four
components: a stochasticdemandmodel, a decisionsystem, anoutcome
calculator, and a scoring system. The stochastic demand model represents
our projection (and the uncertaintiesin our projection) of the rates of arrival
and service requirementsofthecustomers. The decisionsystemsearches
systematically over the multidimensional space defined by the control para-
meters X , , ..., X , to find an optimal combination of values, X ; , ..., X,*, for
these control parameters that will yield the “best” system performance given
the stochastic demand that has been specified for the particular problem.
To enablethe decisionsystem to compute and evaluate the conse-
quences of any specific set of control parameter values, it has to use the
results oftheoutcomecalculatorandthescoringsystem. Theoutcome
calculator and the scoringsystem have to be constructed a s entirely separate
and independent systems.
The outcome calculator calculates (or projects) the specific outcome(s)
that will result from any specific assumptions concering customer demand
and any specific decisions concerning the values of the control parameters.
In particular, for any such combination of assumptions, the outcome calcu-
lator must be able to compute the pertinent outcome parameters (which are
defined i n terms of objective physical quantities such a s queue length. cus-
tomer waiting time, service cost, and other pertinent descriptors of the out-
comes) that may be needed to evaluate the queuing system performance in
terms of the selected quality dimensions. Clearly, the outcome calculator is
concerned with the objective physical outcomes of the queuing system (in
principle, it has nothing to do with the customers’ goals, objectives, prio-
rities, or expectations). It shouldbe able to provide the real ranking value of
the quality dimensions selected.
The scoring system has to be concerned with the subjective desirability
of the outcomes in terms of customers’ expectations, perceptions of waiting
and service, and current goals andobjectives. That should be done through,
for instance, a careful analysis of complaints, behavioral queuing research,
and relevant customerquestionnairesand surveysaddressingthequality
dimensions selected. Thepurpose ofthescoringsystem is to assign to
each outcome a ranking of the quality dimensionsselected that corresponds,
as accurately as possible, to the customers’ real objectives and expectations
for that particular queuing system.
Theactualimplementation of thetotalqualityqueuemanagement
framework to a specific queuing system might be done, for instance, follow-
ing ;I “value-driven”approach.However,otherapproaches might be
envisaged.
StochasticModelingforProcessQuality 453

One of the interesting features of this framework is that we have two


ranking schemes for the quality dimensions selected. The first is an inevita-
ble consequenceofthestructure ofthequeuing system and itsrelevant
physical law (it reflects the voice of the real system), and the second reflects
thecustomers’perceptions andexpectations of thequeuing system (it
reflects the voice of the customer). So the comparison of the two rankings
might be very important to the queuing system’s learning process.
The total quality queue management framework is expected to help
managers gain insight into the main factors that influence product service
quality and identify process changes that will improve it.

6. CONCLUSIONSAND FURTHER REMARKS

Studies have shown that indicators often distort a program from the begin-
ning by forcing a focus on the indicators rather than on the true underlying
goals. The result is generally a lack of sustained success. And i n many cases
there is no success at a l l save in the artificial indicators, which can often be
manipulated with little effect on the underlying process. Unfortunately, in
several situations the harm caused by those artificial indicators is very pain-
ful. That is indeed aserious risk to be avoided.Therefore, an effective
process of judging the costs and consequences of thechoices necessarily
incorporates a learning process. An important result of such learning is a
shared vision withthemanagers, operators (many operators know a lot
about their jobs and about the queuing system they are working with and
also have the capability of taking direct action), and customers about how
the process works and how it should work in order to confront the chal-
lenges it faces.
In this chapter, product service is treated a s “manufacturing in the
field.” It is advocated that it should be carefully planned, audited for quality
control, and regularly reviewed for performance improvement and customer
reaction. The methodology presented is an attempt to construct a learning
queuing system that is able to assess (internally andexternally) itsown
actions and judge and adjust theprocess through which it acts. It relies
onteamworkamongcustomers,operators,andnlanagersto unify some
goals, on a scientific approach, and on decision making based on reliable
data. In fact, it is based on analysis. simulation, data, policies, and options.
The idea is also to question policies whenever appropriate. Adequate data
have to be collected and studied statistically, and options have to be ana-
lyzed, including the option to change policies.
In real life, changes are very oftencostly in terms of money. time.
psychological tensions, and so on. for many reasons (e.g., the new changes
454 Ramalhoto

in practice do not perform as well as expected), and many things can go


irreversibly wrong.Therefore, whenever possible, thetotalqualityqueue
management framework, or any other adequate system’s thinking frame-
work, should be used (and its solutions tested, including the budgeted costs)
in connection with virtual reality experimentation technology.
Rernark 8. There is also a need to systematically judge all the other
aspects of the queuing system, namely, the product supply and information
technologyinvolved.Keybenchmarkmeasuresandstandardsarealso
needed.
Renlark 9. Whenanappointment scheme forthearrivals is being
used, most likely the manager will prefer a tight schedule that limits idle
time,while a customer may prefer to arrive late to avoid waiting. When
commitment on both sides is lacking, cost penalties on both sides often lead
tomore successful appointment schemeswith “reasonable”average idle
time and waiting time.
Remark 10. In a queuing system there are essentially two main rea-
sons for customer dissatisfaction:(1) a waiting time thatexceeds a threshold
level and ( 2 ) dissatisfactionwiththe service received. Thelattercan be
caused by thepoorquality ofeithertheproduct service or the product
supply or an excessively high cost.
Remurk 11. Interdisciplinaryadvancedstudies in the fields of data
analysis,decisionanalysis,queuingtheory,qualitymanagement, and the
psychologyofindividuals, time, andchangeare needed tocreatemore
successful queuing systems for the service industries. Communication, infor-
mation, and commitment are also important tools. Queuing system studies
could also incorporate the latest behavioral queuing research (accumulated
acrossthe fields ofpsychology,marketing,economics,andsociology)to
alleviate the human tensions and humiliation of waiting.
Remark 12. Future customer visits to any queuing system when alter-
natives exist (nonmonopolisticornon-urgently needed service) heavily
dependonthe price andquality of the service productqualityandthe
product supply provided by this queuing system. It is well accepted by all
that having to wait beyond certaitn limits is one of the crucial factors in
customer satisfaction. However, as shown i n Ref. 19, customers’ final dis-
satisfaction with waiting for service is also very highly correlated with their
globalretrospective(dis)satisfactionjudgments,which affect theirfuture
actions.
Renmrk 13. Little has been written on how queuing system quality is
related toconventionalproductivity,profitability,and sales performance
Stochastic Modeling for Process Quality 455

measures. The direct effect of queuing system redesign initiatives on those


measures needs deeper investigation.
Remurk 14. It should be noted that the service industry is spreading
over to manufacturing. In the “service factory” concept, the service is iden-
tified as the “fifth competitive priority” (as opposed to the traditional four
competitive priorities-cost, quality,delivery, and flexibility) in manufactur-
ing strategy. The idea is that the manufacturing organization can become
more competitive by employing a broader range of services provided by the
factory personnel and facilities (for instance, maintenance aftersales of their
own goods).
Renzurk 15. Itshould be emphasizedthat a high qualityqueuing
system is only the result of constant effort to control and to improve each
single aspect of the queuing system as well as not to disregard synergies,
integral (total) management, and strong market awareness.

ACKNOWLEDGEMENT

TheAuthorthanksProfessorGomez-Corralfortheelaboration of the
figures 2 to 7.
The author has also benefited from discussions with colleagues from
Princeton University, Maryland University, and Rutgers University during
her half-year sabbatical in 1998 at Princeton University.
Thisworkwascarried out with thesupportfromthe“Fundaqao
paraa CiEncia eaTecnologia”underthecontractnumber 134-94 of
theMarineTechnologyandEngineeringResearchUnit-ResearchGroup
on QueueingSystemsandQualityManagement, and the project INTAS
9 M 8 2 8 , 1997-2000, on “Advances in Retrial Queueing Theory”.

REFERENCES

1. WE Deming.Out of the Crisis, MIT-CAES.Cambridge,MA:Cambridge


University Press, 1986.
2. CAKing. Service qualityassurance is different foradvanced engineering
studies. Qual Prog June: 1 4 1 8 , 1985.
3. M F Ramalhoto. Queueing system of the service industry-A TQM approach.
In: G K Kanji, ed. Total Quality Management. Proceedings of the First World
Congress. London: Chapman & Hall, 1995. pp 40741 I .
4. B Bergman, B. Klefsjo. Quality from customer needs to customer satisfaction.
McGraw-Hill, New York, 1994.
456 Ramalhoto

5 . G K Kanji. Quality and statistical concepts. I n : G K Kanji, ed. Total Quality


Management. Proceedings of the First World Congress. London: Chapn1an &
Hall, 1995. pp. 3-10.
6. K Dahlgaard, K Kristensen, G K Kanji. TQM-leadership. I n : G K Kanji. ed.
Total Quality Management. Proceedings of the First World Congress. London:
Chapman & Hall, 1995, pp 73-84.
7. A Parasuraman, LL Berry,VA Zeithaml. Understanding customer expecta-
tions of service. Sloan Manage Rev Spring: 39-48, 1991.
8. M F Ra1nalhoto. R. Syski. Queueing and qualityservice. Invest Oper 16(2):155-
172. 1995.
9. R D Nelson. MS Squillante. Stochastic analysisof affinity scheduling and load
balancing in parallel queues. (Submitted for publication)
IO. MY Kitaev. VV Rykov. Controlled Queueing Systems. Boca Raton, FL: CRC
Press. 1995.
11. Z Carmon. JG Shanthikumar, TF Carmon. A psychologicalperspective on
service segmentation models: The significanceof accountingforconsumers'
perceptions of waiting and service. Manage Sci 41(11): 180&1815. 1995.
12. M F Ramalhoto. Generalizations of Erlang formulae and some oftheir2nd
orderproperties. I n : ABachem,V Derigs, M Junger, R Schrader,eds.
Operations Research '93, GMOOR. Physica-Verlag.Heidelberg. 1993. pp
412-417.
13. M F Ramalhoto, A Gomez-Corral. Some decomposition formulae for the M/
M/,'r/r + d queue with constant retrial rate.Communications in Statistics-
Stochastic Models: 14(1 & 2), 1998, 123-145.
14. M F Rnmalhoto. The state of the M,/G/oo queue and its importance to the
study of the M/G/r/r + d queue. (Submitted for publication).
15. L Green, P Kolesar, A Svoronos. Some effects of nonstationarity on multi-
servcr Markovian queueing system. Oper Res 39:502-51 I . 1991.
16. I Kosten. On the influence of repeated calls in the theory of probabilities of
blocking. Ingenieur 59: 1-25, 1947.
17. GI Falin. J G C Templeton. Retrial Queues. London: Chapman & Hall. 1997.
18. M F Ramalhoto. Stochastic modellingin the quality improvement of service
industries-~-~Some new approaches. proceedings of International Conference on
StatisticalMethodsandStatisticalComputingforQualityandProductivity
Improvement. ICSQP'95. Seoul, 1995, pp 27-35.
19. Z Carmon, D Kahneman.The experiencedutilityof queueing: Experience
profilcs and retrospective evaluation of simulatedqueues.Workingpaper,
F U ~ LSchool.
I ~ I Duke University. 1993.
Recent Developments in Response
Surface Methodology and Its
Applications in Industry
Angela R. Neff
General Electric, Schenectady, New York
Raymond H. Myers
Virginia Polytechnic Institute and State University, Blacksburg, Virginia

1. INTRODUCTION

It is interesting to note that there are a limited number of areas of statistics


that are almostentirely motivated by and dependent on real problems. They
do not progress merely because of innovative mathematical rigor, but rather
their development is a function of the increased complexity of problems
faced by practitioners. Such is the case with response surface methodology
(RSM). The fundamental goal remains the same a s it was in the late 1940s
and early 1950s: to find optimum process conditions through experimental
design andstatisticalanalysis.Whiletheterm“qualityimprovement”
became a classic andoverusedterm in the 1980s and 1990s, RSMdealt
with quality improvement problems 30 years earlier.
There is no question that RSM has received unprecedented attention
in recentyears and has been the beneficiary of Genichi Taguchi and the
quality era. It has been put forth as a serious alternative to specific Taguchi
methodology. In fact, the RSM approach has been suggested as a collection
of tools that will allow for the adoption of Taguchi principles while provid-
ing a more rigorous approach to statistical analysis. Much progress con-
tinuesto be madeas RSM benefits frommathematicaloptimization

457
458 Neff and Myers

methods, statistical graphics, robust fitting, new design ideas, Bayesian sta-
tistics, optimal designtheory,generalizedlinearmodels,andmanyother
advances. Researchers in all fields are able to focus on applicationsof RSM
because of thesubstantialimprovement in thesoftwarethat isused for
RSM. There is no doubt that high qualitysoftware is one of thebetter
communication links between the statistics researcher and the user.
In this chapter we discuss and review some of the recent developments
in RSM and how they are having and will continue to have an impact on
applications in industry.

2. MEAN AND VARIANCE MODELING AND ROBUST


PARAMETER DESIGN

Along with the realization that product quality depends on understanding


process variationas well astargeting ofthemeancametheconceptof
responsesurfacemodelingforboththeprocessmeanandvariance.
Taguchi’s clever consideration and use of noise models allowed this area
to advance. Robust parameter design (RPD) is a principle that emphasizes
proper choiceof levels of controllable processvariables (purmneters in
Taguchi’sterminology) in order to manufacture a product withminimal
variation around a predetermined target. These controllable process vari-
ables (controlled in experiments as well as in product and process design) are
referred to a s c.ot~trolfuctors.It is assumed that mostof the variation around
the target is due to the inability to control a second set of variables called
noise,firctor..s.Some examples of noise factors are environmental conditions,
raw material properties, variables related to how the consumer handles or
uses theproduct,and even thetolerancesaroundcontrolfactors.[The
reader is referred toMyersandMontgomery (1995) forillustrations of
controlandnoisevariablesforvariousapplications.]The objective in
RPD is to design the process by selecting levels of the control factors in
order to achieve r.obz1.stne.s.~(insensitivity) to the inevitable changes in the
noise factors.Thiscan be achieved throughtheappropriate design and
analysis of experiments that include noise as well as control factors, since
even the noise factors are often within our control for purposes of experi-
mentation. This philosophy is perhaps Taguchi’sgreatest contribution to
the quality movement.
Compared to the design and analysis techniques utilized by Taguchi
(Taguchi and Wu (1980)), response surface methods can accomplish RPD
through more rigorous analysis and efficient experimentation. For more on
the RSM approach compared to Taguchi’s methods, readVining and Myers
(1990), Myers al.
et(1992a),
Khattree (1996), andLucas (1994).
ResponseSurfaceMethodology 459

Independent oftheapproachtaken,however,theabilitytoincorporate
robustness to noise factors into a process design depends on the existence
and detection of at least one control x nose interaction. It is the structure of
these interactions that determines the nature of nonhomogeneity of process
variance that characterizes the parameter design problem. For illustration,
consider a problem involving one control factor, x, and one noise factor, z .
Figure 1 shows two potential outcomes of the relationship between factors.x
and z and their effects on the response, y . In Figure la, it can be seen that
the response y is robust to variability in the noise factor z when the variable
x is controlled at its lowlevel. When .x is at itshigh level, however, the
change in z has an effect of 15 units on the response. I n other words, the
presence of the .x= interactionindicatesthatthere is anopportunityto
reduce the response variability through proper choice of the level of the
control factor. In contrast, Figure l b shows that when there is no control
x noise interaction, the variabilityin y induced by the noise factor cannotbe
“designed out” of the system, since the variability is the same (i.e., homo-
geneous) at both levels of the control factor.
While the estimation of control x noise interactions is important for
understanding how best to control process variance, the control factor main
effects, as well as interactions among control factors, are equally important
for understanding how to drive the response mean to its target. The dual
responsesurfaceapproach, whichaddressesbothprocessmeanandvar-
iance, begins with the response model,

)’(X, Z) = Po + x’P + X’BX+ z’y + X‘AZ+ E (1)

In the response model, x and z represent the y,. x 1 and I‘,x 1 vectors of
control and noise factors, respectively. The I’, x I’, matrix B contains coeffi-

20 ..................................... x =+I
20

10 .....

-1
/ Z +I
x=-1
Y

10

-1 Z +I
Figure 1 (a) Control by noise interaction. (b) No control by noise interaction.
460 Neff and Myers

cients of control x control interactions (which includes quadratics). while


the matrix A is an I’, x I’,matrix of control x noise interactions. While it is
possible to have interactions or even quadratics among noise factors, the

-
previously defined response model will accommodate many real-life appli-
cations. It is assumed that E N ( 0 , ozl),implying that any nonconstancy of
variance of the process is due to an inability to control the noise variables.
The assumption on thenoise variables is such that the experimentallevels of
each z, is centered at some mean p, with the f l levels set at p , f co:,, where
i,
c is a constant. 1 , etc. As a result, it is assumed that
2
E ( z ) = 0. Var(z) = ozIrt
thus implying that noise variables are uncorrelated with known variance.
Taking expectation and variance operators on the response model in
( I ) , we can obtain estimates of the mean and variance response surfaces as
~:[I(x)]= hox’b + X’BX
and
Var[1(x) = (f+ A’x)’V(f + A’x) + 6:
An equivalentform ofthevariancemodel,undertheassumptionthat
V = 0’1, is given by
v a r , ~ ~ * ( x=) lb t ~ ’ ( x ) ~ (+
x )6:
where I(x) = (f + A’x), which is the vector of partial derivatives of )(x, z)
with respect to z. In these equations, b, f, B, and Acontain regression
coefficients fromthe fittedmodel of Eq. (1). with 6; representingthe
error mean square from this model fit. Notice the role that Aplays in the
variance model, recalling that it contains the coefficients of the important
control x noiseinteractions.Runningthe process atthe levels of xthat
minimize Ill(x)llwillin turn minimize theprocessvariance. If however,
A = 0, the process variance does not depend on x, and hence one cannot
create a robust process by choice of settingsofthecontrolfactors(illu-
strated previously with the simple example in Figure 1).
Various analytical techniques have been developed for the purpose of
process understanding and optimization based on the dual response surface
models. Vining andMyers (1990) proposedfindingconditions in xthat
minimize Var,l~*(x)]subject to,&-J(X)] being held atsomeacceptable
level. Lin and Tu (1996) consider a mean squared error approach for the
“target is best” case. Other methods, given in Myers et al. (1997), focus on
the distribution of response values in the process. These include the devel-
opment of predictionintervals forfuture response values as well asthe
development of tolerance intervals to include a t least 100% of the process
Response Surface Methodology 461

values with some specified probability. An exanlple taken from Myers et a l .


(1997) will be used to graphicallyillustratethe dual responsesurface
approach and the usefulness of theanalyticalmeasurespreviouslymen-
tioned.
The data for thisexample.takenfrom Montgomery (1997), comes
from a factorialexperimentconducted i n a U.S. pilotplant to study the
factors thought to influence the filtration rate of a chemical bonding sub-
stance. Four factors were varied in this experiment: pressure (si), formalde-
hydeconcentration ( x ? ) , stirringrate (s3, andtemperature ( 2 ) . There is
interest in maximizing filtration rate while also dealing with the variation
transmitted by fluctuations of temperature in the process. For this reason,
temperature is treated as a noise variable. All four factors are varied at the
f l levels i n a 24 factorial arrangement, with the f1 levels of temperature
assumed to be at fo,, representing temperature variability in the process.
The fitted response model is given by (Montgomery, 1997).

f = 70.025 + 10.8125~+ 4.9375.~2+ 73125.~3- 9 . 0 6 2 5 ~ 2+~8 . 3 1 2 5 ~ 3-~0.5625.\-,.~3

with R’ = 0.9668 and 6, = 4.5954. Note that there are two control x noise
interactions present in the model, indicating that the variability transmitted
fromtemperaturefluctuationscan be reduced throughproper choice of
formaldehyde concentration (.y2) and stirring rate (r3). Pressure (.Y,) was
found to haveno significant effect on filtrationrate CY). Theestimated
mean and variance models are therefore given by
E[I(.Y~,
.\-3)] 70.02 + 4.9375.yz + 7.3125~3- 0 . 5 6 2 5 ~ 2 ~ 3
and
+ +
Var,[lf.y2, s3)] = (10.8125 - 9 . 0 6 2 5 . ~ ~8.3125~3)’ (4.5954)’
Figure 2 shows the overlaid contour plots for theresponse surface models of
the process mean and standard deviation. Thetrade-off between maximizing
filtrationrate while attempting to minimizevariance is evident.Figure 3
contains a contourplot of mean filtration rate along with the locus of points
l ( s z , sj) = 0, defining a line of minimum estimated process variance. The
shadedregionrepresentsa 95% confidenceinterval around this line of
minimum variance. From Figure 3, the mean-variance trade-off becomes
even more clear, since we can achieve barely more than 73 gal/hr for the
estimatedprocessmean while minimizingtheprocessvariance(withco-
ordinates s? = I , s 3 = 0.2).
In Figure 4 we see lower 95?4 one-sided prediction limits, while Figure
5 depicts lower 95% tolerance h i t s on filtration rateswith probability 0.95.
Both of these illustrations indicate that the process should be operated at a
462 Neff and Myers

-1 .o 0.0 1.o
x2

Figure 2 Contour plot of both the mean filtration rate and the process standard
deviation.

-1.o 0.0 1.o


xz
Figure 3 Contour plot of mean filtration rate and the line of minimum process
variance with its 95% confidence region.
ResponseSurfaceMethodology 463

I .o

x3 -o..o

-1.0
-I .o o..o I .o

x2

Figure 4 Contour plot of lower 95% one-sided prediction limits

high concentration of formaldehyde ( s 2 ) with reasonable flexibility in the


operating level of stirring rate (.Y?).
Combiningtheinformationfromthefourplotsprovidespowerful
insights
into
the
process,
namely, that
operating
the
at (.Y* =

I .o

-1.0

-1.0 o..o 10

x2
Figure 5 Contour plot of 0.95 content lower 95% one-sided tolerance limits.
464 Neff and Myers

1.0, .xj = -0.21) condition will minimize the processvariancewith


promising results indicated by the prediction and tolerance limits.
Taguchi’s parameter design has had a profound effect on the rise in
interest and use of RSM in industry. There are developments in other areas
of interest, however, that should andlikely will enhance its role, not only in
traditionalqualityimprovementbut also in biostatistics and biomedical
applications.

3. NEW DEVELOPMENTS IN RSM


3.1. Role of Computer-Generated Designs
The computer hasbeen an important tool in the construction ofexperimen-
tal designs since the early 1980s. However, the focus has been almost entirely
on criteria that have their underpinnings steeped in normal theory linear
models. In this situation, of course, the alphabetic optimality criteria devel-
oped by Kiefer (1959) and others can be applied without knowledge of the
parameters.However,as we emphasize in what follows. manyofthe
response surface applications in the present and the future involve nonlinear
and/or non-normal theory applications in which optimal designs depend on
knowledge of the parameters. Uncertainty about model parameters in these
cases as well as uncertainties in more standard cases about model assump-
tions, goals, the presence of outliers, or missing design points result in the
need forconsiderationsofdesignrobustnessasaseriousalternative to
optimal design. Almost without exception. commercial computer software
deals with design optimality and does not address robustness.It is clear that
computer-generated design cannot reach its full potential without consider-
ing these matters as well as dealing with various kinds of graphical metho-
dology that allow the practitioner to compare and evaluate experimental
designs. In what follows we discuss computer graphics that relate to RSM
designs and provide some insight into new developments. These new devel-
opments necessitate design robustness as a companion to the RSM analysis
tools that are currently finding use in industry.

3.2. Role of Creative Computer Graphics


Practitioners of RSM are undoubtedly familiar with the use of three-dimen-
sional and contour plots for visualizing a predicted response. In a multi-
response optimization problem, the practice of overlaying multiple contour
plots is extremelyhelpful for visualizing anypotentialcompromisesthat
must be made in order to determine the process optimum. Statistical soft-
Response Surface Methodology 465

ware packages such a s Design-Expert and Minitab (version 12) have built-in
features for generating these overlaid plots.
There are also graphical techniques that are extremely useful for eval-
uating the prediction capabilityof experimental designs. TWOsuch graphical
methods that are discussed here are variance dispersion graphs and predic-
tion variance contour plots. Both of these graphical techniques enable the
user to visualize the stability of prediction variance throughout the design
space, thus providing a mechanism for comparing competing designs.
The graphical technique referred to as the variance dispersion graph
(VDG) was developed by Giovannitti-Jensen and Myers (1 989) and Myers
et al. (1992b). A variance dispersion graph for an RSM design displays a
“snapshot” of the stability of the scaled prediction variance, v(x) = N Var
y(x)/o’, and how the design compares to an “ideal.” For a spherical design
[see Rozum (1990) and Rozum and Myers (1991) for extensions to cuboidal
designs], the VDG contains four graphical components:

1. A plot of the spherical variance V‘ against the radius I’. The sphe-
rical variance is essentially v(x) averaged (via integration) over the
surface of a sphere of radius I’.
2. A plot of the maximum .(X) on a radius I’ against I’.
3. A plot of the minimum .(X) on a radius I’ against I’.
4. A horizontal line at 74x) = p , to represent the “ideal” case.

Figure 6 illustratesthe utility of VDGs for comparison of twospherical


designs for k = 3 variables, the CCD with a = f i and three center points
and the Box-Behnken design. also with three center points. Both designs
have been scaled so that points are at a radius f i from the design center.
The following represent obvious conclusions from the two VDGs in Figure
6 (Myers and Montgomery, 1995):

1. Notethatthere is very little difference between theminimum,


average, and maximum of ?)(x) for the CCD, indicating that it is
nearlyrotatable.Thisshouldnot be surprising since a = 1.682
results in exact rotatability.
2. The values of .(X) are very comparable for the two designs near
the design center. Any difference is accounted for by the difference
in sample sizes ( N = 17 for the CCD. N = 15 for the BBD).
3 . TheCCDappearsto be thebetter design forpredictionfrom
radius 1.0 to f i , based on greaterstability in 74x) and a max
?)(x)that is smaller than that of the BBD.
4. The comparison with the ideal design [ ~ ( x= ) 10.01 is readily seen
for both designs.
466 Neffand Myers

"1"_
- CCD
.. ._ BED

0.0 0.2 0.4 0.6 0.8 1.0

Figure 6 Variance dispersion graphs for CCD and Box-Behnken designs fork = 3
design variables.

Another graphical method of displaying the stability of the prediction


variance is a display of contours of constant prediction variance. Like the
VDG, this technique enables the user to visualize the behavior of 4 x ) over
the design space. Unlike the VDG, a contour plot of 7 4 ~ allows ) one to
determine the direction in which V ( X ) is most unstable. This technique is now
illustrated through a comparison of two competing designs of equal size, the
3 - 11A hybrid with one additional center point and a D-optimal design.
The 12-run D-optimal design was generated using SAS Proc Optex, assum-
ing the three-factor full quadratic model. The candidate list from which the
design was selected was structured to be similar to thesphericalspace
encompassed by the hybrid design. Contour plots of the unscaled prediction
standard error ( v ( x ) / N )were generated for each design, under the assump-
tion of a full quadratic model. Figures 7a and 7b contain these contour plots
for the hybrid and D-optimal designs, respectively. Note that each contour
plotrepresentsa slice of the design spacewherefactor C is fixed at its
midpoint condition, and therefore the center contour represents the stan-
dard error of prediction at the center of the design space. Studying these
plots provides information about two key aspects of the designs: (1) near-
ness torotatabilityand (2) stability/consistency of predictionvariance
throughout the space.
The hybrid design, known to be nearly rotatable, also has very stable
and consistentpredictionvariance throughout thespace. The D-optimal
ResponseSurfaceMethodology 467

DESMN-EXPERT Plot

A d u o l Facto:
X=A
Y=B

Adupl Constants:
c = 0.00 m

-1.00 4.50 0.00 0.50 1.00

DESIGN-EXPERT Plot

Actual Factors:
X=A
Y=B

Actual Constants:
c = 0.00 m

Figure 7 Contours of standard error of prediction for (a) hybrid 31 IA and (b) D-
optimal design.
468 Neff and Myers

design, in contrast, is not rotatable, which can be seen by the inconsistency


of the prediction variance i n the corners of the plotted space. In addition to
the D-optimal design being unstable in the center of the design space, we
also observe an overall higher degree of prediction variability throughout
relative to that of the hybrid design.
Independent of the designs studied, however, the power of the graphi-
cal techniques is evident. Graphical tools suchasthosepresented in this
section allow the researcher to quickly gain information about design per-
formance and characteristics of the response surface.

3.3. Bayesian or Two-Stage Design


In more and more applications, the ability to design a n experiment depends
on a priori knowledge of the response surface model. For example, when
designingexperimentsfornonlinearmodels,theparameters of thenon-
normalerrormodelsmust be known. Even forthe caseofthelinear
model,identification of “optimal” designs depends on knowledgeofthe
modelregressors.Infact, we can say that it is rare when we truly know
enough to design the experiment effectively without invoking prior infonna-
tion or conducting a preliminary experiment,
Considerthefollowing logistic regressionmodel, used frequently in
biomedical applications:

where J’; E (0, 1 ) indicates whether the ith subject responded to dose x, of a
given drug. I t is thereforeassumedthat E , is approxinlately Bernoulli (0,
p i ( 1 - p i ) ) , where
1

The corresponding Fisher information matrix is given by

Note that the information matrix is a function of the unknown p’s. This
makes it impossible to directly use traditional design optinlality criteria for
generating an efficient design, since they depend on being able to optimize
some norm on the Fisher information matrix. For example, construction of
the D-optimal design for the above model would require that the doses s I ,
s 2 , ..., x,, be chosen such that Det[N-II(P)] is maximized. I n order to dothis,
Response Surface Methodology 469

the scientist would be forced to make his or her best guess at the values of Po
and PI. The resulting design will be D-optimal for the specified values, which
unfortunately may be very differentfromthe truth, thus resulting in an
inefficient design.
Chaloner and Verdinelli (1995) review a Bayesian approach to design-
optimality that incorporates prior information about the unknown para-
meters in the formof
probability
a distribution.
This provides
a
mechanism for building in robustness to parameter misspecification, since
a distribution of the parameteris specified, not merely a point estimate. The
resulting Bayesian designoptimalitycriterion is afunctionoftheFisher
informationmatrix.integratedoverthepriordistributiononthepara-
meters. For example,the Bayesian D-optimal design for thepreviously
defined logistic model is found by choosing the levels of s that will maximize
the expression

where is the prior probability distribution of P = [Po, PI]. Other creative


approaches have been taken that provide a robustness to parameter mis-
specification. For example, a minimax approachis provided by Sitter (1992).
A two-stage design is another method used to achieve robustness to
parameter misspecification. The strategy behind designing in two stages is to
generate parameter information from data in the first stage that can then be
used to select the remaining experimental runs with maximum efficiency. A
two-stage procedure may implement any pair of design criteria that meet the
first-stageobjectiveas well astheobjective of thecombineddesign. For
example,AbdelbasitandPlackett (1983) andMinkin (1987) studiedthe
efficiency of two-stage D-optimal designs for binary responses, thus apply-
ing D optimality to obth stages. Myers et al. (1996) developed a two-stage
procedure for the logistic regression model that uses D optimality in the first
stage followed by Q optimality i n the second.
To illustrate the two-stage method, abrief description of the two-stage
D-optimal design ( D - D optimality) procedure for the logistic model is now
given. The first step in the D-D (and also D - Q ) procedure is the selection of a
first-stage D-optimal design. In order to implement D optimality in the first
stage, the experimenter must estimate the unknown P with a best guess, bo.
The N , runs for the first-stage design are then chosen to satisfy the first-
stage D-optimality criterion, given by

Max Det[(N;'I(P)]B=b,
D
470 Neff and Myers

with p replaced by bo and D representing all possible designs of size N I .


After design and execution of the first-stage experiment, N , observations are
available to estimatep. The “best guess” of p is updated by replacing bo with
the MLE of p. The second stage of the two-stage process uses b, thus making
it conditionalonthe results of the first stage. To completethe D - D
procedure, it is necessary to choose a set of N 2 second-stage design points
that will create a combined design that is conditionally D optimal. The N 2
points are chosen to satisfy

where D is now the set of all possible designs of size N2 and I l ( p ) is fixed
after the first stage.
Letsinger (1995) and Myers et al. (1996) evaluatedthe efficiency of
two-stage procedures relative to their single-stage competitors. In doing so,
they showed thatthe best performance of thetwo-stage designs was
achieved when the first-stage design contained only 30% of the combined
design size, thus reserving 70% of the observations for the second stage,
when more parameter information is present.
Even for the normal linear model,successful implementation of design
optitmality criteria is often difficult in practice. This is due to the fact that
the model content must be known a priori. In other words, the experimenter
must be able to specify which regressors are needed to model the response,
in order to generate the most efficient design for constructing the specified
model. If too many regressors are specified, some design points (and con-
sequently valuable resources) may be wasted on estimatiton of unimportant
terms. If too few regressors are specified, then some terms that are needed in
the model may not even be estimatable.
Suppose anexperimenter identifies a set of regressors, s , containing all
p + q regressors he or she believes might be needed in modeling the behavior
+
of a response J-.The linear model is written as y = X p E, with y denoting

-
the I? observations to be collected in an experiment, under the assumption
+
that y ( p , o2 N ( X P , 0’1). The model matrix, X , has dimensions 11 x (p q ) ,
+
with the p y columns defined by the set of regressors, x. Quite often, the
experimenter has knowledge of the process or system that allows him or her
to identify p of the regressors as prin?ar.y ternw. These are the terms that the
experimenterstrongly believes are needed in modelingtheresponse. The
remaining terms are the poterzticrl torms, i.e., those terms about which the
experimenterhasuncertainty. For example,theexperimentermayknow
frompast experience thatcertain process variablesmust be included in
themodel as main effects (i.e.,linearterms) but is uncertain if higher
ResponseSurface Methodology 471

order interactions (such as quadratics) are needed. Thekey is to incorporate


this information into the experimental design, so that limited resources are
first focused onestimation o theprimaryterms (in this casethemain
effects), while also using someresourcesforestimation of thepotential
terms.
DuMouchel and Jones (1994) proposed a Bayesian D-optimality cri-
terion for the efficient estimation of both primary and potential terms. Let
Pprl and Ppot represent the parameters corresponding to primary and poten-
tial terms, respectively. The approach taken by DuMouchel and Jones is to
assume a diffuse prior distribution (arbitrary prior mean with infinite prior
variance) for Ppr,. This is reasonable because these parameters are expected
to be significantly different from zero, but no assumption of direction of
effect is made. The potential terms, however, are perceived to have smaller
coefficients than the coefficients of primary terms. For this reason, Ppo, is
assigned an N ( 0 , o’?I), with o2and T’ known. Fortunately, the design can
be constructed independently of 0’. The value of T?, however, affects the
choice of the design, since it reflects the degree of uncertainty associated
with the potential terms relative to CY?.Under the assumption that primary
and potential terms are uncorrelated (achieved through proper scaling of the
s’s), thejointpriordistribution assigned to Ppr, and Ppo, is the N ( 0 ,
+ +
~ ‘ T ’ K ” ) , where K is a (p (I) x (p (I) diagonal matrix whose first p diag-
onal elements equal 0 and whose remaining (I diagonal elements equal I .
-
Under theassurnption that YIP,o2 N(XP, O’I), the resulting posterior
+
distribution of P = [Ppr,, Ppo,]’ is also normal, with mean b = (X’X K T ~ ) - ’
+
X’y and variance V = 02(X’X K / T ~ ) - ’ The . Bayesian D-optimal design is
that which minimizes the Bayes risk, proportional to
log Det[V] = log Det[o’(X’X + K/r*)”]
In practice, the appropriate design may be found by selecting the rows
of X from a predefined candidate list, so that 1VI is minimized. Note that the
diagonals of V associated with Ppot are somewhat stabilized through prior
information (given through T), identical to thetechnique used in ridge
regression. Theotherdiagonals of V associated with Ppr,, however, are
moredependent on design. As aresult,the Bayesian D-optimal design
will supportestimation of both Ppr, and Ppot, but with higher priority
given to ljpr,.
Consider an application in which three factors are to be studied, with
emphasis being placed on the estimation of main effects and interactions
(Pprl =[intercept, P I , Pz, P3, PI?, PI3, PZ3]’) while there is still some interest in
the estimation of quadratics (Ppo, = [Pll, P2’, P3J). The performance of the
Bayesian D-optimal designs versusthefamiliar face-centered cubic(fcc)
design is compared in Table 1 for various “truemodels.” All designs contain
472 Neff andMyers

N = 16 runs, and all Bayesian D-optimal designs were produced by SAS


(ProcOPTEX).The metric used for design comparison is the scaled D
criterion, N[Det(X'X)-']''", calculated for the true model in each case.
From Table 1 we see that in almost every case the Bayesian D-optimal
designs outperform the fcc design. The performance of the two Bayesian D-
optimal designs depends on the accuracy of the experimenter's prior knowl-
edge about the relative significance of primary and potential terms, reflected
through the choice of the parameter 'c. For examploe, if it is believed that the
quadratics are all within f2o of zero (i.e., most likely insignificant) and
therefore defines r = 2/3, the resulting design willbe most efficient when
thetruemodelcontains no quadratic terms.This design is not the best
choice, however, if all quadratic terms truly belong in the model. In that
case a larger value of T, such as z = 5, would have been a better choice for
controlling the design construction.
This weakness in the Bayesian D-optimaldesignsshouldnot at all
detract. however, from the work of DuMouchel and Jones. In fact, their
greatest contribution was to provide a basis for the development of more
efficient Bayesian design criteria, such as two-stage procedures, for the pur-
pose of generating efficient designsundermodel(regressor)uncertainty.
Consider the value of adopting the method of DuMouchel and Jones to
produce a first-stage design with robustness to regressor uncertainty.
Analysis of the first-stage data could then provide additional information
about the relative importance of the p + (I regressors, enabling the remaining
design points (second-stage design) to be chosen with greater efficiency. The
second-stage design could then be generated from any optimality procedure
that incorporates the improved model knowledge.
The two-stage approach described above was developed by Neff et al.
(1997) for the purpose of developing numerous Bayesian two-stage design
optimality procedures for the normal linear model under regressor uncer-
tainty. Their work suggests that efficiency and robustness is gained from a
+ +
two-stage design of size N = 2(p q 2), with half of the design points

Table 1 Values of Determinant for Evaluation of FCC and Bayesian D-optimal


Designs
Parameters contained FCC Bayesian D-opt Bayesian D-opt
i n the true model (t = 5) (= 2/31
ResponseSurfaceMethodology 473

allocated to each stage of the design. One such two-stageBayesian approach


is illustrated by a brief description of a Bayes D-D optimality procedure.
Using this procedure, the first-stagedesign is chosen to be D optimal accord-
ing to the method of DuMouchel and Jones. Consequently, the first-stage
posterior distribution of p = [ppri,4,J’ is normal, with E ( p j / y I )= (X;Xl+
K/z’)-’X[yI and variance V I = o-(X;X, + K/T’)-’. Basing inferences on
+
the first-stage posterior of p, the p q standardizedestimates of the
model parameters (coefficients) after the first stage are

where cjj is the,jth diagonal element of ( I/o’)V1. Since the estimated effect
of any regressor .x/ is proportional to its standardized estimated coefficient,
the relative importance of the various model terms can be estimatednby the
relative sizes of the @‘s (in absolutevalue).Nornlalizing these 4;’s (in
absolute value) produces a set of discrete scores or “weights of evidence”
that quantify the relative importance of each model term. In other words, a
, ..., z,,+[,},
new set of z’s, ( T ~ ‘cZ, is produced based on this updated prior
information. Going into the second stage, beliefs about the relative impor-
tance of thep + (I model terms are expressed as plo-, z-, y,(O, 02T),where T
7 7

is a 0)+ q) x 0)+ q ) diagonal matrix with z l , t?,...,zl,+[{ appearing on the


diagonals. Setting the prior mean to zero at this point is arbitrary, since it
will have no impact on thesecond-stage design criterion. Still underthe
assumption of a normal linear model, the second-stage posterior distribu-
tion is also normal, with posterior covariance
matrix V2 =
o’(X;XI + XiX2 + T-I)-’. Thus the second-stage conditionally D-optimal
design is found by selecting the rows of X’ from a candidate list such that
IVz( is minimized. Due to the structure of T-’, the diagonals of V? corre-
spondingto less important regressors are alreadysomewhatstabilized.
Design points that provide information about the more important regres-
sors and thus stabilize the corresponding diagonals will be chosen for the
second-stage design. For a performance comparison of this procedure as
well as other two-stage Bayesian design procedures relative to their single-
stage competitors, the reader is referred to Neff et al. (1997).

3.4. Generalized Linear Models


The normal linear model is the model that has been most commonly used in
response surface applications. The assumptions underlying this model are,
of course, thatthe model errorsare normallydistributed with constant
variance. I n many quality improvement applications in industry, however,
the quality characteristic or response most naturally follows a probability
474 Neff and Myers

distributionotherthanthenormal.Consider,forexample,aquality
inlprovement program at a plastics manufacturer focused on reducing the
number of surface defects on i11jection-molded parts. The response in this
case is the defect count per part, which most naturally follows a Poisson
distribution, where the variance is not constant but is instead equal to the
mean.Consideralsoapplications in the field of reliability, in which the
equipment’s time to failure is the quality response under study. Again, the
most natural error distribution is not the normal, but instead the exponen-
tial or gamma, both of which have nonconstant variance structures. These
types of problems nicely parallel similarproblems that exist in thebio-
medical field, particularly in the area of dose-response studies and survival
analysis.
Regression models based on distributions such as the Poisson, gamma,
exponential,andbinomial fall intoa family of distributionsandmodels
known as generalized linear models (GLM). See McCullough and Nelder
(1989) for an excellent text on the subject. In addition the reader is referred
to Myers and Montgomery (1997) for a tutorial on GLM. In fact, all dis-
tributions belonging to the exponential family are accommodated by GLM.
These models have already been used a great deal in biomedical fields but
arejust now drawinginterest in manufacturingareas. I n thepast,the
approach has been to normalize the response through transformation, so
thatOLS modelparameterestimatescould be calculated. Hamadaand
Nelder (1997) show several examples in which the appropriate transforma-
tion either did not exist or produced unsatisfactory results compared to the
appropriate GLM model. They also spoint out that with the progress that
has been made in computing in this area, the GLM models are just aseasily
fit as theOLS modeltothetransformed data. A few examplesoftware
packages with GLM capability areGLIM, SAS PROCGENMOD, S-
plus, and ECHIP.
I t is interesting that some work has been done that provides a con-
nective tissue between generalized linearmodels androbustparameter
design. This relationship between the two fields is extremely important, as
it allows the response surface approach to Taguchi’s parameter design to be
generalized to clearly non-normalapplicationsthat were previously dis-
cussed in this section. Engel and Huele (1996) build a foundation for this
important area, and there will certainly be other developments.
The difficultycomes in designingexperiments forGLM models.
Design optimality criteria become complex, and designs are not simple to
construct even in the case of only two design variables. See, for example,
Sitter and Torsney ( 1 992) and Atkinson and Haines (1 996). One most con-
stantly be aware that even if an optimal design is found it requires parameter
Response Surface Methodology 475

specifications. As a result, the use of robust or two-stage designs will likely,


in the end, be the most practical approach.

3.5. NonparametricandSemiparametricResponseSurface
Methods
Consideraresponsesurfaceproblem in which thequalitycharacteristic
(response) of interest is expected to behave in a highly nonlinear fashion
as a function of a set of processvariables.Althoughthemodelform is
unknown, the model structure is of less importance than theability to locate
the process conditions that result in the optimum response value. The pri-
mary interest is in prediction of the response and understanding the general
nature of theresponsesurface.Additionally, in many of these kinds of
problems the ranges in the design problems are wider than in traditional
RSM in which local approximations are sought.
In the problem above, greater model flexibility is required than can be
achieved with a low-order polynomial model. Nonparametric and semipara-
metric regression modelscan be combined with standard experimental
design tools to provideamore flexible approachtotheoptimization of
complexproblems.Some of thenonparametricmodelingmethodsthat
may be considered are thin-plate spline models, Gaussian stochastic process
models, neural networks, generalized additive models (GAMS), andmultiple
adaptive regression splines (MARS). Thereader is referred to Haaland et al.
(1 994) for a brief description of each model type. Vining and Bohn (1 996)
introduced a semiparametric as well as a nonparametric approach to mean
and variance modeling. The semiparametric strategy involved the use of a
nonparametricmethodtoobtain varianceestimates which thenbecame
inputs to modelingtheresponsemean via weighted least squares. As an
alternative approach they suggested utilizing a nonparametric method for
modeling the response mean as well as the variance.
Haaland et al. (1996) point out that the experimental designs used for
nonparametric response surface methods can include some of the traditional
designs. For example, one may execute a series of fractional factorials fol-
lowed by a central composite design, then develop a global model using a
nonparametric method. An alternative to this design approach is to execute
a single spcrce-filling design, which covers the entire region of operability in
one large experiment. This type of design is not based on any model form
but instead contains points that are spread out uniformly (in some sense)
over the experimental region. The intent is that no pointin the experimental
region will be very far from a design point. Space-filling designs have pri-
marily been used in computer experiments but have also been applied in
physical experiments in the pharmaceutical and biotechnology industries.
476 Neff and Myers

See Haaland et al. (1994) for references. Among the space-filling designs is a
class of distance-based design criteria that focus on selectionofa set of
design points that have adequatecoverage and spread over the experimental
(or operability) region. Two software packages that will construct distance-
based designs are SAS PROC OPTEX and Design-Expert.

3.6. Hard-to-Change or Hard-to-Control Design Variables


111 the design and analysis of industrial experiments, one often encounters
variables that are hard to change or hard to control. Consider thefollowing
example. A product engineer for a plastics manufacturer is conducting an
experiment to determine the effect of extrusion conditions on various phy-
sical properties of the resulting plastic pellets. The three independent vari-
ables to be studiedare screw design, screw speed. andextrusionrate.
Minimal screw designchangescanoccurduringtheexperiment, since
each change requires costly line downtime. For this reason, screw design
is referred to as a “hard-to-change” variable. Also, since screw designs vary
between plant sites, the product engineer has no control over which screw
design will ultimately be used at each site. For this reason, screw design is
also labeled a “hard-to-control” variable.
This has been emphasized in recent years due to the important role of
noise variables that are hard to control. Box and Jones (1992) investigated
the use of split-plot designs as an alternativeto Taguchi’s crossed arrays for
more efficiently studying noise and control variables. Lucas and JU (1992)
pointed out that often the designs for these situations are not completely
randomized but are rather quite like a split plot and yet we analyze them
incorrectly as CRDs.
Strictlyspeaking,thehard-to-control variables are whole-plotvari-
ables with levels that are randomlyassigned to larger whole-plot experimen-
tal units (EUs). The appropriate levels of the easier to control variables are
randomlyassigned to smallerexperimentalunitswithineachwholeplot
(thusmaking themsubplotvariables).As discussed by Letsinger et al.
(1996),thisbirandomizationstructureleads to complications in analysis,

-
since the errorassumptions associatedwiththe
model[i.e.,all E, N ( 0 , o’)]are no longervalid. Let
basic responsesurface
be thewhole-plot
error variance and o: the subplot error variance resulting from the first and
seonc randomization, respectively. The model and error assumptions then
become
y=xp+6+E
where
Response Surface Methodology 477

6+E"N(O,V)

and
+
V =o ~ J o ~ I
Assuming that there a r e i whole plots, then J is a block-diagonal matrix with
nonzeroblocks of theform lbixl x and hi is thenumberof observa-
tions in theithwholeplot, i = 1 , 2 , ...,.j. Notethat while observations
belonging to different whole-plot EUs are independent, those h i , observa-
tions within a given whole plot are correlated.
Practitionersmay be tempted to ignorethebirandomizationerror
structure, analyzing the data as if they came from a completely randomized
design (CRD). The analysis of a split-plot design as a CRD, however, can
lead to erroneously concluding that whole-plot factors are significant when
in fact they are not, while at the sametime erroneously eliminating from the
model significant subplot terms including whole-plot-subplot interactions.
Unlike model estimation for the CRD, the error variances play a major role
in the estimation of coefficients in the birandomization model. Under the
assumption of normal errors, the maximum likelihood estimate (MLE) of
themodel is now obtainedthroughthe generalized least squares (GLS)
estimation equations
(b) = ( x ' v - ' x ) - ' x ' v - ' y
and
Var(b) = (X'V-'X)-'
Notethatboth estimatingequationsdepend onand CY:throughthe
matrix V; therefore proper estimation of these error variances becomes a
priority.
Appropriateness of variousmodel and error estimation methods is
dependent on the structure of the birandomization design (BRD). The gen-
eral class of BRDs is divided into two subclasses: the crossed and the non-
crossed. The distinguishing characteristic is that in the case of the crossed
BRD, subplot conditions (i.e., factor level combinations) are identical across
whole plots. This is the familiar split-plot design, which may result from
restricted randomization of a 2 k ,3 k , or mixed-level factorial design. In the
case of the noncrossed BRD, each whole plot may have a different number
of subplot EUs aswell as different factor combinations. Such a design could
result from restricted randomization of a 2"-" fractional factorial design or
second-order design such as the central composite design (CCD) or Box-
Behnken design.
478 Neff and Myers

For the crossed BRD, Letsinger et al. (1996) show that GLS = OLS
under certain model conditions, and therefore error variance knowledge is
not essential for model estimation. Model editing, however, does depend on
the availability of estimates of 0; and o,?. One approach to estimating these
variances makes use of whole-plot and subplot lack of fit. See Letsinger et
al. (1986) for details.
In general,modelestimation and editing are more complex forthe
noncrossed BRD. I t is interesting topointout, however, that when the
model is first-order,parameterestimationcan be accomplished using the
equivalency of GLS = OLS (as in the crossed case). Once again, model lack
of fit can be used to develop estimators for the error variances, although the
procedure is more complex than that for the crossed BRD. Both estimation
and editing of a second-order model, however, depend on estimates of oi
and 0: through the matrix V . Three competing methods arementioned here:
OLS,iterated reweighted least squares(IRLS),and restricted maximum
likelihood (REML).
One can argue that in some cases OLS is an acceptable method, even
though it ignores the dependence among observations within each whole
plot of the BRD. Infact, OLS provides an unbiased estimator of 0. Also, for
designs that provide little or no lack-of-fit information (for estimation of o,?
and o:),the researcher may be better served by not trying to estimate V than
by introducing more variability into the analysis. The IRLS method begins
with an initial OLS estimate of 0, then uses an iterativeprocedurefor
7 7
estimating oi, 0; and p untilconvergence is reached in fi. TheREML
method, first developed by AndersonandBancroft (1952) and Russell
and Bradley (1958), is similar to MLE except that it uses the likelihood of
a transformation of theresponse, y. Refer to Searle et al. (1992) for a
discussion on REML and its relationship to MLE. The PROC MIXED
procedure in SAS (1992) can be adapted to calculate REML estimators.
Letsinger et al. (1996) give details on the use of PROC MIXED forthe
analysis of a BRD.
The recent reminder that many RSM problems are accompanied by
designs that are not completely randomized will hopefully produce new and
useful tools for the practitioner. In that regard it is of great interest to note
thesimilarity between thesplit-plotRSMproblem(asfar as analysis is
concerned) and the approach taken with generalized estimating equations
that find applications in the biostatistical and biomedical fields. The analysis
is ver similar, though in the longitudinal data applications there generally is
no designed experiment.Liang and Zeger (1986) andothers extendthis
work to generalized linear models and indeed assume various correlation
structures rather than the exchangeable correlation structure inducedby the
Response Surface Methodology 479

approach discussed above. The RSM practitioners can benefit greatly by


borrowing from their colleagues in other fields.

4. CONCLUSION

Response surface methodology is growing. More statistical researchers are


getting involved, dealing with a wider variety of complex problems. RSM
will always play a large role in quality improvement. Much more develop-
mentwork is needed,however, to ensurethat the methodsare flexible
enough to meet the challenges presented by other than the traditional fields
of applications. In addition, strong communication is needed to solidify the
growing interest of practitioners i n the biological and biomedical fields.

REFERENCES

Abdelbasit KM, Plackett RL. (1983). Experimental design for binary data.J Am Stat
Assoc 78:90-98.
AndersonRL.BancroftTL. (1952) StatisticalTheory in Research. New York:
McGraw-Hill.
AtkinsonAC.HainesLM. (1996). Designs fornonlinearand generalized linear
models. I n : Gosh S, RaoCR, eds. Handbook of Statistics, vol. 13.
Amsterdam: Elsevier, pp. 437475.
Box GEP, Jones S. (1992). Split-plot designs for robust product experimcntxtion. J
Appl Stat 1913-26.
Chaloner K. Verdinelli I. (1995). Bayesian experimental design: A review. Stat Sci
10:273-304.
Dumouchel W, Jones B. (1994).Asimple Bayesian modificationof D-optimal
designs to reduce dependence on an assumed model. Technometrics 36:3747.
Engel J, Huele AF. (1996). A generalized linear modeling approach to robust design.
Technometrics 38:365-373.
Giovannitti-Jensen A, Myers RH. (1989). Graphical assessmentof the prediction
capability of response surface designs. Technometrics 31: 159-1 71.
Haaland PD, McMillan N., Nychka D, Welch W. (1994). Analysis of space-filling
designs. Comput Sci Stat 26: 1 1 1-120.
HaalandPD.ClarkeRA, O’Connell MA,NychkaDW. (1996). Nonparametric
response surface methods. Paper presentedat 1996 ASA Meeting, Chicago,1L.
Hamada M. Nelder JA. (1997). Generalized linear models for quality improvement
experiments. J Q L ITechnol
~ 29292-308.
Khattree R. (1996). Robust parameter design: A response surface approach. J Qual
Technol 28: 187-1 98.
Kiefer J. (1959). Optimum experimental designs (with discussion). J. Roy Stat Soc.
Ser B 21:272-319.
480 Neff and Myers

Letsinger JD,MyersRH,LentnerM. (1996). Responsesurfacemethodsfor bi-


randomization structures. J Qual Technol 28:381-397.
Letsinger WC. (1995). Optimal one and two-stage designs for the logistic regression
model. Dissertation, Virgina Polytechnic Institute and State University.
Liang KY, Zeger SL. (1986). Longitudinaldataanalysis usinggeneralized linear
models. Biometrika 73(1):13-22.
Lin D, Tu W. (1995). Dual response surface optimization. J Qual Technol 27:34-39.
Lucas JM. (1994). How to achieve a robust process using response surface metho-
dology. J QUI Technol 26:248-260.
Lucas JM. Ju HL.(1992). Split plotting and randomizationin industrial experiments.
ASQC Qurrlit), Cor~grcwTrorl.scrc.tiorl.r, 27:34-39.
McCullough P. Nelder JA. (1989). Generalized Linear Models. 2nd ed. New York:
Chapman and Hall.
Minkin S. (1987). Optimal designs for binary data. J Am Stat Assoc 82:1098-1 103.
Montgomery DC. (1997). Design and Analysis of Experiments. 4th ed. New York:
Wiley.
Myers RH, Montgomery DC. (1995). Response Surface Methodology: Process and
Product Optimization Using Designed Experiments. New York: Wiley.
Myers RH, Montgomery DC.(1997). A tutorial on generalized linear models. J Qual
Technol 29:274-291.
MyersRH,KhuriAI, Vining G G . (1992a). Responsesurfacealternativestothe
Taguchi robust parameter design approach. Am Stat 46:131-139.
Myers RH, Vining G. Giovannitti-Jensen A, Myers SL. (1992b). Variance dispersion
properties of second-order response surface designs. J Qual Technol 24: 1-1 I .
Myers RH, Kim Y, Griffiths KL. (1997). Response surface methods and the use of
noise variables. J Qual Technol 29:429440.
Myers WR, Myers RH, Carter, WH Jr, White KL Jr. (1996). Two stage designs for
thc logistic regression model in single agent bioassays. J Biopharm Stat, April
issue.
Neff AR, Myers RH, Ye K. (1997). Bayesian two stage designs under model uncer-
tainty. VPI & SU Tech Rep. 97-33.
Rozum MA, (1990). Effective design augmentationforprediction.PhD Thesis,
Virginia Tech.
Rozum MA, Myers RH. (1991). Variance dispersion graphs for cuboidal regions.
Paper presented at ASA Meeting, Atlanta, GA.
Russell TS, Bradley RA. (1958). One-wayvariances in the two-wayclassification.
Biometrika 45: 11 1-129.
SAS (1992). Tech Rep P-229. SAS/STAT Software, Release 6.07. Cary, NC.
Searle SR, Casella G, McCulloch CE. (1992). Variance Components. New York:
Wiley.
Sitter RS. (1992). Robust designs for binary data. Biometrics 48:1145-1155.
Sitter RS, Torsney B. (1992). D Optimal designs for generalized linear models. In:
Kitsos CP, MullerWG,eds.Advances in ModelOrientedData Analysis.
Heidelberg: Physica-Verlag, pp 87-102.
Response Surface Methodology 481

Taguchi G, Wu Y . (1980). Introduction to Off-Line Quality Control. Central Japan


QualityControlAssociation.(AvailablefromAmericanSupplierInstitute,
Dearborn, MI.)
Vining GG. Bohn L. (1996). Responsc surfaccs for the mean and the process var-
iance using a nonparametric approach. J QUATechnol 30:282-29 I ,
Vining GG, Myers RH. (1990). Combining Taguchi and response surface philoso-
phies: A dual response approach. J Qual Technol 22:38-45.
This Page Intentionally Left Blank
Index

Acceptance region, 196 Average problem occurrence interval.


Additive disturbance, 87 8
Admissible for 21 design, 350 Average run length (ARL), 100, 121,
Affine-linear transformation, 341 175, 178, 195
Affinity operator. 444 for the residual chart, 148
Allowance A , 4 performance, 152
Alternative moment matrix, 293 (see Average sample number (ASN). 131
r d s o moment matrix) Average time to signal (ATS). 100, 125
Analysis of covariance (ANOCOVA), steady-state (SSATS), 126
377
Analysis of variance (ANOVA), 329, Bayes D-D optimality procedure, 473
376, 395 Bayesian design, 469
Approximate designs, 343 Behavior:
ARIMA model. 145 closed-loop, 8 I
Autocorrelated data. 145, 224, open-loop, 8 1
Autocorrelation, 224 Best linear unbiased estimator
Autocorrelation function, 142 (BLUB), 414
sample. 144 Biased drift parameter estimate, 94
Autoregressive model: Birandomization structure, 476, 477
first-order, 146 Block interaction model, 396
second-order, 147 Box-Behnken design, 465
Average loss by defectives, 14 B-spline. 340
Average number of defective units, Business excellence:
11 six principles for, 48
Average number of observations to success criteria for, 50
signal (ANOS). 121
Average number of samples to signal Candidate factors, 30
(ANSS), 125 Capability of the process, 24

483
484 Index

Capability potential, 2 Classification. 257, 259


Capability pcrformance, 242 Classification and clustering methods,
Capability indices, 242. 246 259
Central composite design (CCD), 294. Columnwise-pairwise exchange
465, 477 algorithm, 3 17
fully replicated, 300 Combined CUSUM scheme with a
Changepoint analysis, 403 rectangular acccptancc region
two-way. 406 (CC). 196
Changepoint model, 399 Comparing treatment, 401
two-way. 401 Compensatory variable. 94
one-way, 401 Completely randomized design (CRD),
Chart: 477
cause-selecting Shewhart. 163-164 Computer-generated designs. 294. 464
control, 117, 139 Computer-generated contour surface,
cause-selecting T’. 165-166 413
moving centerline EWMA, 15 1 Computer-aided design (CAD), 321
(.sec~ rrlso EWMA) Confidence intervals procedure, 21 1
multivariate T 2 , 165 Confidence intervals of process
T’, 170 capability indices, 279
two kinds of, 163 Conditional T’, 228
unwcighted batch means (UBM), Constrained confidence region. 421
156 Continuous improvement cycle, 41
CUSUM. 1 19-1 2 I (SWrIl.SO Conttnuous improvement process, 72
CUSUM), Contour plot, 412
Bernoulli CUSUM , 120 Control factors, 359, 374, 458
designing. 122 Control x noise interaction, 459
properties of. 121 Control of deterministic trend, 90
binomial CUSUM,125 Control of random walk with drift, 91
designing. 126 Corrected diffusion (CD). 122
exponcntially weigheted moving Cumulative chi-square method, 402
averages (EWMA), 189 Cumulative sum (CUSUM),189
omnibus. 194 Bernoulli CUSUM chart, 120
EWMA bull’s-eye, 193, 198. 206 binomial CUSUM chart, 125
multivariatc EWMA T’. 194, 204 combined CUSUM scheme with a
multivariate Shewhart T’, 203 rectangular acceptance region,
1)-. 118 196
Shewhart bull’s-eye. 193, 205 chart, 119-121
SPRT, 120.129 (sce a l s o SPRT): discrete. 18 1
designing, 133 two-sided, 181, 185
properties of, 131 two-sided CUSUM mean chart. 196
two-sided CUSUM mean, 196 two-sided CUSUM variance chart,
two-sided CUSUM variance. 196 196
weighted batch means(WBM), one-sided, 183
153-1 54 Customer requirements, 269
x. 180 Customer satisfaction, 36, 40, 109
Index 485

[Customer satisfaction] Estimated generalized least squares


index, 23 (EGLS) estimate. 414
European Quality Award (EQA), 51
EWMA. 149
Degree of acceptability, 41 2 EWMA bull’s eye chart. 193, 198,
Discriminant analysis, 259 206
Decision rules. 2 17 moving centerline EWMA control
Design factors. 360 chart, 157
Desirability, 415 multivariate EWMA T’. 194. 204
function approach (DFA), 415 omnibus, 194
optimization methodology. 417 prediction error. 150
transformation, 4 1 5 4 1 6 Excellent leadcrship profile (ELP), 60.
Deterministic drift model, 86 61
Deterministic trend model, 87, 95, 98
DETMAX algorithm, 294 Functional algorithm. 21X
Diagnosis and adjustment. 8 Fuzzy modeling approach. 421
Dispersion effect. 360, 363, 377 Fuzzy nlultiobjcctive optilnization
Dispersion effect modeling, 360 methodology, 421
D-optimality based criterion, 293
D-optimal design, 352. 468
Generalized distance approach (GDA).
Bayesian, 47 I
417
twelve-runs, 466
Generalized interaction, 397
D-optimal minimum support design.
Generalized least squares (GLS), 477
354
Generalized linear model (GLM). 388,
Dose-response experiment, 400
406, 474
Doubly cumulative chi-square statistic.
Gcneralized reduced gradient (GRG)
404
algorithm. 420
Diagnosis theory with two kinds of
Generic process capability index. 274
quality (DTTQ). 167
Group scrcening designs, 307
D L I : rcsponsc
~ approach. 29 I , 420
Dual rcsponse surface approach, 459
Dynamic experiments. 323 Hadmnard matrices, 307
Dynamic characteristics. 374 Hard-to-change variable. 476
Dynamic S/N ration, 374 Hard-to-control variablc. 476
Dynamic systcm, 391 Hybrid design. 466

Improvement:
Effect sparsity. 306 circle, 22
Efficient design strategy, 309 continuous, 38
Elliptical, 196 process, 36
Employee satisfaction indices (ESI), 24 Information matrix. 342
Employees’ ideality, 64 expected, 293
Environmental factors. 360 Interaction, 395
Environmental interaction effect: Isotonic inference. 407
dcsign by. 360 Iterated reweightcd least squares
Environmental variables, 359 (IRLS). 478
486 Index

JIS (Japanese Industrial Standards), 6 [Moment matrix]


Joint models for the mean and the alternative, 293
dispersion, 388 Moving average model:
first-order, 147
Knots, 339 first-order integrated, 148
autoregressive integrated, 148
Multi-index system, 161
Large fraction of defectives, 13
Multi-index production line, 162
Leadership model, 50
Multioperation production line, 161
excellent, 5&57
Multiple comparison of treatments,
Leadership styles, 52
405
Likelihood function, 113
Multiple mean-variance plot (MMVP),
Linear multiresponse model, 413
376, 381
Linkage algorithm, 220
Multiresponse capability indices, 243
Link function for the mean, 388
Multiresponse experiment, 412
Location effects, 363
Multiresponse optimization, 41 2
Loss function, 3
Multiresponse processes, 242
Lower control limit, 118
q-dimensional , 249
Lower specification limit (LSL), 270
Multivariate Bayesian capability index,
246
Manager tetrahedron concept, 441 Multivariate capability indices, 244
Maximum likelihood estimate, 113 Multivariate control chart:
Maximum likelihood method, 112 nonparametric, 21 5
Marked point process, 8 1 Multivariate control procedure for
Markov chain, 175 autocorrelated processes, 227
finite, 176 Multivariate processes, 224
finite and ergodic, 175 Multivariate quality control procedure,
Mean absolute deviation (MAD), 152 210
Mean and variance modeling, 458 Distribution-free, 214
Mean and variance response surface, Multivariate stepwise diagnosis. 167
460
Mean-variance plot, 376
Measurement, 39 Noise factor, 374, 458
of quality, 21 Nonparametric procedure, 21 5
Measuring equipment, 7 Normal probability plots, 329
Minimum mean square error (MMSE) Notz design, 294
controller, 89 fully replicated, 300
Minimum variance controller, 89 Numerically optimal design, 344
Mixed model, first-order 147
Monitoring process capability, 283 Optimal B-spline regression designs,
Model-based approach, 145 349
Model-free approach, 145 Optimality criteria in designs, 31 1
Model-free process-monitoring Optimum diagnosis interval, 9
procedure, 156 Optimum measuring interval, 16
Moment matrix. 293, 342 (sec~also Optimum modifying quantity. 16
information matrix): Ordinary least squares (OLS), 415
Index 487

Orthogonal decomposition of the T’ Product supply, 435


value, 227 Products:
Out-of-tolerance, 2 American, I
Japanese, 1
Prototype experiments. 322
Parameter design, 373 Prototype tests, 323
Pattern recognition, 257 Prototype experiments, 325
Performance measures (PMs), 387 analysis of, 328
Performance measure modeling
(PMM), 387
PDCA-leadership cycle, 70 Q optimality, 469
People-based management, 37 Quadratic loss function, 277
Piecewise polynomial regression, 339 Quality:
Plackett and Burman design. 307, 313 total, 162
Plan-Do-Check-Act cycle of Deming, partial, 162
323 two kinds of, 162
Plan-Do-Check-Action(PDCA), 67 Quality control:
Plan-Do-Study-Act cycle, 370 cost, 8
Polynomial splines, 339 on-line, 6
Potential terms, 470 Quality dimension, 439
Prevention, 39 external, 439
Preventive maintenance system, 10 internal, 440
Primary terms, 470 Quality function deployment (QFD),
Principal component capability index, 56
254 Quality improvement, 437, 457
Process variable with cycle, 225 experiments, 387
Process capability, 243 Quasi-likelihood (QL), 389
Index, 2, 269, 270, 273 extended quasi-likelihood (EQL),
Process change, 80 389
in EPC models, 82 Cox-Reid adjusted profile EQL, 390
Process control, 6, 243 Queuing model:
algorithmic statistical (ASPC), 79 analytical, 445
automatic (APC), 77 in quality service, 442
engineering (EPC), 77 retrial, 447
run-to-run, 79 with time-dependent arrival process,
statistical(SPC), 77 445
generalized, 79 Queuing system, 435
Process decay, 235
Process performance, 269, 273 Random walk model, 96
Process variability, 291 Reliability, 107
Process variance, 29 1 analysis, I10
model for, 292 Renewal, 8 1
Process viability, 247 Replicated axial design, 300
Product development. 323 Replicated factorial design, 294, 300
Product development process, 324 Replicated 3/4 design, 300
Product service, 435 Replicated full factorial, 300
488 Index

Response function modeling (RFM), Statistical understanding. 42


388 Supersaturated designs, 307
Response surface methods: examples of. 3 I3
nonparametric. 475 computer construction of, 317
semiparametric. 47 Supervised learning, 258
Response surface methodology Systematic supersaturated designs, 307
(RSM). 41 1, 457
Restricted maximum likelihood
(REML) technique, 390, 478 Taguchi’s philosophy, 277
Robust design experiments, 323 Teamwork, 40
planning of. 369 Total quality management (TQM), 19.
engineering, 359 35.45
Robustness,458 European model for. 47
improvement. 360 5 principles of TQM, 46
Robust parameter deslgn. 373, 458 pyramid, 46
Robust parametric design, 388 pyramid principles of. 37
Total quality queue management, 451
Transition matrix, 177
Satisfaction and loyalty model, 25 Two-stage analysis method for
Satisfaction process, 29 experiments. 32
Sensitivity-standard deviation (SS) Two-stage design, 469
plot, 376 Two-stage D-optimal design (D-D
Sequential probability ratio test optmality), 469
(SPRT), 120 Two-way table analysis, 397
critical inequality of the, 131
Service delivery process, 435
Shift in drift parameters, 86
Unconditional T’, 227
Shift in trend parameter, 86
Uniform decay process, 226, 230
Short-term leader, 20
cxamplc of. 230
Short-tcrm leadership, 20
Unsupervised learning, 258
Signal factor. 374
Upper specification limit (USL). 270
Signal to Nolse (S/N) ratio, 374
Simultaneous optimization, 421
Space-filling design. 475
SPC-EPC integration, 78 Variance dispersion graph (VDG), 465
Spline model. 339 Variance function, 375. 389
Splinc regression model, 342 Variation. 4
Split-plot dcsigns. 368 Viability index, 247, 248. 249
Spread ratio, 245 Viable, 247
Square-well loss function, 276 Viable bivariate process, 249
Stage decay process, 235 Vital few and trivial many. 363
Static charactcristics (static S/N ratio),
374
Stationary transition probability, 177 Warranty data, 109
Statistical process control and White noise, 81
diagnosis (SPCD), 166 Whole-plot variables. 476
Index 489

Whole-plot experimental units, 476 Zellner’s seemingly unrelated


regression(SUR) estimate, 414
x & R (or s)control to monitor
process capability. 284

You might also like