Multivariate Process Control Chart For Controlling The False Discovery Rate
Multivariate Process Control Chart For Controlling The False Discovery Rate
Multivariate Process Control Chart For Controlling The False Discovery Rate
http://dx.doi.org/10.7232/iems.2012.11.4.385
2012 KIIE
1. INTRODUCTION
Traditionally, quality is essential part of manufacturing in various industries, such as the chemical, semiconductor, automobile, computer, and cell phone industries. These days the importance of quality is emphasized also in service industries such as the banking, telecommunications, and health care industries. Customers
demand for better quality products is growing stronger
especially since they can now share the knowledge about
quality through the Internet and social networks. Therefore, quality improvement is one of the most important
aspects of business. Montgomery (2007) defines the term
quality improvement as the reduction of variability in
processes and products. There are two causes of variability in processes. The first cause is chance cause derived from random effects such as weather conditions.
Chance cause is considered as natural. The other cause
is assignable cause such as a failure in machinery or
faulty raw materials. Assignable cause can be controlled
advantages over the traditional control charts. First, a pvalue approach offers better graphical displays of the
performance of the process and incorporates more complex control procedures (Benjamini and Kling, 1999). Li
et al. (2012) showed that we can determine how strong
the signal is and how stable the process performs at a
given time using univariate cumulative sum (CUSUM)
charts. If we adopt a p-value approach, the control charts
can be considered as a sequential single hypothesis test.
Therefore, if we can establish the distribution of plotted
statistics, we can control quality more easily by setting
only type I error (Lee and Jun, 2010, 2012). Finally,
we can apply a multiple comparison procedure to control charts by testing single hypothesis simultaneously
(Benjamini and Kling, 1999).
In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of
statistical inferences simultaneously (Miller, 1981). Since
single hypothesis testing increases the false positive rate
when various hypotheses are tested, the family wise
error rate (FWER) is used in multiple comparison procedures. The FWER is the probability that at least one
false positive or type I error will occur among all the
hypotheses tested. Many procedures to control the FWER
have been proposed such as the Bonferroni, Sidak,
Tukeys, Holms step-down, and Hochbergs step-up
procedures. However, these procedures are not widely
used. They give conservative results when the number
of hypotheses increases. Therefore, the utility of testing
decreases. To overcome the weaknesses of the FWER,
Benjamini and Hochberg (1995) proposed using the
false discovery rate (FDR). The FDR is the expected
portion of false positives among all the rejected hypotheses. They also proposed a procedure for controlling the FDR.
There has been an effort to apply the FDR to univariate control charts. Lee and Jun (2010, 2012) proposed
procedures to control FDR for univariate X -charts and
exponentially weighted moving average (EWMA) charts.
They showed that by controlling the FDR, X -charts and
EWMA charts give better performance than traditional
control charts.
Grown out of these motivations, the objective of this
paper is to provide a new multivariate process chart by
controlling the FDR. The remainder of this paper is organized as follows. The multivariate process control is
interpreted in terms of p-values in Section 2. Section 2
also proposes a new multivariate control scheme for
controlling the FDR. Section 3 compares the performances of the new control schemes using numerical experiments. Finally, Section 4 gives the conclusion.
386
1
n
n
i =1
(1)
X ij
Sj =
1
n1
n
i =1
( X ij X j )( X ij X j )T .
(2)
(3)
2
T 2 ~ Tq,n-1
n q
q ( n 1)
(5)
T2
q ( n1)
( nq )
F , q , nq
(6)
(7)
Multivariate Process Control Chart for Controlling the False Discovery Rate
Vol 11, No 4, December 2012, pp.385-389, 2012 KIIE
If the null hypothesis j = 0 is true, then the statistic Tj2 follows the Hotelling T2 distribution with a degree of freedom (q, n-q). Therefore, each subgroups pvalue is
p j = P {T2 > T j2 H 0 } = Gq , nq ( q(nn -q-1) T j2 )
p( i ) qi / r , (i = 1, , r)
n -q
(8)
3. NUMERICAL EXPERIMENTS
In this section, some numerical experiments are performed to compare the BH scheme with the multivariate
Shewhart control chart. A theoretical average run length
(ARL) for the multivariate Shewhart control chart is
compared with a simulated ARL for the BH scheme.
Since computing the simulated ARL for the BH scheme
is difficult, the Monte Carlo simulation approach is used.
In this section, two- and three-dimensional quality variables are used for experiments.
(9)
= n( 0 )T ( 0 )
First, the theoretical ARL is compared with the simulated ARL using the p-value approach for the conventional multivariate Shewhart control chart. For twodimensional quality variables, the mean vector (0, 0)T,
covariance matrices (1 0.2; 0.2 1), (1 0.5; 0.5 1), and (1
0.8; 0.8 1) are used to compare the theoretical ARL and
the simulated ARL. For the theoretical ARL, Eqs. (9)
and (10) are used. For the simulated ARL, 10,000 iterations replicated 20 times are performed to compute an
average value. Various mean shift sizes (0, 0.5, 1, 1.5, 2,
2.5, 3, 4, 5) were used for experimental purposes. If a
mean shift size is 0.5, a shifted process has a mean vector of (0.5, 0.5)T, and the covariance matrix is unchanged. The critical value was set at 0.005. The results of this experiment are provided in Table 1. Table 1
indicates that the simulated results are almost exactly
(10)
Table 1. ARL of multivariate Shewhart control chart for two-dimensional quality variables
Covariance matrix
1 0.2
0.2 1
Mean Shift
0
0.5
1
1.5
2
2.5
3
4
5
1 0.5
0.5 1
1 0.8
0.8 1
Simulation
Theoretical
Simulation
Theoretical
Simulation
Theoretical
199.6347
74.7893
22.0309
8.9513
4.6744
2.9133
2.0582
1.3552
1.1151
200.0000
74.8447
22.0235
8.9563
4.6762
2.9144
2.0665
1.3565
1.1150
199.5738
86.6555
27.6506
11.5473
5.9996
3.6920
2.5402
1.5612
1.2119
200.0000
86.4125
27.7327
11.5426
6.0001
3.6715
2.5360
1.5643
1.2115
199.6769
96.0968
33.1505
14.1588
7.3766
4.4844
3.0394
1.7967
1.3263
200.0000
96.0559
33.2083
14.1679
7.3789
4.4709
3.0371
1.7932
1.3258
(9)
387
388
BH scheme is smaller than that of the multivariate Shewhart control chart for all mean shift sizes. Therefore,
the BH scheme performs better for two-dimensional
quality variables. Also, in the case of the BH scheme, a
larger span size results in better performance when the
mean shift is small. These tendencies are the same in the
various covariance matrix cases.
3.2 Three-Dimensional Quality Variables
For three-dimensional quality variables, the same procedure is adopted to compare the performance of the BH
scheme and multivariate Shewhart control chart. To
compare the performance of the BH scheme with the
multivariate Shewhart control chart, the mean vector (0,
0, 0)T and only the covariance matrix (1 0.5 0.5; 0.5 1
0.5; 0.5 0.5 1) are used. Other conditions such as the
number of iterations, number of replications, critical
level and mean shift size were the same as those in the
two-dimensional analysis set forth above. Table 4 lists
the results of this experiment. The interpretation of these
results is also the same as in the case of twodimensional quality variables. In other words, the BH
scheme performs better for three dimensional quality
variables. Also, in the case of the BH scheme, a larger
span size results in better performance when the mean
1st Eigenvalue
2nd Eigenvalue
Difference between
maximum and minimum eigenvalue
1 0.2
0.2 1
1.2
0.8
0.4
1 0.2
0.2 1
1.5
0.5
1.0
1 0.2
0.2 1
1.8
0.2
1.6
Table 3. ARLs of BH scheme and multivariate shewhart control chart for two-dimensional quality variables
Mean shift
BH scheme (r = 10)
BH scheme (r = 20)
BH scheme (r = 30)
200.8141
200.5796
201.6201
200.0000
0.5
114.9605
106.2124
98.7953
123.3485
50.0415
42.6149
36.5857
57.5694
1.5
23.7864
18.4067
15.2016
30.6586
12.8410
9.7025
8.9444
18.6685
2.5
7.8120
6.4639
6.4241
12.5258
5.3399
4.9407
4.9288
9.0127
3.3348
3.3243
3.3280
5.3909
2.5065
2.5162
2.5115
3.6733
Multivariate Shewhart
Multivariate Process Control Chart for Controlling the False Discovery Rate
Vol 11, No 4, December 2012, pp.385-389, 2012 KIIE
389
Table 4. ARLs of BH scheme and multivariate Shewhart control chart for three-dimensional quality variables
Mean shift
BH scheme (r = 10)
BH scheme (r = 20)
BH scheme (r = 30)
Multivariate Shewhart
200.8141
200.5796
201.6201
200.0000
0.5
1
114.9605
50.0415
106.2124
42.6149
98.7953
36.5857
123.3485
57.5694
1.5
2
2.5
23.7864
12.8410
7.8120
18.4067
9.7025
6.4639
15.2016
8.9444
6.4241
30.6586
18.6685
12.5258
3
4
5
5.3399
3.3348
2.5065
4.9407
3.3243
2.5162
4.9288
3.3280
2.5115
9.0127
5.3909
3.6733
shift is small.
4. CONCLUSION
This paper proposed a new multivariate process
control scheme, which intends to control the false discovery rate. The BH procedure is incorporated to control
the FDR in the sense of multiple hypothesis testing.
First, some simulation studies showed that the use of pvalues in multivariate Shewhart control chart is appropriate. Finally, it was shown that the proposed control
scheme outperforms the conventional chart in two- and
three- dimensional quality variables in terms of ARL.
ACKNOWLEDGMENTS
This research was supported by Basic Science Research Program through the National Research Foundation of Korea from the Ministry of Education, Science
and Technology (Project No. 2012-0001665).
REFERENCES
Anderson, T. W. (1958), An Introduction to Multivariate
Statistical Analysis, Wiley, New York, NY.
Benjamini, Y. and Hochberg, Y. (1995), Controlling the
false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal