A Comparative Study of Various Tests For Normality
A Comparative Study of Various Tests For Normality
A Comparative Study of Various Tests For Normality
To cite this article: S. S. Shapiro , M. B. Wilk & Mrs. H. J. Chen (1968) A Comparative Study of Various
Tests for Normality, Journal of the American Statistical Association, 63:324, 1343-1372
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://amstat.tandfonline.com/page/terms-
and-conditions
A COMPARATIVE STUDY OF VARIOUS TESTS FOR NORMALITY
S. 8. SHAPIEO,* M.B. WILK*AND MRS. H. J. CHEN
Computer Applications Znc. and Bell Telephone Laboratories, Inc.
Results are given of an empirical sampling study of the sensitivitieg
I
of nine statistical procedures for evaluating the normality of a com-
plete sample. The nine statistics are W (Shapiro and Wilk, 1965),
d& (standard third moment), bn (standard fourth moment), KS
(Kolmogorov-Smirnov), CM (Cramer-Von Mises), WCM (weighted
CM), D (modified KS), CS (chi-squared) and u (Studentized range).
Forty-fme alternative distributions in twelve families and five sample
sizes were studied. Results are included on the comparison of the sta-
tistical procedures in relation to groupings of the alternative distribu-
tions, on means and variances of the statistics under the various alter-
natives, on dependence of sensitivities on sample size, on approach t o
normality as measured by the W Etatistic within some classes of dis-
tribution, and on the effect of misspecification of parameters on the
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
1. INTRODUCTION
1343
1344 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER lode
0.
1.
fi
4. Crancr-Von Wses
Cramer (1928)
CM CM - J: n [F,(Y) - r(r)12~r)
Fn is the empirical distribution function.
6. Duxbln
Durbln (1961)
D D - m x
1
(t - kJ,-1
I
i 1,2,. ...,n
gJ = (n+2-j) (c; - c;-JJ j - 1,2 ,....,n
0 5 co
I . .
5 el .... 5 c: obtained by ordering
c1 = .
;
u c2 = u2 - v....,c*l = 1 - u
ui F(yi), 1 = 1J2J....,n
7. Chi-squared
(equiprobable c e l l s ,
cs cS -: =
k
-
n. k = nrmrbcr of cells,
1/2
8. u
h v i d e t a1 (1954)
TESTS FOR NORMALITY 1345
number of cells (k) used. For this study, the selected values of k for the various
sample sizes (n)were: (n,k) = (10, 4), (15, 5), (20, 5), (35, 7) and (50, 9).
A notation common to all the definitions is that yr<yz --
Sy,,denotes the
ordered observations from a complete sample of size n, 9 is the sample mean,
and F,( ) is the empirical distribution function.
The results of the study are summarized in Section 9, which may be read
independently of the more detailed presentations.
2. ALTERNATIVE DISTRIBUTIONS STUDIED
the Sechadistribution was studied amongst the logistic family. Because the null
distribution is symmetric, the majority of the alternative distributions were
chosen to be symmetric to provide stringent conditions for comparison of the
methods.
Table 2 also gives the d& and pz values for the distributions, i.e., the stan-
dardized third and fourth moments. The [ values range up to 6.62 (for
WE(.5)), while 0%values lie between 1.75 (Tu(1, 1.5)) and 113.94(LN).
Also several misspecified normal distributions were used to study the
effect of small errors in the assumed values of the normal parameters in
testing the simple hypothesis that the distribution was N(0, I). The alternative
parameters used were: ( p , U) = (0, 1.2), (0, 1.3), (.15, 1.0), (.18, 1.2), (.195, 1.3),
(.3, l), (.36, 1.2), (.39, 1.3); note that p/u has the values 0, .15 and .30.
3. DETAILS O F THE SAMPLING STUDY
The results given are based on empirical sampling. Samples from the various
distributions were generated by a system of procedures developed by Fowlkes
(1965), in which the basic input was the Rand Corporations (1955) normal
and uniform deviates. In generating samples of a given size from a distribution,
reuse of the same deviates was avoided. However, the same deviates were
reused for the differing sizes of samples.
The study involved sample sizes of n = 10,15,20,35 and 50. For convenience,
the null distributions of eight of the nine statistics were obtained by empirical
sampling; for the CS test tabulated chi-squared values were used. For all sta-
tistics except W , the empirical null distribution was based on M = 500 samples
for each sample size; for the W statistic, it was based on M=5000, for n 5 2 0
and on M = [100,00O/n], for 20<n_<50.
For the non-null distributions, the empirical c.d.f.s of the various statistics
were based on M=200 samples. The same samples were submitted to each of
the statistics.
For a typical null empirical distribution the output consisted of a listing of
the quantiles corresponding t o values of the c.d.f. of p = .005, .01(.01).05(.025)
.15(.05).30(.10).70(.05) .85(.025).95(.Ol) .99, .995.
For a typical non-null run, the empirical null distribution was part of the
input. The output results consisted of a listing giving, for each null quantile
1340 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DMCEMBER 1086
1.80
1.85
1.93
2.00
2.14
2.40
2.36
BJ(k) k = r( 0 2.50
k-8 0 2.75
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
k 12 0 2.83
k-20 2.90
-
0
(3) Chi-square4 (v) $(v) v 1 2.83 15.00
v = 2 (papoluntial) 2.00 9.00
YE4
v - 10
1.41
0.89
6.00
4.20
(4) Double Chi-Squared (0) 0 1.88
0 2.19
0 3.00
0 6.00
0 8.56
0 12.26
0 I.&
0.65 2.13
0 2.63
0.73 2.91
normal variable 0 < XI 1 0.28 2.77
Ref. Johnson (i94$ 2
(6) Logistic !a,?)pe +Bx/(l+eQcOx) L 0 4.2
-=< x <
2
(7) L a N0nm.l (w,a 1 w-0, a2-i 6.18 U3.94
(8) Ron-Central Chi-Squared (v, A) V - 1 ,h a 1 6 0.73 3-72
(9) Poisson (A) a-i 4.00
--
1;00
h-4 0.50 3.25
A 10 0.32 3.10
(10) Student T(V) v 1 (Cauchy) 0 --
v.2 0 _-
--
v-4
v 10- 0
0 4.00
-- -
(11) W e y a -1, h .I 0.1 0 3.a
a o 1, h 0.2 0 2.71
variates defined by transfornation a 1, h = 0.7 0 1.92
y = $RA - ( l - R ) A h h e r e R i s uniform
a
-
1, A 9 1.5 0 1-75
on the unit i n t e r v a l
Ref. Hastings, e t al. (1947)
(12) Weibull (k,h)
a
a
a
h
1, h = 3.0
.
1, h = 5.0
1, h 10.0 -
1, k = 0.5
0
0
0
6.62
2.06
2.90
5.33
87.72
h 0 1,k = 2.0 0.63 3.25
TESTS FOR NORMALITY 1347
z ( p ) ,the corresponding non-null empirical c.d.f. value. Thus, for example, for
n = 10 with the W statistic used in testing samples from P(1),the results were
Z(P> P PP
.746 .005 .163
.781 .Ol .233
.SO6 .02 .422
.820 .03 .460
etc.
tion of times that the significance level p lies below 1% is about .233; i.e., that
the power of a 1% test is about .233.
This output was available for computer plotting, using the Stromberg-
Carlson 4020. Thus one could obtain so-called "merit curves," which are plots
of p p (the non-null c.d.f. value) versus p (the null c.d.f. value) at each value
of x ( p ) . Clearly then, the value of the ordinate, p p , corresponding to an abscissa
of p , is the power of a p% test in the usual Neyman-Pearson framework. Al-
ternatively, if one regards the significance level, p , attained in a test, as a ran-
dom variable, then p p represents the c.d.f. of p .
Such empirical c.d.f.'s and merit curves were obtained for each statistic,
alternative distribution and sample size as listed above. Only a subset of this
information is included in the paper.'
4. TYPES OF RESULTS
A subset of merit curves is given in Figures 1.01 to 1.36, each showing the
comparative performance of the nine statistics for a selected alternative dis-
tribution and sample size. The sensitivity of a test statistic is indicated by the
height of the merit curve.
A compilation of the power of a 10% level test is given in Table 3 for each of
the 5 sample sizes and the 44 alternative distributions studied.
A partial summary of the effects of sample size is provided in Figures 2.01
to 2.39, each giving, for a specified alternative distribution, a plot of the power
of a 5% test as a function of log n, with all nine procedures shown in the same
graph. Because the comparative sensitivities of the procedures are roughly the
same at all significance levels, the plots at the 5% level are generally indicative.
Another compilation of results, directed towards approach to normality
within a family of distributions, is provided by Figures 3.01 to 3.06. Each plot
deals with a given family, indexed by a parameter represented as abscissa.
The ordinate is the power of a 5% test using the W statistic.
Results on the effect of misspecification of null distribution parameters for
those statistics which can be used only with simple hypotheses are summarized
in Table 4, the entry being the actual probability of exceeding the null 5%
point.
* More details are available on request from the authors.
1348 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1968
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
0.7
I
W = O , ~ ~ = I , ~ i ,? =
,HS=3,CM=.(,
WCM = 5, D = e, CS = 7, U=O
1850 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1068
7
l.O
xa (to)
I FIG. 1-14 i DX2 (4,s)
N=35
0.8
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
ae
FIG. 1.16
DX2 (1)
N= 50
c FIG. 1.17
0x2 (13)
0.10 0.20
P P P
TEST6 FOR NORMALITY 1351
d
1
I
0.0 - FIG. 1-22 FIG. 1 -23 FIG. 1.24
0.8 - SB
N= 50
(190 SB
N=50
((82) L
N=30
0.7 -
0.6 -
PP
0.3 -
0.4 -
0.3
0.2
0.1
0
0 0.10 0.20 0 0.10 0.20 0 0.W 0.20
P P P
1352 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 18(uI
1.0
FIG. 1-25
PP
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
FIG. 1.29
p (4,
N=ZO
PP
0 0.10 0.20
P P P
I-Or-----l
0.9 '
0.8 '
0.1
0.6 '
t
PP
0.5
0.4
0.5
0.2
0.1
0
0 0.10 0.20
P P P
W C M = 5, D= 6, CS = 7, u= 8
1354 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1968
TABLE 3
1 0 PERCENT POWERS I N $ W, A, and be TESTS
TEST
SAMPLE SIZE
W
10 15 20 35 50
fi
10 15 20 35 50
b2
10 15 20 35 50
ALT. DISTN.
15 28 39 76 96 5 2 0 2 1 20 22 38 80 93
1 16 25 30 66 90
14 17 27 46 82
9 13 20 38 61
13 12 12 22 37
23 30 47 78 96
6 3 1 1 1
3 3 1 1
2 1 2 1 1
5 3 0 3 1
17 11 21 43 59
19 22 33 76 92
18 18 28 58 82
12 15 26 57 69
16 11 15 33 51
22 14 21 26 29
16 21 19 22 42 7 6 8 837 14 11 10 25 29
58 74 93 * * 910 4 4 2 11 8 5 10 14
34 45 51 78 96 8 8 8 4 8 11 7 6 9 10
27 32 39 58 76 10 g 8 11 9 14 6 7 11 7
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
20 22 24 32 37 10 6 11 10 6 10 5 10 12 8
82 94 99 * 66 89 92 * 45 55 64 80 93
57 82 89 * * 43 71 82 96 * 27 38 44 69 81
33 43 65 84 97 29 50 64 83 95 19 27 35 49 51
24 30 40 56 74 19 35 35 59 71 12 15 25 33 37
18 18 23 61 84 6 5 1 3 2 19 19 28 66 88
14 11 14 27 37 6 3 4 .2 3 16 13 21 37 48
25 27 32 48 49 24 28 35 44 44 18 27 3 54 60
31 44 52 68 77 31 44 50 57 64 2 40 43 75 84
39 50 66 78 85 42 47 54 58 69 3.? 49 60 77 85
10 27 29 57 85 4 3 2 0 1 18 22 41 74 89
44 70 85 * 13 35 34 46 67 29 21 38 49 51
13 513 7 1 1 7 6 6 4 10 3 6 10 10
13 :1 12 12 8 10 16 11 10 9 10 9 g 16 8
12 12 12 14 19 7 9 14 11 17 10 8 14 13 11
L 12 14 16 20 20 16 19 1527 29 12 13 13 31 34
LN 72 88 96 60 83 91 * 46 59 67 89 90
(1,16)
NCX~ 37 63 74 95 26 56 61 85 95 22 30 29 41 47
84 99 * * *
:It1
P lo) 24 31 38 57 80
21 16 16 27 24
22 36 37 68 81
9 16 15 23 35
10 12 11 15 17
20 17 16 37 30
10 12 12 17 18
14 8 g 13 10
i1):
42 81 92 99 * 65 76 82 90 94 56 82 88 97 99
41 51 58 80 84 42 51 54 74 75 38 53 58 87 92
T 10) 23 26 27 38 42 27 30 29 43 49 20 22 30 42 60
13 19 18 17 21 12 18 18 22 28 10 13 15 26 28
13 13 12 g 8 12 12 14 13 14 10 8 9 13 13
10 12 15 g 12 1 1 8 9 6 5 10 6 11 15 g
15 16 25 57 78 3 2 0 2 19 16 32 66 81
17 36 44 82 97 3 4 2 2 1 19 29 53 87 98
13 8 14 28 55 5 2 2 1 1 13 12 18 47 60
12 12 14 22 24 1 1 6 8 8 3 g 6 710 2
70 79 90 96 50 49 48 37 43 44 53 61 88 go
9 4 * * * * 76 96 99 * 54 74 8 98 98
18 23 24 40 59 16 23 23 38 51 11 15 12 22 18
TESTS FOR NORMALITY 1355
TABLE 3. (continued)
10 PERCENT POWER IN $, KSj CM and WCM TESTS
TEST Ks CM WCM
SAMPLE SIZE 10 15 20 35 50 10 15 20 35 50 10 15 20 35 20
ALT. DISTN.
14 11 19 18 26 11 8 19 14 20 14 8 19 18 27
17 14 19 13 9 1 13 12 14 7 17 14 20
i6313 17 18 24
19 13 17 14 20
16 13 13 13 1
18 1 13 10 1 z 17
18
ii ii 15 21
11 13 10 18
12 7 17 14 14
10 10 14 16 22
9 217 13 10
16 12 20 15 21
11 g 17 15 12
17 13 20 18 28
12 6 8 12 10 8 9 11 10 8 12 8 12 8 11
26 48 58 86 17 18 31 57 95 20 18 33 56 *
23 33 38 47 88 16 14 15 21 26 18 13 17 -21 35
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
10 1 28 42 72 9 8 19 13 22 11 9 19 14 26
8 12 23 20 40 8 10 17 7 14 9 16 20 7 19
41 51 58 * 30 41 62 83 98 42 48 72 93
20 24 33 48 64 23 26 41 61 86
32 30 37 51 6
20 22 26 33 4
15 18 19 22 34
2 15 16 21 30
9 13 15 16
36
25
14 16 23 34 47
13 11 16 15 27
18 14 15 26 27 1 13 14 21 17 12 16 22 30
19 14 12 12 18
13 7 13 10 18
2
1 12 12
12 7 15
9 11
8 14
20
17 12 13 10 14
36 20 37 34 52
12 13 20 28 50 4 11 17 27 40 10 12 23 33 53
23 20 30 42 59 12 15 26 34 50 22 14 32 42 63
11 13 15 18 30 11 14 12 12 22 10 14 15 15 28
20 28 38 46 63 14 21 31 39 57 17 17 56 45 73
9 810 7 9 8 11 15 10 11 8 9 15 9 13
1011 11 8 14 10 6 g 7 '11 13 5 12 6 12
13 10 9 13 14 12 11 11 13 11 13 11 14 12 12
L 9 6 9 9 8 1 0 9 8 9 5 1 3 8 9 9 7
IAN 28 47 57 * * 25 41 67 90 99 34 43 81 98 *
NCX2 (1 16) 24 25 32 36 45 18 18 24 33 35 21 20 28 32 52
y1
P .lo)
37 52 75 90
13 26 25 29 50
13 14 15 19 29
18 21 38 69 *
9 9 11 12 17
8 6 12 6 11
20 26 52 88
11 g 18 16 22
10 6 11 7 13
30 47 65 86 95 32 46 71 92 95 98 99 *
15 18 23 24 43 18 19 28 25 63 69 81 92 99
11 9 11 19 22 10 7 9 15 lh 14 7 11 17 20
9 611 6 g 6 811 7 g 8 11 13 9 9
9 7 810 8 7 7 1 2 10 7 9 7 13 10 8
10 12 10 9 11 10 11 12 6 a 11 10 14 6 10
11 8 17 13 16 12 7 18 1 21
15 9 17 16 26
16 17 15 21 27
14 8 10 10 14
~
14 16 '15 15 18
14 8 15 11a 14 7 16 10 15
2
16 15- 16 1 28
9 9 14 14 9 12 8 1 16 17
31 45 63 86 99 2
23 33 5 83 95
58 65 * * * 57 73 96 * 63 77 99 * *
12 14 15 17 21 10 11 14 11 16 11 11 15 12 16
1356 AMERICAN STATISTICAL A860CIATION JOURNAL, DECEMBER 1888
TABLE 3. (eontisued)
BI 20 261310 22 16 12 13 12
78 * 17 17 21 15 * 16 13 15 13 10
66 83 93 * 41 94 97 99 18 15 19 31 29
31 39 56 88 93 81 43 43 97 96 15 18 12 30 21
14 16 25 47 48 12 23 20 46 69 9 9 13 21 17
10 14 10 12 21 14 11 14 21 28 13 8 12 11 19
14 14.11 18 20 17 15 14 27 31 20 23 42 70 go
15 9 5 15 12 12 16 9 12 13 24 14 19 27 38
23 21 24 33 35 14 17 25 25 35 L6 23 26 42 51
13
30
20 22 45
29 39 55
45
59
19
32
33 35 56 66
55 71 78
41
1
2z 29
31
35 63 76
45 67 75
12 14 11 15 19 12 20 19 19 27 %' 33 46 77 94
27 27 47 77 89 20 25 29 64 * 41 43 68 92 gg
15 9 a 13 9 12 11 6 12 12 14 7 8 10 7
13 8 8 12 8 11 12 10 7 8 10 7 8 11 10
9 10 9 11 10 8 13 8 10 13 14 10 12 9 13
L 13 4 10 14 12 12 12 8 12 10 14 12 11 22 29
LN 57 76 88 * 4 80 94 98 99 * 13 11 14 35 43
NCX2 (1,16) 21 27 32 51 66 19 23 17 42 93 21 13 17 29 28
* * * + * 3 4 4 * * * 36 36 44 48 56
89 * * * 20 12 10 26 * 15 13 16 13 18
50 93 * * 15 15 13 19 23 19 9 8 12 7
5 84 91 9 * 20 46 2 92 7
$0 41 52 25 66
12 10 11 21 20
23 46 54 95 99
1 18 22 31 59
12 20 18 20 32
19 36 24 78 87
17 16 24 36 51
13 8 9 15 12 9 10 9 13 12 13 9 14 22 24
10 12 6 10 13 12 10 9 12 12
12 8 9 9 12 L4
lo la 8 lo
1 12 l2 12 14 5 10 13 17
17 8 8 18 17 12 14 15 17 18 22 69 87
20 43
14 12 13 30 41 13 23 ig 28 29 26 43 61
91 99
10 6 9 14 13 14 14 9 13 14 ig 12 20 52 75
18 18 18 25 22 11 23 16 26 31 13 11 6 9 5
72 77 87 97 99 3' 83 89 99 37 40 46 68 58
89 99 * * 94 97 gg ig 1017 43 44
10 7 11 13 18 11 15 12 13 17 11 12 10 9 14
*loo$
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
1358
AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1888
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
Y
I
t I
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
100r
FIG. 3.01
BE(P.P)
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
1 I I 1 I
1.0 1.2 1.4 1.6 1.8 2.0 0 4 8 12 16 :
PARAMETER p PARAMETER k
FIG.304
DXW)
0 2 4 8 6 10
PARAMETER Y PARAMETER /3
I I I I I
0 2 4 6 6
PARAMETER x PARAMETER Y
Sample Size = 15
Summary discussions based on the results referenced above are given in the
ensuing sections concerning comparisons of test statistics, effects of sample
size, approach to normality and effect of parameter misspecification.
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
None of the tests showed much sensitivity against the alternatives in this
group. Against the L distribution with n = 50, the bz test achieved the highest
5% power, namely, 24%; in the same case, u had 20%; &, 18%; W , 16%;
and the rest gave 7% or less (Figure 1.24). Against L for n=35, d&had a
5% power of 23%, followed by bz with 19% and W with 15%. The only other
case where a test achieved as much as 20y0 power even for n=50 was against
the Tu(1, 5) distribution, using the CS procedure. The power of D was 15%)
of W was 14%, while the remainder ranged between 0 and 11% against this
alternative. The 5% powers for n = 50 for the SB(1, 2) alternative ranged from
5% to 14% with W the highest (Figure 1.23).
5.6 Group 6: Discrete Distributions
This group includes the B1(4),B1(8),B1(12), BI(20), P ( l ) , P(4) and P(10)
distributions, some of whose merit curves are included in Figures 1.01 to 1.36.
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
The D statistic was far superior to all others for this group and had 100%
power for almost all sample sizes against these alternatives. This is believed
to be due to the fact that the series of transformations applied to a sample
containing a large proportion of identical values yields a large proportion of
zeros.
The CS procedure performs erratically against many of the members of this
group, showing poor power for one sample size and jumping to 100% power
for the next. Thus for example the 5% power for CS against B I ( 8 ) is 9% a t
n = 20 and jumps to 100% at n = 3Fj. This peculiarity is probably due to a com-
bined effect from the discreteness of the data and the arbitrarily chosen num-
ber of cells, depending on n, used in the test.
Both the 46 and bz procedures appear to have no power against any of t,he
binomial alternatives in any sample size and performed relatively poorly
against the Poisson alternatives. The W test performed well against this group
while the u procedure did not. For example, the 5% powers, for n=20, for
BI(4) was 100% for D, 100% for CS, 71% for W , and 20% for u (Figure 1.05).
Of the distance tests (not including D) only the KS procedure performed
relatively well. For example, the 5y0 power against the P(1) alternative with
n = 2 0 was 100% for D, 100% for CS, 99% for W , and 55% for K S with the
remaining tests having power less than 30%.
Since the comparatively remarkable sensitivity of the D statistic is a result
of its response to the discreteness of the sample, one would expect it to show
similar sensitivity in testing other continuous null distributions against dis-
crete alternatives.
5.7 General Comments
The W statistic exhibits sensitivity to non-normality over a wide range of
alternative distributions. I n most cases it has power as good as or better than
the other eight procedures. For continuous alternatives, it is the only test
which never has very low power where another test shows high power. Even
for discrete alternatives it shows poorly only against the results for the D
test and occasionally the C S statistic.
The dG statistic is a good measure of non-normality against highly skewed
TESTS FOR NORMALITY 1367
and also long-tailed distributions. However, it often has lowest sensitivity
against symmetric and asymmetric finite range distributions, often being
biased. Moreover, it has very poor performance with the discrete distributions.
The bz statistic performs comparatively well with finite range distributions,
as well as with symmetric long-tailed infinite range distributions. It is not
effective, relatively, against skewed and discrete distributions.
There does not appear to be a clear cut superiority of 46 versus bz as an
omnibus test for normality. Generally, dG responds to skewness and bz to
kurtosis but in several cases their powers are quite similar. There are cases in
which the alternative distributions, though quite non-normal, give both
& and b2 values which resemble the normal. For example, in the case of the
BE(2, 1) distribution for samples of size 20, d& and bz have 5% powers of
8% and 13% respectively, while the W statistic has a power of 35% (Fig-
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
ure 1.03).
The distance tests, KS,CM,W C M and D,exhibit surprisingly poor power,
with some exceptions mainly in connection with discrete alternatives in the
case of the D statistic. I n general, the D procedure does improve the power
of the KS test for highly skewed continuous distributions but usually not to
the level achieved by other tests.
The D statistic has exceptional performance for discrete alternatives. Pre-
sumably this would apply relative to any null continuous distribution. A
possible explanation for this, as has been noted above, is that the transforma-
tions involved in computing D lead to a large number of exact zeros in the
case of samples from discrete distributions.
The very arbitrariness of definition of the CS statistic makes it difficult to
comment generally on its properties. These evidently depend mainly on the
choice of class intervals. From the present results, one may infer that the CS
test performs well against the very highly skewed distributions but in general
does not have good sensitivity overall.
The u statistic has, comparatively, good sensitivity against symmetric
short-tailed alternatives, typified by the uniform distribution. Its performance
against symmetric long-tailed distributions is comparable to that of W,
though in general not as good. The u test fails badly in the case of skewed dis-
tributions, having very low power even for alternatives as badly asymmetric
as xa(l).
The five groups for the continuous distribution alternatives were made up
based on a partition of the 1 &I,
o2space. Indeed one finds that these groups
correspond generally to varying non-normality as reflected by the sensitivity
of the W statistic. This indicates that, roughly speaking, the W teat subsumes
the information provided by both dK and BZ.
6. THE EFFECT OF SAMPLE SIZE
The effects of sample sizes are indicated in Figures 2.01 to 2.39 which give
plots of the 574 power as a function of log n, for all nine statistics for each of
39 alternative distributions, and by the tabular results on 10% power given
in Table 3.2
2 Tabular results for 5 % power analogous to Table 3 are avnilable from the authors.
1368 AMERICAN STATISTICAL ASSOCIATION JOUBNAL, DECEMBER 1968
In most cases the sensitivity increasea markedly with sample size. The
change with n does however vary considerably both in regard to the alterna-
tive distribution and the test procedure. Sometimes this effect is dramatic
even in the low sample size range of this investigation. Thus, for example, the
power of the W test against the Cauchy goes from 37% to 78% as the sample
size increases from 10 to 15 (Figure 2.29).
It is seen from Figures 2.01 to 2.39 that the 5% power is, loosely speaking,
linear in log n, a t least up until powers close to 100% are realized.
Several of the test procedures give very little increase in power, for certain
alternatives, as n changes. For instance, Figure 2.01 for the BE(1, 1) (uniform)
alternative, shows that the 5% power of each of the &, KS, CM,W C M ,D
and CS statistics changes from a range of 2 to 10% at n = 10 to a range of
0 to 22% at 72 = 50. For contrast, the corresponding 5% power range for W,
ba and u for BE(1, 1) is 5 to 20% at n = 10 and 85 to 96% at n=50. As another
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
the binomial distribution with k>20 tends to give rise to samples having a
near normal configuration.
Figure 3.03 for the x2(v) family shows a sharp dependence of the W index on
v over the range of v employed. However, in moderate size samples, ~ ~ ( 1gives
0)
rise to appreciably non-normal configurations. Thus, a t n = 50, the 5% power
of W is 62% for ~ ~ ( 1 0 ) .
Figure 3.04 gives measures of departure from normalit4yfor the Dx2((s)family
(which includes the normal, p=O). The curves rise rapidly on both sides of
/3 = 0 for large sample sizes. The change of the index is more gradual a t smaller
sample sizes, especially for negative /3. The general behaviour of the Dx(j3)
family is that it goes from a mass of 1 at 0 for /3+-1 through beta-like dis-
tributions, becomes normal at /3 = 0, double exponential a t /3 = 1 and develops
larger and larger tails as j3 increases. The differential behaviour, in regard to
dependence on sample size, between positive and negative /3 values for the
Ox2@) family is similar to comparison of results for the beta distributions
with those for the T(1) and T(2) distributions.
Figure 3.05 for the Poisson family, P(X),shows a sharp drop in the 5% power
of W , as X varies between 1 and 10, for each of the five sample sizes. With X = 10
the 5% power of W is only 18% for n = 50, indicating that for A> 10 even larger
samples would tend to have a normal-like configuration.
It is seen from Figure 3.06 that the W index of non-normality changes
rapidly for the Students T(v) distribution as the parameter v goes from 1 to 4,
while for v larger than 4 the approach to normality becomes much more
gradual. For example, as judged in samples of n=50, the 5% power of W is
99.7% for v = l , 40.7% for v=4 and 17.3% for v = 10. Presumably it would
require substantially larger sample sizes to develop appreciable power against
Students T(v) distribution with v > 10.
The results given here should not be interpreted directly to indicate the
quality of a mathematical approximation of, say, the Poisson with X = 10, by a
N ( p = 10, u2= 10) distribution. Rather it is an index which assesses the con-
figuration of samples of specified size relative to normal samples and will not
respond, in small samples, to moderate systematic differences in the actual
c.d.f.s. Thus, in the case of, say, the T(v) family, the difference from the nor-
1370 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1988
ma1 is essentially in the tails of the distribution which, however, account for
only a small overall proportion of the population. Hence large sample sizes
may be needed to reflect even large relative differences in the tails.
Of the test statistics considered in the present study, the K S , CM, WCM,
D and CS procedures are appropriate directly only as tests of simple hypotheses.
Thus, in their use as tests for normality, the parameters, p and u, of the hy-
pothesized null distribution must be specified. In most applied statistics cir-
cumstances where a test for normality might be of interest, prior information
on the parameters of the supposed normal distribution are usually not avail-
able. In some cases one may be prepared to approximate the unknown param-
eters, for purposes of execution of the test. The following results are of interest
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
sampling studies to gain insight and assuraiice concerning the usefulness of the
proposed procedure.
One of the ma.jor difficulties encountered in the present study was that of
organization, analysis and presentation of the voluminous results. This aspect
is reminiscent of the central features of most large scale statistical data analysis
problems.
REFERENCES
111 Anderson, T. W. and Darling, D. A. (1954). A test of goodness of fit, J . Amer.
Stat. Assn. 49, pp. 765-69.
[2] Box, G.E. P. and Tiao, G. C. (1964).A Bayesian approach to the importance of
assumptions applied t o the comparison of variances, Biometrika 61, pp. 153-67.
[3] Cramer, H. (1928). On the composition of elementary errors, Skand. Aktuar. 1 1 ,
pp. 141-80.
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013
[4] David, H.A., Hartley, H. O., and Pearson, E. S. (1954). The distribution of the
ratio, in a single normal sample, of range to standard deviation, Biometrika 41, pp.
482-93.
[5] Durbin, J. (1961). Some methods of constructing exact tests, Biometrika 48, pp.
41-55.
[6]Fowlkes, E. B. (1965).A Fortran I1 system for the generation of random samples,
unpublished Bell Laboratories memorandum.
[7] Hastings, C., Mosteller, F., Tukey, J., and Winsor, C. (1947). Low momenta for
small samples: a comparative study of order statistics, Annals of Math. Statist. 18,
pp. 413-26.
[8] Johnson, N. L. (1949).Systems of frequency curves generated by methods of trans-
lation, Biometrika 38,pp. 149-76.
[9] Kolmogorov, A. N. (1933). uSulla determinazione empirica di una legge d i distri-
buzione, G. Znst. Ztal. Attuun. 4, pp. 83-91.
[lo] Rand Corporation (1955). A Million Random Uigib With 100,000 Normal Deviates.
The Free Press Publishers, Glencoe, Ill.
(111 Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for norniality
(complete samples), Biometrika 68,pp. 591-611.