Two-And Three - Parameter Weibull Goodness-of-Fit Tests: United States Department of Agriculture
Two-And Three - Parameter Weibull Goodness-of-Fit Tests: United States Department of Agriculture
Two-And Three - Parameter Weibull Goodness-of-Fit Tests: United States Department of Agriculture
Department of
Agriculture
Two- and Three-
Forest Service
Forest
Parameter Weibull
Products
Laboratory Goodness-of-Fit Tests
Research
Paper
FPL-RP-493
James W. Evans
Richard A. Johnson
David W. Green
Abstract Extensive tables of goodness-of-fit critical values for the two- and three-parameter Weibull
distributions are developed through simulation for the Kolmogorov-Smirnov statistic, the
Anderson-Darling statistic, and Shapiro-Wilk-type correlation statistics. Approximating formulas
for the critical, values are derived and compared with values in the tables. Power studies using
several different distributional forms show that the Anderson-Darling statistic is the most
sensitive to lack of fit of a two-parameter Weibull and correlation statistics of the Shapiro-Wilk
type the most sensitive to departures from a three-parameter Weibull.
Research Highlights This paper presents the results of a study to develop and evaluate goodness-of-fit tests for the
two- and three-parameter Weibull distributions. The study was initiated because of discrepancies
in published critical values for two-parameter Weibull distribution goodness-of-fit tests, the lack
of any critical values for a Shapiro-Wilk-type correlation statistic, and the lack of general
three-parameter Weibull distribution goodness-of-fit tests. The results of the study will be used
by Forest Products Laboratory (FPL) scientists to evaluate the goodness-of-fit of Weibull
distributions to experimental data. This will allow evaluation of distributional forms that may be
used in reliability-based design procedures.
Through computer simulation, extensive tables of critical values for three standard
goodness-of-fit statistics are developed for both the two- and three-parameter Weibull
distributions. The statistics used are the Kolmogorov-Smirnov D statistic, the Anderson-Darling
A 2 statistic, and a Shapiro-Wilk-type correlation statistic. The critical values for the statistics are
modeled through regression to provide equations to estimate the critical values. The equations
allow computer programs to evaluate the goodness-of-fit of data sets containing up to 400
observations. The abilities of the tests to detect poor fits (the “power of the tests”) are studied
using both the equations and the exact critical values of the test statistics. Finally, invariance
properties of the statistics are proved. These invariance properties show that the scope of the
simulation is adequate for all problems that the tests might be applied to.
Four general conclusions are readily apparent from the results of this study:
1. Of the three statistics considered, the Anderson-Darling A2 statistic appears to be the statistic
of choice for testing the goodness-of-fit of a two-parameter Weibull distribution to a set of
data. The choice might be different if censored data were used since the correlation statistic
has the potential advantage of being easily modified for type I and type II censored
observations.
3. The critical value approximations appear to be very good for the range of sample sizes
considered.
4. The power of the tests is very dependent upon the true distributional form of the data.
November 1989
Evans, James W.; Johnson, Richard A.; Green, David W. Two- and three-parameter Weibull goodness-of-fit
tests. Res. Pap. FPL-RP-493. Madison, WI: U.S. Department of Agriculture, Forest Service, Forest
Products Laboratory; 1989. 27 p.
A limited number of free copies of this publication are available to the public from the Forest Products
Laboratory, One Gifford Pinchot Drive, Madison, WI 53705-2398. Laboratory publications are sent to more
than 1,000 libraries in the United States and elsewhere.
The Forest Products Laboratory is maintained in Madison, Wisconsin, by the U.S. Department of
Agriculture, Forest Service, in cooperation with the University of Wisconsin.
Two- and Three-
Parameter Weibull
Goodness-of-Fit Tests
Background and Two-parameter and three-parameter Weibull distributions are widely used to represent
Introduction the strength distribution of structural lumber and engineering-designed wood
subassemblies. As wood construction practices in the United States and Canada are
revised from deterministic to reliability-based design procedures, assessing the
goodness-of-fit of these Weibull distributional forms becomes increasingly important.
In its three-parameter form, the family is represented by the density function
(1)
where a is the shape parameter, b the scale parameter, and c the location parameter.
The family of two-parameter Weibull distributions follows from Equation (1) when
c = 0.
2
Shapiro-Wilk-Type Studies such as the Monte-Carlo study of Shapiro and others (1968) have consistently
Correlation Statistic shown that for testing goodness-of-fit of normal distributions, the Shapiro-Wilk
statistic has superior power to other statistics in detecting that the data comes from a
wide range of other distributions. To develop a similar goodness-of-fit test for Weibull
distributions, we modify a simplified form of this statistic first suggested by Shapiro
and Francia (1972) but with approximate “scores” suggested by Filliben (1975).
(2)
and
is a median score, in the spirit of Filliben, except that these scores depend upon the
maximum likelihood estimate of the Weibull shape parameter a. The corresponding
Q-Q plot of (X ( i ) , m w , i ) should resemble a straight line if the underlying population is
Weibull. (A Q-Q plot is an abbreviated notation for a Quantile-Quantile probability
plot, where corresponding percentiles of one distribution are plotted against the
percentiles of the other, as discussed in Wilk and Gnanadesikan (1968).) The statistic
is the squared correlation coefficient of this plot. Note that the choice of m w,i
follows the approximate scores suggested by Filliben (1975) and is slightly different
from that chosen by Smith and Bain (1976).
(3)
and
Other We include two other well-studied goodness-of-fit statistics for comparative purposes.
Goodness-of-Fit The modified Komogorov-Smirnov D statistic is given by
Statistics
where Fn (x) is the empirical distribution function of the sample and F(x;b,a) is the
fitted distribution.
3
The modified Anderson-Darling A2 statistic is given by
Unlike the statistic, neither D nor A2 is affected by the choice of scale. Thus, we
can use either X or ln X in the calculations.
To calculate the critical values of the statistics, we used IMSL (1979) to generate a
fixed sample U ( 1 ) , U (2), . . . , U( n ) of n order statistics from a uniform distribution.
Then we transformed to the Weibull order statistics,
4
These curves fit the tabled values reasonably well, as shown in Figure 1. We made no
attempt to force these curves to give the asymptotic values of Chandra and others
(1981). Those asymptotic values were estimated by an extrapolation procedure
described in their paper. Since the critical values in Table 3 are slightly above the
estimated asymptotic values for large samples, we decided to fit the data of Table 3
without forcing the curve through Chandra’s estimate.
Modeling the critical values of the correlation statistic produces the following
separate equations for the 0.10, 0.05, and 0.01 levels of significance:
Power Studies
To evaluate our results, we conducted a power study of the three statistics, D, A2, and
using four different distributions:
1. Uniform distribution on 0 to 1
2. Truncated normal with a mean of 1.4 and standard deviation of 0.35 (The
distribution is truncated so that no values less than 0.00001 are allowed.)
3. Lognormal distribution where In X has a mean of 1.6 and standard deviation of 0.4
4. Gamma distribution with shape parameter equal to 2 and scale parameter equal to 1
The procedure involved generating 5,000 pseudorandom samples of size n from each of
the four alternative distributions considered. We then calculated each of the three test
statistics and compared them to their critical values from Table 3 and the
5
approximations to the critical values obtained from the equations for D, A 2, and
In each case, we counted the number of rejections of the null hypothesis. We repeated
this procedure for sample sizes of n = 20, 50, 80, 100, and 200.
Results of the power study are presented in Tables 4 to 6, which show that the
Anderson-Darling A2 statistic is generally superior to (that is, has a larger power than)
both the Kolmogorov-Smirnov D and the correlation statistics for the alternatives
presented. Neither D nor appears to be very powerful across this group of
alternative distributions. One other feature illustrated in Tables 4 to 6 is the close
agreement in power between the actual critical values and the values from the
approximation formulas, which attests to the accuracy of the approximations.
Three-Parameter Simulation Results
Weibull
Developing critical values for the three-parameter Weibull distribution goodness-of-fit
Distributions
critical statistics is more difficult because they depend upon the unknown shape
parameter. Selecting the scale on which to perform the tests is also a problem. In the
two-parameter case, using In X(i) instead of X(i) made the correlation statistic
independent of the shape parameter. In the three-parameter case, the choice of which
scale to use is not obvious. For the Kolmogorov-Smirnov D and the Anderson-Darling
A 2 statistics, the two scales produce the same results. Therefore, we investigated D, A2 ,
and using a = 2.0, 2.8, 3.6, 4.4, and 5.2. We chose b = 1.0 and c = 2.0.
Invariance properties derived in the Appendix show that the critical values of the
statistics depend only upon the shape parameter.
To calculate the critical values of the statistics, IMSL (1979) routines were again used
to generate a fixed sample U ( 1 ) , U ( 2 ) , . . . , U(n) of n order statistics from a uniform
distribution. Then we transformed to the Weibull order statistics,
7
Critical Value Approximations
We wanted to smooth the results and develop formulas that would approximate the
critical values. Because the critical values vary slightly for different shape parameters,
two models were tried for each statistic. One model ignores the differences in values
due to shape and just expresses the general trend due to sample size. For this simple
model, we determined that for the Kolmogorov-Smirnov D statistic and the
Anderson-Darling A2 statistic, the two-parameter models could be modified by
subtracting a constant from the two-parameter model for the critical value. For the
correlation statistics, producing an entirely new model was better. A second model,
more complex, attempts to model the apparent quadratic nature of the effect of shape
on the critical values. This refinement was added to the simple model developed for
each of the three statistics. Thus, two models were created to predict the critical values
of the Kolmogorov-Smirnov, Anderson-Darling, and correlation (original and extreme
value scales) statistics.
Kolmogorov-Smirnov Models –
2. Two-parameter model plus a shift and an adjustment for the estimated shape
parameter values (labeled SHAPE in the formulas)
Anderson-Darling Models –
8
2. Two-parameter model plus a shift and an adjustment for shape
1. New models
Testing is done by comparing calculated using X( i ) to the critical values from the
formulas. We reject the hypothesis that a three-parameter Weibull fits the data if is
less than the critical value.
1. New models
9
2. New models plus an adjustment for shape
Testing is done by comparing calculated using ln X( i) to the critical values from the
formulas. We reject the hypothesis that a three-parameter Weibull fits the data if is
less than the critical value.
Power Studies
We compared the four statistics and their approximations for critical values in a power
study using two Weibull distributions and two alternative distributions:
We chose Weibull distributions for use in the three-parameter case to evaluate the
difference between the simple and complex models for critical values. This was not
necessary in the two-parameter case, because the statistics did not depend on the true
shape parameter.
The procedure involved generating 2,000 pseudorandom samples of size n from each of
the four distributions considered. We then calculated each of the four test statistics and
compared them with the critical values obtained from the models presented in the
previous section. In each case we counted the number of rejections of the null
hypothesis. This procedure was repeated for n = 20, 50, 80, 100, and 200.
Results of the power study are presented in Tables 11 to 13, which show that nothing
was gained by using the more complex critical value formulas. The approximation
models for critical values do produce the correct percentage of rejections for the two
Weibull distributions, indicating the adequacy of the models. For the non-Weibull
distributions, the correlation statistics and are the most powerful of the test
statistics. The Anderson-Darling A2 statistic is slightly more powerful than the
Kolmogorov-Smirnov D statistic.
Comparing the two-parameter results with the three-parameter results shows that for a
given sample size, the Anderson-Darling statistic is much more powerful in the
10
two-parameter case than in the three-parameter case. Conversely, the correlation test is
more powerful in the three-parameter case than in the two-parameter case.
Conclusions Four general conclusions are readily apparent from the results of this study:
3. The critical value approximations appear to be very good for the range of sample
sizes considered. For the two-parameter Weibull distribution critical values, this is
seen in the close agreement of the plotted curves and the critical values as shown in
Figures 1 to 3. For the three-parameter Weibull distribution critical values, the
quality of the approximations can be seen in the power study where powers for the
Weibull distributions tested were near the significance level of the test.
4. The power of the tests is very dependent upon the true distributional form of the
data. For example, in the two-parameter Weibull distribution power study, power
against the uniform distribution was much higher than against the truncated normal.
References Bush, J.G.; Woodruff, B.W.; Moore, A.H.; Dunne, E.J. 1983. Modified Cramer-
von Mises and Anderson-Darling tests for Weibull distributions with unknown location
and scale parameters. Communications in Statistics, Part A-Theory and Methods. 12:
2465-2476.
Chandra, M.; Singpurwalla, N.D.; Stephens, M.A. 1981. Kolmogorov statistics for
tests of fit for the extreme-value and Weibull distribution. Journal of the American
Statistics Association. 76: 729-731.
David, F.N.; Johnson, N.L. 1948. The probability integral transformation when
parameters are estimated from the sample. Biometrika. 35:182-190.
Filliben, J. 1975. The probability plot correlation coefficient test for normality.
Technometrics. 17:111-117.
Lawless, J.F. 1982. Statistical Models and Methods for Lifetime Data. New York:
John Wiley and Sons.
11
Littell, R.D.; McClave, J.R.; Offen, W.W. 1979. Goodness-of-fit test for the
two-parameter Weibull or extreme-value distribution with unknown parameters.
Communications in Statistics, Part B-Simulation and Computation. 8: 257-269.
Mann, N.R.; Scheuer, E.M.; Fertig, K.W. 1973. A new goodness-of-fit test for the two
parameter Weibull or extreme-value distribution with unknown parameters.
Communications in Statistics. 2:383-400.
Shapiro, S.S.; Francia, R.S. 1972. Approximate analysis of variance tests for
normality. Journal of the American Statistics Association. 67: 215-216.
Shapiro, S.S.; Wilk, M.B.; Chen, H. 1968. A comparative study of various tests for
normality. Journal of the American Statistics Association. 63: 1343-1372.
Smith, R.M.; Bain, L.J. 1976. Correlation type goodness-of-fit statistics with censored
sampling. Communications in Statistics, Part A-Theory and Methods. 5:119-132.
Stephens, M.A. 1977. Goodness-of-fit for the extreme value distribution. Biometrika.
64: 583-588.
Tiku, M.L.; Singh, M. 1981. Testing the two parameter Weibull distribution.
Communications in Statistics, Part A-Theory and Methods. 10:907-917.
Wilk, M.B.; Gnanadesikan, R. 1968. Probability plotting methods for the analysis of
data. Biometrika. 55:1-19.
Wozniak, P.J.; Warren, W.G. 1984. Goodness of fit for the two-parameter Weibull
distribution. Presented at the American Statistical Association National Meeting,
Philadelphia, PA.
Table 1 –Two-parameter Weibull critical values for the Kolmogorov-Smirnov D statistic based on
maximum likelihood estimates
a
Values multiplied by n1/2 to convert to the scale of Chandra and others (1981)
b These are theoretical asymptotic values given by Chandra and others (1981).
Table 2–Two-parameter Weibull critical values for the Anderson-Darling A2 statistic based on
maximum likelihood estimates
13
Table 3–Two-parameter Weibull critical values for D, A2 , and
14
Table 3–Two-parameter Weibull critical values for D, A2 , and –con.
α = 0.10
Table 4–Simulated powers of two-parameter Weibull test statisticsa -α
Sample D A2
Distribution size T F T F T F
a
Values from table (T) and formula (F).
15
α = 0.05
Table 5–Simulated powers of two-parameter Weibull test statisticsa –α
Sample D A2
Distribution size T F T F
a
Values from table (T) and formula (F)
Sample D A2
Distribution size T F T F T F
a
Values from table (T) and formula (F).
16
Table 7–Three-parameter Weibull critical values for the Kolmogorov-Smirnov
D statistic
17
Table 8–Three-parameter Weibull critical values for the Anderson-Darling
A 2 statistic
18
Table 9–Three-parameter Weibull critical values for the correlation test
statistic
19
Table 10–Three-parameter Weibull critical values for the correlation test
statistic
20
Table 11 – Simulated powers of three-parameter Weibull test statisticsa – α = 0.10
Simple D A2
Distribution size S C S C S C S C
a
Values from simple (S) or complex (C) formula.
Simple D A2
Distribution size S C S C S C S C
a
Values from simple (S) or complex (C) formula
21
Table 13–Simulated powers of three-parameter Weibull test statisticsa – α = 0.01
Simple D A2
Distribution size S C S C S C S C
22
Appendix – In this Appendix, we verify a number of invariance considerations that pertain to this
Some Invariance study. We present results related to the scale choice, null distributions, and power.
Properties of
the Statistics Invariance of D and A 2 Under a Change to the Log Scale
Two-Parameter Case- In the two-parameter case, the estimate of the population
distribution function is
where and are the maximum likelihood estimates of the scale and shape
parameters, respectively. Under a change to the log scale, the estimate becomes
The estimated uniform variables are the same whichever scale is used. Equation (A. 1)
holds even if the underlying population is not Weibull.
Consequently,
The estimated uniform variables are the same whichever scale is used. Because the
uniform variables have the same values, A2 and D are the same under both the original
and log scales.
23
Appendix – In this Appendix, we verify a number of invariance considerations that pertain to this
Some Invariance study. We present results related to the scale choice, null distributions, and power.
Properties of
the Statistics Invariance of D and A 2 Under a Change to the Log Scale
Two-Parameter Case- In the two-parameter case, the estimate of the population
distribution function is
where and are the maximum likelihood estimates of the scale and shape
parameters, respectively. Under a change to the log scale, the estimate becomes
The estimated uniform variables are the same whichever scale is used. Equation (A. 1)
holds even if the underlying population is not Weibull.
Consequently,
The estimated uniform variables are the same whichever scale is used. Because the
uniform variables have the same values, A2 and D are the same under both the original
and log scales.
23
Invariance When the Underlying Distribution Is Weibull
Two-Parameter Case – On the log scale
is distributed free of the location parameter log b and scale parameter a – 1 (David and
Johnson 1948; Lawless 1982, p. 147). From Equation (A.l) the distribution of
estimated uniform observations does not depend on a and b, and hence the A2 and D
statistics do not depend on a and b.
The statistic, on the log scale, has a distribution that is free of the parameters since
Three-Parameter Case-For the three-parameter case, Lemon (1975) gives the pivotal
functions
where the equality in distribution is for the values of (a,b,c) indicated in the subscript.
In other words, these quantities depend only on the underlying population shape
parameter. Because
24
where the distribution of each term depends only on a, the estimated uniform variables
in Equation (A.2) and hence the distributions of A2 and D depend only on a, not the
other parameters.
which is a function of the standard Weibull with parameters a 10. Since Rw is the
correlation of these variables with the scores m w , i and the scores only depend on , and
hence a, the distribution of Rw depends only on a.
has a distribution that depends only on a. Since Rw is the correlation of the variables
(A.7) with the scores ln mw ,i , its distribution also depends only on a.
Further,
by the properties in Equation (A.4). Consequently, the estimated uniform variables are
the same on both scales, so A2 and D do not depend on the scale.
Thus, the power of each of the three tests remains the same for any lognormal
distribution. Also, the powers for the uniform (0, 1) alternative hold for the uniform
(0, > 0 and, more generally, for
25
The basic relations between the maximum likelihood estimators on the two scales can
be shown directly. Let Y = Then
Because of this relation the estimated uniform variables are also equal.
maximize L(x | a, b, c). In this sense, the maximum likelihood estimators and
are equivariant.
26
According to Equation (A.9)
By Equation (A.2), the estimated uniform variables are equal, and we conclude that
power is the same under both G 0 (x) and any location scale change
27
Invariance When the Underlying Distribution Is Weibull
Two-Parameter Case – On the log scale
is distributed free of the location parameter log b and scale parameter a – 1 (David and
Johnson 1948; Lawless 1982, p. 147). From Equation (A.l) the distribution of
estimated uniform observations does not depend on a and b, and hence the A2 and D
statistics do not depend on a and b.
The statistic, on the log scale, has a distribution that is free of the parameters since
Three-Parameter Case-For the three-parameter case, Lemon (1975) gives the pivotal
functions
where the equality in distribution is for the values of (a,b,c) indicated in the subscript.
In other words, these quantities depend only on the underlying population shape
parameter. Because
24
where the distribution of each term depends only on a, the estimated uniform variables
in Equation (A.2) and hence the distributions of A2 and D depend only on a, not the
other parameters.
which is a function of the standard Weibull with parameters a 10. Since Rw is the
correlation of these variables with the scores m w , i and the scores only depend on , and
hence a, the distribution of Rw depends only on a.
has a distribution that depends only on a. Since Rw is the correlation of the variables
(A.7) with the scores ln mw ,i , its distribution also depends only on a.
Further,
by the properties in Equation (A.4). Consequently, the estimated uniform variables are
the same on both scales, so A2 and D do not depend on the scale.
Thus, the power of each of the three tests remains the same for any lognormal
distribution. Also, the powers for the uniform (0, 1) alternative hold for the uniform
(0, > 0 and, more generally, for
25
The basic relations between the maximum likelihood estimators on the two scales can
be shown directly. Let Y = Then
Because of this relation the estimated uniform variables are also equal.
maximize L(x | a, b, c). In this sense, the maximum likelihood estimators and
are equivariant.
26
According to Equation (A.9)
By Equation (A.2), the estimated uniform variables are equal, and we conclude that
power is the same under both G 0 (x) and any location scale change
27