Inference For Log-Gamma Distribution Based
Inference For Log-Gamma Distribution Based
To cite this article: Chien-Tai Lin , Sam J. S. Wu & N. Balakrishnan (2006): Inference for Log-Gamma
Distribution Based on Progressively Type-II Censored Data, Communications in Statistics - Theory and
Methods, 35:7, 1271-1292
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation
that the contents will be complete or accurate or up to date. The accuracy of any
instructions, formulae, and drug doses should be independently verified with primary
sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.
Communications in Statistics—Theory and Methods, 35: 1271–1292, 2006
Copyright © Taylor & Francis Group, LLC
ISSN: 0361-0926 print/1532-415X online
DOI: 10.1080/03610920600692789
1271
1272 Lin et al.
1. Introduction
The log-gamma distribution is the distribution of the logarithmic transformation of
a generalized gamma variable and has the probability density function (pdf)
1
fw = expw − ew − < w < > 0
t 1 −z −1
It = e z dz 0 < t < > 0
0
where is the location parameter, is the scale parameter, and is the shape
parameter.
The density function in (2) is the three-parameter log-gamma density function
that we will use and develop inference for the parameters , , and in this paper
based on progressively Type-II censored samples.
Inference for the special case of the extreme value distribution (case = 1)
has been discussed in the literature quite extensively; see, for example, Bain (1972,
1978), Mann et al. (1974), Lawless (1982), Nelson (1982), Cohen and Whitten
(1988), and Balakrishnan and Cohen (1991). Lawless (1980, 1982) and Prentice
(1974) illustrated the usefulness of the log-gamma model in (2) as a lifetime
model and discussed the maximum likelihood estimation of the parameters based
on complete samples. Farewell and Prentice (1977) suggested the use of the
relative maximum likelihood method for estimating the shape parameter of this
Inference for Log-Gamma Distribution 1273
we discuss the MLEs of the parameters and assuming that the shape parameter
is known. After presenting the likelihood equations, we discuss the Newton–
Raphson method for the determination of the MLEs. Next, along the lines of
Balakrishnan and Varadan (1991), we derive approximate MLEs for and and
use them as initial values in the numerical procedure for determination of the
MLEs. We also discuss the EM algorithm and propose a modified EM algorithm
for determining the MLEs. In Sec. 3, we describe the interval estimation of the
parameters and based on some pivotal quantities. In Sec. 4, we present the
results on the bias and mean square error of the MLEs determined through an
extensive Monte Carlo simulation for different choices of the sample size n, the
effective sample size m, progressive censoring schemes, and values of the shape
parameter . We also examine the effectiveness of the intervals based on asymptotic
normality (of the MLEs), and show that they have very poor probability coverages
for small values of m. Finally, in Sec. 5, we present two examples to illustrate all the
methods of inference discussed in this paper.
m
L = C fyi m n
1 − Fyi m n
Ri
i=1
or equivalently
C m
L = fxi m n
1 − Fxi m n
Ri (3)
m i=1
where
Downloaded by [North Carolina State University] at 13:01 14 March 2013
m
m
ln L = const − m ln + ln fxi + Ri ln1 − Fxi (4)
i=1 i=1
where t = dtd ln t = dtd t/t is the digamma function. We can find the
MLEs of , , and as values ˆ and ˆ that maximize the log-likelihood function
ˆ ,
in (4) by solving the equations
ln L/
= 0
ln L/
= 0, and
ln L/
= 0.
Since the Eqs. (5) to (7) cannot be solved analytically, some numerical method
must be employed. We observed in our simulations that it is difficult to solve all
three equations simultaneously, as the log-likelihood function becomes very flat as
a function of the shape parameter , a fact noticed earlier by some other authors
as well.
For this reason, Lawless (1982, p. 243) proposed a two-stage method to find the
MLEs. In the first step, determine ˆ ˆ
and with fixed . Repeating the first
step for various values of , we can determine the estimate of as the value of
ˆ
that maximizes L ˆ
. In addition, Lawless (1982, p. 239) also suggested
Inference for Log-Gamma Distribution 1275
that one should find ˆ restricted in the range (0, 20) since the density function in (2)
changes very little when is greater than 20.
To perform the first step, we discuss in the following four subsections the
Newton–Raphson method for determining the MLEs, the approximate maximum
likelihood estimators (which could be used as initial values in the Newton–Raphson
method), the EM algorithm, and a modified EM algorithm proposed for a faster
convergence in the EM algorithm.
Since these equations cannot be solved explicitly, the Newton–Raphson method may
be used for the numerical determination of the MLEs. For this purpose, we need
the second derivatives of the log-likelihood function, which are
2 ln L 1
m
x m
Ri fxi
= 2 −exp √i −
2 i=1 i=1
1 − Fxi
√ √ x fxi
× − exp √i + (8)
1 − Fxi
2 ln L 1 √ m
√ m
xi
= 2 m + 2 xi − 2 xi exp √
2 i=1 i=1
2
m
x Ri xi fxi
m
− xi exp √i − 2
i=1 i=1
1 − Fxi
Ri xi fxi √
m 2 √ x fxi
− − exp √i + (9)
i=1
1 − Fxi 1 − Fxi
2 ln L 1 √ √ m
x m
x m
Ri fxi
= 2 m − exp √i − xi exp √i −
i=1 i=1 i=1
1 − Fxi
m
Ri xi fxi √ √ x fxi
− − exp √i + (10)
i=1
1 − Fx i 1 − Fxi
in (5) and (6) by Taylor series around the point (see Balakrishnan and Aggarwala,
2000, for reasoning)
i ≡ EXi = F −1 EUi
where Ui is the corresponding ith progressively Type-II censored order statistic from
the uniform U0 1 distribution; in this case, EUi is given by (see Balakrishnan
and Aggarwala, 2000)
m
j + Rm−j+1 + · · · + Rm
EUi = 1 − for i = 1 m
j=m−i+1
j + 1 + Rm−j+1 + · · · + Rm
Then we obtain
Downloaded by [North Carolina State University] at 13:01 14 March 2013
x
exp √i = i + xi i
and
fxi
= i + xi i
1 − Fxi
where
i i 1 i
i = 1 − √ exp √ i = √ exp √
fi √ √ fi
i = 1 − i − exp √i +
1 − Fi 1 − Fi
fi √ √ i fi
i = − exp √ + ≥ 0
1 − Fi 1 − Fi
Upon simplifying the approximate likelihood equations in (11) and (12), we obtain
˜ − B + C ˜ = 0 and m˜ 2 + E ˜ − F = 0
where
m
√
B= i yi + Ri i yi /M
i=1
m
√
√
C= − i − Ri i /M
i=1
Inference for Log-Gamma Distribution 1277
m
√
M= i + Ri i
i=1
m √ √ √
E= yi − B − i − Ri i − 2C i − 2CRi i
i=1
and
m √
F= yi − B2 i + Ri i
i=1
˜ and ,
from which we obtain the AMLEs of and , , ˜ as
Downloaded by [North Carolina State University] at 13:01 14 March 2013
˜ = B − C
˜ (13)
√
−E + E 2 + 4mF
˜ = (14)
2m
2.3. EM Algorithm
The progressive Type-II censoring model can be viewed as an incomplete data
problem, with the progressively censored observations as missing data at m stages.
So, an alternative approach to finding the MLEs with feasible initial guess of
parameters is through the EM algorithm; see, for example, Dempster et al. (1977).
From (2), the log-likelihood function based on the complete data Y =
y1 yn , with fixed , is
n
−1/2 √ yi − n
y −
Y
= n ln − n ln + − exp i √ (15)
i=1
i=1
Then the MLEs of and , ˆ and ˆ can be obtained by solving the equations
n
yi
ˆ √
i=1 yi exp ˆ
n
y
√ = n
yi − i=1 i (16)
√ n
i=1 exp ˆ
n ˆ √
ˆ 1 yi
e = exp √ (17)
n i=1 ˆ
where Zjh , h = 1 Rj (for j = 1 m), are iid random variables with a left-
truncated log-gamma density function (truncated on the left at yj ) and
Zjh −
E1j =E √ Zjh > yj
1 j
= − ln − exp − exp √
1 − Fj
j p+
exp √ j
× + √ − p + + 1
p=0
p + + 1
Zjh −
= E exp √ Zjh > yj
Downloaded by [North Carolina State University] at 13:01 14 March 2013
E2j
exp
√j p++1
1 j
= 1 − exp − exp √
1 − Fj p=0
p + + 2
Zjh − Zjh −
E3j =E √ exp √ Zjh > yj
1 j
= + 1 − ln − exp − exp √
1 − Fj
j p++1
exp √ j
× + 1 + √ − p + + 2
p=0
p + + 2
with j = yj − /. All the relevant algebraic details and derivations are presented
in the appendix.
Hence in the h + 1th iteration of the EM algorithm, the value of ˆ h+1 is first
obtained by solving the equation
m m
yj Z
ˆ h+1 j=1 yj exp √
ˆ h+1
Rj E Zjh exp ˆ jh√ Zjh > yj ˆ h ˆ h
+ j=1
√ =
h
m yj Zjh
j=1 exp ˆ h+1 + j=1 Rj E exp ˆ h Zjh > yj
ˆ h ˆ h
m
√ √
m m
j=1 yi + j=1 Rj E Zjh Zjh > yj
ˆ h ˆ h
−
(19)
n
and then ˆ h+1 is obtained as
1 √
expˆ h+1 ˆ h+1
m
1 yj m
Zjh
= exp √ + Rj E exp √ Zjh > yj ˆ h ˆ h+1
n j=1 ˆ h+1 j=1 ˆ h+1
Ng et al. (2002) have applied the Newton–Raphson method to solve for ˆ h+1
in Eq. (19). They evaluated the conditional expectations by plugging in the values
of ˆ h and ˆ h into Eq. (18) first and then used these values in each iteration of
the Newton–Raphson method to obtain a convergent value, ˆ h+1 . It seems that we
Inference for Log-Gamma Distribution 1279
can improve the convergence of their approach by replacing the values of ˆ h ˆ h
in the conditional expectations at each iteration of the Newton–Raphson method
until ˆ h+1 has been found. However, in some situations (for example, Cases C, E,
and G of Example 1 and Case H of Example 2 in Sec. 5), both these methods failed
to converge. Hence we employed the fixed-point iteration (see Burden and Faires,
1997, pp. 58–60) to improve the convergence of the Newton–Raphson method.
It should also be noted that the second-order derivatives presented in (8)–(10)
get used at every iteration of the Newton–Raphson method, but they only get used
at the final stage of the EM algorithm while computing the Fisher information
measure. Clearly, this is one advantage of the EM algorithm.
Even though the Newton–Raphson method would yield exactly the same values
for the MLEs and their asymptotic variances and covariance as the EM algorithm
Downloaded by [North Carolina State University] at 13:01 14 March 2013
would result in (through the Fisher information), the convergence of the iterative
process will be very different from that of the EM algorithm. As pointed out by
Little and Rubin (1983), the EM algorithm will converge reliably but rather slowly
as compared to the Newton–Raphson method when the amount of information
in the missing data is relatively large. For this reason, we propose the following
modification to the EM algorithm, which seems to result in a faster convergence for
the situations considered in this paper.
Figure 1. Convergence of the EM algorithm and the modified EM algorithm for Case A in
Example 1. (a) Trace plots for h and h under the EM algorithm. (b) (a) Trace plots for
h and h .
1280 Lin et al.
10−5 ), then perform the next iteration to produce 3 based on 1 + 2 /2 (rather
than basing it on 2 ); perform the following iteration to produce 4 basing it on
3 . If the convergence has occurred between these two successive iterates, terminate
the iteration of the algorithm. Otherwise, continue with the next iteration by basing
it on 3 + 4 /2, and so on. For the estimation of , follow the same modified
iterative procedure. As we will see later in our illustrative example, this modification
results in a faster convergence of the EM algorithm; see Fig. 1b.
It is important to mention here that with the MLEs of and computed by
any of the methods described above, we can readily obtain the MLE of the mean
lifetime as
= ˆ + ˆ √ − ln
ˆ = EY
Downloaded by [North Carolina State University] at 13:01 14 March 2013
and also its standard error from the estimated variances and covariance of ˆ and ˆ
obtained from the Fisher information as
√
ˆ = Var
SE ˆ + − ln 2 Var
ˆ + 2 − ln Cov
ˆ
ˆ 1/2
3. Interval Estimation
To develop interval estimates for the parameters and of the log-gamma
distribution in (2), we can consider suitable pivotal quantities and then use the
limiting normal distribution based on the asymptotic properties of the MLEs. For
this purpose, we first need to derive the asymptotic variance–covariance matrix of
the MLEs, which is related to the Fisher information matrix. From Eqs. (8) to
(10), the observed information matrix can be inverted to obtain the asymptotic
variance–covariance matrix of the MLEs as
2 2 ln L
−1 11
−
ln2 L
ˆ ˆ
−
ˆ ˆ I I 12
=
2
2 ln L 22
−
−
2 ln L
I 12
I ˆ
== ˆ
ˆ ˆ
2 ˆ ˆ
ˆ − ˆ − ˆ −
P1 = P2 = and P3 =
ˆ Î 11 Î 11 ˆ Î 22
as standard normal, based on the asymptotic normality of ˆˆ The pivotal quantities
P1 and P2 can be used to construct confidence intervals for the location parameter
when the scale parameter is unknown and known, respectively, while the pivotal
quantity P3 can be used to construct confidence intervals for .
In order to examine the effectiveness of these confidence intervals, we simulated
the probability coverages, P−196 ≤ Pi ≤ 196 for i = 1 2 3, through 1,000 Monte
Carlo simulations. We expected to obtain coverages of approximately 95%.
For obtaining interval estimates for the shape parameter , we can utilize the
likelihood ratio test for testing H0 = 0 versus H1 = 0 . The statistic
= −2 log R
Inference for Log-Gamma Distribution 1281
where
ˆ
L ˆ
R =
ˆ
L ˆ
ˆ
Using the quantity 0.147, we can find a 95% confidence interval for . Moreover, the
graph of R and the line R = 0147 may indicate the ˆ and the 95% confidence
interval for .
Downloaded by [North Carolina State University] at 13:01 14 March 2013
4. Simulation Results
An extensive Monte Carlo simulation study was carried out in order to evaluate
the performance of the MLEs discussed in Sec. 2. Progressively Type-II censored
samples from the log-gamma distribution (with = 0 = 1, and varying values of
) were generated using the algorithm of Balakrishnan and Sandhu (1995).
The convergence of the Newton–Raphson method depended on the choice
of the initial values. For this reason, the AMLEs computed from Eqs. (13) and
(14) were used as starting values for the iterations, and the MLEs were obtained
by solving the nonlinear equations (5) and (6) using the Fortran library and the
IMSL nonlinear equation solver. We observed that the use of the AMLEs as
starting values resulted in a faster convergence of the Newton–Raphson method.
For the EM algorithm, we observed that after a sufficient number of iterations, the
estimates of and coincided with the values obtained from the Newton–Raphson
method.
The simulations were carried out for sample sizes n = 15 20 50, different
choices of the effective sample size m, and different progressive censoring
schemes with the two extreme censoring schemes of 0 0 0 n − m and n −
m 0 0 0 being included in every case. For simplicity in notation, we will
denote these schemes by m − 1∗ 0 n − m and n − m m − 1∗ 0, respectively;
for example, 4∗ 0 20 denotes the progressive censoring scheme 0 0 0 0 20. In
order to evaluate the effect of differing levels of censoring as compared to the
two extreme censoring schemes, we have also included a few other progressive
censoring schemes for some selected choices of n and m. For = 1, the log-gamma
distribution is the same as the extreme value distribution, and in this case the results
of simulation match those of Balakrishnan et al. (2004). It should be mentioned here
that in this case exact conditional inference for the parameters has been discussed
by Viveros and Balakrishnan (1994). Next, for the case when = , the log-gamma
distribution is the normal distribution, as mentioned earlier, in which case the results
of our simulation match those of Balakrishnan et al. (2003).
In Table 1, we have presented the average values of the MLEs of and
and their variances and covariance for = 1 15 5 20 determined from 1,000
Monte Carlo simulations. We also averaged the values of variances and covariance
determined from the observed Fisher information matrix, and these values are
presented in this table for comparison with the simulated variances and covariance.
Downloaded by [North Carolina State University] at 13:01 14 March 2013
Table 1
1282
Averages, variances, and covariance of the MLEs, variances and covariance from observed Fisher information, and 95% coverage
probabilities for the pivotal quantities P1 P2 , and P3
n m Scheme ˆ ˆ var
ˆ var
ˆ cov
ˆ
ˆ ˆ 2 Î 11 ˆ 2 Î 22 ˆ 2 Î 12 P1 P2 P3
∗
15 5 (4 0, 10) 1 −03145 0.8216 0.3676 0.1522 01549 0.3167 0.1507 0.1510 72.1 92.7 73.6
1.5 −02768 0.8238 0.7704 0.9710 00342 0.2619 0.1417 0.1316 73.2 93.4 74.0
5 −02189 0.8225 1.0638 1.7763 −00857 0.1848 0.1207 0.0998 75.4 94.0 73.5
20 −01922 0.8205 1.2951 2.5706 −01704 0.1551 0.1104 0.0867 76.3 94.0 73.0
−01690 0.8182 0.1594 0.1128 00858 0.1340 0.1028 0.0772 77.4 94.3 72.8
(10, 4∗ 0) 1 −01840 0.9180 0.2194 0.0855 −00004 0.1874 0.0816 -0.0111 86.3 92.0 83.8
1.5 −01708 0.9181 0.4580 1.0070 −01628 0.1801 0.0755 -0.0033 86.5 92.5 83.2
5 −01456 0.9113 0.6679 1.9198 −02978 0.1646 0.0645 0.0135 86.9 93.0 82.0
20 −01315 0.9059 0.8598 2.8178 −03997 0.1577 0.0615 0.0232 87.6 93.9 81.4
−01184 0.9002 0.1641 0.0695 00392 0.1536 0.0615 0.0324 87.7 94.3 81.0
(2∗ 0, 10, 2∗ 0) 1 −02541 0.8677 0.2558 0.0874 00550 0.1906 0.0750 0.0378 79.9 90.1 77.3
1.5 −02283 0.8679 0.5486 0.9246 −01082 0.1774 0.0739 0.0420 80.7 91.0 77.2
5 −01842 0.8648 0.7879 1.7596 −02457 0.1540 0.0717 0.0484 82.0 92.5 76.8
Lin et al.
20 −01628 0.8623 0.9920 2.5892 −03423 0.1433 0.0717 0.0518 82.1 93.6 76.2
−01442 0.8591 0.1578 0.0847 00650 0.1355 0.0734 0.0554 82.9 94.1 76.6
( 2, 2, 2, 2, 2) 1 −02683 0.8515 0.2814 0.1181 00896 0.2228 0.1112 0.0773 76.3 91.4 76.8
1.5 −02376 0.8533 0.5968 0.9552 −00550 0.1970 0.1056 0.0718 77.8 92.2 76.8
5 −01893 0.8515 0.8429 1.7845 −01837 0.1575 0.0929 0.0628 79.6 93.1 75.9
20 −01666 0.8492 1.0468 2.6058 −02747 0.1409 0.0872 0.0596 80.6 93.8 75.5
−01468 0.8464 0.1523 0.0946 00675 0.1288 0.0837 0.0577 81.3 94.0 75.2
( 4, 4, 2, 2∗ 0) 1 −02200 0.8879 0.2323 0.0850 00326 0.1804 0.0763 0.0193 82.9 90.9 80.4
1.5 −01995 0.8879 0.4917 0.9539 −01261 0.1701 0.0734 0.0245 83.3 91.5 80.2
5 −01639 0.8834 0.7104 1.8180 −02589 0.1516 0.0681 0.0341 83.8 93.0 78.4
20 −01460 0.8799 0.9028 2.6733 −03547 0.1433 0.0670 0.0396 84.9 93.4 78.3
−01301 0.8759 0.1562 0.0775 00535 0.1376 0.0680 0.0451 85.1 93.1 77.4
Downloaded by [North Carolina State University] at 13:01 14 March 2013
20 5 (4∗ 0, 15) 1 −03714 0.8187 0.4831 0.1550 02059 0.4320 0.1543 0.2029 71.2 93.6 73.4
1.5 −03268 0.8209 1.0120 0.9728 00840 0.3486 0.1461 0.1766 71.8 93.9 73.9
5 −02596 0.8198 1.3835 1.7773 −00436 0.2319 0.1254 0.1318 72.7 94.3 73.4
20 −02294 0.8179 1.6674 2.5709 −01342 0.1876 0.1147 0.1127 73.8 94.4 72.6
−02037 0.8155 0.1824 0.1161 01082 0.1565 0.1067 0.0986 74.4 94.3 72.7
(15, 4∗ 0) 1 −01946 0.9311 0.2214 0.0787 00051 0.1900 0.0753 -0.0056 86.1 91.9 85.1
1.5 −01815 0.9310 0.4657 1.0181 −01642 0.1827 0.0697 0.0020 86.7 92.3 84.9
5 −01568 0.9236 0.6805 1.9492 −03060 0.1670 0.0596 0.0187 86.9 93.0 83.3
20 −01431 0.9176 0.8767 2.8644 −04148 0.1599 0.0570 0.0284 87.2 93.6 83.2
−01305 0.9113 0.1649 0.0645 00445 0.1556 0.0573 0.0376 87.4 93.6 82.4
( 0, 15, 3∗ 0) 1 −02369 0.9034 0.2443 0.0713 00368 0.1867 0.0612 0.0209 82.5 90.2 81.8
1.5 −02177 0.9016 0.5228 0.9551 −01354 0.1769 0.0591 0.0274 83.1 91.1 81.5
5 −01828 0.8939 0.7584 1.8332 −02805 0.1585 0.0570 0.0397 83.7 92.5 80.0
20 −01648 0.8892 0.9658 2.6981 −03871 0.1502 0.0579 0.0464 84.4 93.1 79.5
−01490 0.8841 0.1639 0.0700 00621 0.1445 0.0608 0.0530 84.8 93.5 79.0
10 (9∗ 0, 10) 1 −01202 0.9291 0.1302 0.0817 00477 0.1183 0.0811 0.0440 86.6 92.6 86.4
1.5 −01071 0.9295 0.2578 1.0206 −00210 0.1039 0.0759 0.0390 87.2 93.1 86.1
5 −00862 0.9262 0.3586 1.9510 −00855 0.0832 0.0650 0.0317 88.2 94.3 85.3
20 −00759 0.9236 0.4461 2.8705 −01334 0.0749 0.0598 0.0290 88.2 94.9 85.3
−00667 0.9204 0.0733 0.0600 00296 0.0689 0.0560 0.0273 88.5 95.1 84.8
Inference for Log-Gamma Distribution
(10, 9∗ 0) 1 −00898 0.9624 0.1051 0.0508 −00107 0.1020 0.0502 -0.0146 91.3 94.0 90.4
1.5 −00839 0.9622 0.2130 1.0230 −01038 0.0984 0.0469 -0.0102 91.8 94.4 90.5
5 −00722 0.9581 0.3112 1.9898 −01830 0.0913 0.0408 -0.0006 92.3 95.0 89.9
20 −00654 0.9548 0.4041 2.9468 −02457 0.0883 0.0387 0.0050 92.4 95.1 89.1
−00591 0.9506 0.0855 0.0403 00110 0.0865 0.0381 0.0103 92.4 95.3 88.7
(0, 10, 8∗ 0) 1 −00967 0.9606 0.1035 0.0470 −00036 0.0985 0.0451 -0.0079 90.3 93.8 90.0
1.5 −00902 0.9597 0.2104 1.0126 −00963 0.0944 0.0424 -0.0035 91.0 94.1 89.8
5 −00770 0.9540 0.3066 1.9729 −01752 0.0867 0.0381 0.0052 91.2 94.6 88.9
20 −00695 0.9502 0.3967 2.9215 −02367 0.0835 0.0370 0.0101 91.3 95.0 88.6
1283
−00626 0.9458 0.0816 0.0405 00159 0.0814 0.0373 0.0148 91.1 94.9 87.4
(continued)
Downloaded by [North Carolina State University] at 13:01 14 March 2013
1284
Table 1
Continued.
n m Scheme ˆ ˆ var
ˆ var
ˆ cov
ˆ
ˆ ˆ 2 Î 11 ˆ 2 Î 22 ˆ 2 Î 12 P1 P2 P3
50 20 (19∗ 0, 30) 1 −00759 0.9624 0.0769 0.0444 00354 0.0763 0.0432 0.0346 90.1 94.0 88.5
1.5 −00680 0.9630 0.1475 1.0107 −00064 0.0648 0.0408 0.0305 90.5 94.4 88.7
5 −00558 0.9610 0.2005 1.9730 −00474 0.0487 0.0354 0.0241 91.6 95.2 89.0
20 −00501 0.9593 0.2455 2.9287 −00794 0.0424 0.0327 0.0214 92.0 95.5 88.7
−00450 0.9569 0.0372 0.0320 00193 0.0377 0.0305 0.0194 92.0 95.5 88.4
(30, 19∗ 0) 1 −00508 0.9827 0.0511 0.0254 −00079 0.0521 0.0250 -0.0079 93.1 93.7 91.8
1.5 −00479 0.9830 0.1027 1.0129 −00634 0.0504 0.0235 -0.0057 93.2 93.9 92.5
5 −00426 0.9811 0.1506 1.9983 −01117 0.0473 0.0206 -0.0008 93.1 94.7 91.4
20 −00396 0.9794 0.1965 2.9789 −01520 0.0459 0.0195 0.0021 93.2 95.0 91.1
−00367 0.9766 0.0433 0.0194 00040 0.0452 0.0191 0.0050 93.3 95.1 91.1
25 (24∗ 0, 25) 1 −00530 0.9655 0.0482 0.0330 00187 0.0485 0.0332 0.0181 92.1 94.9 90.1
Lin et al.
1.5 −00468 0.9667 0.0930 0.9945 −00158 0.0427 0.0313 0.0161 92.2 95.2 90.0
5 −00368 0.9668 0.1287 1.9542 −00476 0.0346 0.0272 0.0133 92.8 95.2 89.9
20 −00320 0.9667 0.1601 2.9121 −00711 0.0314 0.0252 0.0122 92.5 95.7 89.9
−00278 0.9656 0.0278 0.0234 00112 0.0290 0.0236 0.0115 92.6 95.8 90.4
(25, 24∗ 0) 1 −00328 0.9776 0.0397 0.0210 −00064 0.0417 0.0211 -0.0074 94.9 95.2 92.7
1.5 −00302 0.9789 0.0791 0.9946 −00434 0.0404 0.0199 -0.0056 95.0 95.4 92.0
5 −00256 0.9797 0.1159 1.9684 −00745 0.0382 0.0177 -0.0017 95.2 96.0 91.8
20 −00231 0.9800 0.1514 2.9427 −00990 0.0373 0.0168 0.0006 95.1 96.6 91.9
−00207 0.9791 0.0344 0.0159 00025 0.0369 0.0164 0.0029 94.8 96.7 92.8
Inference for Log-Gamma Distribution 1285
From Table 1, we observe that both the bias and the variance of the estimates
decrease significantly as the effective sample proportion m/n increases, as one would
expect. A comparison of the simulated values of the variances and covariance of the
estimates with those obtained from the corresponding observed Fisher information
matrices reveals that the latter are much closer to the former for large values of m
even when n is small. However, when m is moderate, a larger value of n is required
for closeness between these two values.
We have also presented the simulated values (based on 1,000 Monte Carlo
simulations) of the coverage probabilities of the pivotal quantities P1 P2 , and P3 ,
based on the asymptotic normality, in Table 1. When is unknown, the probability
coverages are extremely unsatisfactory, especially when the effective sample fraction
m/n is small. If is known, then the coverage probabilities for P2 are close to
Downloaded by [North Carolina State University] at 13:01 14 March 2013
the required levels. In most practical situations, however, is unknown; hence using
the normal approximation for the corresponding pivotal quantities is not advisable.
The distributions of the pivotal quantities are extremely skewed. Therefore our
recommendation, based on these observations, is to use simulated unconditional
percentage points of the pivotal quantities in order to construct confidence intervals
for the location and scale parameters.
5. Illustrative Examples
In order to illustrate the methods of inference developed in this paper, we will
present two examples in this section.
Example 1. Let us consider the data giving the number of million revolutions to
failure for each of a group of 23 ball bearings in a fatigue test (see Lieblein and
Zelen, 1956). The 23 log-lifetimes are
Case A 2884 3365 3497 3726 3741 3820 3881 3948 3950 3991
4017 4217 4229 4229 4232 4432 4534 4591 4655 4662
4851 4852 5156
Lieblein and Zelen (1956) assumed a two-parameter Weibull distribution for the
original data and hence an extreme value distribution for the log-lifetime data given
above. Lawless (1982) and Balakrishnan and Chan (1995a,b) assumed a generalized
gamma distribution for the original data and hence a log-gamma distribution for
the log-lifetimes.
With the above complete sample data (Case A), we generated two different
progressively Type-II censored samples as follows.
Case B 2884 3497 3726 3741 3820 3881 3948 3950 3991 4017 4217
4229 4229 4232 4432 4534 4591 4655 4662 4851 4852 5156
Case C 2884 3991 4017 4217 4229 4229 4232 4432 4534 4591 4655
4662 4851 4852 5156
In each case, we present the estimate of as well as the 95% confidence interval
for , and the corresponding estimates of , and and their standard errors,
in Table 2. In the complete sample situation (Case A), the estimates we obtained
agree with those of Lawless (1982). Further, Case D corresponds to the Type-II
right-censored sample case, in which case the estimates obtained agree with those of
Downloaded by [North Carolina State University] at 13:01 14 March 2013
1286
Table 2
Estimates of parameters for different cases in Example 1
Case m Scheme ˆ (s.e.) ˆ (s.e.) ˆ (s.e.) 95% interval of
A 23 (23∗ 0) ˆ = 106204 4.2299 (0.1069) 0.5099 (0.0761) 4.1504 (0.1089) (0.427, )
=1 4.4052 (0.1050) 0.4757 (0.0744) 4.1306 (0.1259)
= 15 4.3594 (0.1054) 0.4861 (0.0748) 4.1397 (0.1192)
=5 4.2659 (0.1064) 0.5040 (0.0757) 4.1494 (0.1105)
= 20 4.2084 (0.1073) 0.5132 (0.0763) 4.1506 (0.1084)
= 4.1504 (0.1087) 0.5215 (0.0769) 4.1388 (0.1088)
B 22 (1, 21∗ 0) ˆ = 84504 4.2727 (0.1050) 0.4894 (0.0745) 4.1869 (0.1073) (0.421, )
=1 4.4330 (0.1031) 0.4575 (0.0724) 4.1689 (0.1232)
= 15 4.3886 (0.1034) 0.4673 (0.0730) 4.1774 (0.1167)
=5 4.2982 (0.1045) 0.4848 (0.0742) 4.1862 (0.1084)
= 20 4.2427 (0.1057) 0.4946 (0.0750) 4.1869 (0.1066)
= 4.1865 (0.1075) 0.5046 (0.0758) 4.1752 (0.1075)
C 15 (8, 14∗ 0) ˆ = 12447 4.5767 (0.1050) 0.3930 (0.0744) 4.3782 (0.1202) (0.156, )
Lin et al.
Balakrishnan and Chan (1995a). For comparative purposes, we have also selected
five other values of from the confidence interval and presented the corresponding
estimates of the parameters , , and and their standard errors in Table 2.
We notice that all these values are stable and quite close to each other for every
progressively censored sample considered. In particular, we notice that the estimate
of and its standard error change very little for the different choices of taken
from the confidence interval.
Using the sample mean and sample standard deviation as the starting values,
we observed that the EM algorithm converged to the same estimates of and as
the Newton–Raphson method, and that the EM algorithm required 140 iterations
for Case A and 53 iterations for Case B, while the Newton–Raphson method
required four iterations for each. It should be noted here that the Newton–Raphson
Downloaded by [North Carolina State University] at 13:01 14 March 2013
method only required two iterations for each when the approximate maximum
likelihood estimates were used as the initial values. In Fig. 1, we have displayed the
convergence of the EM algorithm and the modified EM algorithm for Case A. The
modified EM algorithm resulted in improving the convergence quite significantly
and required only six and eight iterations for Cases A and B, respectively.
From the above complete sample, we generated two different progressively Type-II
censored samples as follows
The estimate of as well as the 95% confidence interval for , and the
corresponding estimates of , , and and their standard errors, are presented in
Table 3. In the complete sample situation (Case A), the estimates obtained agree
with those of Balakrishnan and Chan (1995b). Further, Case D corresponds to the
Type-II right-censored sample, and in this case the estimates obtained agree as well
with those of Balakrishnan and Chan (1995b). Once again, we find that all these
values are reasonably stable and very close to each other for every progressively
Downloaded by [North Carolina State University] at 13:01 14 March 2013
1288
Table 3
Estimates of parameters for different cases in Example 2
Case m Scheme ˆ (s.e.) ˆ (s.e.) ˆ (s.e.) 95% interval of
A 40 (40∗ 0) ˆ = 34254 5.1670 (0.4072) 2.5305 (0.2939) 4.4504 (0.4305) (0.368, )
=1 5.7514 (0.3914) 2.3419 (0.2856) 4.3996 (0.4715)
= 15 5.5215 (0.3978) 2.4212 (0.2892) 4.4273 (0.4512)
=5 5.0456 (0.4104) 2.5642 (0.2954) 4.4532 (0.4263)
= 20 4.7503 (0.4184) 2.6383 (0.2988) 4.4529 (0.4224)
= 4.4504 (0.4272) 2.7021 (0.3021) 4.3900 (0.4273)
B 38 (2, 37∗ 0) ˆ = 79696 5.1721 (0.3876) 2.3734 (0.2757) 4.7430 (0.3966) (0.517, )
=1 5.9393 (0.3728) 2.1756 (0.2678) 4.6836 (0.4473)
= 15 5.7243 (0.3774) 2.2410 (0.2703) 4.7115 (0.4268)
=5 5.2841 (0.3855) 2.3505 (0.2747) 4.7411 (0.4001)
= 20 5.0139 (0.3906) 2.4027 (0.2772) 4.7430 (0.3942)
= 4.7410 (0.3965) 2.4459 (0.2798) 4.6863 (0.3966)
C 36 (4, 35∗ 0) ˆ = 49162 5.4549 (0.3817) 2.2665 (0.2721) 4.9265 (0.3962) (0.447, )
Lin et al.
censored sample considered. We also observe that the estimate of and its standard
error vary very little for different choices of taken from the confidence interval.
With the sample mean and sample standard deviation used as the initial
values, we found that the EM algorithm converged to the same estimates of
and as the Newton–Raphson method, and that the EM algorithm required 116
iterations for Case A and 111 iterations for Case B, while the Newton–Raphson
method required 5 and 4 iterations, respectively. In contrast, we observed that
the iterations required for the Newton–Raphson method with the approximate
maximum likelihood estimates as starting values are 2 for Case A and 3 for Case B.
In order to facilitate faster convergence, we applied the modified EM algorithm and
observed that this method required only 12 and 14 iterations for Cases A and B,
respectively.
Downloaded by [North Carolina State University] at 13:01 14 March 2013
Appendix
The log-likelihood function in (15) based on the complete data, for a fixed value
of , can be expressed as follows:
m
−1/2 √ yj − m
yj −
Y
= n ln − n ln + − exp √
j=1
j=1
Rj
√ m
zjh − m Rj
zjh −
+ − exp √
j=1 h=1
j=1 h=1
t + exp √j 1 t+−1 −w
= 1− w e dw
1 − Fj t 0 t +
1290 Lin et al.
t
From the fact that 0
xd−1 e−x dx/d = i=d ti e−t /i!, we get
exp
√j p+t+
t + j
M Zj√− t = 1 − exp − exp √
1 − Fj t p=0
p + t + + 1
The derivative of the above moment generating function with respect to t is then
given by
d t + j
M Zj√− t = − ln + t + − exp − exp √
dt 1 − Fj t
p+t+
Downloaded by [North Carolina State University] at 13:01 14 March 2013
exp √j j
× t + + √ − p + t + + 1
p=0
p + t + + 1
exp
√j p++1
1 j
= 1 − exp − exp √
1 − Fj p=0
p + + 2
Zj − Zj −
E3j ≡E √ exp √ Zj > yj
Zj − tZj −
=E √ exp √ Zj > yj
t=1
d tZj − d
=E exp √ Zj > yj = M Zj√− t
dt dt
t=1 t=1
1 j
= + 1 − ln − exp √
1 − Fj
p++1
exp √j j
× + 1 + √ − p + + 2
p=0
p + + 2
Inference for Log-Gamma Distribution 1291
References
Ahn, H. (1996). Log-gamma regression modeling through regression trees. Comm. Statist.
Theor. Meth. 25:295–311.
Bain, L. J. (1972). Inferences based on censored sampling from the Weibull or extreme-value
distribution. Technometrics 14:693–702.
Bain, L. J. (1978). Statistical Analysis of Reliability and Life-Testing Models—Theory and
Practice. New York: Marcel Dekker.
Balakrishnan, N., Aggarwala, R. (2000). Progressive Censoring: Theory, Methods, and
Applications. Boston: Birkhäuser.
Balakrishnan, N., Cohen, A. C. (1991). Order Statistics and Inference: Estimation Methods.
San Diego, CA: Academic Press.
Balakrishnan, N., Varadan, J. (1991). Approximate MLEs for the location and scale
parameters of extreme value distribution with censoring. IEEE Trans. Reliab.
Downloaded by [North Carolina State University] at 13:01 14 March 2013
40:146–151.
Balakrishnan, N., Chan, P. S. (1995a). Maximum likelihood estimation for the log-
gamma distribution under Type-II censored samples and associated inference. In:
Balakrishnan, N. J., ed. Advances in Reliability. Boca Raton, FL: CRC Press,
pp. 409–437.
Balakrishnan, N., Chan, P. S. (1995b). Maximum likelihood estimation for the three-
parameter log-gamma distribution under Type-II censoring. In: Balakrishnan, N. J.,
ed. Advances in Reliability. Boca Raton, FL: CRC Press, pp. 439–451.
Balakrishnan, N., Sandhu, R. A. (1995). A simple simulational algorithm for generating
progressive Type-II censored samples. Amer. Statist. 49:229–230.
Balakrishnan, N., Kannan, N., Lin, C. T., Ng, H. K. T. (2003). Point and interval estimation
for Gaussian distribution, based on progressively Type-II censored samples. IEEE
Trans. Reliab. 52:90–95.
Balakrishnan, N., Kannan, N., Lin, C. T., Wu, S. J. S. (2004). Inference for the extreme value
distribution under progressive Type-II censoring. J. Statist. Comp. Simul. 74:25–45.
Burden, R. L., Faires, J. D. (1997). Numerical Analysis. Pacific Grove: Brooks/Cole.
Cohen, A. C., Whitten, B. J. (1988). Parameter Estimation in Reliability and Life Span Models.
New York: Marcel Dekker.
Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete
data via the EM algorithm. J. Royal Statist. Soc. Ser. B 39:1–38.
DiCiccio, T. J. (1987). Approximate inference for the generalized gamma distribution.
Technometrics 29:33–40.
Farewell, V. T., Prentice, R. L. (1977). A study of distributional shape in life testing.
Technometrics 19:69–75.
Jones, R. A., Scholz, F. W., Ossiander, M., Shorack, G. R. (1985). Tolerance bounds for log
gamma regression models. Technometrics 27:109–118.
Lawless, J. F. (1980). Inference in the generalized gamma and log-gamma distributions.
Technometrics 22:409–419.
Lawless, J. F. (1982). Statistical Models and Methods for Lifetime Data. New York: John
Wiley.
Lieblein, J., Zelen, M. (1956). Statistical investigation of the fatigue life of deep groove ball
bearings. J. Res. Nat. Bureau Standards 57:273–316.
Little, R. J. A., Rubin, D. B. (1983). Incomplete data. In: Kotz, S., Johnson, N. L., eds.
Encyclopedia of Statistical Sciences. Vol. 4. New York: John Wiley, pp. 46–53.
Mann, N. R., Schafer, R. E., Singpurwalla, N. D. (1974). Methods for Statistical Analysis of
Reliability and Life Data. New York: John Wiley.
Nelson, W. (1982). Applied Life Data Analysis. New York: John Wiley.
Ng, H. K. T., Chan, P. S., Balakrishnan, N. (2002). Estimation of parameters from
progressively censored data using EM algorithm. Comput. Statist. Data Anal.
39:371–386.
1292 Lin et al.
Prentice, R. L. (1974). A log gamma model and its maximum likelihood estimation.
Biometrika 61:539–544.
Smith, S. P., Hammond, K. (1988). Rank regression with log gamma residuals. Biometrika
75:741–751.
Taguchi, T., Sakurai, H., Nakajima, S. (1993). A concentration analysis of income
distribution model and consumption pattern. Introduction of logarithmic gamma
distribution and statistical analysis of Engel elasticity. Statistica 53:31–57.
Tiku, M. L., Tan, W. Y., Balakrishnan, N. (1986). Robust Inference. New York: Marcel
Dekker.
Viveros, R., Balakrishnan, N. (1994). Interval estimation of parameters of life from
progressively censored data. Technometrics 36:84–91.
Wingo, D. R. (1987). Computing maximum-likelihood parameter estimates of the generalized
gamma distribution by numerical root isolation. IEEE Trans. Reliab. 36:586–590.
Downloaded by [North Carolina State University] at 13:01 14 March 2013
Young, D. H., Bakir, S. T. (1987). Bias correction for a generalized log-gamma regression
model. Technometrics 29:183–191.