A Two Parameter Distribution Obtained by
A Two Parameter Distribution Obtained by
A Two Parameter Distribution Obtained by
DOI 10.1007/s00009-015-0665-5
c Springer International Publishing 2015
1. Introduction
There are many practical situations where well-known distributions do not
provide adequate fits to real data. Among the different methods for generating
new distributions, the compounding of some discrete and important lifetime
distributions has been very popular in lifetime modeling.
Adamidis and Loukas [1] introduced a two-parameter exponential–
geometric (EG) distribution by compounding the exponential and geometric
distributions. More general family of these distributions had been considered
by Marshall and Olkin [9]. Similarly, the exponential Poisson (EP) and expo-
nential logarithmic distributions were introduced by Kuş [6] and Tahmasbi
and Rezaei [15], respectively. Barreto-Souza et al. [3] proposed the Weibull
geometric (WG) model, while Lu and Shi [7] studied the Weibull–Poisson
(WP) distribution as a natural extension of the EG and EP distributions.
Further, Rodrigues et al. [13] defined the Weibull negative binomial distribu-
tion, which includes as sub-models the WG and WP distributions. Nadarajah
B. V. Popović et al. MJOM
−1
∞
1
F (x) = F (x; λ, θ) = G(x|α, λ) θ e−θα dα = 1 − log(1 − e−λx ) , λ, θ > 0.
0 θ
(2)
∞
zk
log(1 − z) = −z , if |z| < 1 (5)
k+1
k=0
and
−x (−1)n x(n)
= , (6)
n n!
where x(k) = x(x + 1) . . . (x + k − 1) denotes the rising factorial.
∞ −t
We shall use the exponential integral defined as Ei (x) = − −x e t dt.
The exponential integral can be easily computed in MATHEMATICA using
the function ExpIntegralEi[x].
In the sequel, we require the Whittaker’s function defined by
+∞ k−1/2+m
e−z/2 z k −k−1/2+m t
Wk,m (z) = t 1 + e−t dt,
Γ 21 − k + m 0 z
where Γ(·) is the gamma function. The values of the Whittaker function can
be obtained using the MATHEMATICA function WhittakerW[k,m,z].
We use an equation of Gradshteyn and Ryzhik ([4], equation 0.314) for
a power series raised to a positive integer power p
⎛ ⎞p
⎝ an xn ⎠ = cp,n xn , (7)
n≥0 n≥0
where the coefficients cp,n can be determined using the following recurrence
relation (for m ≥ 1 and cp,0 = ap0 )
m
cp,m = (m a0 )−1 [k(p + 1) − m] ak cp,m−k .
k=1
The rest of the paper is organized as follows. We discuss the shapes of
the pdf and cdf of the new distribution in Sect. 2. We provide in Sect. 3
several mathematical properties of the GEE model such as the ordinary and
incomplete moments, probability-weighted moments (PWMs), mean devia-
tions, Bonferroni and Lorenz curves, generating function and order statistics.
The Rényi and Shannon entropies are derived in Sect. 4. The parameters
B. V. Popović et al. MJOM
a b
0.6
0.30
0.5
0.4
0.20
0.3
0.2
0.10
0.1
0.00
0.0
0 5 10 15 0 1 2 3 4 5
a b
4.0
1.4
3.5
1.3
3.0
1.2
2.5
1.1
2.0
1.0
1.5
0.9
0 1 2 3 4 5 0 1 2 3 4 5
Proof. Let us suppose that F (x; λ1 , θ1 ) = F (x; λ2 , θ2 ) for all x > 0. We will
show that this condition implies that λ1 = λ2 and θ1 = θ2 . First, we note
that this condition implies that h(x; λ1 , θ1 ) = h(x; λ2 , θ2 ) for all x > 0. Now,
letting x tends to ∞ on both sides, and using the result from Theorem 4
that h(∞; λ, θ) = λ, we obtain that λ1 = λ2 . Next, condition F (x; λ1 , θ1 ) =
F (x; λ2 , θ2 ) implies that θ1 (1 − e−λ1 x ) = θ2 (1 − e−λ2 x ) and since λ1 = λ2 ,
we obtain finally that θ1 = θ2 . Thus, we have proved the identifiability of the
distribution function F .
3. Mathematical Properties
Without loss of generality in order to simplify final expressions given in the
Sects. 3 and 4, we will set λ = 1.
3.1. Moments
By means of the Theorem 2, the nth moment of X can be expressed as
+∞ n
n −V n n 1 [log(1 − e−v )]
E(X ) = E − log(1 − e ) = (−1) 2 dv .
θ 0 1 + θ1 v
By setting 1 + θ1 v = u, using both (5) and (7), the last equation reduces
to
+∞
e−θ(n+k)v
E(X n ) = cn,k dv, (8)
0 (v + 1)2
k≥0
where (for i ≥ 1)
i
1 [j(n + 1) − i]
cn,i = cn,k−j , (9)
i j=1 (j + 1)
and cn, 0 = 1. From the result (3.353.2) in [4], Eq. (8) can be expressed as
E(X n ) = cn,k 1 − e−θ(n+k) Ei [−θ(n + k)] . (10)
k≥0
Let Xi:n denote the ith order statistic. From Eqs. (2) and (3), the power series
expansion of the pdf of Xi:n can be expressed as
n−i
K e−x (−1)j n−i
fi:n (x) = i+j+1 ,
θ(1 − e−x ) j=0 1 − 1
log(1 − e−x ) j
θ
n!
where K = (i−1)!(n−i)! .
We obtain the moments of the order statistics using a similar approach
of Sect. 2. We have
n−i
n (−1)i cn,k n−1
E(Xi:n )=K
j=0
(i + j)! j
k≥0
i+j
(l − 1)! i+j θ(n+k)
× l
− [θ(n + k)] e Ei [−θ(n + k)] . (12)
l=1 [−θ(n + k)]
4. Entropy
An entropy is a measure of variation or uncertainty of a random variable.
Two popular entropy measures are the Rényi and Shannon entropies [12,14].
The Rényi entropy of a random variable with pdf f (·) is defined as
∞
1
IR (γ) = log f γ (x)dx
1−γ 0
Here, we derive explicit expressions for the Rényi and Shannon entropies of
X. We have
γ
1 1 ∞
e−γx
IR (γ) = log 2γ dx
1−γ θ 0 (1 − e−x )γ 1 − 1
log(1 − e−x )
θ
γ
1 1 ∞
e−(γ−1)x e−x
= log dx .
1−γ θ 0 (1 − e−x )γ−1 (1 − e−x ) 1 − 1 log(1 − e−x ) 2γ
θ
Then, using the binomial expansion of (1 − e−x )1−γ and the same algebra
which leads to (10), the last equation becomes
⎧
⎨ 1−γ γ−1+i
1 γ−1
1
IR (γ) = 1−γ log θ (−1)i+j
⎩ i j
i,j≥0
⎫
+∞ ⎬
e−θjv
dx . (13)
0 (1 + v)2γ ⎭
(14)
Equation (14) is very complicated for limiting, and then, we can derive
an explicit expression for the Shannon entropy based on its definition. We
can write
E{− log[f (X)]} = log θ − E(X) + E log(1 − e−X )
#
1 −X
+ 2E log 1 − log(1 − e ) . (15)
θ
The first expectation in (15) follows easily from (10) for n = 1. Using
Eq. (5) and the same approach of Sect. 2, we obtain
(−1)j i
E log(1 − e−X ) = − [1 + jθ Ei (−jθ)]. (16)
i j
i,j≥0
1 −x
Setting 1 − θ log(1 − e ) = u, we easily obtain
#
1
E log 1 − log(1 − e−X ) = 1. (17)
θ
By inserting (10) (for n = 1), (16), and (17) into (15), we obtain the Shannon
entropy.
5. Estimation
In this section, we consider the maximum likelihood estimation (MLE) of the
unknown parameters λ and θ. Suppose the observed sample x1 , . . . , xn of size
B. V. Popović et al. MJOM
n from the distribution (3). The log-likelihood function for (λ, θ) is given by
n
n
l(λ, θ) = n log λ − n log θ − λ xi − log(1 − e−λxi )
i=1 i=1
n
1
−2 log 1 − log(1 − e−λxi ) .
i=1
θ
So, the components of the score function satisfy equations
n n
∂l(λ, θ) n xi 2 xi e−λxi
= − + = 0,
∂λ λ i=1 1 − e−λxi θ i=1 (1 − e−λxi ) 1 − θ1 log(1 − e−λxi )
(18)
n
∂l(λ, θ) n 2 log(1 − e−λxi )
=− − 2 = 0. (19)
∂θ θ θ i=1 1 − θ1 log(1 − e−λxi )
Now, we will study the existence and uniqueness of the MLE estimates
when the other parameter is known (or given).
Theorem 6. If the parameter θ is known, then the equation (18) has at least
one root on the interval (0, +∞).
∂l(λ,θ) ∂l(λ,θ)
$nOne can readily verify that limλ→0 ∂λ = +∞ and limλ→+∞ ∂λ
Proof.
= − i=1 xi . So, there exists at least one solution on the interval (0, +∞).
This completes the proof.
Theorem 7. Let us suppose that the parameter λ is known. Then, the root
of equation (19) lies in the interval − log 1 − e−λx(n) , − log 1 − e−λx(1)
and is unique for all λ > 0, where x(1) = min(x1 , x2 , . . . , xn ) and x(n) =
max(x1 , x2 , . . . , xn ).
$n yi
Proof. Let us define the function ψ(θ) = − n2 + i=1 θ+y i
, where yi =
− log(1 − e−λxi ). Then, the function ∂l(λ,θ)
∂θ can be represented as ∂l(λ,θ)
∂θ =
2
θ ψ(θ), which implies that we can derive its behavior through the behavior of
the function ψ(θ). We have that ψ(θ) is almost surely decreasing on (0, +∞),
and it holds lim ψ(θ) = n2 and lim ψ(θ) = − n2 . Thus, the function ψ(θ) has
θ↓0 θ↑0
the unique root, which implies that the Eq. (19) has the unique solution.
Also, since y(1) + yi ≤ 2yi for all i = 1, 2, . . . , n then
n n
n yi n 1
ψ y(1) = − + ≥− + = 0,
2 i=1 y(1) + yi 2 i=1 2
and analogously
n n
n yi n 1
ψ y(n) =− + ≤− + = 0.
2 i=1 y(n) + yi 2 i=1 2
From the last two equations, we found interval which is the solution of the
Eq. (19).
Compounding the Generalized Exponential
Model %
λ α
% θ% β% AIC KS p
GE 0.0187 0.7798 – – 483.99 0.2042 0.0309
(0.0036) (0.1352)
GEE 0.0383 – 0.2158 – 473.57 0.1491 0.2162
(0.0058) (0.0883)
MOE 0.0326 2.6214 – – 483.10 0.1617 0.1464
(0.0070) (1.0927)
Weibull 0.9490 – – 44.9125 486 0.1928 0.0486
(0.1095) (6.9465)
0.06
0.05
0.04
Density
0.03
0.02
0.01
0.00
0 20 40 60 80
6. Application
Here, we use two real data sets to compare the fits of the GEE distribution
with other fits from the Weibull, generalized exponential (GE), and Marshal–
Olkin exponential (MOE) distributions. The parameters are estimated using
maximum likelihood and reported jointly with standard errors in parentheses
in Tables 2 and 4.
B. V. Popović et al. MJOM
1.0
0.8
0.6
Fn(x)
0.4
0.2
0.0
0 20 40 60 80
x
0.10
0.05
0.00
0 2 4 6 8 10 12
1.0
0.8
0.6
Fn(x)
0.4
0.2
0.0
0 2 4 6 8 10 12
x
Model %
λ α
% θ% β% AIC KS p
GE 0.1570 0.8377 – – 113.24 0.2493 0.1397
(0.0467) (0.2300)
GEE 0.4202 – 0.0974 – 100.41 0.1375 0.7958
(0.0939) (0.0707)
MOE 0.3984 8.0859 – – 107.78 0.1472 0.7254
(0.1046) (5.7289)
Weibull 1.0892 – – 5.8163 113.50 0.2205 0.2465
(0.2210) (1.2218)
Figure 3 displays the histogram of the current data and the fitted pdfs.
Figure 4 displays the empirical cdf for the current data superimposed with
the fitted cdfs. These two figures reinforce that the GEE distribution provides
the best fit to the blood cancer data.
We compute the MLEs of the model parameters and adopt the Akaike
information criteria (AIC) and p values corresponding to the Kolmogorov–
Smirnov (KS) test for comparing the fitted models. The results in Table 2
indicate that the GEE distribution has the lowest AIC and the largest p value
among all fitted distributions.
7. Concluding Remarks
We introduce a new two-parameter model, called the generalized exponential
exponential (GEE) distribution, and study some of its structural properties.
We provide explicit expressions for the density function, moments and incom-
plete moments, probability-weighted moments, generating function, mean
deviations, Bonferroni and Lorenz curves, and two measures of entropy. Our
formulas related with the GEE model are manageable and, with the use of
modern computer resources with analytic and numerical capabilities, may
turn into adequate tools comprising the arsenal of applied statisticians. The
model parameters are estimated by maximum likelihood, and the existence
of the ML estimates is proved. This distribution is a very competitive model
to other lifetime distributions. In fact, we prove that this can be superior to
some widely known lifetime distributions by means of two examples with real
data.
Acknowledgements
We are grateful to the Editor and anonymous referees for comments that
greatly improved the quality of the paper. The authors are very grateful to
one of the referee for providing them the proof of Theorem 7.
References
[1] Adamidis, K., Loukas, S.: A lifetime distribution with decreasing failure
rate. Stat. Probab. Lett. 39, 35–42 (1998)
[2] Aarset, M.V.: How to identify bath tub hazard rate. IEEE Trans. Reliab. 36(1),
106–108 (1987)
[3] Barreto-Souza, W., Morais, A.L., Cordeiro, G.M.: The Weibull-geometric dis-
tribution. J. Stat. Simul. Comput. 81, 645–657 (2010)
[4] Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products. 7th
edn. Academic Press, San Diego (2007)
[5] Gupta, R.D., Kundu, D.: Generalized exponential distribution. Aust. N. Z. J.
Stat. 41, 173–188 (1999)
[6] Kuş, C.: A new lifetime distribution. Comput. Stat. Data Anal. 51, 4497–
4509 (2007)
[7] Lu, W., Shi, D.: A new compounding life distribution: the Weibull-Poisson
distribution. J. Appl. Stat. 39, 21–38 (2012)
[8] Marshall, A.W., Olkin, I.: Life Distributions. Springer, New York (2007)
Compounding the Generalized Exponential
[9] Marshall, A.W., Olkin, I.: A new method for adding a parameter to a family
of distributions with application to the exponential and Weibull families. Bio-
metrika. 84, 641–652 (1997)
[10] Murthy, D.N.P., Xie, M., Jiang, R.: Weibull Models. Wiley, New York (2005)
[11] Nadarajah, S., Popović, B.V., Ristić, M.M.: Compounding: an R package for
computing continuous distributions obtained by compounding a continuous
and a discrete distribution. Comput. Stat. 28, 977–992 (2013)
[12] Rényi, A.: On measures of entropy and information. In: Proceedings of the 4th
Berkeley Symposium on Mathematical Statistics and Probability, Volume I,
PP. 547–561. University of California Press, Berkeley (1961)
[13] Rodrigues, C., Cordeiro, G.M., Demetrio, C.G.B., Ortega, E.M.M.: The
Weibull negative binomial distribution. Adv. Appl. Stat. 22, 25–55 (2011)
[14] Shannon, C.E.: Prediction and entropy of printed English. The Bell System
Tech. J. 30, 50–64 (1951)
[15] Tahmasbi, R., Rezaei, S.: A two-parameter lifetime distribution with decreasing
failure rate. Comput. Stat. Data Anal. 52, 3889–3901 (2008)
Božidar V. Popović
Faculty of Philosophy
University of Montenegro
Nikšić, Montenegro
e-mail: [email protected]
Miroslav M. Ristić
Faculty of Sciences and Mathematics
University of Niš
Niš, Serbia
e-mail: [email protected]
Gauss M. Cordeiro
Departamento de Estatı́stica
Universidade Federal de Pernambuco
Recife, PE, Brazil
e-mail: [email protected]