universidade federal de sÃo carlos centro …the exponentiated exponential distribution is an...
TRANSCRIPT
ISSN 0104-0499
UNIVERSIDADE FEDERAL DE SÃO CARLOS
CENTRO DE CIÊNCIAS EXATAS E TECNOLÓGICAS
RELATÓRIO TÉCNICO DEPARTAMENTO DE ESTATÍSTICA
TEORIA E MÉTODO– SÉRIE A
Janeiro/2011 nº
235
THE EXPONENTIATED EXPONENTIAL-GEOMETRICDISTRIBUTION: A DISTRIBUTION WITH
DECREASING, INCREASING AND UNIMODALHAZARD FUNCTION
Francisco Louzada-NetoVitor Marchi Mari Roman
The Exponentiated Exponential-Geometric
Distribution: A distribution with decreasing,
increasing and unimodal hazard function
Francisco Louzada, Vitor Marchi and Mari Roman
Department of Statistics, Universidade Federal de Sao Carlos, Brazil
Abstract
In this paper we proposed a new family of distributions namely Exponentiated Exponential-
Geometric (E2G) distribution. The E2G distribution is a straightforwardly generalization of
the EG distribution proposed by [1], which accommodates increasing, decreasing and uni-
modal hazard functions. It arises on a latent competing risk scenarios, where the lifetime
associated with a particular risk is not observable but only the minimum lifetime value among
all risks. The properties of the proposed distribution are discussed, including a formal prove
of its probability density function and explicit algebraic formulas for its survival and hazard
functions, moments, rth moment of the ith order statistic, mean residual lifetime and modal
value. Maximum likelihood inference is implemented straightforwardly. From a misspecifica-
tion simulation study performed in order to assess the extent of the misspecification errors
when testing the EG distribution against the E2G one we observed that it is usually possible
to discriminate between both distributions even for moderate samples in presence of censor-
ing. The practical importance of the new distribution was demonstrated in three applications
where we compare it with several former lifetime distributions.
Key-words: Exponentiated Exponential Distribution, Geometric Distribution, Latent Com-
peting Risks, Survival Analysis, Censured data, unimodal failure rate function
1 Introduction
In recent years, several new classes of models were introduced grounded in the simple exponential
distribution, which is a wide used lifetime distribution for modeling many survival problems. The
main idea is to propose lifetime distributions which can accommodate practical applications where
the underlying hazard functions are non-constant, presenting monotone shapes. The exponential
distribution however does not provide a reasonable parametric fit for such practical applications.
For instance, we can cite [1], which proposed a variation of the exponential distribution, the ex-
ponential geometric (EG) distribution, with decreasing hazard function, [2], which proposed a
generalized exponential distribution, which can accommodate data with increasing and decreasing
hazard functions, [3], which proposed another modification of the exponential distribution with
decreasing hazard function, and [4], which generalizes the distribution proposed by [3] by includ-
ing a power parameter in his distribution, which can accommodate increasing, decreasing and
unimodal hazard functions. [5] proposed the complementary exponential power distribution by
exponentiating the exponential power distribution proposed by [6].
In this paper, we propose a new lifetime distribution family, an direct extension of the EG
distribution [1], obtained by compounding the exponentiated exponential distribution [7] with the
1
usual geometric distribution. Hereafter we shall refer to the new distribution as the exponentiated
exponential geometric (E2G) distribution. Besides being able to accommodate increasing, decreas-
ing and unimodal hazard functions, the E2G distribution can be seen as a mechanistic framework
of easy practical interpretation. Its genesis is based on a latent competing risk problem [8], in the
sense that there is no information about which factor was responsible for the component failure and
only the minimum lifetime value among all risks is observed. On many occasions this information
is not available or it is impossible that the true cause of failure is specified by an expert. Further,
the true cause of failure can be masked from our view.
The paper is organized as follows. In Section 2, we introduce the new E2G distribution and
present some of its properties. Furthermore, we derive the expressions for the probability density
function and survival function, r-th raw moments of the E2G distribution and modal value. Also,
in this section we present the inferential procedure. In Section 3 we discuss the relationship between
the Exponential, EG, and E2G distributions based on the hazard function and report the results
of a misspecification simulation study performed in order to verify whether we can distinguish
between the EG and E2G distributions in the light of the data based on some usual distribution
comparison criterion. In Section 4 we fit the E2G distribution to three real datasets and compare
it with the fits of several usual lifetime distributions, pointing out its relative superiority. Some
final comments in Section 5 conclude the paper.
2 The E2G model
Let Y a nonnegative random variable denoting the lifetime of a component in some population.
The random variable Y is said to have a E2G distribution with parameters λ > 0, α > 0 and
0 < θ < 1 if its probability density function is given by,
f(y) =αλθe−λy(1− e−λy)α−1
[1− (1− θ)(1− (1− e−λy)α)]2 , (1)
where λ is a scale parameter of the distribution, and α and θ are shape parameters. For α = 1 the
E2G is reduced to the EG distribution [1]. The Figure 1 (left panel) shows the E2G probability
density function for θ = 0.01, 0.5, 0.99 and α = 0.3, 1, 10.
The survival function of a E2G distributed random variable is given by,
S(y) =θ(1− (1− e−λy)α)
1− (1− θ)(1− (1− e−λy)α), (2)
where, α > 0, θ ∈ (0, 1) and λ > 0.
From (2), the hazard function, according to the relationship h(y) = f(y)/S(y), is given by,
h(y) =αλe−λy(1− e−λy)α−1
(1− (1− e−λy)α) [1 + (1− θ) (1− (1− e−λy)α)]. (3)
Its initial value is not finite if α < 1. Otherwise it is given by h(0) = λ/θ if α = 1 and
h(0) = 0 if α > 1. The long-term hazard function value is h(∞) = λ. The hazard function (3) can
be increasing, decreasing or unimodal as shown in the Figure 2 (right panel), which shows some
hazard function shapes for θ = 0.01, 0.5, 0.99 and α = 0.3, 1, 10.
The pth quantile of the E2G distribution , the inverse of the distribution function F (xp) = p,
is given by,
Q(u) = F−1(u) = −ln(1− ( uθ
1−u+uθ )1/α)
λ, (4)
where u has the uniform U(0, 1) distribution and F (y) = 1 − S(y) is cumulative distribution
function of Y .
2
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
2.0
λλ=1.0, θθ=0.01
Times
Den
sity
αα=0.30αα=1.00αα=10.0
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
2.0
λλ=1.0, θθ=0.5
Times
Den
sity
αα=0.30αα=1.00αα=10.0
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
2.0
λλ=1.0, θθ=0.99
Times
Den
sity
αα=0.30αα=1.00αα=10.0
0 1 2 3 4 5 6
01
23
4
λλ=1.0, θθ=0.01
Times
Haz
ard
Fun
ctio
n
αα=0.30αα=1.00αα=10.0
0 1 2 3 4 5 6
01
23
4
λλ=1.0, θθ=0.5
Times
Haz
ard
Fun
ctio
nαα=0.30αα=1.00αα=10.0
0 1 2 3 4 5 6
01
23
4
λλ=1.0, θθ=0.99
Times
Haz
ard
Fun
ctio
n
αα=0.30αα=1.00αα=10.0
Figure 1: Left panel: Probability density function of the E2G distribution. Right panel: hazard
function of the E2G distribution. We fixed λ = 1.
2.1 Genesis
In the classical competing risks scenarios the lifetime associated with a particular risk is not
observable, rather we observe only the minimum lifetime value among all risks. Simplistically,
in reliability, we observe only the minimum component lifetime of a serial system. That is, the
observable quantities for each component are the minimum lifetime value to failure among all
risks, and the cause of failure. Competing risks problems arise in several areas and full statistical
procedures and extensive literature are available. Interested readers can refer to [9], [10] and [11]. A
difficulty however arises if the risks are latent in the sense that there is no information about which
factor was responsible for the component failure, which can be often observed in field data. We
call these latent competing risks data [8]. On many occasions this information is not available or it
is impossible that the true cause of failure is specified by an expert. In reliability, the components
can be totally destroyed in the experiment. Further, the true cause of failure can be masked from
our view. In medical studies, a patient can die and the true cause can be attributed to multiple
unknown risks.
Then, in this context, our model can be derived as follows. Let M be a random variable
denoting the number of failure causes, m = 1, 2, ... and considering M with geometrical distribution
of probability given by,
P (M = m) = θ(1− θ)m−1, (5)
where 0 < θ < 1 and M = 1, 2, ....
Also consider ti, i = 1, 2, 3, ... realizations of a random variable denoting the failure times, ie,
the time-to-event due to the j− th latent competing risk and Ti has an exponentiated exponential
distribution with probability density function, indexed by λ and α, given by,
f(ti;λ, α) = αλ exp{−λti}(1− exp{−λti})α−1. (6)
3
The exponentiated exponential distribution is an alternative to the Weibull and the gamma
distributions, firstly proposed by [2] and [7].
In the latent competing risks scenario, the number of causes M and the lifetime tj associated
with a particular cause are not observable (latent variables), but only the minimum lifetime Y
among all causes is usually observed. So, we only observe the random varible given by,
Y = min (t1, t2 . . . , tM ) . (7)
The following result shows that the random variable Y have probability density function given
by (1).
Proposition 2.1. If the random variable Y is defined as 7, then, considering (5) and (6), Y is
distributed according to a E2G distribution, with probability density function given by (1).
Proof 2.1. The conditional density function of (7) given M = m is given by
f(y|M = m,λ) = mαλe−λy(1− e−λy)α−1(1− (1− e−λy)α
)m−1; t > 0, m = 1, . . .
Them, the marginal probability density function of Y is given by
f(y) =
∞∑m=1
mαλe−λy(1− e−λy)α−1(1− (1− e−λy)α
)m−1 × θ(1− θ)m−1
=θαλe−λy(1− e−λy)α−1∞∑m=1
(1
1− (1− (1− e−λy)α)(1− θ)
)m−1
=θαλe−λy(1− e−λy)α−1
(1
1− (1− θ)(1− (1− e−λy)α)
)2
This completes the proof.
�
2.2 Some Properties
Many of the interesting characteristics and features of a distribution can be studied through its
moments, such mean, variance. Expressions for mathematical expectation, variance and the rth
moment on the origin of X can be obtained using the well-known formula
E[Xr] = r
∫ ∞0
xr−1S(x)dx. (8)
A general expression for r-th ordinary moment µ′
r = E(Y r) of the Y variable, with density
function given by (1) can be obtained analytically, if we consider the binomial series expansion
given by,
(1− x)−r =
∞∑k=0
(r)kk!
xk, (9)
where (r)k is a Pochhammer symbol, given (r)k = r(r + 1) · · · (r − k + 1) and if |x| < 1 the series
converge, and
(−r)k = (−1)k(r − k + 1)k. (10)
Proposition 2.2. For the random variable Y with E2G distribution, we have that, rth moment
function is given by
µ′
r =θr!
λr
∞∑l=0
∞∑m=0
∞∑k=0
∞∑j=0
(−1)l+m(j − l + 2)l(αl + k −m+ 1)m(1− θ)j
l!j!(m+ 1)r.
4
Proof 2.2. From (2) and (8), and using (9), we have that
µ′
r =r
∫ ∞0
yr−1S(y)dy
=rθ
∫ ∞0
yr−1 (1− (1− e−λy)α)
1− (1− θ)(1− (1− e−λy)α)dy
=rθ
∫ 1
0
(− ln(1− z)
λ
)r−1(1− zα)
[1− (1− θ)(1− zα)](1− z)λdz
=rθ(−1)r−1
λr
∞∑k=0
(1)kk!
∫ 1
0
(ln(1− z))r−1 zk(1− zα)
1− (1− θ)(1− zα)dz
=rθ(−1)r−1
λr
∞∑k=0
∞∑j=0
(1)jj!
(1− θ)j∫ 1
0
(ln(1− z))r−1zk(1− zα)j+1dz
=rθ(−1)r−1
λr
∞∑k=0
∞∑j=0
∞∑l=0
(−(j + 1))ll!
(1− θ)j∫ 1
0
(ln(1− z))r−1zαl+kdz
=rθ
λr
∞∑k=0
∞∑j=0
∞∑l=0
(−(j + 1))ll!
(1− θ)j∫ ∞
0
ur−1(1− e−u)αl+ke−udu
=rθ
λr
∞∑k=0
∞∑j=0
∞∑l=0
∞∑m=0
(−(j + 1))ll!
(−(αl + k))mm!
(1− θ)j∫ ∞
0
ur−1e−u(m+1)du
where the last equality follows from the integrate of a gamma distribution and the property (10),
completing complete the proof.
�
Order statistics are among the most fundamental tools in non-parametric statistics and in-
ference. Let Y1, . . . , Yn be a random sample taken from the E2G distribution and Y1:n, . . . , Yn:n
denote the corresponding order statistics. Then, the probability density function fi:n (y) of the ith
order statistics Yi:n is given by
fi:n(x) =n!
(k − 1)!(n− k)!F (y)k−1(1− F (y))n−kf(y)
The rth moment of the ith order statistic Xi:n, according with [12], can be represented as
E [Y ri:n] = r
n∑p=n−i+1
(−1)p−n+i−1
(p− 1
n− i
)(n
p
)∫ ∞0
xr−1 [S(y)]pdy. (11)
Proposition 2.3. For the random variable Y with E2G distribution the rth moment of the ith
order statistic is given by
E [Y ri:n] =r!
λr
n∑p=n−i+1
∞∑l=0
∞∑m=0
∞∑k=0
∞∑j=0
(−1)p−n+l+m+i+2r
(p− 1
n− i
)θm
×(n
p
)(p)j(p+ j − l + 1)l(1− θ)j(αl + k −m+ 1)m
j!l!(m+ 1)r.
Proof 2.3. From (11), using (2) and (9), and proceeding in a similar way as in the proof 2.2 the
result follows.
�
5
Given that there was no failure prior to time t, the residual lifetime distribution of a random
variable X, distributed as E2G distribution, has the survival function given by
St(x) = Pr[X > x+ t|X > t] =
(1− (1− e−λ(x+t))α
1− (1− e−λx)α
)(1− (1− θ)(1− e−λx)α
1− (1− θ)(1− e−λ(x+t))α
).
The mean residual lifetime of a continuous distribution with survival function F (x) is given by
µ(t) = E(X − t|X > t) =1
S(t)
∫ ∞t
S(u)du. (12)
Proposition 2.4. For the random variable Y with E2G distribution the mean residual lifetime is
given by
µ(t) =θ
λ
(1− (1− θ)(1− e−λt)
1− (1− e−λt)α
) ∞∑k=0
∞∑i=0
∞∑j=0
(1− θ)i(−1)j(i− j + 2)jj!
(1− (1− eλt)k+αj+1
k + αj + 1
).
Proof 2.4. From (12) and using S(y) given by (2) we have that
1
S(t)
∫ ∞t
S(u)du = θ1− (1− θ)(1− e−λt)
1− (1− e−λt)α
∫ ∞t
1− (1− e−λu)α
1− (1− θ)(1− e−λu)du
=θ
λ
1− (1− θ)(1− e−λt)1− (1− e−λt)α
∫ 1
1−e−λt
1− xα
(1− (1− θ)(1− xα))(1− x)dx.
Now using (9) and proceeding in a similar way as in the proof 2.2 the result follows.
�
Proposition 2.5. The modal value for the variable Y with density given by (1) is given by,
Y = − ln(Root[θ(αx− 1) + (1− x)α(−1 + θ + αx(θ − 1))])/λ if α > 1, (13)
where Root[f(x)] is the x root of f(x) and its is evaluated by numerical methods.
Proof 2.5. This proof is straightly obtained by solving, from (1), the equation df(y)/dy = 0.
�
2.3 Inference
Assuming the lifetimes are independently distributed and are independent from the censoring
mechanism, the maximum likelihood estimates (MLEs) of the parameters are obtained by direct
maximization of the log-likelihood function given by,
`(θ, λ, α) = n ln(θ) + ln(αλ)
n∑i=1
ci − λn∑i=1
ciyi + (α− 1)
n∑i=1
ci ln(1− e−λyi) + (14)
+
n∑i=1
(1− ci) ln(1− (1− e−λyi)α
)−
n∑i=1
(1 + ci) ln(1− (1− θ)(1− (1− e−λyi)α)
),
where ci is a censoring indicator, which is equal to 0 or 1, respectively, if the data is censured or
observed, respectively. The advantage of this procedure is that it runs immediately using existing
statistical packages. We have considered the optim routine of the R [13]. Large-sample inference
for the parameters are based on the MLEs and their estimated standard errors.
In order to compare distributions we consider the −max `(.) values and the Akaike information
criterion (AIC), which are defined, respectively, by −2`(.) + 2q where `(.) is the log-likehood
evaluated in the MLE vector of parameters of the respective distribution and q is the number
of estimated parameters. The best distribution corresponds to a lower −max `(.), AIC and BIC
values.
6
3 On the relationship between the EG and E2G distribu-
tions
In this section, we discuss some relationship between the EG distribution [1], who motivated this
article, and E2G distribution proposed here.
The hazard function of the EG distribution is given by
h(y) = λ/(1− (1− θ)e−λy). (15)
Proposition 3.1. The hazard functions, (15) and (3), diverge for y → 0 and converge for the
same point λ when y →∞.
Proof 3.1. (a) For y → 0, for the EG model, limy→0 h(y) = limy→0λ
1− θe−λy=
λ
1− θwhile
for the E2G distribution, if α = 1, limy→0 h(y) = λθ if α < 1, limy→0 h(y) =∞ and if α > 1,
limy→0 h(y) = 0 concluding the proof of the initial divergence between the EG and E2G
hazard functions.
(b) For y → ∞, for the EG model, limy→∞ h(y) = limy→∞λ
1− θe−λy= λ and for the E2G
distribution limy→∞ h(y) = λ, concluding the proof of the converge.
�
The Figure 2 shows the behavior of hazard functions of EG model for θ = 0.1, 0.9 and for rate
function of E2G with θ = 0.1, 0.9 and α = 0.25, 4. The EG hazard function is decreasing while the
E2G hazard function can be decreasing, increasing and unimodal, but both hazards converge to λ
for y →∞ corroborating with the Proposition 3.1.
0 1 2 3 4 5 6
01
23
4
EG, with λ=1.0
Times
Haz
ard
Fun
ctio
n
θ=0.10θ=0.90
0 1 2 3 4 5 6
01
23
4
E2G, with λ=1.0
Times
Haz
ard
Fun
ctio
n
(θ,α)=(0.10,0.25)(θ,α)=(0.90,0.25)(θ,α)=(0.10,4.00)(θ,α)=(0.90,4.00)
Figure 2: Comparing the hazard function of the EG and E2G distributions for same fixed θ values.
A misspecification study was performed in order to verify if we can distinguish between the
EG and E2G distributions, in the light of a dataset, based on an usual comparison criterion.
We consider here the −max `(.) values. The preferred distribution is the one with the smaller
−max `(.) value.
We generate 1.000 samples of each one of the EG and E2G distributions. We consider different
sample sizes, n, equal to 10, 20, 30, 50 and 100, and different censoring percentages, p, equal to 0.1,
7
Table 1: Percentage of times that the ( EG / E2G) distribution, which originated the sample, was
the best fitted distribution.
θ = 0.9
n/p 10 20 30 50 100
0.10 0.524/0.639 0.616/0.884 0.737/0.940 0.760/0.975 0.848/0.999
0.20 0.432/0.482 0.516/0.707 0.637/0.778 0.671/0.879 0.784/0.961
0.30 0.428/0.402 0.453/0.545 0.574/0.628 0.599/0.696 0.712/0.826
θ = 0.1
n/p 10 20 30 50 100
0.10 0.491/0.870 0.634/0.992 0.697/0.999 0.777/0.999 0.827/0.999
0.20 0.469/0.826 0.617/0.977 0.670/0.998 0.742/0.999 0.819/0.999
0.30 0.480/0.766 0.587/0.952 0.673/0.992 0.725/0.999 0.772/0.999
0.2, 0.3. We fixed λ = 0.5 and θ = 0.1 and θ = 0.9 for both distributions, and α = 3 to the E2G
distribution. The two distributions were fitted to each sample and their −max `(.) values were
calculated. For θ = 0.1, and α = 3, the data have a unimodal hazard function and for θ = 0.9,
and α = 3 have a increasing hazard function. In principle, the case θ = 0.1, θ = 0.9 combined
with α = 3 is better adjusted by a E2G distribution. Table 1 shows the percentage of time that
the distribution, which originated the sample, was the best fitted distribution according to the
−max `(.) values. We observe that it is usually possible to discriminate between the distributions
even for moderate samples in presence of censoring. We also performed the study by considering
the AIC instead of the −max `(.) values. However the results leads to similar conclusions and are
omitted here.
4 Aplications
In this section, we compare the E2G distribution fit with several usual lifetime distributions on three
datasets extracted from the literature. one with increasing hazard function, one with unimodal
hazard function and one with decreasing hazard function.
The following lifetime distributions were considered.The Exponential distribution with prob-
ability density function given by f(x) = λe−λx, the Weibull distribution with probability den-
sity function given by f(x) = θλ
(xλ
)θ−1e−(x/λ)θ , the gamma distribution with probability den-
sity function given by f(x) = 1λθΓ(θ)
xθ−1e−x/λ, the EG distribution [1] with probability density
function given by f(x) = λ(1 − (1 − θ)e−λx)−1, the Modified Weibull (MW) distribution [14]
with probability density function given by f(x) = αxθ−1(θ + λx)eλxe−αxθ exp{λx}, and the gen-
eralized exponential-Poisson (GEP) distribution [4] with probability density function given by
f(x) = αβλ(1−e−λ)α
(1− e−λ+λ exp (−βx))α−1e−λ−βx+λ exp (−βx).
The first set, hereafter T1, are observed survival times for 65 breast cancer patients treated
over the period 1929-1938, quoted by [15]. The second set, hereafter T2, are survival times for
patients with bile duct cancer, which took part in a study to determine whether a combination of a
radiation treatment (R0Rx) and the drug 5-fluorouracil (5-FU) prolonged survival, extracted from
[16]. Survival times, in days, are given for a control group. The third set, hereafter T3, consists
of the number of successive failure for the air conditioning system of each member in a fleet of 13
Boeing 720 jet airplanes. The pooled data with 214 observations was considered by [1]. It was first
analyzed by [17]and discussed further by [18],[19], [3] and [20].
8
Firstly, in order to identify the shape the hazard function we shall consider a graphical method
based on the TTT plot [21]. In its empirical version the TTT plot is given by G(r/n) =
[(∑ri=1 Yi:n) + (n − r)Yr:n]/(
∑ni=1 Yi:n), where r = 1, . . . , n and Yi:n represent the order statis-
tics of the sample. It has been shown that the hazard function is increasing (decreasing) if the
TTT plot is concave (convex). The left panels of Figure 4 shows concave TTT plots for T1,
concave/convex for T2 and convex for T3, indicating increasing, unimodal and decreasing hazard
functions, respectively.
Table 2 provides the −max `(.) and the AIC values for all fitted distributions. Both criteria
provide evidence in favor of our E2G distribution for T2 and T3, and give similar values for the
E2G and gamma distributions for T1, corroborating the fact that the E2B distribution can be seen
as a competitive distribution of practical interest for the analysis of survival data. Figure 4 (right
panels) shows the fitted survival superimposed to the empirical Kaplan-Meier survival function.
The the MLEs (and their corresponding standard errors in parentheses) of the parameters α,
θ and λ(×1000) of the E2G distribution are given, respectively, by 2.012(0.719), 0.582(0.551) and
41.11(9.98) for T1, by 1.978(2.377), 0.021(0.0489) and 0.85(0.188) for T2, and by 1.238(0.273),
0.314(0.141) and 6.90(0.72) for T3.
Table 2: Values of the −max `(.) and AIC for all fitted distributions.
E EG Weibull Gamma E2G MW GEP
T1−max `(.) 284.6012 284.6012 280.6973 280.0386 279.7394 280.6974 279.9055
AIC 571.2024 573.2024 565.3946 564.0772 565.4788 567.3947 565.8110
T2−max `(.) 172.6231 171.2032 172.2865 172.6161 167.3942 172.2865 168.7415
AIC 347.2463 346.4064 348.5730 349.2322 340.7883 350.5731 343.4831
T3−max `(.) 1178.766 1175.925 1177.585 1178.291 1174.260 1177.585 1174.785
AIC 2359.532 2355.849 2359.170 2360.582 2354.520 2361.170 2355.570
5 Concluding remarks
In this paper we propose a new lifetime distribution. The E2G distribution is a straightforwardly
generalization of the EG distribution proposed by [1], which accommodates increasing, decreasing
and unimodal hazard functions. It arises on a latent competing risk scenarios, where the lifetime
associated with a particular risk is not observable but only the minimun lifetime value among all
risks. The properties of the proposed distribution are discussed, including a formal prove of its
probability density function and explicit algebraic formulas for its survival and hazard functions,
moments, rth moment of the ith order statistic, mean residual lifetime and modal value. Max-
imum likelihood inference is implemented straightforwardly. From a misspecification simulation
study performed in order to assess the extent of the misspecification errors when testing the EG
distribution against the E2G one we observed that it is usually possible to discriminate between
both distributions even for moderate samples in presence of censoring. The practical importance of
the new distribution was demonstrated in three applications where the E2G distribution provided
the better fitting in comparison with the EG one and several other former lifetime distributions.
Acknowledgments: The research of Francisco Louzada is supported by the Brazilian organi-
zation CNPq.
9
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
TTT−Plot
n/r
G(n
/r)
0 20 40 60 80 100 120
0.0
0.2
0.4
0.6
0.8
1.0
Time
S(t
) es
timat
ed
EEGWeibullGammaE2GMWGEP
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
TTT−Plot
n/r
G(n
/r)
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Time
S(t
) es
timat
ed
EEGWeibullGammaE2GMWGEP
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
TTT−Plot
n/r
G(n
/r)
0 100 200 300 400 500 600
0.0
0.2
0.4
0.6
0.8
1.0
Time
S(t
) es
timat
ed
EEGWeibullGammaE2GMWGEP
Figure 3: Left Panels: Empirical TTT-Plot. Right Panels: Kaplan-Meier and survival fitted curves.
Upper panels are for dataset T1, Middle panels are for dataset T2 and lower panels are for dataset
T3.
10
References
[1] K. Adamidis and S. Loukas. A lifetime distribution with decreasing failure rate. Statistics
Probability Latters, 39:35–42, 1998.
[2] R. Gupta and D. Kundu. Generalized exponential distributions. Australian and New Zealand
Journal of Statistics, 41:173–188, 1999.
[3] C. Kus. A new lifetime distribution. Computation Statist. Data Analysis, 51:4497–4509, 2007.
[4] W. Barreto-Souza and F. Cribari-Neto. A generalization of the exponential-poisson distribu-
tion. Statistics and Probability Letters, 79:2493–2500, 2009.
[5] G. D. C. Barriga, F. Louzada-Neto, and V. G. Cancho. The complementary exponential power
lifetime model. Computational Statistics and Data Analysis, 55(1250–1259), 2011.
[6] R. M. Smith and L. J. Bain. An exponential power life-testing distribution. CommStatA,
4:469–481, 1975.
[7] Rameshwar D. Gupta and Debasis Kundu. Exponentiated exponential family: An alternative
to gamma and weibull distributions. Biometrical Journal, 43:117–130, 2001.
[8] F. Louzada-Neto. Poly-hazard regression models for lifetime data. Biometrics, 55:1121–1125,
1999.
[9] J. F. Lawless. Statistical Models and Methods for Lifetime Data, volume second edition. Wiley,
New York, NY, 2003.
[10] M.J. Crowder, A.C. Kimber, R.L. Smith, and T.J. Sweeting. Statistical Analysis of Reliability
Data. Chapman and Hall, London, 1991.
[11] D.R. Cox and D. Oakes. Analysis of Survival Data. Chapman and Hall, London, 1984.
[12] H.M. Barakat and Y.H. Abdelkader. Computing the moments of order statistics from non-
identical random variables. Statistical Methods and Applications, 12(1):15–26, 2004.
[13] R Development Core Team. R: A Language and Environment for Statistical Computing. R
Foundation for Statistical Computing, Vienna, Austria, 2008.
[14] C.D. Lai, Min Xie, and N.P Murthy. A modified weibull distribution. IEEE Transactions on
Reliability, 52:33–37, 2003.
[15] J. Boag. Maximum likelihood estimates of the proportion of patientes cured by cancer thera-
phy. Journal of the Royal Statistics Society, Series B, 11:15–44, 1949.
[16] Thomas R. Fleming, Judith R. O’Fallon, Peter C. O’Brien, and David P. Harrington. Mod-
ified kolmogorov-smirnov test procedures with application to arbitrarily right-censored data.
Biometrics, 36(4):607–625, 1980.
[17] F. Proschan. Theorical explanation of observed decreasing failure rate. Technometrics, 5:375–
383, 1963.
[18] A. Dahiya and J. Gurland. Goodness of fit tests for the gamma and exponential distributions.
Technometrics, 14:791–801, 1972.
11
[19] L. J. Gleser. The gamma distribution as a mixture of exponential distributions. Amer. Statist.,
43:115–117, 1989.
[20] W. Barreto-Souza, A. L. de Morais, and G. M. Cordeiro. The weibull-geometric distribution.
Journal of Statistical Computation and Simulation, pages 1–14, 2008.
[21] M. V. Aarset. How to identify a bathtub hazard rate. IEEE Transactions on Reliability,
2:106–108, 1987.
12
PUBLICAÇÕES 2010
. RODRIGUES, J.; CASTRO, M.; BLAKRISHNAN, N.; GARIBAY, V.; “Destructive weighted Poisson cure rate models”
Novembro/2009 Nº 210
. COBRE, J.; LOUZADA-NETO, F., PERDONÁ, G.; “A Bayesian Analysis for the Generalized Negative Binomial Weibull Cure Fraction Survival Model: Estimating the Lymph Nodes Metastasis Rates” Janeiro/2010 Nº 211
. DINIZ, C. A. R. ; LOUZADA-NETO, F.; MORITA, L. H. M.; “The Multiplicative Heteroscedastic Von Bertalan_y Model”
Fevereiro/2010 Nº 212
. DINIZ, C. A. R. ; MORITA, L. H. M, LOUZADA-NETO, F.; “Heteroscedastic Von Bertalanffy Growth Model and an
Application to a Kubbard female chicken corporeal weight growth data” Fevereiro/2010 Nº 213
. FURLAN, C. P. R.; DINIZ, C. A. R.; FRANCO, M. A. P.; “Estimation of Lag Length in Distributed Lag Models: A Comparative Study” Março/2010 Nº 214
. RODRIGUES, J., CANCHO, V. G., CASTRO, M., BALAKRISHNAN, N., “A Bayesian destructive weighted Poisson cure rate
model and an application to a cutaneous melanoma data”, Março/2010 Nº 215.
. CORDEIRO, G. M.; RODRIGUES, J.; CASTRO, M. “The exponential COM-Poisson distribution” , Abril/2010 Nº 216
. CANCHO, V. G.; LOUZADA-NETO, F.; BARRIGA, G. D. C. “The Poisson-Exponential Survival Distribution For Lifetime
Data”, Maio/2010 Nº 217
. DIINIZ, C. A. R.; FURLAN, C. P. R.; LEITE, J. G. “A Bayesian Estimation of Lag Length in Distributed Lag Models” ,
Julho/2010 Nº 218
. CONCEIÇÃO, K. S.; PIRES, R. M.; LOUZADA-NETO, F.; ANDRADE, M. G.; DINIZ, C. A. R. “A Generalized Species-Area Relationship for Estimating Species Diversity: The Poisson Distribution Case” Julho/2010 Nº 219
. LOUZADA-NETO, F.; CANCHO, V. G.; BARRIGA, G. D. C.; “A Bayesian Analysis For The Poisson-Exponential Distribution”
Agosto/2010 Nº 220
. SCACABAROZI, F.N. , DINIZ C. A. R., FRANCO M. A. P. , “A Comparative Study of Credibility an Confidence Intervals for
the Parameter of a Poisson Distribuition” Setembro/2010 Nº 221
. TOMAZELLA, V. L., BERNARDO, J. M.; “Testing for Hardy-Weinberg Equilibrium in a Biological Population: An Objective
Bayesian Analysis” Setembro/2010 Nº 222
. LOUZADA-NETO F., DINIZ C. A. R., COSTA, C. C., SILVA P. H. F., DESTEFANI, C. R., TEMPONI A. P. O. “Procedimentos
Estatísticos para Segmentação de Base de Dados” Setembro/2010 Nº 223
. ARA-SOUZA, A. L.; LOUZADA-NETO, F.; “Caracterização dos Docentes e Necessidade de Doutores dentro das Graduações de
Estatística do Brasil” Setembro/2010 Nº 224
. LOUZADA-NEDTO, F.; ROMAN, M.; CANCHO, V. G.; “The Complementary Exponential Geometric Distribution: Model, Properties and a Comparison With Its Counterpart” Setembro/2010 Nº 225
. LOUZADA-NETO,F.; BORGES, P.; “The Exponential Negative Binomial Distribuition” Setembro/2010 Nº 226
. PEREIRA, G.A; LOUZADA-NETO, F.; MORAES-SOUZA, H. ; FERREIRA-SILVA, M. M.; BARBOSA, V. F.; “General Bayesian latent class model for the evaluation of the performance of L diagnostic tests for Chagas’ disease in the absence of a gold
standard considering M covariates and V different disease prevalences” Setembro/2010 Nº 227
. SARAIVA, E. F.; MILAN, L. A.; LOUZADA-NETO, F.; “A Posterior Split-Merge MCMC Algorithm for Mixture Models with an unknown number of components” Setembro/2010 Nº 228
. BORGES, P.; ROMAN, M.; TOJEIRO, C. A. V.; LOUZADA-NETO, F.; “The complementary exponential logarithmic
distribution: A two-parameter lifetime distribution with increasing failure rate” Outubro/2010 Nº 229
. TOJEIRO, C.; A. V. LOUZADA-NETO, F.; “A General Threshold Stress Hybrid Hazard Model for Lifetime Data” Outubro/2010
Nº 230
. TOMAZELLA, V. L.D.; CANCHO, V. G.; LOUZADA-NETO, F.; “Objective Bayesian Reference Analysis for the Poisson-exponential lifetime distribution” Outubro/2010 Nº 231
. DELGADO, J. F.; BORGES, P.; CANCHO, V.G.; LOUZADA-NETO, F.; “A compound class of exponential and power series
distributions with increasing failure rate” Novembro/2010 Nº 232
. DELGADO. J. J. F., CANCHO V. G., LOUZADA-NETO F.; “The power series cure rate model: an application to a cutaneous
melanoma data” Novembro/2010 Nº 233
Os recentes relatórios poderão ser obtidos pelo endereço http://www.des.ufscar.br/Publicacoes.php?lingua_id=2&secao_id=81
Mais informações sobre publicações anteriores ao ano de 2010 podem ser obtidas via e-mail: [email protected]