universidade federal de sÃo carlos centro …the exponentiated exponential distribution is an...

ISSN 0104-0499

UNIVERSIDADE FEDERAL DE SÃO CARLOS

CENTRO DE CIÊNCIAS EXATAS E TECNOLÓGICAS

RELATÓRIO TÉCNICO DEPARTAMENTO DE ESTATÍSTICA

TEORIA E MÉTODO– SÉRIE A

Janeiro/2011 nº

235

THE EXPONENTIATED EXPONENTIAL-GEOMETRICDISTRIBUTION: A DISTRIBUTION WITH

DECREASING, INCREASING AND UNIMODALHAZARD FUNCTION

Francisco Louzada-NetoVitor Marchi Mari Roman

The Exponentiated Exponential-Geometric

Distribution: A distribution with decreasing,

increasing and unimodal hazard function

Francisco Louzada, Vitor Marchi and Mari Roman

Department of Statistics, Universidade Federal de Sao Carlos, Brazil

Abstract

In this paper we proposed a new family of distributions namely Exponentiated Exponential-

Geometric (E2G) distribution. The E2G distribution is a straightforwardly generalization of

the EG distribution proposed by [1], which accommodates increasing, decreasing and uni-

modal hazard functions. It arises on a latent competing risk scenarios, where the lifetime

associated with a particular risk is not observable but only the minimum lifetime value among

all risks. The properties of the proposed distribution are discussed, including a formal prove

of its probability density function and explicit algebraic formulas for its survival and hazard

functions, moments, rth moment of the ith order statistic, mean residual lifetime and modal

value. Maximum likelihood inference is implemented straightforwardly. From a misspecifica-

tion simulation study performed in order to assess the extent of the misspecification errors

when testing the EG distribution against the E2G one we observed that it is usually possible

to discriminate between both distributions even for moderate samples in presence of censor-

ing. The practical importance of the new distribution was demonstrated in three applications

where we compare it with several former lifetime distributions.

Key-words: Exponentiated Exponential Distribution, Geometric Distribution, Latent Com-

peting Risks, Survival Analysis, Censured data, unimodal failure rate function

1 Introduction

In recent years, several new classes of models were introduced grounded in the simple exponential

distribution, which is a wide used lifetime distribution for modeling many survival problems. The

main idea is to propose lifetime distributions which can accommodate practical applications where

the underlying hazard functions are non-constant, presenting monotone shapes. The exponential

distribution however does not provide a reasonable parametric fit for such practical applications.

For instance, we can cite [1], which proposed a variation of the exponential distribution, the ex-

ponential geometric (EG) distribution, with decreasing hazard function, [2], which proposed a

generalized exponential distribution, which can accommodate data with increasing and decreasing

hazard functions, [3], which proposed another modification of the exponential distribution with

decreasing hazard function, and [4], which generalizes the distribution proposed by [3] by includ-

ing a power parameter in his distribution, which can accommodate increasing, decreasing and

unimodal hazard functions. [5] proposed the complementary exponential power distribution by

exponentiating the exponential power distribution proposed by [6].

In this paper, we propose a new lifetime distribution family, an direct extension of the EG

distribution [1], obtained by compounding the exponentiated exponential distribution [7] with the

1

usual geometric distribution. Hereafter we shall refer to the new distribution as the exponentiated

exponential geometric (E2G) distribution. Besides being able to accommodate increasing, decreas-

ing and unimodal hazard functions, the E2G distribution can be seen as a mechanistic framework

of easy practical interpretation. Its genesis is based on a latent competing risk problem [8], in the

sense that there is no information about which factor was responsible for the component failure and

only the minimum lifetime value among all risks is observed. On many occasions this information

is not available or it is impossible that the true cause of failure is specified by an expert. Further,

the true cause of failure can be masked from our view.

The paper is organized as follows. In Section 2, we introduce the new E2G distribution and

present some of its properties. Furthermore, we derive the expressions for the probability density

function and survival function, r-th raw moments of the E2G distribution and modal value. Also,

in this section we present the inferential procedure. In Section 3 we discuss the relationship between

the Exponential, EG, and E2G distributions based on the hazard function and report the results

of a misspecification simulation study performed in order to verify whether we can distinguish

between the EG and E2G distributions in the light of the data based on some usual distribution

comparison criterion. In Section 4 we fit the E2G distribution to three real datasets and compare

it with the fits of several usual lifetime distributions, pointing out its relative superiority. Some

final comments in Section 5 conclude the paper.

2 The E2G model

Let Y a nonnegative random variable denoting the lifetime of a component in some population.

The random variable Y is said to have a E2G distribution with parameters λ > 0, α > 0 and

0 < θ < 1 if its probability density function is given by,

f(y) =αλθe−λy(1− e−λy)α−1

[1− (1− θ)(1− (1− e−λy)α)]2 , (1)

where λ is a scale parameter of the distribution, and α and θ are shape parameters. For α = 1 the

E2G is reduced to the EG distribution [1]. The Figure 1 (left panel) shows the E2G probability

density function for θ = 0.01, 0.5, 0.99 and α = 0.3, 1, 10.

The survival function of a E2G distributed random variable is given by,

S(y) =θ(1− (1− e−λy)α)

1− (1− θ)(1− (1− e−λy)α), (2)

where, α > 0, θ ∈ (0, 1) and λ > 0.

From (2), the hazard function, according to the relationship h(y) = f(y)/S(y), is given by,

h(y) =αλe−λy(1− e−λy)α−1

(1− (1− e−λy)α) [1 + (1− θ) (1− (1− e−λy)α)]. (3)

Its initial value is not finite if α < 1. Otherwise it is given by h(0) = λ/θ if α = 1 and

h(0) = 0 if α > 1. The long-term hazard function value is h(∞) = λ. The hazard function (3) can

be increasing, decreasing or unimodal as shown in the Figure 2 (right panel), which shows some

hazard function shapes for θ = 0.01, 0.5, 0.99 and α = 0.3, 1, 10.

The pth quantile of the E2G distribution , the inverse of the distribution function F (xp) = p,

is given by,

Q(u) = F−1(u) = −ln(1− ( uθ

1−u+uθ )1/α)

λ, (4)

where u has the uniform U(0, 1) distribution and F (y) = 1 − S(y) is cumulative distribution

function of Y .

2

0 1 2 3 4 5 6

0.0

0.5

1.0

1.5

2.0

λλ=1.0, θθ=0.01

Times

Den

sity

αα=0.30αα=1.00αα=10.0

0 1 2 3 4 5 6

0.0

0.5

1.0

1.5

2.0

λλ=1.0, θθ=0.5

Times

Den

sity

αα=0.30αα=1.00αα=10.0

0 1 2 3 4 5 6

0.0

0.5

1.0

1.5

2.0

λλ=1.0, θθ=0.99

Times

Den

sity

αα=0.30αα=1.00αα=10.0

0 1 2 3 4 5 6

01

23

4

λλ=1.0, θθ=0.01

Times

Haz

ard

Fun

ctio

n

αα=0.30αα=1.00αα=10.0

0 1 2 3 4 5 6

01

23

4

λλ=1.0, θθ=0.5

Times

Haz

ard

Fun

ctio

nαα=0.30αα=1.00αα=10.0

0 1 2 3 4 5 6

01

23

4

λλ=1.0, θθ=0.99

Times

Haz

ard

Fun

ctio

n

αα=0.30αα=1.00αα=10.0

Figure 1: Left panel: Probability density function of the E2G distribution. Right panel: hazard

function of the E2G distribution. We fixed λ = 1.

2.1 Genesis

In the classical competing risks scenarios the lifetime associated with a particular risk is not

observable, rather we observe only the minimum lifetime value among all risks. Simplistically,

in reliability, we observe only the minimum component lifetime of a serial system. That is, the

observable quantities for each component are the minimum lifetime value to failure among all

risks, and the cause of failure. Competing risks problems arise in several areas and full statistical

procedures and extensive literature are available. Interested readers can refer to [9], [10] and [11]. A

difficulty however arises if the risks are latent in the sense that there is no information about which

factor was responsible for the component failure, which can be often observed in field data. We

call these latent competing risks data [8]. On many occasions this information is not available or it

is impossible that the true cause of failure is specified by an expert. In reliability, the components

can be totally destroyed in the experiment. Further, the true cause of failure can be masked from

our view. In medical studies, a patient can die and the true cause can be attributed to multiple

unknown risks.

Then, in this context, our model can be derived as follows. Let M be a random variable

denoting the number of failure causes, m = 1, 2, ... and considering M with geometrical distribution

of probability given by,

P (M = m) = θ(1− θ)m−1, (5)

where 0 < θ < 1 and M = 1, 2, ....

Also consider ti, i = 1, 2, 3, ... realizations of a random variable denoting the failure times, ie,

the time-to-event due to the j− th latent competing risk and Ti has an exponentiated exponential

distribution with probability density function, indexed by λ and α, given by,

f(ti;λ, α) = αλ exp{−λti}(1− exp{−λti})α−1. (6)

3

The exponentiated exponential distribution is an alternative to the Weibull and the gamma

distributions, firstly proposed by [2] and [7].

In the latent competing risks scenario, the number of causes M and the lifetime tj associated

with a particular cause are not observable (latent variables), but only the minimum lifetime Y

among all causes is usually observed. So, we only observe the random varible given by,

Y = min (t1, t2 . . . , tM ) . (7)

The following result shows that the random variable Y have probability density function given

by (1).

Proposition 2.1. If the random variable Y is defined as 7, then, considering (5) and (6), Y is

distributed according to a E2G distribution, with probability density function given by (1).

Proof 2.1. The conditional density function of (7) given M = m is given by

f(y|M = m,λ) = mαλe−λy(1− e−λy)α−1(1− (1− e−λy)α

)m−1; t > 0, m = 1, . . .

Them, the marginal probability density function of Y is given by

f(y) =

∞∑m=1

mαλe−λy(1− e−λy)α−1(1− (1− e−λy)α

)m−1 × θ(1− θ)m−1

=θαλe−λy(1− e−λy)α−1∞∑m=1

(1

1− (1− (1− e−λy)α)(1− θ)

)m−1

=θαλe−λy(1− e−λy)α−1

(1

1− (1− θ)(1− (1− e−λy)α)

)2

This completes the proof.

�

2.2 Some Properties

Many of the interesting characteristics and features of a distribution can be studied through its

moments, such mean, variance. Expressions for mathematical expectation, variance and the rth

moment on the origin of X can be obtained using the well-known formula

E[Xr] = r

∫ ∞0

xr−1S(x)dx. (8)

A general expression for r-th ordinary moment µ′

r = E(Y r) of the Y variable, with density

function given by (1) can be obtained analytically, if we consider the binomial series expansion

given by,

(1− x)−r =

∞∑k=0

(r)kk!

xk, (9)

where (r)k is a Pochhammer symbol, given (r)k = r(r + 1) · · · (r − k + 1) and if |x| < 1 the series

converge, and

(−r)k = (−1)k(r − k + 1)k. (10)

Proposition 2.2. For the random variable Y with E2G distribution, we have that, rth moment

function is given by

µ′

r =θr!

λr

∞∑l=0

∞∑m=0

∞∑k=0

∞∑j=0

(−1)l+m(j − l + 2)l(αl + k −m+ 1)m(1− θ)j

l!j!(m+ 1)r.

4

Proof 2.2. From (2) and (8), and using (9), we have that

µ′

r =r

∫ ∞0

yr−1S(y)dy

=rθ

∫ ∞0

yr−1 (1− (1− e−λy)α)

1− (1− θ)(1− (1− e−λy)α)dy

=rθ

∫ 1

0

(− ln(1− z)

λ

)r−1(1− zα)

[1− (1− θ)(1− zα)](1− z)λdz

=rθ(−1)r−1

λr

∞∑k=0

(1)kk!

∫ 1

0

(ln(1− z))r−1 zk(1− zα)

1− (1− θ)(1− zα)dz

=rθ(−1)r−1

λr

∞∑k=0

∞∑j=0

(1)jj!

(1− θ)j∫ 1

0

(ln(1− z))r−1zk(1− zα)j+1dz

=rθ(−1)r−1

λr

∞∑k=0

∞∑j=0

∞∑l=0

(−(j + 1))ll!

(1− θ)j∫ 1

0

(ln(1− z))r−1zαl+kdz

=rθ

λr

∞∑k=0

∞∑j=0

∞∑l=0

(−(j + 1))ll!

(1− θ)j∫ ∞

0

ur−1(1− e−u)αl+ke−udu

=rθ

λr

∞∑k=0

∞∑j=0

∞∑l=0

∞∑m=0

(−(j + 1))ll!

(−(αl + k))mm!

(1− θ)j∫ ∞

0

ur−1e−u(m+1)du

where the last equality follows from the integrate of a gamma distribution and the property (10),

completing complete the proof.

�

Order statistics are among the most fundamental tools in non-parametric statistics and in-

ference. Let Y1, . . . , Yn be a random sample taken from the E2G distribution and Y1:n, . . . , Yn:n

denote the corresponding order statistics. Then, the probability density function fi:n (y) of the ith

order statistics Yi:n is given by

fi:n(x) =n!

(k − 1)!(n− k)!F (y)k−1(1− F (y))n−kf(y)

The rth moment of the ith order statistic Xi:n, according with [12], can be represented as

E [Y ri:n] = r

n∑p=n−i+1

(−1)p−n+i−1

(p− 1

n− i

)(n

p

)∫ ∞0

xr−1 [S(y)]pdy. (11)

Proposition 2.3. For the random variable Y with E2G distribution the rth moment of the ith

order statistic is given by

E [Y ri:n] =r!

λr

n∑p=n−i+1

∞∑l=0

∞∑m=0

∞∑k=0

∞∑j=0

(−1)p−n+l+m+i+2r

(p− 1

n− i

)θm

×(n

p

)(p)j(p+ j − l + 1)l(1− θ)j(αl + k −m+ 1)m

j!l!(m+ 1)r.

Proof 2.3. From (11), using (2) and (9), and proceeding in a similar way as in the proof 2.2 the

result follows.

�

5

Given that there was no failure prior to time t, the residual lifetime distribution of a random

variable X, distributed as E2G distribution, has the survival function given by

St(x) = Pr[X > x+ t|X > t] =

(1− (1− e−λ(x+t))α

1− (1− e−λx)α

)(1− (1− θ)(1− e−λx)α

1− (1− θ)(1− e−λ(x+t))α

).

The mean residual lifetime of a continuous distribution with survival function F (x) is given by

µ(t) = E(X − t|X > t) =1

S(t)

∫ ∞t

S(u)du. (12)

Proposition 2.4. For the random variable Y with E2G distribution the mean residual lifetime is

given by

µ(t) =θ

λ

(1− (1− θ)(1− e−λt)

1− (1− e−λt)α

) ∞∑k=0

∞∑i=0

∞∑j=0

(1− θ)i(−1)j(i− j + 2)jj!

(1− (1− eλt)k+αj+1

k + αj + 1

).

Proof 2.4. From (12) and using S(y) given by (2) we have that

1

S(t)

∫ ∞t

S(u)du = θ1− (1− θ)(1− e−λt)

1− (1− e−λt)α

∫ ∞t

1− (1− e−λu)α

1− (1− θ)(1− e−λu)du

=θ

λ

1− (1− θ)(1− e−λt)1− (1− e−λt)α

∫ 1

1−e−λt

1− xα

(1− (1− θ)(1− xα))(1− x)dx.

Now using (9) and proceeding in a similar way as in the proof 2.2 the result follows.

�

Proposition 2.5. The modal value for the variable Y with density given by (1) is given by,

Y = − ln(Root[θ(αx− 1) + (1− x)α(−1 + θ + αx(θ − 1))])/λ if α > 1, (13)

where Root[f(x)] is the x root of f(x) and its is evaluated by numerical methods.

Proof 2.5. This proof is straightly obtained by solving, from (1), the equation df(y)/dy = 0.

�

2.3 Inference

Assuming the lifetimes are independently distributed and are independent from the censoring

mechanism, the maximum likelihood estimates (MLEs) of the parameters are obtained by direct

maximization of the log-likelihood function given by,

`(θ, λ, α) = n ln(θ) + ln(αλ)

n∑i=1

ci − λn∑i=1

ciyi + (α− 1)

n∑i=1

ci ln(1− e−λyi) + (14)

+

n∑i=1

(1− ci) ln(1− (1− e−λyi)α

)−

n∑i=1

(1 + ci) ln(1− (1− θ)(1− (1− e−λyi)α)

),

where ci is a censoring indicator, which is equal to 0 or 1, respectively, if the data is censured or

observed, respectively. The advantage of this procedure is that it runs immediately using existing

statistical packages. We have considered the optim routine of the R [13]. Large-sample inference

for the parameters are based on the MLEs and their estimated standard errors.

In order to compare distributions we consider the −max `(.) values and the Akaike information

criterion (AIC), which are defined, respectively, by −2`(.) + 2q where `(.) is the log-likehood

evaluated in the MLE vector of parameters of the respective distribution and q is the number

of estimated parameters. The best distribution corresponds to a lower −max `(.), AIC and BIC

values.

6

3 On the relationship between the EG and E2G distribu-

tions

In this section, we discuss some relationship between the EG distribution [1], who motivated this

article, and E2G distribution proposed here.

The hazard function of the EG distribution is given by

h(y) = λ/(1− (1− θ)e−λy). (15)

Proposition 3.1. The hazard functions, (15) and (3), diverge for y → 0 and converge for the

same point λ when y →∞.

Proof 3.1. (a) For y → 0, for the EG model, limy→0 h(y) = limy→0λ

1− θe−λy=

λ

1− θwhile

for the E2G distribution, if α = 1, limy→0 h(y) = λθ if α < 1, limy→0 h(y) =∞ and if α > 1,

limy→0 h(y) = 0 concluding the proof of the initial divergence between the EG and E2G

hazard functions.

(b) For y → ∞, for the EG model, limy→∞ h(y) = limy→∞λ

1− θe−λy= λ and for the E2G

distribution limy→∞ h(y) = λ, concluding the proof of the converge.

�

The Figure 2 shows the behavior of hazard functions of EG model for θ = 0.1, 0.9 and for rate

function of E2G with θ = 0.1, 0.9 and α = 0.25, 4. The EG hazard function is decreasing while the

E2G hazard function can be decreasing, increasing and unimodal, but both hazards converge to λ

for y →∞ corroborating with the Proposition 3.1.

0 1 2 3 4 5 6

01

23

4

EG, with λ=1.0

Times

Haz

ard

Fun

ctio

n

θ=0.10θ=0.90

0 1 2 3 4 5 6

01

23

4

E2G, with λ=1.0

Times

Haz

ard

Fun

ctio

n

(θ,α)=(0.10,0.25)(θ,α)=(0.90,0.25)(θ,α)=(0.10,4.00)(θ,α)=(0.90,4.00)

Figure 2: Comparing the hazard function of the EG and E2G distributions for same fixed θ values.

A misspecification study was performed in order to verify if we can distinguish between the

EG and E2G distributions, in the light of a dataset, based on an usual comparison criterion.

We consider here the −max `(.) values. The preferred distribution is the one with the smaller

−max `(.) value.

We generate 1.000 samples of each one of the EG and E2G distributions. We consider different

sample sizes, n, equal to 10, 20, 30, 50 and 100, and different censoring percentages, p, equal to 0.1,

7

Table 1: Percentage of times that the ( EG / E2G) distribution, which originated the sample, was

the best fitted distribution.

θ = 0.9

n/p 10 20 30 50 100

0.10 0.524/0.639 0.616/0.884 0.737/0.940 0.760/0.975 0.848/0.999

0.20 0.432/0.482 0.516/0.707 0.637/0.778 0.671/0.879 0.784/0.961

0.30 0.428/0.402 0.453/0.545 0.574/0.628 0.599/0.696 0.712/0.826

θ = 0.1

n/p 10 20 30 50 100

0.10 0.491/0.870 0.634/0.992 0.697/0.999 0.777/0.999 0.827/0.999

0.20 0.469/0.826 0.617/0.977 0.670/0.998 0.742/0.999 0.819/0.999

0.30 0.480/0.766 0.587/0.952 0.673/0.992 0.725/0.999 0.772/0.999

0.2, 0.3. We fixed λ = 0.5 and θ = 0.1 and θ = 0.9 for both distributions, and α = 3 to the E2G

distribution. The two distributions were fitted to each sample and their −max `(.) values were

calculated. For θ = 0.1, and α = 3, the data have a unimodal hazard function and for θ = 0.9,

and α = 3 have a increasing hazard function. In principle, the case θ = 0.1, θ = 0.9 combined

with α = 3 is better adjusted by a E2G distribution. Table 1 shows the percentage of time that

the distribution, which originated the sample, was the best fitted distribution according to the

−max `(.) values. We observe that it is usually possible to discriminate between the distributions

even for moderate samples in presence of censoring. We also performed the study by considering

the AIC instead of the −max `(.) values. However the results leads to similar conclusions and are

omitted here.

4 Aplications

In this section, we compare the E2G distribution fit with several usual lifetime distributions on three

datasets extracted from the literature. one with increasing hazard function, one with unimodal

hazard function and one with decreasing hazard function.

The following lifetime distributions were considered.The Exponential distribution with prob-

ability density function given by f(x) = λe−λx, the Weibull distribution with probability den-

sity function given by f(x) = θλ

(xλ

)θ−1e−(x/λ)θ , the gamma distribution with probability den-

sity function given by f(x) = 1λθΓ(θ)

xθ−1e−x/λ, the EG distribution [1] with probability density

function given by f(x) = λ(1 − (1 − θ)e−λx)−1, the Modified Weibull (MW) distribution [14]

with probability density function given by f(x) = αxθ−1(θ + λx)eλxe−αxθ exp{λx}, and the gen-

eralized exponential-Poisson (GEP) distribution [4] with probability density function given by

f(x) = αβλ(1−e−λ)α

(1− e−λ+λ exp (−βx))α−1e−λ−βx+λ exp (−βx).

The first set, hereafter T1, are observed survival times for 65 breast cancer patients treated

over the period 1929-1938, quoted by [15]. The second set, hereafter T2, are survival times for

patients with bile duct cancer, which took part in a study to determine whether a combination of a

radiation treatment (R0Rx) and the drug 5-fluorouracil (5-FU) prolonged survival, extracted from

[16]. Survival times, in days, are given for a control group. The third set, hereafter T3, consists

of the number of successive failure for the air conditioning system of each member in a fleet of 13

Boeing 720 jet airplanes. The pooled data with 214 observations was considered by [1]. It was first

analyzed by [17]and discussed further by [18],[19], [3] and [20].

8

Firstly, in order to identify the shape the hazard function we shall consider a graphical method

based on the TTT plot [21]. In its empirical version the TTT plot is given by G(r/n) =

[(∑ri=1 Yi:n) + (n − r)Yr:n]/(

∑ni=1 Yi:n), where r = 1, . . . , n and Yi:n represent the order statis-

tics of the sample. It has been shown that the hazard function is increasing (decreasing) if the

TTT plot is concave (convex). The left panels of Figure 4 shows concave TTT plots for T1,

concave/convex for T2 and convex for T3, indicating increasing, unimodal and decreasing hazard

functions, respectively.

Table 2 provides the −max `(.) and the AIC values for all fitted distributions. Both criteria

provide evidence in favor of our E2G distribution for T2 and T3, and give similar values for the

E2G and gamma distributions for T1, corroborating the fact that the E2B distribution can be seen

as a competitive distribution of practical interest for the analysis of survival data. Figure 4 (right

panels) shows the fitted survival superimposed to the empirical Kaplan-Meier survival function.

The the MLEs (and their corresponding standard errors in parentheses) of the parameters α,

θ and λ(×1000) of the E2G distribution are given, respectively, by 2.012(0.719), 0.582(0.551) and

41.11(9.98) for T1, by 1.978(2.377), 0.021(0.0489) and 0.85(0.188) for T2, and by 1.238(0.273),

0.314(0.141) and 6.90(0.72) for T3.

Table 2: Values of the −max `(.) and AIC for all fitted distributions.

E EG Weibull Gamma E2G MW GEP

T1−max `(.) 284.6012 284.6012 280.6973 280.0386 279.7394 280.6974 279.9055

AIC 571.2024 573.2024 565.3946 564.0772 565.4788 567.3947 565.8110

T2−max `(.) 172.6231 171.2032 172.2865 172.6161 167.3942 172.2865 168.7415

AIC 347.2463 346.4064 348.5730 349.2322 340.7883 350.5731 343.4831

T3−max `(.) 1178.766 1175.925 1177.585 1178.291 1174.260 1177.585 1174.785

AIC 2359.532 2355.849 2359.170 2360.582 2354.520 2361.170 2355.570

5 Concluding remarks

In this paper we propose a new lifetime distribution. The E2G distribution is a straightforwardly

generalization of the EG distribution proposed by [1], which accommodates increasing, decreasing

and unimodal hazard functions. It arises on a latent competing risk scenarios, where the lifetime

associated with a particular risk is not observable but only the minimun lifetime value among all

risks. The properties of the proposed distribution are discussed, including a formal prove of its

probability density function and explicit algebraic formulas for its survival and hazard functions,

moments, rth moment of the ith order statistic, mean residual lifetime and modal value. Max-

imum likelihood inference is implemented straightforwardly. From a misspecification simulation

study performed in order to assess the extent of the misspecification errors when testing the EG

distribution against the E2G one we observed that it is usually possible to discriminate between

both distributions even for moderate samples in presence of censoring. The practical importance of

the new distribution was demonstrated in three applications where the E2G distribution provided

the better fitting in comparison with the EG one and several other former lifetime distributions.

Acknowledgments: The research of Francisco Louzada is supported by the Brazilian organi-

zation CNPq.

9

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

TTT−Plot

n/r

G(n

/r)

0 20 40 60 80 100 120

0.0

0.2

0.4

0.6

0.8

1.0

Time

S(t

) es

timat

ed

EEGWeibullGammaE2GMWGEP

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

TTT−Plot

n/r

G(n

/r)

0 500 1000 1500 2000 2500

0.0

0.2

0.4

0.6

0.8

1.0

Time

S(t

) es

timat

ed


0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

TTT−Plot

n/r

G(n

/r)

0 100 200 300 400 500 600

0.0

0.2

0.4

0.6

0.8

1.0

Time

S(t

) es

timat

ed


Figure 3: Left Panels: Empirical TTT-Plot. Right Panels: Kaplan-Meier and survival fitted curves.

Upper panels are for dataset T1, Middle panels are for dataset T2 and lower panels are for dataset

T3.

10

References

[1] K. Adamidis and S. Loukas. A lifetime distribution with decreasing failure rate. Statistics

Probability Latters, 39:35–42, 1998.

[2] R. Gupta and D. Kundu. Generalized exponential distributions. Australian and New Zealand

Journal of Statistics, 41:173–188, 1999.

[3] C. Kus. A new lifetime distribution. Computation Statist. Data Analysis, 51:4497–4509, 2007.

[4] W. Barreto-Souza and F. Cribari-Neto. A generalization of the exponential-poisson distribu-

tion. Statistics and Probability Letters, 79:2493–2500, 2009.

[5] G. D. C. Barriga, F. Louzada-Neto, and V. G. Cancho. The complementary exponential power

lifetime model. Computational Statistics and Data Analysis, 55(1250–1259), 2011.

[6] R. M. Smith and L. J. Bain. An exponential power life-testing distribution. CommStatA,

4:469–481, 1975.

[7] Rameshwar D. Gupta and Debasis Kundu. Exponentiated exponential family: An alternative

to gamma and weibull distributions. Biometrical Journal, 43:117–130, 2001.

[8] F. Louzada-Neto. Poly-hazard regression models for lifetime data. Biometrics, 55:1121–1125,

1999.

[9] J. F. Lawless. Statistical Models and Methods for Lifetime Data, volume second edition. Wiley,

New York, NY, 2003.

[10] M.J. Crowder, A.C. Kimber, R.L. Smith, and T.J. Sweeting. Statistical Analysis of Reliability

Data. Chapman and Hall, London, 1991.

[11] D.R. Cox and D. Oakes. Analysis of Survival Data. Chapman and Hall, London, 1984.

[12] H.M. Barakat and Y.H. Abdelkader. Computing the moments of order statistics from non-

identical random variables. Statistical Methods and Applications, 12(1):15–26, 2004.

[13] R Development Core Team. R: A Language and Environment for Statistical Computing. R

Foundation for Statistical Computing, Vienna, Austria, 2008.

[14] C.D. Lai, Min Xie, and N.P Murthy. A modified weibull distribution. IEEE Transactions on

Reliability, 52:33–37, 2003.

[15] J. Boag. Maximum likelihood estimates of the proportion of patientes cured by cancer thera-

phy. Journal of the Royal Statistics Society, Series B, 11:15–44, 1949.

[16] Thomas R. Fleming, Judith R. O’Fallon, Peter C. O’Brien, and David P. Harrington. Mod-

ified kolmogorov-smirnov test procedures with application to arbitrarily right-censored data.

Biometrics, 36(4):607–625, 1980.

[17] F. Proschan. Theorical explanation of observed decreasing failure rate. Technometrics, 5:375–

383, 1963.

[18] A. Dahiya and J. Gurland. Goodness of fit tests for the gamma and exponential distributions.

Technometrics, 14:791–801, 1972.

11

[19] L. J. Gleser. The gamma distribution as a mixture of exponential distributions. Amer. Statist.,

43:115–117, 1989.

[20] W. Barreto-Souza, A. L. de Morais, and G. M. Cordeiro. The weibull-geometric distribution.

Journal of Statistical Computation and Simulation, pages 1–14, 2008.

[21] M. V. Aarset. How to identify a bathtub hazard rate. IEEE Transactions on Reliability,

2:106–108, 1987.

12

PUBLICAÇÕES 2010

. RODRIGUES, J.; CASTRO, M.; BLAKRISHNAN, N.; GARIBAY, V.; “Destructive weighted Poisson cure rate models”

Novembro/2009 Nº 210

. COBRE, J.; LOUZADA-NETO, F., PERDONÁ, G.; “A Bayesian Analysis for the Generalized Negative Binomial Weibull Cure Fraction Survival Model: Estimating the Lymph Nodes Metastasis Rates” Janeiro/2010 Nº 211

. DINIZ, C. A. R. ; LOUZADA-NETO, F.; MORITA, L. H. M.; “The Multiplicative Heteroscedastic Von Bertalan_y Model”

Fevereiro/2010 Nº 212

. DINIZ, C. A. R. ; MORITA, L. H. M, LOUZADA-NETO, F.; “Heteroscedastic Von Bertalanffy Growth Model and an

Application to a Kubbard female chicken corporeal weight growth data” Fevereiro/2010 Nº 213

. FURLAN, C. P. R.; DINIZ, C. A. R.; FRANCO, M. A. P.; “Estimation of Lag Length in Distributed Lag Models: A Comparative Study” Março/2010 Nº 214

. RODRIGUES, J., CANCHO, V. G., CASTRO, M., BALAKRISHNAN, N., “A Bayesian destructive weighted Poisson cure rate

model and an application to a cutaneous melanoma data”, Março/2010 Nº 215.

. CORDEIRO, G. M.; RODRIGUES, J.; CASTRO, M. “The exponential COM-Poisson distribution” , Abril/2010 Nº 216

. CANCHO, V. G.; LOUZADA-NETO, F.; BARRIGA, G. D. C. “The Poisson-Exponential Survival Distribution For Lifetime

Data”, Maio/2010 Nº 217

. DIINIZ, C. A. R.; FURLAN, C. P. R.; LEITE, J. G. “A Bayesian Estimation of Lag Length in Distributed Lag Models” ,

Julho/2010 Nº 218

. CONCEIÇÃO, K. S.; PIRES, R. M.; LOUZADA-NETO, F.; ANDRADE, M. G.; DINIZ, C. A. R. “A Generalized Species-Area Relationship for Estimating Species Diversity: The Poisson Distribution Case” Julho/2010 Nº 219

. LOUZADA-NETO, F.; CANCHO, V. G.; BARRIGA, G. D. C.; “A Bayesian Analysis For The Poisson-Exponential Distribution”

Agosto/2010 Nº 220

. SCACABAROZI, F.N. , DINIZ C. A. R., FRANCO M. A. P. , “A Comparative Study of Credibility an Confidence Intervals for

the Parameter of a Poisson Distribuition” Setembro/2010 Nº 221

. TOMAZELLA, V. L., BERNARDO, J. M.; “Testing for Hardy-Weinberg Equilibrium in a Biological Population: An Objective

Bayesian Analysis” Setembro/2010 Nº 222

. LOUZADA-NETO F., DINIZ C. A. R., COSTA, C. C., SILVA P. H. F., DESTEFANI, C. R., TEMPONI A. P. O. “Procedimentos

Estatísticos para Segmentação de Base de Dados” Setembro/2010 Nº 223

. ARA-SOUZA, A. L.; LOUZADA-NETO, F.; “Caracterização dos Docentes e Necessidade de Doutores dentro das Graduações de

Estatística do Brasil” Setembro/2010 Nº 224

. LOUZADA-NEDTO, F.; ROMAN, M.; CANCHO, V. G.; “The Complementary Exponential Geometric Distribution: Model, Properties and a Comparison With Its Counterpart” Setembro/2010 Nº 225

. LOUZADA-NETO,F.; BORGES, P.; “The Exponential Negative Binomial Distribuition” Setembro/2010 Nº 226

. PEREIRA, G.A; LOUZADA-NETO, F.; MORAES-SOUZA, H. ; FERREIRA-SILVA, M. M.; BARBOSA, V. F.; “General Bayesian latent class model for the evaluation of the performance of L diagnostic tests for Chagas’ disease in the absence of a gold

standard considering M covariates and V different disease prevalences” Setembro/2010 Nº 227

. SARAIVA, E. F.; MILAN, L. A.; LOUZADA-NETO, F.; “A Posterior Split-Merge MCMC Algorithm for Mixture Models with an unknown number of components” Setembro/2010 Nº 228

. BORGES, P.; ROMAN, M.; TOJEIRO, C. A. V.; LOUZADA-NETO, F.; “The complementary exponential logarithmic

distribution: A two-parameter lifetime distribution with increasing failure rate” Outubro/2010 Nº 229

. TOJEIRO, C.; A. V. LOUZADA-NETO, F.; “A General Threshold Stress Hybrid Hazard Model for Lifetime Data” Outubro/2010

Nº 230

. TOMAZELLA, V. L.D.; CANCHO, V. G.; LOUZADA-NETO, F.; “Objective Bayesian Reference Analysis for the Poisson-exponential lifetime distribution” Outubro/2010 Nº 231

. DELGADO, J. F.; BORGES, P.; CANCHO, V.G.; LOUZADA-NETO, F.; “A compound class of exponential and power series

distributions with increasing failure rate” Novembro/2010 Nº 232

. DELGADO. J. J. F., CANCHO V. G., LOUZADA-NETO F.; “The power series cure rate model: an application to a cutaneous

melanoma data” Novembro/2010 Nº 233

Os recentes relatórios poderão ser obtidos pelo endereço http://www.des.ufscar.br/Publicacoes.php?lingua_id=2&secao_id=81

Mais informações sobre publicações anteriores ao ano de 2010 podem ser obtidas via e-mail: [email protected]

http://www.des.ufscar.br/Publicacoes.php?lingua_id=2&secao_id=81

mailto:[email protected]

universidade federal de sÃo carlos centro …the exponentiated exponential distribution is an...

Documents