1 borgan and henderson: event history methodology lancaster, september 2006 session 1: event history...

31
1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

Upload: virgil-hodge

Post on 27-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

1

Borgan and Henderson:

Event History Methodology

Lancaster, September 2006

Session 1:

Event history data and

counting processes

Page 2: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

2

• A survival time is the time elapsed from an initial event to a well-defined end-point, e.g.– From birth to death (time=age)

– From birth to cancer diagnosis (time=age)– From cancer diagnosis to death (time=disease duration)

Survival analysis: censoring and truncation

• A special feature of survival data is that we usually have incomplete observation of the survival times due to censoring and truncation.

• Traditionally research in event history analysis has focused on situations where the interest is in a single event for each subject under study. This is called survival analysis.

Page 3: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

3

The most common form of incompleteness is right censoring: for some individuals we only know that their true survival times are larger than certain censoring times.

Censoring may be due to a number of reasons: termination of study (cf. figure below), withdrawals, lost to follow-up, etc.

Page 4: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

4

Due to censoring the commonly used statistical methods do not apply to censored survival data. (We cannot even calculate a mean!)

Another common form of incomplete observation is due to left truncation (delayed entry): some (all) individuals are not followed from time zero (in the study time scale), but only from later entry times.

We usually have left truncated data when age is the study time scale (as is often the case in epidemiology).

Left truncation usually occurs in combination with right censoring.

Page 5: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

5

Survival analysis: Examples

Lung cancer data:• 272 non-small-cell lung cancer patients in Yorkshire followed for 3 years in the early 1990s.• Observe time to cancer death or censoring (17%).• Covariates:

– Age: in years– Sex: 0=M, 1=F– Activity score: 0-4– Anorexia: 0=absent, 1=present– Hoarseness: 0=absent, 1=present– Metastases: 0=absent, 1=present

Page 6: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

6

Leukemia data:

• 1043 cases of acute myeloid leukaemia recorded in the North West Leukaemia Register between 1982 and 1998.

• Observe death or censoring (16%).• Covariates:

– Age: in years – Sex: 0=F, 1=M – White blood cell count: (50x109/L), truncated at 500 – Deprivation score: a measure of poverty/affluence for the residential location (low is good)

Page 7: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

7

Uncensored survival time T (assumed absolutely continuous).

Survival function: ( ) ( )S t P T t

Hazard rate:0

1( ) lim ( | )t t

t P T t t T t

Survival analysis: concepts

Heuristically: ( ) ( | )t dt P T t t T t

The hazard rate is the instantaneous probability of the event per unit of time.

Page 8: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

8

Relations between the survival function S(t), the hazard rate and the density function f(t):

0( ) exp ( )

tS t s ds

( ) ( )

( ) ( )( ) log ( )

f t S t

S t S tddt

t S t

0( )

( ) ( ) ( ) ( ) ( )t

s dsf t S t t S t t e

( )t

Page 9: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

9

Distribution Hazard Survival

Exponential

Weibull

Gompertz

Gamma

Survival analysis: some parametric models

te

We will not consider parametric models any further.

( )te1( )t

te exp{ (1 )}te ( , ) / ( )t

(upper) incomplete gamma function

1 / ( , )tt e t

Page 10: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

10

Survival analysis: clustered data

Usually survival data are assumed independent.

One important exception is when subjects are grouped into clusters (e.g. twins, litters, organs of an individual).

The Diabetic Retinopathy Study: – 179 patients with diabetic retinopathy– Laser photocoagulation randomly assigned to one

eye of each patient– Observe time to severe visual loss ("blindness") for

each eye (may be censored)– There are 179 clusters with 2 observations per

cluster

Page 11: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

11

Event history data

Connecting together several events for a subject as they occur over time yields event histories.

Events may be of the same type (recurrent events): - Births for a woman - Episodes of diarrhea for a child

- Recurrent tumours

The events may be of different types: - Marriage, divorce, new marriage, etc. - Cancer diagnosis, remission, relapse, death

Multistate models are used for analysis.

Page 12: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

12

The bladder cancer study:

– 86 patients with superficial bladder tumours

– Tumours were removed, and the patients randomized to placebo or treatment by thiotepa

– Patients were followed up, and the recurrence of tumours were registered

Recurrent event data: examples

Page 13: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

13

New episode:

Follow-up information on 10 children:

X

X

O

Under observation:

Ongoing episode:

Drop-out:

Infant diarrhoea data: – Data on episodes of diarrhoea on 926 children obtained

from household surveys, October 2000 to January 2002, in Salvador, Brazil.

– Various covariates collected at beginning of study.– Observations may be missing due to intermittent

missingness and drop-out.

Page 14: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

14

Recurrent event data: basic models

Two classic models for recurrent event data are the Poisson processes and the renewal processes.

In general one may obtain a model for recurrent events by specifying the intensity :

( ) (event in [ , ) | past events)t dt P t t dt

For a Poisson process is a fixed function not depending on past events.

For a renewal process where s(t) is the time since the last event and h(t) is a fixed function.

( ) ( ( ))t h s t

( )t

( )t

Page 15: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

15

Competing causes of death:

Data from the health screenings in three Norwegian counties 1974-78.

Followed-up to the end of 2000 by record linking to the cause of death registry at Statistics Norway.

Alive

0

Dead due to cancer

1

Dead due cardiovascular disease

2

Dead due to other medical causes

3

Alcohol abuse, accidents, violence

4

Multistate models: examples

Page 16: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

16

Page 17: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

17

Multistate models: the Markov case

Alive Dead 0 1

01( )t

is the hazard rate or transition intensity.01( )t

The survival analysis situation may be modelled by a Markov model with two states:

Page 18: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

18

With two or more causes of failure we get a model for competing risks:

Alive

Deadcause of interest

0

Deadother causes

1

2

02 ( )t

01( )t

and are the cause specific hazards or transition intensities (i.e. instantaneous probabilities of a transition per unit of time).

01( )t 02 ( )t

Page 19: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

19

An illness-death model:

Healthy Diseased

0

Dead

1

202 ( )t

01( )t

12 ( , )t dt = timed = duration

We have a Markov process if the transition intensities do not depend on duration in a state.

Page 20: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

20

Explicit expressions for the transition probabilities

are available for simple Markov models.

00 01 02

0 00 0

( , ) exp ( ) ( )

( , ) ( , ) ( )

t

s

t

j js

P s t u u du

P s t P s u u du

In general there are no explicit expressions for the transition probabilities, but we will see later how the transition probability matrix may be estimated from the matrix of empirical transition intensities.

E.g. for competing risks (j =1,2) :

( , ) (in state at time | in state at time )hjP s t P h t h s

Page 21: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

21

We may be interested in studying simultaneously the occurrences of two or more types of events.

Modelling event history data

When defining a general model for event history data, we need to take care of the fact the event history data are incompletely observed.

For now, however, we concentrate on one type of event, e.g. corresponding to a failure, a recurrent event or one specific transition in a multistate model.

Page 22: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

22

Joint model for the complete event history data and the observation process

Model for the complete event history data

Model for the observed event history data

Parameters of interest are defined for this model

Conditions on the observation process are defined for this model

Statistical methods are derived and studied for this model

We need to consider and relate models for three situations:

Page 23: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

23

We consider a cohort with n individuals, and let be the intensity of the event of interest for the i-th individual:

( )i t

( ) (event in [ , ) | )i tt dt P t t dt H

Here Ht- denotes all information on covariates and events that would have been available at t- if all event histories had been completely observed (note that equals 0 when the event cannot happen).

Model for complete event history data

The are our key model parameters, and an aim of a statistical analysis is to infer how these (and derived quantities) depend on covariates and vary over time.

( )i t

( )i t

Page 24: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

24

Joint model for the event history data and the observation processWe will allow the event histories to be left truncated and right censored, so an individual may enter the study at a time t > 0 (in the study time scale), and the observation of its event history may stop before death (or another terminal event).

We will also allow intermittent missingness, so that there may be intervals where events are not observed (cf. the diarrhoea data), and define the left-continous observation process:

,obs

1 observed "just before" time ( )

0 not observed "just before" time i

tY t

t

Page 25: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

25

Let Gt- denote all information that would have been available at t- on covariates, events and the observation process in the hypothetical situation where both the complete event histories and the observation process is observed.

(event in [ , ) | )

(event in [ , ) | )t

t

P t t dt G

P t t dt H

Assume that the observation process is independent in the sense that:

Thus the likelihood of an event occurring for an individual is not influenced by whether the individual is observed or not.

Page 26: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

26

Denote by Ft- all information actually available to the researcher at t- (on covariates, events, entries, exits, etc).

Then for individual i:

Modelling the observable data

,obs

,obs

,obs

,obs

(observe event in [ , ) | )

E{E[ ( ) {event in [ , )} | ]| }

E{ ( ) (event in [ , ) | )| }

E{ ( ) (event in [ , ) | )| }

E{ ( ) ( ) | }

t

i t t

i t t

i t t

i i t

P t t dt F

Y t I t t dt G F

Y t P t t dt G F

Y t P t t dt H F

Y t t dt F

Page 27: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

27

,obs ( ) ( ) ( ) ( )i i i iY t t Y t t

Let 1 at risk "just before" time

( )0 otherwisei

tY t

Then

Assume that both Yi(t) and are predictable processes, i.e. their values at time t are known from the information at hand "just before" time t (apart from unknown parameters).

(where "at risk" means that the event may happen and that the individual is under observation).

(observe event in [ , ) | )= ( ) ( )t i iP t t dt F Y t t

Then for individual i :

( )i t

Page 28: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

28

t

Ni(t)

0

1

2

Let Ni(t) count the observed occurrences of the event of interest for individual i as a function of (study) time t

Counting process formulation

The intensity process of Ni(t) is given by

where dNi(t) is the increment of Ni over [t,t+dt)

( ) ( ( ) 1| )i i tt dt P dN t F ( ) ( )i iY t t

( )i t

Page 29: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

29

( ) ( ( ) 1| )i i tt dt P dN t F ( ( ) | ) i tE dN t F

Cumulative intensity process:

0

0

( ) ( )

( ) ( )

t

i i

t

i i

t s ds

Y s s ds

dNi(t) is a binary variable (taking the values 0 or 1)

Thus

Page 30: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

30

Introduce Mi(t) = Ni(t) –

Thus Mi(t) is a martingale

( ( ) | )i tE dM t F

( ( ) | ) ( )i t iE dN t F t dt

( ( ) ( ) | )i i tE dN t t dt F

( ) ( ) i it dt t dt 0

( ) ( )) ( i i it dMdtdN t t At each time t we have the decomposition

observation signal noise

( )i t

Note: All martingales we will consider take the value 0 at t = 0, and hence have expected value 0 for all t

Page 31: 1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 1: Event history data and counting processes

31

Aggregated counting process:

1

( ) ( )n

ii

N t N t

Corresponding intensity process and martingale (by linearity of expectation):

1

( ) ( )n

ii

t t

1

( ) ( )n

ii

M t M t

( ) ( )( ) t d ddN t tt M

Decomposition: