semi-markov models under panel observationtitman/semimarkov_talk.pdf · advantages computation of...

41
Semi-Markov models under panel observation Andrew Titman Lancaster University March 8, 2012 Andrew Titman Lancaster University Semi-Markov models under panel observation

Upload: others

Post on 14-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Semi-Markov models under panel observation

Andrew Titman

Lancaster University

March 8, 2012

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 2: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Overview

Multi-state modelling

Computational issues with semi-Markov models

Phase-type sojourn distributions

Phase-type approximations to parametricdistributions

Application to data on post-lung-transplantationpatients

Further extensions

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 3: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Multi-state models

Generalisation of standard survival analysis

Model transition intensities between multiple states

Applications

Medical: e.g. chronic diseases, HIV, breast cancer screening,cognitive decline.Financial: e.g. credit scoring modelsSocial science/ economics: e.g. employment status

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 4: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Multi-state models

Inference methods dependent on the observation scheme

Continuous observation up to right-censoring:

Natural generalisations of estimators from standard survivalanalysis availableNon-parametric estimation of baseline intensities commonlyused.

Panel observation

State of individual only observed at discrete (irregularlyspaced, patient specific) time pointsParametric estimation most common: Markov, timehomogeneous (Kalbfleisch & Lawless, 1985).

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 5: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Example: Bronchiolitis obliterans syndrome intransplantation patients

BOS Free BOS

Death

q12(t,Ft)

q21(t,Ft)

q13(t,Ft) q23(t,Ft)

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 6: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Transition intensities

Multi-state models are typically parameterised via thetransition intensities

qrs(t,Ft) = limδt→0

P (X(t+ δt) = s|X(t) = r,Ft)δt

for process X(t) with filtration (or history) Ft.Necessary to make some kind of assumptions

Homogeneous Markov qrs(t,Ft) = qrsMarkov qrs(t,Ft) = qrs(t)Semi-Markov qrs(t,Ft) = qrs(t, t

∗) where t∗ < t is the time ofentry into the current state.

Vast majority of work for panel observed data focusses onMarkov cases.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 7: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Why consider a semi-Markov model?

Might be more realistic for particular applications

e.g. spells of a disease unlikely to be very short → exponentialdistribution not appropriatee.g. people at less risk of disease the longer they have beendisease free.

As a model diagnostic

Way of directly testing the Markov assumptionLikely to also pick up some frailty type effects

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 8: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Likelihood for Markov model

The likelihood for a single individual observed in states x0, . . . , xni

at time points, 0 = t0, t1, . . . , tni is

ni∏j=1

pxj−1xj (tj−1, tj)

where prs(t1, t2) = P(X(t2) = s|X(t1) = r).P(t1, t) relates to Q(t), the generator matrix of transitionintensities, through the Kolmogorov forward equations (KFE)

dP(t1, t)

dt= P(t1, t)Q(t), P(t1, t1) = I.

In the time homogeneous case P(t1, t) = exp((t− t1)Q0), i.e.matrix exponential.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 9: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Likelihood for Semi-Markov model

No longer possible to factorise likelihood in terms of transitionprobabilities between pairs of events

prs(t1, t2) now depends on time of entry into state r.

In general P (X1 = x1, . . . , Xn = xn) =∑H∫S|H Lh(s)ds

Sum over all possible paths, H, consistent with the observedhistory and for each history integrate over the possiblesojourns in each state.

If no recovery possible then involves numerical quadrature.(Foucher et al, 2011).

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 10: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Current status data

Simplest possible interval censoring scenario is where theprocess is initiated in state 1 at time 0 and subjects are onlyobserved once

Here likelihood can be expressed in terms ofpr(t) = P (X(t) = r|X(0) = 1) which is the solution to asystem of integral equations.

pr(t) =∑j 6=r

∫ t

0

pj(u)qjr(t− u) exp {−Qj(t− u)}du+ δ1r exp {−Q1(t)}

where Qj(t) =∑R

k=1

∫ t0 qjk(u)du.

But more generally require nested equations because currenttime spent in each state is not known.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 11: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Computation for semi-Markov likelihood

Kang & Lagakos (2007) considered the direct integralequation approach, but with restrictions:

At least one state of the process has an exponential sojourntime - to allow partial factorisation.Other states have a minimum sojourn time (guarantee time) -to limit the maximum number of jumps occurring betweenobservations.

Some potential for simulation based approaches to theproblem

e.g. Stopping-time resampling (Chen et al (2005))

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 12: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Phase type distribution

Distribution of time to absorption of a time homogeneousMarkov process

Matrix analytic representation

f(t) = π exp (tS)S0

S(t) = π exp (tS)1

where π vector of initial state occupancy probabilities, Ssubgenerator matrix and S0 = −S1.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 13: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Coxian phase-type distribution

1 2 3 N

N + 1

µ1 µ2 µ3 µN

ξ1 ξ2 ξ3 . . .ξN−1

θ = (µ1, µ2, . . . , µN , ξ1, ξ2, . . . , ξN−1).

π = (1, 0, . . . , 0).

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 14: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

General idea

Phase-type distributions offer a very flexible class ofwaiting-time distributions.

If the sojourn times of the semi-Markov model are restrictedto have phase-type distributions, then the likelihood remainstractable

Can be represented as an aggregated Markov model.Hidden Markov model likelihood methods apply.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 15: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Likelihood

If each state, r, of the process has an N phase-type sojourndistribution can define sub-states r1, . . . , rN .

Representing the phases of the phase-type distribution

The latent process, X∗, of sub-states is Markov.

Observed process then has a hidden Markov modelrepresentation, e.g.

P (X1, X2, X3) =∑i,j,k

P (X∗1 = 1i, X∗2 = 2j , X

∗3 = 3k)

=∑i,j,k

P (X∗1 = 1i)P (X∗2 = 2j |X∗1 = 1i)P (X

∗3 = 3k|X∗2 = 2j)

Can recursively evaluate summation by using Forwardalgorithm.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 16: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Advantages

Computation of likelihood relatively fast

Provided the number of latent states is not excessive.

Often, in addition to panel observation can havemisclassification of the state.

P (Ot = s|Xt = r) = ers and assumed that O1, . . . , Onindependent conditional on X1, . . . , Xn.

Very natural extension to these models under the phase-typeframework because already using a hidden Markov modellikelihood.

Some scope to fit these models in existing software e.g. msmpackage in R.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 17: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Disadvantages

When using phase-type distributions with multiple phases, runinto identifiability problems very quickly

Many parameters close to being redundant or becomeredundantDifficulties even for right-censored data

Only feasible for very simple phase-type distributions in thepanel data case.

If comparing with Markov model cannot perform standardlikelihood ratio test

Non-standard conditions - some parameters of the phase-typemodel are unidentifiable under the null Markov model.

Not uncommon to get boundary estimates e.g. 0 hazard ofdeath from one state.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 18: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

2-Phase Coxian distribution

Simplest non-trivial phase-type distribution

Defined by three parameters which roughly determine theinitial intensity, terminal intensity and the rate of changebetween these levels.

1 2

µ1 µ2

ξ

3

ξ not identifiable if µ1 = µ2.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 19: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

2-Phase Semi-Markov model for the BOS data

3

22211211

µ(13)2

µ(13)1 µ

(23)2

ξ1 µ(12)2 ξ2

µ(21)2

µ(21)1

µ(12)1

µ(23)1

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 20: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Alternative approach

In stochastic control theory, the use of phase-typeapproximations to parametric distributions is common.

e.g. in the analysis of queues.

However, typically analysing a process with known waitingdistribution.

Principle could be applied to estimating semi-Markov models.

Join phase-type approximations for different states together.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 21: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Approximation of Weibull distribution

Weibull hazard function is monotonically increasing ordecreasing

Good phase-type approximation can be obtained withrelatively few phases.

Here consider 5-phase Coxian distribution with 9 parameters.

Seek S(θ) that minimizes the Kullback-Leibler distance

Don’t need to fit to tails of distribution. e.g. if follow-up instudy is 10 years, don’t need to fit distribution beyond 10 years.Just need accurate amount of mass after upper point

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 22: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

B-spline family fit

In order to fit the semi-Markov will want phase-type fits for alarge range of Weibull distributions.Impractical to do a custom fit for every point.

Too time consumingResulting likelihood not smooth

In general seek θ(α) that minimizes∫ αu

αl

KL(fα,λ, fS(θ))dα (1)

Find B-spline approximations to the solution of (1)

θi(α) =∑j

wijBij(α)

for i = 1, . . . , 9.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 23: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Demonstration of fit: α = 1.2

0.0 0.5 1.0 1.5 2.0

−2.

5−

2.0

−1.

5−

1.0

−0.

50.

0

log[f(t)]

t

log(

f)

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

S(t)

t

S(t

)

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 24: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Demonstration of fit: α = 1.8

0.0 0.5 1.0 1.5 2.0

−3.

5−

3.0

−2.

5−

2.0

−1.

5−

1.0

−0.

50.

0

log[f(t)]

t

log(

f)

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

S(t)

t

S(t

)

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 25: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Demonstration of fit: α = 0.5

0.0 0.5 1.0 1.5 2.0

−3

−2

−1

01

2

log[f(t)]

t

log(

f)

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

S(t)

t

S(t

)

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 26: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Demonstration of fit: α = 0.5

0 1 2 3 4 5

−4

−3

−2

−1

01

2

log[f(t)]

t

log(

f)

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

S(t)

t

S(t

)

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 27: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Demonstration of fit: Kullback-Leibler distance

0.5 1.0 1.5 2.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

Comparison of approximations

α

KL

PointwiseB−spline

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 28: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Phase-type approximation to Weibull semi-Markov process

Optimisation to establish approximation quite large

But only has to be performed once:

Can fit for Weibull rate λ = 1 for a given cut-off point, e.g.t = 2.Taking λS(θ) then gives optimal estimate for rate λ for cut-offt = 2/λ.

Resulting likelihood is differentiable so standard approaches tomaximum likelihood estimation applicable e.g. BFGS, BHHHor other quasi-Newton methods.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 29: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Embedded system

Each (non-absorbing) state of the semi-Markov process ismade up of 5 sub-states

If there are multiple destinations from a state:

Overall intensity out of state taken to be αrλαrr tαr−1

Individual intensity from r → s

αrλrs {λrt}αr−1

where λr =∑j 6=s λrs

NB: Not the same as having competing Weibull intensitieswith separate shape parameters.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 30: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Quality of approximation to the likelihood

Difficult to assess because “exact” likelihood very difficult tocompute for examples of interest.

In simulations estimates based on maximising the approximatelikelihood are close to unbiased and have accurate standarderrors.

For a simple two state ‘switching’ model where all subjectsobserved at equally spaced intervals and one sojourndistribution is exponential can use direct simulation to getlikelihood curve.

Sufficient statistic is simple.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 31: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Quality of approximation: Simple 2 state example

● ●● ● ● ●

●●

0.6 0.7 0.8 0.9 1.0 1.1 1.2

−31

25−

3120

−31

15

Comparison of likelihood curves

α

l(α)

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 32: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Example: Post-lung transplantation patients

Bronchiolitis obliterans syndrome

Deterioration in lung function over time

364 double-lung or heart-lung transplantation patients.

6 month survivors

‘Normal’ lung function determined in first 6 months

BOS state defined by % of normal lung function based onFEV1 measurements.

Subject to misclassification.

between BOS free & BOS states.

2654 assessments on lung function, 193 deaths.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 33: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Results for BOS

Markov 2-PH Semi-Markov Weibull Semi-Markov

−2× LL 3005.06 2976.5 2979.7Parameters 9 13 11

Clear evidence against homogeneous Markov model.

Fit of 2-phase Coxian and Weibull semi-Markov models quitecomparable.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 34: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Results for BOS

Semi-Markov models estimate decreasing hazard with timesince entry into the state for both the BOS-free and BOSstates.

Possible interpretations:

Patient heterogeneity: some patients have rapid declines.Problem with model assumptions regarding statemisclassification.Partly accounts for time non-homogeneity with respect to timesince transplant.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 35: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Comparison of overall survival estimates

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Estimated survival for heart−lung transplant patients

Time since transplant (years)

S(t

)

Markov2−PHWeibull

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 36: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Comparison of conditional survival estimates

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Estimated conditional survival given a 5 year sojourn in state 2

Time (Years)

P(A

live)

WeibullWeibull 95% CI2PH2PH 95% CI

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 37: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Further extensions

Covariates on intensities straightforward provided assume

qrs(t; z) = αrλrs exp(βrsz)

∑j

λrj exp(βrjz)t

αr−1

Not a proportional intensities model.

Alternative competing Weibull intensities possible in principle

But requires a much larger number of latent states (e.g. 5N

for N competing events).

Pattern mixture representation also possible.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 38: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Further extensions

Non-homogeneous semi-Markov models are possible byapplying existing methods for non-homogeneous HMMs

Piecewise constant intensities:

qrs(t) =

{qrs1 t < tu

qrs2 t ≥ tu

‘Time transformation’ models:

Q(t) = Q0g(t), g(t) > 0.

Intensities of the observed process then depend both on timesince entry in the state and time since initiation (or calendartime).

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 39: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Conclusions

Models with phase-type sojourn distributions can be used toobtain tractable likelihoods for semi-Markov models underpanel observation due to equivalence with a class of hiddenMarkov models.

Can use either directly:

Simple 2-phase Coxian distribution

Indirectly as approximations to other parametric survivaldistributions:

One-off optimisation to establish B-spline family ofapproximation to Weibull distributionsThese approximations then embedded within overall system.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 40: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

Conclusions

Enables a way of checking (homogeneous) Markovassumption.

But doesn’t imply semi-Markov model is the best model.

Non-homogeneous Markovfrailty/random effectsState misclassification

May depend on the application which is most preferable.

2-Phase Coxian and Weibull models give very similar results

Very slight improvement in efficiency for Weibull estimates.

Andrew Titman Lancaster University

Semi-Markov models under panel observation

Page 41: Semi-Markov models under panel observationtitman/semimarkov_talk.pdf · Advantages Computation of likelihood relatively fast ... framework because already using a hidden Markov model

References

Chen, Y., Xie, J., Liu, JS. (2005) Stopping-time resampling forsequential Monte Carlo methods. JRSS B 67: 199-217.

Foucher, Y., Giral, M., Soulillou, JP., Daures, JP. (2010). A flexiblesemi-Markov model for interval-censored data and goodness-of-fittesting. Statistical Methods in Medical Research. 19: 127-145.

Kalbfleisch, J.D, Lawless, J.F. (1985) The analysis of panel dataunder a Markov assumption. JASA. 80:863-871

Kang, M., Lagakos, S.W. (2007) Statistical methods for panel datafrom a semi-Markov process, with application to HPV. Biostatistics8, 252-264.

Titman, AC. Sharples, LD. (2010). Semi-Markov models withphase-type sojourn distributions. Biometrics. 66: 742-752.

Andrew Titman Lancaster University

Semi-Markov models under panel observation