robert t. o’neill , ph.d. director , office of biostatistics cder, fda

Considerations Regarding Considerations Regarding Choice of the Primary Choice of the Primary

Analysis in Longitudinal Analysis in Longitudinal Trials With Dropouts: An Trials With Dropouts: An

FDA PerspectiveFDA Perspective

Robert T. O’Neill , Ph.D.Robert T. O’Neill , Ph.D.

Director , Office of BiostatisticsDirector , Office of Biostatistics

CDER, FDACDER, FDA

Presented at the FDA /Industry Workshop; Bethesda, Maryland; September 17-19, 2003

DisclaimerDisclaimer

The opinions expressed are mine The opinions expressed are mine and do not represent CDER policy and do not represent CDER policy on this issue. Ongoing research on this issue. Ongoing research within CDER is directed toward within CDER is directed toward more specific guidance which will more specific guidance which will supplement that contained in the supplement that contained in the ICH E9 Guidance “Statistical ICH E9 Guidance “Statistical Principles for Clinical TrialsPrinciples for Clinical Trials

OutlineOutline Issues about the dropout process in longitudinal clinical Issues about the dropout process in longitudinal clinical

trials : focus on informative, treatment related trials : focus on informative, treatment related missingnessmissingness

Terminology used to express missing dataTerminology used to express missing data

The literature: Problem formulation and approaches - The literature: Problem formulation and approaches - Different research approaches to the issuesDifferent research approaches to the issues

Some general conclusions from this researchSome general conclusions from this research

Concluding remarks on choosing a primary analysis and Concluding remarks on choosing a primary analysis and its pre-specification when the analysis may depend upon its pre-specification when the analysis may depend upon the data patternthe data pattern

IssuesIssues Regulatory setting places emphasis on confirmatory studies, Regulatory setting places emphasis on confirmatory studies,

pre-specification of objectives, hypotheses, analyses, pre-specification of objectives, hypotheses, analyses, documentationdocumentation

How to specify in the protocol the How to specify in the protocol the primary strategyprimary strategy for for dealing with missing data - if you presume that it will be dealing with missing data - if you presume that it will be informative - and you have not observed the data yetinformative - and you have not observed the data yet

Considerable literature on the matter, unclear as to which Considerable literature on the matter, unclear as to which approach to choose, when and why, and the considerable approach to choose, when and why, and the considerable computational efforts involved computational efforts involved

Documentation and reporting of the criteria, choice of Documentation and reporting of the criteria, choice of approach (model), and certainty of the conclusionsapproach (model), and certainty of the conclusions

Patient withdrawals from treatment exposurePatient withdrawals from treatment exposure

Why do subjects stay in clinical trials, why do Why do subjects stay in clinical trials, why do they withdraw from assigned therapy, when do they withdraw from assigned therapy, when do

they withdraw, and how do they differ from they withdraw, and how do they differ from completers ?completers ?

Efficacy (lack of or benefit from)Efficacy (lack of or benefit from)

Safety (toxicity, tolerability, neither)Safety (toxicity, tolerability, neither)

BothBoth

Aggravation of the trialAggravation of the trial

When in time do they leave therapy in When in time do they leave therapy in trial (short duration, long duration)trial (short duration, long duration)

Symptomatic relief vs unperceived benefitSymptomatic relief vs unperceived benefit

What is the shape of response What is the shape of response time patterntime pattern

Efficacy and ToxicityEfficacy and Toxicity

If toxicity is time dependent, and If toxicity is time dependent, and efficacy is time dependent, efficacy is time dependent, withdrawal due to either side effects withdrawal due to either side effects or perceived early efficacy, censors or perceived early efficacy, censors the efficacy outcome whose time the efficacy outcome whose time dependency may follow different dependency may follow different patternpattern

Bivariate outcome Bivariate outcome

(Efficacy (t), Safety(t))(Efficacy (t), Safety(t))

Four situations (E,S) at time Four situations (E,S) at time TT

E S(T) Patient perceives effective response AND no toxicity occursE S(T) Patient perceives effective response AND no toxicity occurs

EE S S(T) Patient perceives effective response AND experiences toxicity(T) Patient perceives effective response AND experiences toxicity

may or may not leave depending upon tolerance and may or may not leave depending upon tolerance and preferencespreferences

EE S(T) Patient perceives No effective response AND no toxicity S(T) Patient perceives No effective response AND no toxicity occursoccurs

EE S(T) Patient perceives No effective response AND experiences S(T) Patient perceives No effective response AND experiences toxicitytoxicity

may or may not leave but is MORE likely to leave than may or may not leave but is MORE likely to leave than patient above who perceives effective responsepatient above who perceives effective response

The problem formulation in the The problem formulation in the literature ?literature ?

Which to choose as primary ?Which to choose as primary ?

Likelihood based methodsLikelihood based methods

mixed model repeated measures (MMRM)mixed model repeated measures (MMRM)

Pattern -mixture modelsPattern -mixture models

Selection modelsSelection models

Ad hoc methods (LOCF, worst case imputation)Ad hoc methods (LOCF, worst case imputation)

Single and multiple imputation approachesSingle and multiple imputation approaches

The terminologyThe terminology

Terminology for missing dataTerminology for missing data MCAR - missing completely at randomMCAR - missing completely at random

MAR - missing at randomMAR - missing at random

MNAR - Missing not at randomMNAR - Missing not at random

NIM - Non - Ignorable MissingnessNIM - Non - Ignorable Missingness

OC observed casesOC observed cases

AAD - all available dataAAD - all available data

CC - complete casesCC - complete cases

CAR - coarsened at randomCAR - coarsened at random

LVCF (last value carrying forward) - LOCFLVCF (last value carrying forward) - LOCF

Dependent dropouts - DDDependent dropouts - DD

More TerminologyMore Terminology for missing data for missing data

Classification of drop-out mechanismClassification of drop-out mechanism

independent of data - IDAindependent of data - IDA

dependent on observed data - ODAdependent on observed data - ODA

dependent on missing data - MIDAdependent on missing data - MIDA

dependent on observed and missing dependent on observed and missing data - OMIdata - OMI

dependent on observed data and dependent on observed data and covariates - ODACOcovariates - ODACO

MADMADMissing and Deferred (or Missing and Deferred (or

Delayed):Delayed):Another new termAnother new term

MAD - missing and deferredMAD - missing and deferred

at randomat random

completely at randomcompletely at random

completely completely uninterpretableuninterpretable

What is unique aboutWhat is unique about missing data in clinical trials ? missing data in clinical trials ?

Monotonically Missing data is potentially an outcome by Monotonically Missing data is potentially an outcome by itselfitself

Why ? - It can be a surrogate for patient preference, Why ? - It can be a surrogate for patient preference, acceptability with therapy, and can potentially be unpredictive acceptability with therapy, and can potentially be unpredictive of where the subject would be in the future (where no of where the subject would be in the future (where no observations are taken or available)observations are taken or available)

With monotone missing data, the ‘dropout mechanism’ is With monotone missing data, the ‘dropout mechanism’ is very likely informativevery likely informative

Possible to plan to collect information during study prior Possible to plan to collect information during study prior to treatment withdrawal and prior to study completion to treatment withdrawal and prior to study completion but post treatment withdrawal (conditioning)but post treatment withdrawal (conditioning)

1 2 3 4 5

Bas

elin

e

Higher is bad

Example of longitudinal Efficacy response/score by visit - missing for toxicity/safety reasons not considerednot considered

Control

Test

Visit

1 2 3 4 5Higher is bad

Are slope and baseline predictive of how long a patient stays in trial ?

No treatment effect

Control

Test

Visit

Bas

elin

e

1 2 3 4 5Higher is bad

Is baseline predictive of how long a subject stays in trialTreatment relationship is less clear

TestControl

Visit

Bas

elin

e

Maximum likelihood Maximum likelihood approachesapproaches

Factoring the likelihoodFactoring the likelihood

Information is in the Information is in the conditioningconditioning

The observed data and the The observed data and the missing datamissing data

Which do you wish to condition on Which do you wish to condition on and why ?and why ?

Shared parameter models not considered here, eg. same parameter in observed and missing data models

Selection ModelSelection Model

Factors the joint distribution of Factors the joint distribution of the observed data Y and the the observed data Y and the missing data M into the marginal missing data M into the marginal distribution of the observed distribution of the observed response times M the conditional response times M the conditional distribution of missingness given distribution of missingness given Y = yY = y

Heitjan, Ignorability and Bias in clinical trials; Stat Med 18, 2421-2434 (1999)

Selection Model

, When the missing data are NOT IGNORABLE, one has to specify an explicit model for the missing-data mechanism M to make an appropriate inference for theta

Selection ModelSelection Model-What to do if non-ignorable -What to do if non-ignorable

occurs -occurs -

Specify an explicit model for the Specify an explicit model for the missing data mechanismmissing data mechanism

Wu and Carroll for slope Wu and Carroll for slope analysis in longitudinal studiesanalysis in longitudinal studies

Do sensitivity and robustness Do sensitivity and robustness analysis, under different analysis, under different plausible models for the plausible models for the missing data mechanismmissing data mechanism

Pattern-mixture modelsPattern-mixture models

Factors the joint distribution of Y Factors the joint distribution of Y and M as the marginal distribution and M as the marginal distribution of M (the missing pattern) times of M (the missing pattern) times the conditional distribution of the conditional distribution of observed data Y given M = m (the observed data Y given M = m (the response given missing pattern )response given missing pattern )

The data are stratified by missing The data are stratified by missing data patterns and inference is on data patterns and inference is on the conditional model parameter the conditional model parameter

Pattern - mixture models

The first factor says that the data are stratified by missing data patterns and the inference is given to the conditional model parameter

Choice of missingness Choice of missingness modelsmodels

Pattern-mixture model: patterns of response, distribution of effects Pattern-mixture model: patterns of response, distribution of effects within patterns - need for a lot of data in each pattern - information within patterns - need for a lot of data in each pattern - information in number of subjects in each patternin number of subjects in each pattern

Selection model : response and dropout distribution - very Selection model : response and dropout distribution - very dependent upon assumptions of modelsdependent upon assumptions of models

Shared parameter modelsShared parameter models

How much missingness (% of total N) can be tolerated ?How much missingness (% of total N) can be tolerated ?

OverallOverall

Between treatment groupsBetween treatment groups

Early, middle, later in the studyEarly, middle, later in the study

Reasonable imputation strategies, including LOCFReasonable imputation strategies, including LOCF

Step 1: Model Selection Step 1: Model Selection ??

MMRMMMRM

Selection modeling - the second factor corresponds to the Selection modeling - the second factor corresponds to the self selection of individuals into “observed” and “missing” self selection of individuals into “observed” and “missing” groups.groups.

Pattern- mixture models - a mixture of different Pattern- mixture models - a mixture of different populations, characterized by the observed pattern of populations, characterized by the observed pattern of missingnessmissingness

e.g. 4 times measured; subjects with 1,2,3,4 e.g. 4 times measured; subjects with 1,2,3,4 measurements form the four patternsmeasurements form the four patterns

estimate treatment effects within patterns and then estimate treatment effects within patterns and then combine in some waycombine in some way

Step 2 : Sensitivity Step 2 : Sensitivity Analysis - Which Analysis - Which

strategy, when , whystrategy, when , why

Determining evidence for MNARDetermining evidence for MNAR

Model correctness (selection, pattern-mixture)Model correctness (selection, pattern-mixture)

Local Influence of patients on power and Local Influence of patients on power and detection abilitydetection ability

Documenting the strategies and what was Documenting the strategies and what was donedone

Supporting the range of possible conclusions Supporting the range of possible conclusions consistent with the dataconsistent with the data

Some selected Some selected literatureliterature

comparing different comparing different strategiesstrategies

Three papers on comparing Three papers on comparing performance of competing performance of competing

analysis strategiesanalysis strategies

Analysis of Longitudinal Clinical Trial Incomplete Data ; O. Siddiqui and J. Hung

1.

2.

3.

Compares impact of fixed value imputation (FVI) like LOCF with m.l. general linear models of the observed data

Uses a Pattern - mixture model to make inference on the unconditional, hypothetical complete-data mean

Non zero values to

have dropouts have

outcomes that are worse

Miller, Morgan, Espeland, Emerson Miller, Morgan, Espeland, Emerson MessageMessage

Variability of the measurements needs Variability of the measurements needs to be addressedto be addressed

The direction of bias, by not accounting The direction of bias, by not accounting for it, is not predictablefor it, is not predictable

Under several Under several non-ignorable non-non-ignorable non-responseresponse scenarios, m.l. based scenarios, m.l. based analyses can yield equivalent analyses can yield equivalent hypothesis tests to those obtained hypothesis tests to those obtained when analyzing only the observed data.when analyzing only the observed data.

Consideration jointly of:Consideration jointly of: 2 arm trial, with change in spine deformity index (SDI) over 2 arm trial, with change in spine deformity index (SDI) over

a 4 year duration with measurements at each of 4 yearsa 4 year duration with measurements at each of 4 years

Linear progression of diseaseLinear progression of disease

Disease progression mechanisms (early, middle , late)Disease progression mechanisms (early, middle , late)

Dropout mechanismsDropout mechanisms

Considers each subject’s last observation is dependent on Considers each subject’s last observation is dependent on prior repeated measurementprior repeated measurement

14 separate methodological strategies for dealing at the 14 separate methodological strategies for dealing at the analysis with missing dataanalysis with missing data

Large black dot is unacceptable power reduction

Large black square is inflation of type 1 above 7.5%

The adequacy of a strategy for dealing with missing values strongly depends on whether the courses of disease are similar or very different in the study groups. Therefore knowledge about the courses of longitudinal data is important besides information on dropout rates for planning an adequate ITT analysis.

If the information about the courses of disease is not available at the planning stage of a clinical trial, the ICH E9 guideline suggests correcting the strategy for dealing with missing values in a blind review stage before analysis of the trial starts Pre- definition of methods [ of dealing with missing values ] may be facilitated by updating this aspect of the analysis plan during the blind review .

Thus, the blind review is a possibility to get an idea about the patterns of courses of the endpoint, thus making the choice of an adequate strategy easier. However, a blind review including the main endpoint might induce problems if obvious treatment effects show up at this stage, giving away treatment groups. For judging the adequacy of an approach for dealing with missing values, information about rates and times of dropouts as well as courses of disease must be provided in the publication of the results.

Message

MessagesMessages For drop-out rates less than 20% AND similar courses of For drop-out rates less than 20% AND similar courses of

disease in the treatment groups, missing values might be disease in the treatment groups, missing values might be replaced by mean of other groupsreplaced by mean of other groups

For larger drop-out rates OR less similar course of disease, no For larger drop-out rates OR less similar course of disease, no adequate recommendations can be givenadequate recommendations can be given

Type 1 error increases drastically for the different strategies, Type 1 error increases drastically for the different strategies, especially if the course of disease vary between treatment especially if the course of disease vary between treatment groupsgroups

There is is NO strategy which is adequate for all different There is is NO strategy which is adequate for all different combinations of dropout mechanisms, drop-out rates or less combinations of dropout mechanisms, drop-out rates or less similar courses of disease and no adequate recommendations similar courses of disease and no adequate recommendations can be given.can be given.

Presuming informative missingness - what to do ?

Computational burden is the issue

Adjusting for Non-Adjusting for Non-ignorable Drop-out Using ignorable Drop-out Using

Semiparametric Semiparametric Nonresponse ModelsNonresponse Models

Sharfstein, Rotnitzky and Robins, Sharfstein, Rotnitzky and Robins, JASA,V 94; 1096-1120 (1999)JASA,V 94; 1096-1120 (1999)

See Commentaries pages 1121-1146See Commentaries pages 1121-1146

Comparison of estimation methods

Wei, L and Shih, WJWei, L and Shih, WJ

Partial imputation approach Partial imputation approach to analysis of repeated to analysis of repeated

measurements with measurements with dependent dropoutsdependent dropouts

Statist. Med. 2001; 20: 1197-1214

3 Conditions3 Conditions

1. The drop-out rates are the same in 1. The drop-out rates are the same in both treatment groupsboth treatment groups

2. Dropout process depends on the 2. Dropout process depends on the outcome variable in the same outcome variable in the same manner in both treatment groupsmanner in both treatment groups

3. Common variances for the 3. Common variances for the outcome variable in both outcome variable in both treatment groupstreatment groups

The Wei and Shih approach is to control C1,

so that the dropout rates become the same (or nearly similar)

after partially imputing those needed to made the rates the same.

One can condition only on what was One can condition only on what was observed and measured - Its observed and measured - Its

effectiveness depends on what you effectiveness depends on what you know in advance to condition onknow in advance to condition on

Dropout rates in each treatment groupDropout rates in each treatment group

Same or differentSame or different

Time pattern same or differentTime pattern same or different

How many identifiable cause specific How many identifiable cause specific reasons for dropouts , and are they the reasons for dropouts , and are they the same or different in each treatment same or different in each treatment groupgroup

Example: ES, Example: ES, EE S, E S, E SS, , E SE S

Two papers on joint analysis of Two papers on joint analysis of dropout as a response and dropout as a response and

observed repeated dataobserved repeated data

Worst-rank score analysis Worst-rank score analysis with informatively missing with informatively missing

observationsobservations

Follow-up measurement is missing for some subjects Follow-up measurement is missing for some subjects because a disease-related event occurred prior to the because a disease-related event occurred prior to the time of the follow-up observationtime of the follow-up observation

Example: Study of congestive heart failure, patients Example: Study of congestive heart failure, patients undergo exercise testing, but the measurement is undergo exercise testing, but the measurement is missing for those who die of heart disease during the missing for those who die of heart disease during the studystudy

Measurements are informatively missing because Measurements are informatively missing because mortality from heart disease and a decline in exercise mortality from heart disease and a decline in exercise BOTH indicate progression of the underlying diseaseBOTH indicate progression of the underlying disease

Considers two separate populations, completers and dropouts and tests a joint hypothesis (binomial for dropouts) regarding outcomes both in the same direction for the test treatment group

Siddiqui and HungSiddiqui and Hung

What is the null What is the null hypothesis ?hypothesis ?

No difference between treatments No difference between treatments at all time pointsat all time points

No difference between treatments No difference between treatments at the last time pointat the last time point

Concluding RemarksConcluding Remarks

In protocol planning, assume that monotonic missing In protocol planning, assume that monotonic missing data, if it occurs, is likely informativedata, if it occurs, is likely informative

Non- ignorable nonresponse occurs when the probability Non- ignorable nonresponse occurs when the probability of response depends on the unobserved outcome. In this of response depends on the unobserved outcome. In this situation, assumptions regarding the missing data situation, assumptions regarding the missing data process, which often are not directly verifiable, typically process, which often are not directly verifiable, typically are necessary to provide valid estimates and inference are necessary to provide valid estimates and inference

Decide what data will be collected that will allow for Decide what data will be collected that will allow for conditioning on factors that matter to address bias conditioning on factors that matter to address bias adjustmentsadjustments

Concluding RemarksConcluding Remarks Choose the primary strategy to be used, including approaches, criteria Choose the primary strategy to be used, including approaches, criteria

for model selection, model fits, sensitivity analysis, robustness, etc.for model selection, model fits, sensitivity analysis, robustness, etc.

Consider the justification for model selection and the sequence of Consider the justification for model selection and the sequence of analytical steps to assure that assumptions of the pre-specified analytical steps to assure that assumptions of the pre-specified analysis are met conditional on what is observedanalysis are met conditional on what is observed

Consider, in advance, the joint distribution of efficacy outcome, the Consider, in advance, the joint distribution of efficacy outcome, the side effect outcomes and other response variables needed to satisfy side effect outcomes and other response variables needed to satisfy the MAR requirement - and measure what you can for analysis the MAR requirement - and measure what you can for analysis purposespurposes

In the absence of outcome data, the only way to make inference In the absence of outcome data, the only way to make inference regarding the entire population available for randomization is to make regarding the entire population available for randomization is to make assumptions regarding the distribution of missing data or the missing assumptions regarding the distribution of missing data or the missing data process data process

Concluding RemarksConcluding Remarks

Reporting and documentation adviceReporting and documentation advice

Plots, graphs, model fits, comparisonsPlots, graphs, model fits, comparisons

How to convince others that your model How to convince others that your model choice and primary strategy was the most choice and primary strategy was the most appropriate for the conclusions drawnappropriate for the conclusions drawn

extent of sensitivity analysesextent of sensitivity analyses

If possible, collect data on all subjects until If possible, collect data on all subjects until the trial is completed, even if withdrawn the trial is completed, even if withdrawn from trialfrom trial

robert t. o’neill , ph.d. director , office of biostatistics cder, fda

Documents

time te st patient

time dependent

toxicity occurse st

efficacy outcome

time dependency

primary analysis

missing datamcar

preferencese st patient