a workshop introducing doubly robust estimation of treatment effects michele jonsson funk, phd...
TRANSCRIPT
A workshop introducing doubly robust estimation of treatment effects
Michele Jonsson Funk, PhDUNC/GSK Center for Excellence in PharmacoepidemiologyUniversity of North Carolina at Chapel Hill
2
Conflict of Interest Statement
Macro development funded by the Agency for Healthcare Research and Quality via a supplemental award to the UNC CERTs (U18 HS10397-07S1)
Additional support from the UNC/GSK Center for Excellence in Pharmacoepidemiology and Public Health.
No potential conflicts of interest with respect to this work.
3
Regression models assume that… The parametric form is correct.
Should we use logistic regression, or log- binomial?
We have included correct predictors. Should we really include age in this model?
Those predictors have been specified correctly. Should age be coded continuously or in 10 year
categories? Is there an interaction with race? What about higher order terms? Etc…
4
What if the model is wrong?
Lunceford & Davidian, Stat Med, 2004 Omit a true confounder (extreme example) True relationships known (simulated data) Vary associations between
Risk factor – outcomeConfounder – exposure
5
-35-28
-21-18 -15-11
-40
-30
-20
-10
0
Str Mod None
ML outcome regression: false model%
bias
Lunceford & Davidian, Stat Med, 2004
Risk factor – outcome assn
6
-1 -1 -1
-1 -1 -1
-40
-30
-20
-10
0
Str Mod None
Doubly robust (DR) estimator: false model for outcome regression
%bi
as
Risk factor – outcome assn
Lunceford & Davidian, Stat Med, 2004
7
94.894.7 95 96.4
80
85
90
95
100
Str Mod None
ML outcome regression: true modelC
I C
over
age
Risk factor – outcome assn
Lunceford & Davidian, Stat Med, 2004
8
95.9 94.5 94.995.6 94.3 95.6
80
85
90
95
100
Str Mod None
DR: true models for propensity score & outcome regression
CI C
over
age
Risk factor – outcome assn
Lunceford & Davidian, Stat Med, 2004
9
0 0 00 0 080
85
90
95
100
Str Mod None
ML outcome regression: false modelC
I Cov
erag
e
Risk factor – outcome assn
Lunceford & Davidian, Stat Med, 2004
10
95.7 95 94.495.2 93.2 93.9
80
85
90
95
100
Str Mod None
CI C
over
age
DR: true model for propensity score & false model for outcome regression
Risk factor – outcome assn
Lunceford & Davidian, Stat Med, 2004
11
Doubly robust (DR) estimation from 30,000 feet
Robins & colleagues recognized the doubly robust property in mid-90’s
Combines standardization (or reweighting) with regression
Part of the family of methods that includes propensity scores and inverse probability weighting
12
Conceptual description
Doubly robust (DR) estimation uses two models: Propensity score model for the confounder - exposure
(or treatment) relationship Outcome regression model for the confounder –
outcome relationship, under each exposure condition These two stages can use:
different subsets of covariates, and different parametric forms.
If either model is correct, then the DR estimate of treatment effect is unbiased.
13
Two stages
Risk factors (potential confounders)
Exposure (Treatment)
Outcome
Prop
ensi
ty
Scor
e M
odel
(1) O
utcome
Regression (2)
14
Causal effect of interest
Comparing counterfactual scenarios E(Y1): Whole population treated (exposed) vs. E(Y0): Whole population untreated (unexposed)
Average causal effect of treatment E(Y1) – E(Y0) : difference E(Y1) / E(Y0) : ratio
In non-randomizes studies, the unexposed may not fairly reflect what would have happened to the exposed had they been unexposed (confounding)
15
Doubly robust estimator Y: outcome Z: binary treatment (exposure) X: baseline covariates (confounders plus other prognostic factors) e(X,β): model for the true propensity score m0(X,α0) and m1(X,α1): regression models for true relationship between covariates
and the outcome within each strata of treatment
Causal effect of interest (deltaDR): difference in mean response if everyone in the population received treatment versus everyone receiving no treatment; E(Y1)-E(Y0).
ΔDR = E(Y1) - E(Y0)
Adapted from Davidian M, DR Presentation, 2007
16
Doubly robust estimatorE(Y1): average popn response
with treatment / exposure
Adapted from Davidian M, DR Presentation, 2007
17
Average population response with treatment (μ1,DR)
IPTW Estimator “Augmentation”
Adapted from Davidian M, DR Presentation, 2007
18
True PS model; false regression model (I)
Propensity score model
Regression model
Adapted from Davidian M, DR Presentation, 2007
19
True PS model; false regression model (II)
!
Assuming nounmeasuredconfounders
Adapted from Davidian M, DR Presentation, 2007
20
False PS model; true regression model (I)
Propensity score model
Regression model
Adapted from Davidian M, DR Presentation, 2007
21
False PS model; true regression model (II)
!
Assuming nounmeasuredconfounders
Adapted from Davidian M, DR Presentation, 2007
22
ΔDR = [E(Y1) + junk] - [E(Y0) + junk]
Overly simplified statistics
Where junk = 0 if either the propensity score or the regression model is true…
ΔDR = E(Y1) - E(Y0)
Adapted from Davidian M, DR Presentation, 2007
23
Standard errors
Option 1: Sandwich estimator
Option 2: Bootstrap
Adapted from Davidian M, DR Presentation, 2007
24
Simulation findings
Bang & Robins 2005 N=500, 1000 iterations False propensity score model
1 of 4 true predictors of tx1 ‘noise’ variable, independent of tx
False outcome regression modelOmit one risk factor, an higher order term and
an interaction term
25
Bias under false models
Analysis
Method
True Model(s)
False Model
PS OR Both
PS -0.01 0.86
OR 0.00 -1.56
DR 0.00 0.00 -0.09 0.92
H Bang & JM Robins, Biometrics (2005).
26
Variance under false models
Analysis True Model
False Model
PS OR Both
PS 0.21 0.15
OR 0.07 0.07
DR 0.09 0.08 0.28 0.15
H Bang & JM Robins, Biometrics (2005).
27
Recapping L&D simulations
Compare performance of propensity score analysis, IPW, outcome regression (OR) and DR
Omit a true confounder (extreme example) True relationships known (simulated data) Vary associations between
Risk factor – outcome Confounder – exposure
Vary sample size
28
If all models are true…
Bias <3% for all methods except for PS analysis using strata (due to residual
confounding) Variance similar in general
VarOR < VarDR (slightly) if confounder-exp relationship is strong
VarDR < VarIPW If OR model is right, most efficient. But we have
no way of knowing whether or not it’s right.
Lunceford & Davidian, Stat Med, 2004
29
If outcome regression model is false…
Biaso DR always <1%; OR biased by 10-20% in most scenarios
Efficiencyo DR nearly as efficient as correct model except when conf-exp
relationship strongo DR always more efficient than IPW
Confidence interval coverageo DR coverage nominalo ML coverage poor
Adding risk factors to PS model improves precision If both are nearly right (only a little wrong), bias is small
Lunceford & Davidian, Stat Med, 2004
30
Discussion
If method offers some protection against model misspecification, why isn’t it being used by pharmacoepidemiologists?
31
SAS macro for DR estimation
ObjectivesFacilitate wider use of DR estimation Improve performance by implementing
sandwich estimator for SEsEnhance usability by following SAS
conventionsProvide user with relevant diagnostic details
32
SAS macro for doubly robust estimation including documentation
Dataset for sample analyses (1.7MB, optional)
http://www.harryguess.unc.edu
33
Running the DR macro
By design, the DR macro uses common SAS® syntax for specifying the source dataset, variables for modeling, and additional options:
%dr(%str(options data=SAS-data-set descending;
wtmodel exposure = x y z / method=dr dist=bin showcurves;
model outcome = x y z / dist=n ; ) );
34
Running the DR macro
%dr(%str(options data=SAS-data-set descending;
wtmodel exposure = x y z / method=dr dist=bin showcurves;
model outcome = x y z / dist=n; ) );
35
%dr(%str(options data=SAS-data-set descending;
wtmodel exposure = x y z / method=dr dist=bin showcurves;
model outcome = x y z / dist=n; ) );
Running the DR macro
36
%dr(%str(options data=SAS-data-set descending;
wtmodel exposure = x y z / method=dr dist=bin showcurves;
model outcome = x y z / dist=n; ) );
Running the DR macro
37
DR macro: output
Propensity score (wtmodel) results
Descriptive statistics for weights
Graph of propensity score curves by exposure status
Reweighted regression model among the unexposed (dr0)
Reweighted regression model among the exposed (dr1)
Doubly robust estimate and standard error
38
DR macro: output
Obs totalobs usedobs dr0 dr1 deltadr se 1 100000 79292 .005546853 0.034117 0.028570 .002026204
n in dataset
n used in the analysis. usedobs<totalobs due to missing data or use of common support option
average response had all been unexposed, adjusted for risk factors
average response had all been exposed, adjusted for risk factors
dr1 – dr0; difference in mean response for continuous outcome; risk difference for dichotomous outcome
SE of deltaDR
39
Example analysis CVD Outcomes
Continuous: CVD score (i.e. LDL) Binary: acute MI
Exposure (treatment): statin use (yes/no) 50% of population exposed
10 covariates (5 continuous, 5 binary) Data are simulated, so true relationships among
exposure, covariates & outcome are known
40
Example analysis
%dr(%str(options data=final descending;
wtmodel statin=hs smk hxcvd black age bmi exer chol income / method=dr dist=bin showcurves common_support=.99;
model cvdscore=hs female smk hxcvd age age2 bmi bmi2 exer chol / dist=n; ));
41
Propensity scores from ‘showcurves’ option
0
0. 5
1. 0
1. 5
2. 0
2. 5
3. 0
3. 5
4. 0
Percent
0
- 0. 09 - 0. 01 0. 07 0. 15 0. 23 0. 31 0. 39 0. 47 0. 55 0. 63 0. 71 0. 79 0. 87 0. 95 1. 03
0
0. 5
1. 0
1. 5
2. 0
2. 5
3. 0
3. 5
4. 0
Percent
1
Est i mat ed Pr obabi l i t y
Une
xpos
edE
xpos
ed
Propensity Score
42
Results from sample analysis
Effect Estimates Result %bias SE
True -1.099
Crude 1.869 270.0% 0.089
Maximum likelihood -1.089 0.9% 0.023
Doubly robustPS model Outcome model
Correct Correct -1.102 -0.3% 0.025Correct Incorrect -1.117 -1.7% 0.089Incorrect Correct -1.093 0.5% 0.022Incorrect Incorrect 0.397 136.1% 0.049
43
Validation: simulation methods
Draw random sample (n) from simulated populationVary n from 100 to 5000
Estimate doubly robust effect of treatment and standard error
Repeat 1000 times
44
Continuous outcomen DR Estimate %bias SD (DR) mean SE SD(dr)/meanSE CI coverage
True RD = 0rmi1a 5000 0.01 0.8% 0.036 0.036 1.00 94.6
1000 0.01 0.9% 0.083 0.081 1.03 94.5500 0.00 0.2% 0.119 0.113 1.05 93.7100 0.02 2.1% 0.312 0.245 1.27 87.3
True RD = -0.41rmi2a 5000 -0.40 0.3% 0.035 0.036 0.98 95.4
1000 -0.40 0.9% 0.078 0.079 0.98 95.9500 -0.41 0.0% 0.118 0.113 1.04 94.4100 -0.41 -0.6% 0.299 0.246 1.21 91.2
True RD = -1.10rmi3a 5000 -1.10 -0.3% 0.035 0.035 1.01 94.6
1000 -1.10 -0.1% 0.081 0.078 1.03 94500 -1.10 -0.3% 0.113 0.111 1.01 94.6100 -1.09 0.9% 0.297 0.239 1.24 89.5
True RD = -1.61rmi4a 5000 -1.61 0.0% 0.034 0.035 0.98 95.6
1000 -1.61 -0.1% 0.082 0.078 1.05 93.6500 -1.61 -0.1% 0.114 0.109 1.05 93.1100 -1.62 -0.5% 0.318 0.246 1.30 87
45
Continuous outcomen DR Estimate %bias SD (DR) mean SE SD(dr)/meanSE CI coverage
True RD = 0rmi1a 5000 0.01 0.8% 0.036 0.036 1.00 94.6
1000 0.01 0.9% 0.083 0.081 1.03 94.5500 0.00 0.2% 0.119 0.113 1.05 93.7100 0.02 2.1% 0.312 0.245 1.27 87.3
True RD = -0.41rmi2a 5000 -0.40 0.3% 0.035 0.036 0.98 95.4
1000 -0.40 0.9% 0.078 0.079 0.98 95.9500 -0.41 0.0% 0.118 0.113 1.04 94.4100 -0.41 -0.6% 0.299 0.246 1.21 91.2
True RD = -1.10rmi3a 5000 -1.10 -0.3% 0.035 0.035 1.01 94.6
1000 -1.10 -0.1% 0.081 0.078 1.03 94500 -1.10 -0.3% 0.113 0.111 1.01 94.6100 -1.09 0.9% 0.297 0.239 1.24 89.5
True RD = -1.61rmi4a 5000 -1.61 0.0% 0.034 0.035 0.98 95.6
1000 -1.61 -0.1% 0.082 0.078 1.05 93.6500 -1.61 -0.1% 0.114 0.109 1.05 93.1100 -1.62 -0.5% 0.318 0.246 1.30 87
46
Continuous outcomen DR Estimate %bias SD (DR) mean SE SD(dr)/meanSE CI coverage
True RD = 0rmi1a 5000 0.01 0.8% 0.036 0.036 1.00 94.6
1000 0.01 0.9% 0.083 0.081 1.03 94.5500 0.00 0.2% 0.119 0.113 1.05 93.7100 0.02 2.1% 0.312 0.245 1.27 87.3
True RD = -0.41rmi2a 5000 -0.40 0.3% 0.035 0.036 0.98 95.4
1000 -0.40 0.9% 0.078 0.079 0.98 95.9500 -0.41 0.0% 0.118 0.113 1.04 94.4100 -0.41 -0.6% 0.299 0.246 1.21 91.2
True RD = -1.10rmi3a 5000 -1.10 -0.3% 0.035 0.035 1.01 94.6
1000 -1.10 -0.1% 0.081 0.078 1.03 94500 -1.10 -0.3% 0.113 0.111 1.01 94.6100 -1.09 0.9% 0.297 0.239 1.24 89.5
True RD = -1.61rmi4a 5000 -1.61 0.0% 0.034 0.035 0.98 95.6
1000 -1.61 -0.1% 0.082 0.078 1.05 93.6500 -1.61 -0.1% 0.114 0.109 1.05 93.1100 -1.62 -0.5% 0.318 0.246 1.30 87
47
Dichotomous outcomen DR Estimate %bias SD (DR) mean SE SD(dr)/meanSE CI coverage
True RD = 0.0000mi1 5000 -0.002 -0.2% 0.008 0.008 0.98 96.0
1000 -0.001 -0.1% 0.019 0.018 1.10 92.4500 0.000 0.0% 0.028 0.024 1.16 91.2100 0.001 0.1% 0.078 0.040 1.93 68.5
True RD = -0.0228mi2 5000 -0.025 -9.3% 0.009 0.008 1.03 93.7
1000 -0.024 -4.2% 0.020 0.018 1.08 92.9500 -0.023 -2.6% 0.030 0.025 1.18 90.8100 -0.016 31.4% 0.075 0.041 1.85 69.5
True RD = -0.0670mi3 5000 -0.068 -2.0% 0.008 0.008 1.01 94.6
1000 -0.069 -2.7% 0.019 0.018 1.05 93.0500 -0.069 -2.9% 0.027 0.025 1.06 92.6100 -0.061 8.8% 0.074 0.043 1.74 72.1
True RD = -0.0970mi4 5000 -0.098 -1.0% 0.009 0.008 1.04 93.6
1000 -0.099 -2.3% 0.018 0.018 1.00 94.0500 -0.097 0.2% 0.029 0.025 1.13 91.6100 -0.088 9.0% 0.074 0.043 1.70 73.7
48
Dichotomous outcomen DR Estimate %bias SD (DR) mean SE SD(dr)/meanSE CI coverage
True RD = 0.0000mi1 5000 -0.002 -0.2% 0.008 0.008 0.98 96.0
1000 -0.001 -0.1% 0.019 0.018 1.10 92.4500 0.000 0.0% 0.028 0.024 1.16 91.2100 0.001 0.1% 0.078 0.040 1.93 68.5
True RD = -0.0228mi2 5000 -0.025 -9.3% 0.009 0.008 1.03 93.7
1000 -0.024 -4.2% 0.020 0.018 1.08 92.9500 -0.023 -2.6% 0.030 0.025 1.18 90.8100 -0.016 31.4% 0.075 0.041 1.85 69.5
True RD = -0.0670mi3 5000 -0.068 -2.0% 0.008 0.008 1.01 94.6
1000 -0.069 -2.7% 0.019 0.018 1.05 93.0500 -0.069 -2.9% 0.027 0.025 1.06 92.6100 -0.061 8.8% 0.074 0.043 1.74 72.1
True RD = -0.0970mi4 5000 -0.098 -1.0% 0.009 0.008 1.04 93.6
1000 -0.099 -2.3% 0.018 0.018 1.00 94.0500 -0.097 0.2% 0.029 0.025 1.13 91.6100 -0.088 9.0% 0.074 0.043 1.70 73.7
49
Dichotomous outcomen DR Estimate %bias SD (DR) mean SE SD(dr)/meanSE CI coverage
True RD = 0.0000mi1 5000 -0.002 -0.2% 0.008 0.008 0.98 96.0
1000 -0.001 -0.1% 0.019 0.018 1.10 92.4500 0.000 0.0% 0.028 0.024 1.16 91.2100 0.001 0.1% 0.078 0.040 1.93 68.5
True RD = -0.0228mi2 5000 -0.025 -9.3% 0.009 0.008 1.03 93.7
1000 -0.024 -4.2% 0.020 0.018 1.08 92.9500 -0.023 -2.6% 0.030 0.025 1.18 90.8100 -0.016 31.4% 0.075 0.041 1.85 69.5
True RD = -0.0670mi3 5000 -0.068 -2.0% 0.008 0.008 1.01 94.6
1000 -0.069 -2.7% 0.019 0.018 1.05 93.0500 -0.069 -2.9% 0.027 0.025 1.06 92.6100 -0.061 8.8% 0.074 0.043 1.74 72.1
True RD = -0.0970mi4 5000 -0.098 -1.0% 0.009 0.008 1.04 93.6
1000 -0.099 -2.3% 0.018 0.018 1.00 94.0500 -0.097 0.2% 0.029 0.025 1.13 91.6100 -0.088 9.0% 0.074 0.043 1.70 73.7
50
Caveats SEs conservative when sample size is small;
bootstrapping may be used in this case to get more appropriate SEs
Macro only provides difference estimates (not RR or OR) for now
Exposure must be dichotomous; outcome must be continuous or dichotomous (time-to-event analysis not supported)
Some SAS conventions not recognized within the macro code where and class statements not recognized interaction terms and higher order polynomials must be created
in a prior data step
51
Practical considerations
How to choose which covariates to include?Good question.Based on simulations from PS literature
Include all risk factors for outcome May omit predictors of tx that do not affect
outcome
52
Practical considerations
Effect Estimates Result %bias SE
Crude 1.90 ? 0.089
Maximum likelihood -1.09 ? 0.023
Propensity score -1.50 ? 0.050
Doubly robust -1.12 ? 0.024
III. SAS Macro
What to do with estimates from various models that differ?
53
Practical considerations
What sort of diagnostics should be checked?Potentially influential obs with extreme PS
values ‘common_support’ option in SAS macro
Distribution of PS scores stratified by treatment / exposure group
‘showcurves’ option in SAS macro
54
Checking PS distribution
0 0.5 1
Strata 1 2 3 4 5 6
Propensity score
Tx=0
Tx=1
55
Checking PS distribution
0 0.5 1
Strata 1 2 3 4 5 6
Propensity score
Tx=0
Tx=1
56
Checking PS distribution
0 0.5 1
Strata 1 2 3 4 5 6
Propensity score
Tx=0
Tx=1
57
Limitations
DR estimation is not a panacea for unmeasured confounding. Recall- ‘junk’ only reduces to 0 with assumption of no
unmeasured confounders
One of the models must be correct for the estimator to be unbiased Bang & Robins suggest that it will be minimally biased if both
models are nearly right…
Standard errors tend to be slightly larger compared to a single correctly specified regression model
Explaining DR estimation in your methods section could be interesting…
58
Applications DR estimation potentially valuable for
comparative effectiveness studies, and in particular for head-to-head comparisons of treatment effectiveness or adverse events from observational data when RCTs can’t or won’t be done...
for ethical reasons, for economic reasons, for reasons of rare or late-effect outcomes, or for reasons of the need to conduct faster analyses of
possible sentinel events
59
Extensions
Missing data Incomplete follow-up in RCTs
Longitudinal marginal structural models Goodness of fit test?
60
Summary Observational studies of treatment effects depend on
statistical models to disentangle causal effects from confounding
We can never be certain that the statistical model we have chosen is correct
DR estimate unbiased if at least one of the two component models is right and therefore provides some protection against model misspecification
The ‘price’ of double robustness is slightly larger standard errors than a single correctly specified regression model
Assumption of no unmeasured confounders required
61
References Bang, H. & J.M. Robins: Doubly-robust estimation in missing data and causal inference
models. Biometrics 2005, 61, 962–973. Lunceford, J. K. and Davidian, M. (2004). Stratification and weighting via the propensity
score in estimation of causal treatment effects: A comparative study. Statistics in Medicine 23, 2937–2960.
Robins, J. M. (2000). Robust estimation in sequentially ignorable missing data and causal inference models. Proceedings of the American Statistical Association Section on Bayesian Statistical Science, 6–10.
Robins, J. M., Rotnitzky, A., and Zhao L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association 89, 846–866.
Rotnitzky, A., Robins, J. M., and Scharfstein, D. O. (1998). Semiparametric regression for repeated outcomes with nonignorable nonresponse. Journal of the American Statistical Association 93, 1321–1339.
Scharfstein, D. O., Rotnitzky, A., and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association 94, 1096–1120 (with Rejoinder, 1135–1146).
Van der Laan, M. J. and Robins, J. M. (2003). Unified Methods for Censored Longitudinal Data and Causality. New York: Springer-Verlag.
62
Acknowledgements
Chris Wiesen, PhD, Odum Institute for Research in Social Science, University of North Carolina, Chapel Hill, NC
Daniel Westreich, MSPH, Department of Epidemiology, University of North Carolina, Chapel Hill, NC
Marie Davidian, PhD, Department of Statistics, North Carolina State University, Raleigh, NC
Collaborators on the development of the SAS macro:
63
Acknowledgements (II) Agency for Healthcare Research and Quality Supplemental
Award to the UNC CERTs (U18 HS10397-07S1) UNC/GSK Center for Excellence in Pharmacoepidemiology
and Public Health Kevin Anstrom, Lesley Curtis, Brad Hammill, and Rex Edwards
from the Duke CERTs team for valuable feedback on the alpha version.
Thanks to students from UNC’s EPID 369/730, a causal modeling course, for valuable feedback on the beta version.
Presented in memory of Harry Guess, MD, PhD, 1940-2006, who co-authored the initial proposal to develop a SAS macro for doubly robust estimation.
64
Contact Information
Michele Jonsson Funk, PhDResearch Assistant ProfessorDepartment of EpidemiologyUniversity of North CarolinaChapel Hill NC 27599-7521
[email protected] 919-966-8431 (ph)919-843-3120 (fax)
http://www.harryguess.unc.edu
65
Questions & Discussion