subclassification sta 320 design and analysis of causal studies dr. kari lock morgan and dr. fan li...

65
Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Upload: vanessa-williams

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Subclassification

STA 320Design and Analysis of Causal Studies

Dr. Kari Lock Morgan and Dr. Fan LiDepartment of Statistical Science

Duke University

Page 2: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Quiz 2

> summary(Quiz2) Min. 1st Qu. Median Mean 3rd Qu. Max.

11.0 14.0 15.0 15.5 17.0 20.0

Page 3: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Quiz 2• one-sided or two-sided p-value?

(depends on question being asked)

• imputation: use observed control outcomes to impute missing treatment outcomes and vice versa. o class year: use observed outcomes from

control sophomores to impute missing outcomes for treatment sophomores

• biased or unbiased

Page 4: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Covariate Balance• In randomized experiments, the

randomization creates covariate balance between treatment groups

• In observational studies, treatment groups will be naturally unbalanced regarding covariates

• Solution? compare similar units

• (How? Propensity score methods.)

Page 5: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Covariate Balance

GOAL THIS WEEK: Try to fix this!

Page 6: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Select Facts about Classical Randomized Experiments

Timing of treatment assignment clear

Design and Analysis separate by definition: design automatically “prospective,” without outcome data

Unconfoundedness, probabilisticness by definition

Assignment mechanism – and so propensity scores – known

Randomization of treatment assignment leads to expected balance on covariates

(“Expected Balance” means that the joint distribution of covariates is the same in the active treatment and control groups, on average)

Analysis defined by protocol rather than exploration

6Slide by Cassandra Pattanayak

Page 7: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Select Facts about Observational Studies

Timing of treatment assignment may not be specified

Separation between design and analysis may become obscured, if covariates and outcomes arrive in one data set

Unconfoundedness, probabilisticness not guaranteed

Assignment mechanism – and therefore propensity scores – unknown

Lack of randomization of treatment assignment leads to imbalances on covariates

Analysis often exploratory rather than defined by protocol7

Slide by Cassandra Pattanayak

Page 8: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

Timing of treatment assignment may not be specified

Separation between design and analysis may become obscured, if covariates and outcomes arrive in one data set

Unconfoundedness, probabilisticness not guaranteed

Assignment mechanism – and therefore propensity scores – unknown

Lack of randomization of treatment assignment leads to imbalances on covariates

Analysis often exploratory rather than defined by protocol8

Slide by Cassandra Pattanayak

Page 9: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

Separation between design and analysis may become obscured, if covariates and outcomes arrive in one data set

Unconfoundedness, probabilisticness not guaranteed

Assignment mechanism – and therefore propensity scores – unknown

Lack of randomization of treatment assignment leads to imbalances on covariates

Analysis often exploratory rather than defined by protocol 9Slide by Cassandra Pattanayak

Page 10: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

Unconfoundedness, probabilisticness not guaranteed

Assignment mechanism – and therefore propensity scores – unknown

Lack of randomization of treatment assignment leads to imbalances on covariates

Analysis often exploratory rather than defined by protocol 10Slide by Cassandra Pattanayak

Page 11: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

3. Identify key covariates likely related to outcomes and/or treatment assignment. If key covariates not observed or very noisy, usually better to give up and find a better data source.

4. Remove units not similar to any units in opposite treatment group

Assignment mechanism – and therefore propensity scores – unknown

Lack of randomization of treatment assignment leads to imbalances on covariates

Analysis often exploratory rather than defined by protocol 11Slide by Cassandra Pattanayak

Page 12: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

3. Identify key covariates likely related to outcomes and/or treatment assignment. If key covariates not observed or very noisy, usually better to give up and find a better data source.

4. Remove units not similar to any units in opposite treatment group

Assignment mechanism – and therefore propensity scores – unknown

Lack of randomization of treatment assignment leads to imbalances on covariates

Analysis often exploratory rather than defined by protocol 12Slide by Cassandra Pattanayak

Page 13: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

3. Identify key covariates likely related to outcomes and/or treatment assignment. If key covariates not observed or very noisy, usually better to give up and find a better data source.

4. Remove units not similar to any units in opposite treatment group

5. Estimate propensity scores, as a way to…

6. Find subgroups (subclasses or pairs) in which the active treatment and control groups are balanced on covariates (not always possible; inferences limited to subgroups where balance is achieved)

Analysis often exploratory rather than defined by protocol

13Slide by Cassandra Pattanayak

Page 14: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

3. Identify key covariates likely related to outcomes and/or treatment assignment. If key covariates not observed or very noisy, usually better to give up and find a better data source.

4. Remove units not similar to any units in opposite treatment group

5. Estimate propensity scores, as a way to…

6. Find subgroups (subclasses or pairs) in which the treatment groups are balanced on covariates (not always possible; inferences limited to subgroups where balance is achieved)

Analysis often exploratory rather than defined by protocol

14Slide by Cassandra Pattanayak

Page 15: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

3. Identify key covariates likely related to outcomes and/or treatment assignment. If key covariates not observed or very noisy, usually better to give up and find a better data source.

4. Remove units not similar to any units in opposite treatment group

5. Estimate propensity scores, as a way to…

6. Find subgroups (subclasses or pairs) in which the active treatment and control groups are balanced on covariates (not always possible; inferences limited to subgroups where balance is achieved)

7. Analyze according to pre-specified protocol

15Slide by Cassandra Pattanayak

Page 16: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Best Practices for Observational Studies

1. Determine timing of treatment assignment relative to measured variables

2. Hide outcome data until design phase is complete

3. Identify key covariates likely related to outcomes and/or treatment assignment. If key covariates not observed or very noisy, usually better to give up and find a better data source.

4. Remove units not similar to any units in opposite treatment group

5. Estimate propensity scores, as a way to…

6. Find subgroups (subclasses or pairs) in which the active treatment and control groups are balanced on covariates (not always possible; inferences limited to subgroups where balance is achieved)

7. Analyze according to pre-specified protocol

16

Design Observational Study to Approximate Hypothetical, Parallel

Randomized Experiment

Slide by Cassandra Pattanayak

Page 17: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Propensity Scoresps = predict(ps.model, type="response")

Page 18: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Trimming• Eliminate cases without comparable

units in the opposite group

• One option: set boundaries on the allowable propensity score and eliminate units with propensity scores close to 0 or 1

• Another option: eliminate all controls with propensity scores below the lowest treated unit, and eliminate all treated units with propensity scores above the highest control

Page 19: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Trimming

No comparable treated units - eliminate these control units

ps = predict(ps.model, type="response")

Page 20: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Trimming

> overlap(ps, data$W) #these units should be eliminated[1] “8 controls below any treated"[1] “5 treated above any controls”

> data = data[ps>=min(ps[data$W==1]) & ps <= max(ps[data$W==0]),]

Page 21: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Estimating Propensity Scores• In practice, estimating the propensity

score is an iterative process:

1. Estimate propensity score

2. Eliminate units with no overlap (eliminate units with no comparable units in other groups)

3. Repeat until propensity scores overlapping everywhere for both groups

Go back and refit model

Page 22: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

New Propensity Scores

trim non-overlap…refit propensity score model…

Page 23: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

New Propensity Scores

Page 24: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

After Trimming: Love Plot• Original n = 210; after trimming n =

187

the closer to 0, the better! (0 = perfect balance)

Before trimmingAfter trimming

Page 25: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Love Plots• The previous plot is called a love plot

(Thomas Love): used to visualize improvement in balance

• Original statistics are t-statistics

• Statistics after balancing use the same denominators (SE) as the original t-statisticso Otherwise closer to 0 could be from increased

SEo No longer t-statistics, just for comparisono Only numerators (difference in means) changeo If smaller, must be better balance

Page 26: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Trimming• Trimming can improve covariate balance,

improving internal validity (better causal effects for remaining units)

• But hurts external validity (generalizability)

• Changes the estimand – estimate the causal effect for those units who are comparable

• How many units to trim is a tradeoff between decreasing sample size and better comparisons – Ch 16 gives optimal threshold

Page 27: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Subclasses• If we have enough covariates

(unconfounded), within subclasses of people with identical covariates, observational studies look like randomized experiments

• Idea: subclassify people based on similar covariate values, and estimate treatment effect within each subclass

• (similar to stratified experiments)

Page 28: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

One Key CovariateSmoking, Cochran (1968)

Population: Male smokers in U.S.Active treatment: Cigar/pipe smokingControl treatment: Cigarette smokingOutcome: Death in a given yearDecision-Maker: Individual male smokerReason for smoking male to choose cigarettes

versus cigar/pipe?Age is a key covariate for selection of smoking

type for males28

Slide by Cassandra Pattanayak

Page 29: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Subclassification to Balance AgeTo achieve balance on age, compare:

- “young” cigar/pipe smokers with “young” cigarette smokers- “old” cigar/pipe smokers with “old” cigarette smokers

Better: young, middle-aged, old, or more age subclasses

Objective of study design, without access to outcome data: approximate a completely randomized experiment within each subclass

Only after finalizing design, reveal outcome data29

Rubin DB. The Design Versus the Analysis of Observational Studies for Causal Effects: Parallels with the Design of Randomized Trials. Statistics in Medicine, 2007. Slide by Cassandra

Pattanayak

Page 30: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Comparison of Mortality Rates for Two Smoking Treatments in U.S.

30Cochran WG. The Effectiveness of Adjustment of Subclassification in Removing Bias in Observational Studies. Biometrics 1968; 24: 295-313.

Cigarette Smokers

Cigar/Pipe Smokers

Mortality Rate per 1000 person-years, %

13.5 17.4

Slide by Cassandra Pattanayak

Page 31: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Comparison of Mortality Rates for Two Smoking Treatments in U.S.

31Cochran WG. The Effectiveness of Adjustment of Subclassification in Removing Bias in Observational Studies. Biometrics 1968; 24: 295-313.

Cigarette Smokers

Cigar/Pipe Smokers

Mortality Rate per 1000 person-years, %

13.5 17.4

Averaging Over Age Subclasses

2 Age Subclasses 16.4 14.9

3 Age Subclasses 17.7 14.2

11 Age Subclasses 21.2 13.7

Slide by Cassandra Pattanayak

Page 32: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

What if we had 20 covariates, with 4 levels each?

Over a million million subclasses

32Slide by Cassandra Pattanayak

Page 33: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Solution?• How can we balance many covariates?

BALANCE THE PROPENSITY SCORE!

Page 34: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Propensity Score• Amazing fact: balancing on just

the propensity score balances ALL COVARIATES included in the propensity score model!!!

Page 35: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Toy Example• One covariate, X, which takes levels A, B, C

X = A X = B X = CTreatment

90 2 5

Control 10 8 20e(x) 0.9 0.2 0.2

• Within circled subclass, are treatment and control balanced with regard to X?

• Yes! Each has 2/7 B and 5/7 C

Page 36: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Hypothetical ExamplePopulation: 2000 patients whose medical

information was reported to government database

Units: Patients

Active Treatment: New surgery (1000 patients)

Control Treatment: Old surgery (1000 patients)

Outcome: Survival at 3 years

Remove outcomes from data set36

Slide by Cassandra Pattanayak

Page 37: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Reasonable to assume propensity score = 0.5 for all?

37

Age Range Total Number

Number New Surgery

Number Old Surgery

Estimated Probability New Surgery, given Age

0-19 137 94 43 94/137 = 0.69

20-39 455 276 179 276/455 = 0.61

40-59 790 393 397 393/790 = 0.50

60-79 479 193 286 193/479 = 0.28

80-99 118 31 87 31/118 = 0.26

Slide by Cassandra Pattanayak

Page 38: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Does propensity score depend on age only?

38

Cholesterol Range

Total Number

Number New Surgery

Number Old Surgery

Estimated Probability New Surgery, given Cholesterol

0-199 175 155 20 155/175 = 0.89

200-249 475 354 121 354/475 = 0.75

250-299 704 343 361 343/704 = 0.49

300-349 464 130 334 130/464 = 0.28

350-400 162 16 146 16/162 = 0.10

Slide by Cassandra Pattanayak

Page 39: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

39

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 11/111.00

32/380.84

32/490.65

17/290.59

2/70.29

200-249

57/610.93

100/1190.84

75/1410.53

40/1030.39

4/250.16

250-299

48/570.84

145/1910.76

148/2930.51

43/1770.24

7/670.10

300-349

28/330.85

63/980.64

72/1720.42

28/1250.22

2/460.04

350-400

9/100.90

8/220.36

11/430.26

2/280.07

1/130.08

Proportion of units assigned to active treatment rather than control treatment

Slide by Cassandra Pattanayak

Page 40: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 11/111.00

32/380.84

32/490.65

17/290.59

2/70.29

200-249 57/610.93

100/1190.84

75/1410.53

40/1030.39

4/250.16

250-299 48/570.84

145/1910.76

148/2930.51

43/1770.24

7/670.10

300-349 28/330.85

63/980.64

72/1720.42

28/1250.22

2/460.04

350-400 9/100.90

8/220.36

11/430.26

2/280.07

1/130.08

Proportion of units assigned to active treatment rather than control treatment

Slide by Cassandra Pattanayak

Page 41: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 11/111.00

32/380.84

200-249 57/610.93

100/1190.84

250-299 48/570.84

145/1910.76

300-349 28/330.85

350-400 9/100.90

Subclassifying on estimated propensity score leads to active treatment and control groups, within each subclass, that have similar covariate distributions

Slide by Cassandra Pattanayak

Page 42: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 11 32

200-249 57 100

250-299 48 145

300-349 28

350-400 9

Number of active treatment units, subclass 1

Slide by Cassandra Pattanayak

Page 43: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 11/4300.03

32/4300.07

200-249 57/4300.13

100/4300.23

250-299 48/4300.11

145/4300.34

300-349 28/4300.07

350-400 9/4300.02

Covariate distribution among active treatment units, subclass 1

Slide by Cassandra Pattanayak

Page 44: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 11/111.00

32/380.84

200-249 57/610.93

100/1190.84

250-299 48/570.84

145/1910.76

300-349 28/330.85

350-400 9/100.90

Proportion of units assigned to active treatment rather than control treatment

Slide by Cassandra Pattanayak

Page 45: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 0 6

200-249 4 19

250-299 9 46

300-349 5

350-400 1

Number of control treatment units, subclass 1

Slide by Cassandra Pattanayak

Page 46: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39 40-59 60-79 80-99

0-199 0/900.00

6/900.07

200-249 4/900.04

19/900.21

250-299 9/900.10

46/900.51

300-349 5/900.06

350-400 1/900.01

Covariate distribution among control treatment units, subclass 1

Slide by Cassandra Pattanayak

Page 47: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Age

Cholesterol

0-19 20-39

0-199 11/4300.03

32/4300.07

200-249 57/4300.13

100/4300.23

250-299 48/4300.11

145/4300.34

300-349 28/4300.07

350-400 9/4300.02

Covariate distribution among active treatment units, subclass 1

Age

Cholesterol

0-19 20-39

0-199 0/900.00

6/900.07

200-249 4/900.04

19/900.21

250-299 9/900.10

46/900.51

300-349 5/900.06

350-400 1/900.01

Covariate distribution among control treatment units, subclass 1

Slide by Cassandra Pattanayak

Page 48: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Stratified randomized experiment:- Create strata based on covariates- Assign different propensity score to each stratum- Units with similar covariates are in same stratum and have same propensity scores

Observational study:- Estimate propensity scores based on covariates- Create subclasses based on estimated propensity scores- Units within each subclass have similar propensity scores and, on average, similar covariates

Works if we have all the important covariates – i.e., if assignment mechanism unconfounded given observed covariates

48Slide by Cassandra Pattanayak

Page 49: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Subclassification• Divide units into subclasses within

which the propensity scores are relatively similar

• Estimate causal effects within each subclass

• Average these estimates across subclasses (weighted by subclass size)

• (analyze as a stratified experiment)

Page 50: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Estimate within Subclass• If propensity scores constant enough

within subclass, often a simple difference in observed means is adequate as an estimate

• If covariate differences between treatment groups persist, even within subclasses, regression or model-based imputation may be used

Page 51: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

How many subclasses?• It depends! (covariate balance, n, etc.)

• More subclasses: propensity scores will be closer to the same within each subclass

• Fewer subclasses: sample sizes will be larger within each subclass, so estimates will be less variable

• Larger sample size can support more subclasses

Page 52: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

How many subclasses?• Start with 5 equally sized subclasses

(usually 5-10 are sufficient)

• Check…o Propensity score balance within subclassesoNumber of treated and controls within

subclassesoOverall covariate balance

• If balance needs improving and subclasses have enough treated and controls, try more subclasses

Page 53: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Subclass Breaks• Starting with 5 equally sized

subclasses

• Subclass breaks would be at the 20th, 40th, 60th, and 80th percentiles of the propensity score

• Subclasses do not have to be equally sized, that’s just a convenient starting point

Page 54: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Data> subclass.breaks = quantile(ps, c(.2,.4, .6,.8)) > subclass = subclasses(ps, subclass.breaks)> plot.ps(ps, W, subclass.breaks)

Page 55: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Data> table(W, subclass) subclassW 1 2 3 4 5 0 19 18 12 7 7 1 19 19 25 30 31

Page 56: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Love Plot> cov.balance(X, W)> cov.balance(X, W, subclass)

RawSubclassified

Page 57: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Outcomes• Once we are happy with the

covariate balance, we can analyze the outcomes

• (Note: once you look at the outcomes, there is no turning back, so make sure you are happy with balance first!)

Page 58: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Inference: Estimate• Analyze as a stratified experiment

• General (where j indexes subclasses):

• One common option:

Page 59: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Data: Outcomes!> subclass.average(MathOutcome, W, subclass)

$`Difference in Means within Each Subclass`[1] -1.263158 -4.681287 -4.016667 -6.633333 -3.658986$`Weighted Average Difference in Means` [,1][1,] -4.033685

> subclass.average(VocabOutcome, W, subclass)$`Difference in Means within Each Subclass`[1] 9.000000 9.289474 7.856667 5.828571 8.626728$`Weighted Average Difference in Means` [,1][1,] 8.127701

Taking the vocab training rather than the math training course causes a decrease of about 4 points on the math test and an increase of about 8 points on the vocab test, on average.

Page 60: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Datan = 445 undergraduate

students

Randomized Experiment

235

Observational Study210

Vocab Training

116

Math Training

119

Vocab Training

131

Math Training

79

randomization

randomization

students choose

Shadish, M. R., Clark, M. H., Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. JASA. 103(484): 1334-1344.

Page 61: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Data: Outcomes!> subclass.average(MathOutcome, W, subclass)

$`Difference in Means within Each Subclass`[1] -1.263158 -4.681287 -4.016667 -6.633333 -3.658986$`Weighted Average Difference in Means` [,1][1,] -4.033685

> subclass.average(VocabOutcome, W, subclass)$`Difference in Means within Each Subclass`[1] 9.000000 9.289474 7.856667 5.828571 8.626728$`Weighted Average Difference in Means` [,1][1,] 8.127701

Taking the vocab training rather than the math training course causes a decrease of about 4 points on the math test and an increase of about 8 points on the vocab test.

Estimate from randomized experiment: -4.189

Estimate from randomized experiment: 8.114

Page 62: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Inference: Variance• General (where j indexes subclasses):

• If using simple difference in means:

Page 63: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Data: Inference> subclass.var(MathOutcome, W, subclass)

$`Variance within Subclasses`[1] 1.859820 1.627395 1.617101 1.946617 4.144487$`Variance of Estimate` [,1][1,] 0.2856831$`SE of Estimate` [,1][1,] 0.5344934

> subclass.var(VocabOutcome, W, subclass)$`Variance within Subclasses`[1] 2.853668 3.198244 3.209063 7.074847 7.376476$`Variance of Estimate` [,1][1,] 0.6017093$`SE of Estimate` [,1][1,] 0.7756993

Page 64: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Shadish Data: Math

• We are 95% confident that taking the math training course (as opposed to the vocab course) increases math scores by between about 3 and 5 points, on average. This is highly significant – taking the math course does improve your math test score.

Page 65: Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

To Do• Read Ch 16, 17

• Homework 4 (due Monday)

• Bring laptops to class on Wednesday