planning sample size for randomized evaluations ed... · sample size calculations for randomized...
TRANSCRIPT
![Page 1: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/1.jpg)
TRANSLATING RESEARCH INTO ACTION
Sample size calculations for randomized evaluations
Rebecca Thornton Assistant Professor of Economics
University of Michigan
povertyactionlab.org
![Page 2: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/2.jpg)
1. Background: The basics
2. Getting more complicated: Clusters
3. How to do this in practice
Outline
![Page 3: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/3.jpg)
• Interviews are expensive and you have a budget
• You do not want to be disappointed that you didn’t have a large enough sample
• If you understand the basics of sample size, there are lots of things you can do to increase your power
• You are spending a lot of money and time on this evaluation
Why care?
![Page 4: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/4.jpg)
• General question: how large does the sample need to be to credibly detect a given effect size? (ie. a certain effect of a program)
• What does “credibly” mean here?
It means we can be reasonably sure that the difference between the control and treatment group is due to the treatment and not just to chance
4
Today’s Question
![Page 5: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/5.jpg)
• Two important issues about sample size
1. Larger sample helps to ensure that the treatment and control groups are balanced (on observables and unobservables)
Helps prevent a biased estimate
2. Can detect a significant difference in outcomes between the treatment and control groups
Helps to detect a significant estimate
Sample Size
![Page 6: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/6.jpg)
To start (and to finish)
• Doing sample size calculations is a craft
• The values estimated depend on parameters
whose values are unknown and will vary.
– Power calculations involve some guess work.
– Vary across outcomes!
![Page 7: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/7.jpg)
Basic set up
• At the end of an experiment, we compare the
average outcome of interest in the treatment with
the average outcome of interest in the control
• We are interested in the difference:
Mean (treatment) - Mean (control) = Effect (size)
• Example: Want to know the effect of giving out text
books on test scores. You have the scores of
treatment students (with books) and control students
(without books)
![Page 8: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/8.jpg)
Simple Example
60
65
70
75
80
85
90
No Books Books
Test Scores
![Page 9: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/9.jpg)
• Subtract the average of the Control from the
average of the Treatment
• Run a regression of the outcome (Y) on an
indicator of being in the Treatment group:
Y= a + bT
Effect of the program
![Page 10: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/10.jpg)
Simple Example
60
65
70
75
80
85
90
No Books Books
Test Scores
![Page 11: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/11.jpg)
• Effect size: Difference in means or,
Y = a + bT b = Effect size (slope of the line)
Y=70+10*T
• Treatment Effect = 10 points – How confident am I that there is no treatment effect?
– * 10 percent chance that there is really no effect
– ** 5 percent chance
– *** 1 percent chance
Effect of the program
![Page 12: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/12.jpg)
• Is the estimate of b biased?
– Discussed in previous lectures
– Depends on the validity of the randomization and mitigation of other threats
• How precise is the estimate of b?
– Did this difference happen just by chance? How confident am I that there is a true effect of my program?
– Depends on the sample size, the variability of the outcome variable (Y), and the actual effect of the program
• Accuracy vs. Precision
Back to the main questions…
![Page 13: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/13.jpg)
Accuracy versus Precision
Accuracy P
reci
sio
n
![Page 14: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/14.jpg)
Unbiased and sample size
Unbiased Sa
mp
le S
ize
![Page 15: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/15.jpg)
• When we do survey research and estimate treatment effects… – Randomization helps us to be accurate (unbiased) – Sample size allows us to be precise (confident about
our estimates)
• Both are independently important – Increased sample size may be precise, but not
accurate. – Randomization without a large enough sample will
allow us to estimate the unbiased effect (accuracy), but we might not be that confident about it
Estimation
![Page 16: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/16.jpg)
• Impact evaluation involves the scientific method – 1) propose a hypothesis
– 2) design the experiment to test that hypothesis
• How do we test hypotheses? – We start with an hypothesis (ie., there will be an
effect of the program)
– At the end of an experiment, we test our hypothesis
Scientific method
![Page 17: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/17.jpg)
• In criminal law, most institutions follow the rule: “innocent until proven guilty”
• The presumption is that the accused is innocent and the burden is on the prosecutor to show guilt
– The jury or judge starts with the “null hypothesis” that the accused person is innocent
– The prosecutor has a hypothesis that the accused person is guilty
17
Hypothesis testing
![Page 18: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/18.jpg)
• In program evaluation, instead of “presumption of innocence,” the rule is: “presumption of insignificance”
• The “Null hypothesis” (H0) is that there was no (zero) impact of the program
• The burden of proof is on the evaluator to show a significant effect of the program
Hypothesis testing
![Page 19: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/19.jpg)
• If our measurements show a difference between the treatment and control group we know:
– There is some difference between the treatment and the control…
– But, our presumption is that there is no impact of the program (our H0 is still true)
– It might be that the difference is solely due to chance (random sampling error)
• We need to use statistics to calculate how likely this difference is in fact due to random chance or not
Hypothesis testing
![Page 20: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/20.jpg)
• Lets say the sample size is = 2…
Extreme Example
![Page 21: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/21.jpg)
Perhaps…
Less extreme: Is this difference due to random chance?
Control
Treatment
![Page 22: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/22.jpg)
Probably not….
Is this difference due to random chance?
Control
Treatment
![Page 23: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/23.jpg)
• Using statistics, if we find that it is very unlikely (say less than a 5% probability) that the difference is solely due to chance: – We “reject our null hypothesis” – We may now say: “our program has a statistically
significant impact”
• Are we now 100 percent certain there is an impact? – No, we may be only 95% confident; and we accept
that if we using this threshold, we may be wrong 5% of the time
Hypothesis testing: conclusions
![Page 24: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/24.jpg)
• What if we can’t reject our null hypothesis
– Does that mean we can be 100% certain there is no impact?
– No, it just didn’t meet the statistical threshold to conclude otherwise
Hypothesis testing: conclusions
![Page 25: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/25.jpg)
• Possibility #1: There is an impact
– Could detect it – have enough statistical power
– Could not detect it – do not have enough power
• Possibility #2: There is no impact
– Conclude there was no impact
– Conclude there was an impact
Two possibilities
![Page 26: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/26.jpg)
YOU CONCLUDE
Effective No Effect
THE
TRUTH
Effective Type II Error
No Effect
Type I Error
Hypothesis testing
![Page 27: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/27.jpg)
YOU CONCLUDE
Effective No Effect
THE
TRUTH
Effective Type II Error
No Effect
Type I Error
(probability =
sig level)
Hypothesis testing
Significance Level: Set to a level that you are comfortable with: With a
level of 5%, you can be 95% confident your conclusion of an effect. For policy purpose, you want to be very confident in the answer you give: the level will be set fairly low . Related to Type I error.
![Page 28: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/28.jpg)
YOU CONCLUDE
Effective No Effect
THE
TRUTH
Effective (probability =
power)
Type II Error
No Effect
Type I Error
Hypothesis testing
Power: How frequently will we detect effective programs. Type II error results from low power.
![Page 29: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/29.jpg)
1. Variance – The more “noisy” it is to start with, the harder it is
to measure effects
2. Effect Size to be detected – The smaller the effect size we want to detect, the
larger sample we need
3. Sample Size – The more children we sample, the more likely we
are to obtain the true difference
Power: main ingredients
![Page 30: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/30.jpg)
1. Variance – The more “noisy” it is to start with, the harder it is
to measure effects
2. Effect Size to be detected – The smaller the effect size we want to detect, the
larger sample we need
3. Sample Size – The more children we sample, the more likely we
are to obtain the true difference
Power: main ingredients
![Page 31: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/31.jpg)
Variance
Low Standard Deviation
0
5
10
15
20
25
va
lue
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
Number
Fre
qu
en
cy
mean 50
mean 60
![Page 32: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/32.jpg)
Less Precision
Medium Standard Deviation
0
1
2
3
4
5
6
7
8
9
va
lue
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
Number
Fre
qu
en
cy
mean 50
mean 60
![Page 33: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/33.jpg)
Even less precise
High Standard Deviation
0
1
2
3
4
5
6
7
8
value 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89
Number
Fre
qu
en
cy
mean 50
mean 60
![Page 34: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/34.jpg)
• Variance depends first on your outcome variable: which outcome you want to measure
• Must calculate separately for each outcome
• What can help increase power? Can “absorb” variance: – using a baseline
– controlling for other variables
– Do a pilot and measure the outcome variables, field testing
Variance
![Page 35: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/35.jpg)
1. Variance – The more “noisy” it is to start with, the harder it is
to measure effects
2. Effect Size to be detected – The smaller the effect size we want to detect, the
larger sample we need
3. Sample Size – The more children we sample, the more likely we
are to obtain the true difference
Power: main ingredients
![Page 36: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/36.jpg)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
-4 -3 -2 -1 0 1 2 3 4 5 6
control
treatment
1 Standard Deviation
Effect Size: 1 “standard deviation”
![Page 37: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/37.jpg)
Effect Size: 3 standard deviations
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
-4 -3 -2 -1 0 1 2 3 4 5 6
control
treatment
The less overlap the better… (easier to detect a difference)
3 Standard Deviations
![Page 38: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/38.jpg)
• What effect do you think that the program will have?
• What is the smallest effect that you would like to be able to detect with confidence?
Effect Size
![Page 39: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/39.jpg)
DO NOT USE: “Expected” effect size
• First start with the question: how big of an effect do I think the program will have? – This is usually large… I like the program, why else
implement?
– But if we overestimate the effect size, we overestimate the power that we will have, and our sample size may be too small
• Be conservative – What is the smallest effect size that would justify
implementing the program?
39
“Choosing” an effect size
![Page 40: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/40.jpg)
• Different effect sizes for different outcome variables
• Also depends on how variable the outcome is
• How to standardize effect sizes across outcomes? – Standardized effect size is the effect size divided
by the standard deviation of the outcome
= (Treatment – Control)/SD
• Common standardized effect sizes 40
“Choosing” an effect size
![Page 41: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/41.jpg)
An effect size of…
Is considered… …and it means that…
0.2 Modest The average member of the treatment group had a better outcome than the 58th percentile of the control group
0.5 Large The average member of the treatment group had a better outcome than the 69th percentile of the control group
0.8 VERY Large The average member of the treatment group had a better outcome than the 79th percentile of the control group
Standardized effect size
Really? Common Danger: Picking an effect size that is too large! Calculate!
![Page 42: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/42.jpg)
1. Variance – The more “noisy” it is to start with, the harder it is
to measure effects
2. Effect Size to be detected – The smaller the effect size we want to detect, the
larger sample we need
3. Sample Size – The more children we sample, the more likely we
are to obtain the true difference
Power: main ingredients
![Page 43: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/43.jpg)
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
test scores
control
treatment
control μ
treatment μ
Average difference: 6 points
We only observe a random sample of the students
![Page 44: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/44.jpg)
Say that we have a sample of 1 observation, that comes from the distribution of data…
0.0%
0.2%
0.4%
0.6%
0.8%
1.0%
1.2%
1.4%
1.6%
1.8%
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
test scores
control
treatment
control μ
treatment μ
N=1
Sample size = 1
![Page 45: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/45.jpg)
Sample size = 4
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
test scores
control
treatment
control μ
treatment μ
N=4
![Page 46: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/46.jpg)
Sample size = 9
0.0%
1.0%
2.0%
3.0%
4.0%
5.0%
6.0%
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
test scores
control
treatment
control μ
treatment μ
N=9
![Page 47: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/47.jpg)
Sample size = 100
0.0%
2.0%
4.0%
6.0%
8.0%
10.0%
12.0%
14.0%
16.0%
18.0%
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
test scores
control
treatment
control μ
treatment μ
N=100
![Page 48: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/48.jpg)
Sample size = 6,000
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
test scores
control
treatment
control μ
treatment μ
N=sqrt(6000)
![Page 49: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/49.jpg)
• What is a good level of power?
• A power of 80% tells us that, in 80% of the experiments of this sample size conducted in this population, if the null hypothesis is in fact false (e.g. there is a treatment effect), we will be able to reject it. In other words, 80% of the time we will be able to measure an effect.
• 20% of the time I will be disappointed
• Common Power used: 80%, 90%
• But I don’t like to be disappointed 20% of the time
Power: What level do I want?
![Page 50: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/50.jpg)
1. Background: The basics
2. Getting more complicated: Clusters
3. How to do this in practice
Outline
![Page 51: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/51.jpg)
• Up to now, we have been assuming randomization at the individual level
• But often, we may want to randomize at a higher group level – Village
– School
– District
• In that case, groups are randomized and individuals within each treatment or control group all get the same treatment
Individual vs. Group design
![Page 52: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/52.jpg)
• Minimize or remove contamination across individuals – Example: Deworming, information campaigns
• More feasible
• Only natural choice – Example: Any education intervention that affect
an entire classroom (e.g. flipcharts, teacher training).
• Why not? Expense (linked with power)
Reason for cluster randomization
![Page 53: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/53.jpg)
• If the treatment is randomized at a group level you need more observations
• Why? The observations (ie. individuals) are not independent of each other – All villagers are exposed to the same weather – All districts share a common history – All students share a schoolmaster
• The more correlation between the outcomes within a group, the larger sample you need
• Value called r (rho) measures this
Impact of Group-level randomization
![Page 54: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/54.jpg)
• Like percentages, r must be between 0 and 1
• Higher values mean that your clusters are more correlated (bad for power), lower r is more desirable
• It is sometimes low, 0, .05, .08, but can be high:0.62
Values of r (rho)
Madagascar Math + Language 0.5
Busia, Kenya Math + Language 0.22
Udaipur, India Math + Language 0.23
Mumbai, India Math + Language 0.29
Vadodara, India Math + Language 0.28
Busia, Kenya Math 0.62
![Page 55: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/55.jpg)
• Where do I find *my* rho?
– Use data
– Ask other researchers
– Be conservative and use a high value
Values of r (rho)
![Page 56: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/56.jpg)
Impact of r (rho) on sample size?
• Design effect = #cluster/#nocluster
• Design effect = 1+(n-1)*rho
– If only one respondent per cluster, rho doesn’t matter
– Larger rho, bigger design effect
– Larger sample size, larger effects of rho
group size (n) rho 10 50 100 200 0.02 1.18 1.98 2.98 4.98
0.05 1.45 3.45 5.95 10.95 0.10 1.9 5.9 10.9 20.9
![Page 57: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/57.jpg)
• If experimental design is clustered, we now need to consider rho when choosing a sample size
• It is extremely important to randomize an adequate number of groups
• Often the number of individuals within groups matter less than the total number of groups
Implications
![Page 58: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/58.jpg)
1. Background: The basics
2. Getting more complicated: Clusters
3. How to do this in practice
Outline
![Page 59: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/59.jpg)
• Two approaches:
• Approach one: Given budget constraints or logistics, you are given the maximum possible sample size. With your estimated effect size, will you have enough power such that it is worthwhile pursuing the project?
• Approach two: Set the power equal to some acceptable number. Given the estimated effect size, what is the sample required to obtain that power?
How to do “power calculations”?
![Page 60: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/60.jpg)
• You plug in some numbers…
• Software will either graph (relates to two approaches above):
– Approach 1: Power vs. effect size
– Approach 2: Power vs. observations
• Follow the graph to see #observations or effect size that gives you ~0.90 power
Power calculations using OD software
![Page 61: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/61.jpg)
Power Calculations using the OD software
• Choose “Power vs number of clusters” in the
menu “clustered randomized trials”
![Page 62: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/62.jpg)
Cluster Size (If no clusters)
• Choose cluster with 1 units… this is a bit
confusing
![Page 63: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/63.jpg)
Choose Significance Level and
Standardized Effect Size
• Pick a
– Normally you pick 0.05
• Pick d
– Can experiment with 0.20
• You obtain the resulting graph showing
power as a function of sample size.
![Page 64: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/64.jpg)
Power and Sample Size
![Page 65: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/65.jpg)
Power and Sample Size
![Page 66: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/66.jpg)
Power and Sample Size
![Page 67: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/67.jpg)
Availability of a Baseline
• A baseline has three main uses:
– Check if C and T group same before the treatment
– Reduce the sample size needed (use controls)
– Interactions and subgroups
• To compute power with a baseline:
– Need to know correlation between two outcome
measures
– Stronger the correlation, the bigger the gain.
– Very big gains for very persistent outcomes such as
tests scores
![Page 68: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/68.jpg)
Stratified Samples
• Stratification reduces sample size needed to achieve a given power
• Why? – Reduce the variance of outcome of interest in each strata
– Reduce the correlation of units within clusters
• Example: if you randomize within school and grade which class is treated and which class is control: – Variance of test score goes down
– The within cluster correlation goes down
• Common stratification variables: – Baseline values of the outcomes when possible
– We expect the treatment to vary in different subgroups
![Page 69: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/69.jpg)
Other considerations
• Are you interested in the difference between two
treatments?
• Are you interested in testing whether the effect is
different in different subpopulations?
• Will there be attrition?
![Page 70: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the](https://reader034.vdocuments.site/reader034/viewer/2022042022/5e7a53afceb37e685627a7f7/html5/thumbnails/70.jpg)
Conclusions
• Sample size calculations are a craft
• Calculations depend on parameters whose
values are unknown and will vary.
– Power calculations involve some guess work.
– Involve pilot testing
– Vary across outcomes!