difference in difference 1. preliminaries office hours: fridays 4-5pm 32lif, 3.01 i will post slides...

15
Difference in Difference 1

Upload: belinda-mccarthy

Post on 04-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Difference in Difference 1

Page 2: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Preliminaries

• Office Hours: Fridays 4-5pm 32Lif, 3.01• I will post slides from class on my website

http://samuelmarden.weebly.com/• E-mail: [email protected]

Page 3: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Diff-in-Diff

• Principle tool in non-experimental applied micro over the last twenty years

• Takes idea from experimental literature – control groups – and applies it in non experimental circumstances

• Suitability of control group is key as control group provides counterfactual

Page 4: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Counterfactuals✔

Describe, as if to a policymaker with no background in econometrics, what a counterfactual is and why it is important for establishing the impact of a particular program?

Page 5: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

CounterfactualSuppose you are interested in assessing the effect of reducing class sizes on children’s final exam grades. You have test scores from students in classes where the class size was reduced starting in the prior year and from students in classes where the size remained the same. Under what conditions would the difference in the average test scores across these two groups be a valid estimate for the effect of reducing class sizes? Explicitly state the counterfactual you need and how it relates to the comparison group you actually have (i.e., students in classes where the size remained the same).

Page 6: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Re-Cap Rubin Causal Model

We would like to know the effect of the ‘treatment’ on the treatment group

E(YTi |T) - E(YC

i |T)• What do these mean?• Do we observe (the sample analogue of) both

of these objects?

Page 7: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Re-Cap Rubin Causal Model

We would like to know the effect of the ‘treatment’ on the treatment group

E(YTi |T) - E(YC

i |T)• What do these mean?• We don’t observe (the sample analogue of)

E(YCi |T)

Instead we often estimate E(YTi |T) - E(YC

i |C)• What is the problem?

Page 8: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Selection bias

Is Occurs when E(YCi |T) ≠ E(YC

i |C)Note (E(YT

i |T) - E(YCi |T)) - (E(YT

i |T) - E(YCi |C)) = E(YC

i |T) ≠ E(YCi |C)

• ExamplesWhat do we do with Diff-in-Diff?• Estimate E(ΔYT

i |T) - E(ΔYCi |C)

• So biased if E(ΔYCi |T) ≠ E(ΔYC

i |C)• What do we call this assumption

Page 9: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

2. Productivity of Cocoa Farmers1995 2000

Technical Assitance Regions

290 400 110

Not Techincal Assitance Regions

280 320

80

a) Time Series Estimate? Any good? Over or under-estimate? What is the identification assumption

b) Cross Section Estimate? Any good? Over or under-estimate? What is the identification assumption

c) DiD Estimate? Any good? What problems has it solved? What is the identification assumption? Do we believe it

Page 10: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Stata Part

We are trying to find the effect of the announcement of an Incinerator on house prices.• We have two years 1978 and 1981• The treated group are houses within three

miles of the incinerator, the control are houses further than three miles

Page 11: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Treatment Effect in 1981.reg lrprice nearinc if year==1981, robust

Linear regression Number of obs = 142 F( 1, 140) = 31.87 Prob > F = 0.0000 R-squared = 0.2172 Root MSE = .34621

------------------------------------------------------------------------------ | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearinc | -.402572 .0713061 -5.65 0.000 -.543548 -.261596 _cons | 11.47852 .0317684 361.32 0.000 11.41571 11.54133------------------------------------------------------------------------------

What is the estimated treatment effect?

What is the identification assumption?

What do we learn?

What are the means of houseprice in 1981 for treatment and control groups?

Page 12: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Treatment Effect in 1978reg lrprice nearinc if year==1978, robust

Linear regression Number of obs = 179 F( 1, 177) = 29.78 Prob > F = 0.0000 R-squared = 0.1855 Root MSE = .33213

------------------------------------------------------------------------------ | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearinc | -.339923 .062292 -5.46 0.000 -.4628536 -.2169924 _cons | 11.28542 .0251352 448.99 0.000 11.23582 11.33503------------------------------------------------------------------------------

What is the estimated ‘treatment’ effect?

What has this to do with our identification assumption?

What do we learn?

What are the means of house price in 1981 for treatment and control groups?

Page 13: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Diff-in-DiffLinear regression Number of obs = 321 F( 3, 317) = 27.64 Prob > F = 0.0000 R-squared = 0.2460 Root MSE = .33842

------------------------------------------------------------------------------ | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearinc | -.339923 .0623326 -5.45 0.000 -.4625609 -.2172851 y81 | .1930937 .0404991 4.77 0.000 .1134127 .2727747 y81_nearinc | -.062649 .0946655 -0.66 0.509 -.2489011 .1236031 _cons | 11.28542 .0251516 448.70 0.000 11.23594 11.33491------------------------------------------------------------------------------

How would we write down the estimating equation?

Which is the variable of interest? What is the estimated ‘treatment’ effect?

What is our identification assumption?

What do we learn?

How do the estimated coefficients relate to the previous tables?

Page 14: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Diff-in-Diff plus ControlsLinear regression Number of obs = 321 F( 3, 317) = 27.64 Prob > F = 0.0000 R-squared = 0.2460 Root MSE = .33842

------------------------------------------------------------------------------ | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearinc | .0153128 .0694791 0.22 0.826 -.1213991 .1520247 y81 | .1612425 .0272562 5.92 0.000 .1076113 .2148737 y81_nearinc | -.1310752 .0599971 -2.18 0.030 -.2491297 -.0130207 age | -.0085349 .0016673 -5.12 0.000 -.0118156 -.0052542 agesq | .0000385 .0000109 3.53 0.000 .0000171 .00006 lintst | -.0434139 .0384371 -1.13 0.260 -.1190454 .0322177 lland | .1043508 .0339814 3.07 0.002 .0374865 .1712151 larea | .3498512 .0622733 5.62 0.000 .2273178 .4723847 rooms | .0476201 .0173142 2.75 0.006 .0135514 .0816889 baths | .093426 .0283689 3.29 0.001 .0376053 .1492467 lcbd | -.033525 .043884 -0.76 0.445 -.1198742 .0528242 _cons | 7.764986 .569591 13.63 0.000 6.644218 8.885753------------------------------------------------------------------------------

Which is the variable of interest? What is the estimated ‘treatment’ effect? How does this change?

What is our identification assumption? How has it changed?

Page 15: Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

What are the policy implications