econometrics session 1 – introduction amine ouazad, asst. prof. of economics
TRANSCRIPT
Econometrics
Session 1 – Introduction
Amine Ouazad,Asst. Prof. of Economics
PRELIMINARIESSession 1 - Introduction
Introduction
• Who I am• Arbitrage• Textbook• Grading• Homework• Implementation
Session 1• The two econometric problems• Randomization as the Golden Benchmark
Outline of the Course
Who I am• Applied empirical economist.• Work on urban economics, economics of
education, applied econometrics in accounting.
• Emphasis on the identification of causal effects.
• Careful empirical work: clean data work, correct identification of causal effects.
• Large datasets:– +100 million observations, administrative
datasets, geographic information software.
• Implementation of econometric procedures in Stata/Mata.
Trade-offs• Classroom is heterogeneous.– In tastes, mathematics level, needs, prior
knowledge.
• Different fields have different habits.– E.g. “endogeneity” is not an issue/the same
issue in OB, Finance, Strategy, or TOM.
• Conclusion:– Course provides a particular spin on
econometrics, with mathematics when needed, applications.
• This is a difficult course, even for students with a prior course in econometrics.
Textbooks
• *William H. Greene, Econometrics, 6th edition.
• Jeffrey Wooldridge, Econometrics of Cross Section and Panel Data.
• Joshua Angrist and Jorn Steffen Pischke, Mostly Harmless Econometrics.
• Applied Econometrics using Stata, Cameron et al.
Prerequisites• I assume you know:– Statistics• Random variables.• Moments of random variables (mean,
variance, kurtosis, skewness).• Probabilities.
– Real analysis• Integral of functions, derivatives.• Convergence of a function at x or at infinity.
–Matrix algebra • Inverse, multiplication, projections.
Grading
• Exam: 60%
• Participation: 10%
• Homework: 30%
– One problem set in-between Econometrics A and B.
Implementation
• STATA version 12.– License for PhD students. Ask IT. 5555 or
Alina Jacquet.– Interactive mode, Do files, Mata
programming.– Compulsory for this course.
• MATLAB, not for everybody.– Coding econometric procedures
yourself, e.g. GMM.
Outline for Session 1Introduction
1. Correlation and Causation
2. The Two Econometric Problems
3. Treatment Effects
1. CORRELATION AND CAUSATION
Session 1 - Introduction
1. The perils of confoundingcorrelation and causation
• How can we boost children’s reading scores?– Shoe size is correlated with IQ.
• Women earn less than men.– Sign of discrimination?
• Health is negatively correlated with the number of days spent in hospital.– Do hospitals kill patients?
Potential outcomes framework
• A.k.a the “Rubin causality model”.• Outcome with the treatment Y(1),
outcome without the treatment Y(0).• Treatment status D=0,1.• FUNDAMENTAL PROBLEM OF
ECONOMETRICS: Either Y(1) or Y(0) is observed, or, equivalently, Y=Y(1) D + Y(0) (1-D) is observed.
• What would have happened if a given subject had received a different treatment?
Naïve estimator of the treatment effect
• D=E(Y|D=1) – E(Y|D=0).• Does that identify any relevant
parameter?
• Notice that:– D= E(Y|D=1) – E(Y|D=0)
= E(Y(1)|D=1)-E(Y(0)|D=0)
• What are we looking for?
Ignorable Treatment (Rubin 1983)
• Assume Y(1),Y(0) D.
• Then E(Y(0)|D=1)=E(Y(0)|D=0)=E(Y(0)).
• Similarly for Y(1).• Then:
Another Interpretation
• Assume Y(D)=a+bD+e.• e is the “unobservables”.• The naïve estimator D=b+E(e|D=1)-
E(e|D=0).• Selection bias: S=E(e|D=1)-E(e|
D=0).– Overestimates the effect if S>0– Underestimates the effect if S<0.
Definitions
• Treatment Effect.Y(1)-Y(0)
• Average Treatment Effect.E(Y(1)-Y(0))
• Average Treatment on the Treated.E(Y(1)-Y(0)|D=1)
• Average Treatment on the Untreated.E(Y(1)-Y(0)|D=0)
Randomizationas the Golden Benchmark
• Effect of a medical treatment.– Treatment and control group.– Randomization of the assignment to the
treatment and to the control.
• Why randomize?
• … effect of jumping without a parachute on the probability of death.
With ignorability…
• If the treatment is ignorable (e.g. if the treatment has been randomly assigned to subjects) then– ATE = ATT = ATU
Selection bias
• Why is there a selection bias?– In medecine, in economics, in
management?
1. Self-selection of subjects into the treatment.
2. Correlation between unobservables and observables, e.g. industry, gender, income.
2. THE TWO ECONOMETRIC PROBLEMS
Session 1 - Introduction
2. The Two Econometric Problems
• Identification and Inference– “Studies of identification seek to
characterize the conclusions that could be drawn if one could use the sampling process to obtain an unlimited number of observations.”
– “Studies of statistical inference seek to characterize the generally weaker conclusions that can be drawn from a finite number of observations.”
Identification vs inference
• Consider a survey of a random subset of 1,302 French individuals.
• Identification:– Can you identify the average income in
France?
• Inference:– How close to the true average income is the
mean income in the sample?– i.e. what is the confidence interval around the
estimate of the average income in Singapore?
Identification vs inference
• Consider a lab experiment with 9 rats, randomly assigned to a treatment group and a control group.
• Identification:– Can you identify the effect of the
medication on the rats using the random assignment?
• Inference:–With 9 rats, can you say anything about
the effectiveness of the medication?
This session
• This session has focused on identification.– i.e. I assume we have a potentially
infinite dataset.– I focus on the conditions for the
identification of the causal effect of a variable.
• Next session: what problems appear because we have a limited number of observations?
LOOKING FORWARD:OUTLINE OF THE COURSE
Session 1 - Introduction
Outline of the course
1. Introduction: Identification
2. Introduction: Inference
3. Linear Regression
4. Identification Issues in Linear Regressions
5. Inference Issues in Linear Regressions
6. Identification in Simultaneous Equation Models
7. Instrumental variable (IV) estimation
8. Finding IVs: Identification strategies
9. Panel data analysis
10.Bootstrap
11.Generalized Method of Moments (GMM)
12.GMM: Dynamic Panel Data estimation
13.Maximum Likelihood (ML): Introduction
14.ML: Probit and Logit
15.ML: Heckman selection models
16.ML: Truncation and censoring
+ Exercise/Review session
+ Exam