econometrics session 1 – introduction amine ouazad, asst. prof. of economics

Econometrics

Session 1 – Introduction

Amine Ouazad,Asst. Prof. of Economics

PRELIMINARIESSession 1 - Introduction

Introduction

• Who I am• Arbitrage• Textbook• Grading• Homework• Implementation

Session 1• The two econometric problems• Randomization as the Golden Benchmark

Outline of the Course

Who I am• Applied empirical economist.• Work on urban economics, economics of

education, applied econometrics in accounting.

• Emphasis on the identification of causal effects.

• Careful empirical work: clean data work, correct identification of causal effects.

• Large datasets:– +100 million observations, administrative

datasets, geographic information software.

• Implementation of econometric procedures in Stata/Mata.

Trade-offs• Classroom is heterogeneous.– In tastes, mathematics level, needs, prior

knowledge.

• Different fields have different habits.– E.g. “endogeneity” is not an issue/the same

issue in OB, Finance, Strategy, or TOM.

• Conclusion:– Course provides a particular spin on

econometrics, with mathematics when needed, applications.

• This is a difficult course, even for students with a prior course in econometrics.

Textbooks

• *William H. Greene, Econometrics, 6th edition.

• Jeffrey Wooldridge, Econometrics of Cross Section and Panel Data.

• Joshua Angrist and Jorn Steffen Pischke, Mostly Harmless Econometrics.

• Applied Econometrics using Stata, Cameron et al.

Prerequisites• I assume you know:– Statistics• Random variables.• Moments of random variables (mean,

variance, kurtosis, skewness).• Probabilities.

– Real analysis• Integral of functions, derivatives.• Convergence of a function at x or at infinity.

–Matrix algebra • Inverse, multiplication, projections.

Grading

• Exam: 60%

• Participation: 10%

• Homework: 30%

– One problem set in-between Econometrics A and B.

Implementation

• STATA version 12.– License for PhD students. Ask IT. 5555 or

Alina Jacquet.– Interactive mode, Do files, Mata

programming.– Compulsory for this course.

• MATLAB, not for everybody.– Coding econometric procedures

yourself, e.g. GMM.

Outline for Session 1Introduction

1. Correlation and Causation

2. The Two Econometric Problems

3. Treatment Effects

1. CORRELATION AND CAUSATION

Session 1 - Introduction

1. The perils of confoundingcorrelation and causation

• How can we boost children’s reading scores?– Shoe size is correlated with IQ.

• Women earn less than men.– Sign of discrimination?

• Health is negatively correlated with the number of days spent in hospital.– Do hospitals kill patients?

Potential outcomes framework

• A.k.a the “Rubin causality model”.• Outcome with the treatment Y(1),

outcome without the treatment Y(0).• Treatment status D=0,1.• FUNDAMENTAL PROBLEM OF

ECONOMETRICS: Either Y(1) or Y(0) is observed, or, equivalently, Y=Y(1) D + Y(0) (1-D) is observed.

• What would have happened if a given subject had received a different treatment?

Naïve estimator of the treatment effect

• D=E(Y|D=1) – E(Y|D=0).• Does that identify any relevant

parameter?

• Notice that:– D= E(Y|D=1) – E(Y|D=0)

= E(Y(1)|D=1)-E(Y(0)|D=0)

• What are we looking for?

Ignorable Treatment (Rubin 1983)

• Assume Y(1),Y(0) D.

• Then E(Y(0)|D=1)=E(Y(0)|D=0)=E(Y(0)).

• Similarly for Y(1).• Then:

Another Interpretation

• Assume Y(D)=a+bD+e.• e is the “unobservables”.• The naïve estimator D=b+E(e|D=1)-

E(e|D=0).• Selection bias: S=E(e|D=1)-E(e|

D=0).– Overestimates the effect if S>0– Underestimates the effect if S<0.

Definitions

• Treatment Effect.Y(1)-Y(0)

• Average Treatment Effect.E(Y(1)-Y(0))

• Average Treatment on the Treated.E(Y(1)-Y(0)|D=1)

• Average Treatment on the Untreated.E(Y(1)-Y(0)|D=0)

Randomizationas the Golden Benchmark

• Effect of a medical treatment.– Treatment and control group.– Randomization of the assignment to the

treatment and to the control.

• Why randomize?

• … effect of jumping without a parachute on the probability of death.

With ignorability…

• If the treatment is ignorable (e.g. if the treatment has been randomly assigned to subjects) then– ATE = ATT = ATU

Selection bias

• Why is there a selection bias?– In medecine, in economics, in

management?

1. Self-selection of subjects into the treatment.

2. Correlation between unobservables and observables, e.g. industry, gender, income.

2. THE TWO ECONOMETRIC PROBLEMS


2. The Two Econometric Problems

• Identification and Inference– “Studies of identification seek to

characterize the conclusions that could be drawn if one could use the sampling process to obtain an unlimited number of observations.”

– “Studies of statistical inference seek to characterize the generally weaker conclusions that can be drawn from a finite number of observations.”

Identification vs inference

• Consider a survey of a random subset of 1,302 French individuals.

• Identification:– Can you identify the average income in

France?

• Inference:– How close to the true average income is the

mean income in the sample?– i.e. what is the confidence interval around the

estimate of the average income in Singapore?

Identification vs inference

• Consider a lab experiment with 9 rats, randomly assigned to a treatment group and a control group.

• Identification:– Can you identify the effect of the

medication on the rats using the random assignment?

• Inference:–With 9 rats, can you say anything about

the effectiveness of the medication?

This session

• This session has focused on identification.– i.e. I assume we have a potentially

infinite dataset.– I focus on the conditions for the

identification of the causal effect of a variable.

• Next session: what problems appear because we have a limited number of observations?

LOOKING FORWARD:OUTLINE OF THE COURSE


Outline of the course

1. Introduction: Identification

2. Introduction: Inference

3. Linear Regression

4. Identification Issues in Linear Regressions

5. Inference Issues in Linear Regressions

6. Identification in Simultaneous Equation Models

7. Instrumental variable (IV) estimation

8. Finding IVs: Identification strategies

9. Panel data analysis

10.Bootstrap

11.Generalized Method of Moments (GMM)

12.GMM: Dynamic Panel Data estimation

13.Maximum Likelihood (ML): Introduction

14.ML: Probit and Logit

15.ML: Heckman selection models

16.ML: Truncation and censoring

+ Exercise/Review session

+ Exam

econometrics session 1 – introduction amine ouazad, asst. prof. of economics

Documents