econ 5410 (former 641) econometrics i · econ 5410 time series several points in time, one...

25
Econ 5410 (former 641) Econometrics I Prof. J. Huston McCulloch

Upload: hacong

Post on 24-Jan-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Econ 5410 (former 641)

Econometrics I

Prof. J. Huston McCulloch

Lecture 1

Introduction

Chapter 1

Elementary Statistics

Random Variables, distributions

Mean, variance

Appendix B.1 – B.3.

Levels of Econometrics

Levels of Econometrics

Gourmet – 2nd year grad sequence

Invent new recipes

Levels of Econometrics

Gourmet – 2nd year grad sequence

Invent new recipes

Cookbook – 1st year grad sequence

Assemble Ingredients from scratch, following recipe

Levels of Econometrics

Gourmet – 2nd year grad sequence

Invent new recipes

Cookbook – 1st year grad sequence

Assemble ingredients from scratch, following recipe

Microwave – Econ 5410, 5420

Select nutritious entree

Punch in # of minutes

Levels of Econometrics

Gourmet – 2nd year grad sequence

Invent new recipes

Cookbook – 1st year grad sequence

Assemble Ingredients from scratch, following recipe

Microwave – Econ 5410, 5420

Select nutritious entree

Punch in # of minutes

Junk Food – Unhealthy diet

Darrell Huff, How to Lie with Statistics, 1954

Data Mining

Cherry picking, Lemon dropping

Duplication fallacy

Newey-West HAC standard errors

Types of Data

Cross Sectional Several “individuals,” one point in time

Econ 5410

Continuous vs discrete dependent variables Logit, Probit

Econ 5410

Time Series Several points in time, one “individual”

Mostly Econ 5420

Pooled Cross Sections Several (different) individuals, several points in time

Panel Data – same individuals across time

Econ 5420

Systems of simultaneous equations, Instrumental Variables e.g. Supply and Demand

Econ 5410, 5420

Concerns

Determining precision of estimates, forecasts

standard errors, confidence intervals

Holding other things constant – Ceteris Paribus

multiple regression

Joint tests of significance

F test, confidence ellipses

Serial Correlation

Reduces effective sample size

Problems

Nonlinearity – 5410

Data Mining – 5410

Heteroskedasticity (non-const. variance) WLS, HCC – 5410

Non-Gaussian errors

Tests for normality – 5410

Robust regression – 5410

Censored dependent variable

Tobit – Econ 5410

Serial correlation – Econ 5410, 5420

Fixed, Random effects in Panel data – 5420

Endogenous regressors

Instrumental variables, 2SLS – 5410, 5420

Elem. Statistics I Wooldridge App. B

Random Variables (RVs)

Appendix B follows convention:

Upper case (e.g. X) indicates RV itself

Lower case indicates possible values RV may take

e.g. x if continuous,

x1,, x2, etc if discrete

Tilde (~) also often used to indicate RV

e.g. X

~

Probability Distributions Characterized by Cumulative Distribution

Function (CDF):

F(x) = P(X x)

CDF must be non-decreasing, bounded by 0 and 1:

F(x)

1

x

Distributions may be discrete or continuous

Discrete Distributions

x1, x2, ... xn = possible values

Probability Mass Function (PMF) gives probability of each

outcome:

P(X = xi) = pi

CDF is a step function

Eg fair die

x1 = 1, ... x6 = 6

pi = 1/6, i = 1, ... 6

xx

i

i

px)F(

11

n

i

ip

1 2 3 4 5 6

F(x)

x

1

5/6

2/3

½

1/3

1/6

Continuous Distributions CDF is a continuous function

P(X=x) = 0, all x, so PMF uninformative

Probability Density Function (pdf)

f(x) = F(x) F(x)/x

= local prob. per unit of x

x

f(x)

f(x) 0

Area under pdf between a and b gives

P(a X b) = F(b) – F(a)

x

f(x)

b

a

dxxf )(

a b

Area under pdf to left of x gives CDF:

P(X x) = F(x)

f()

x

df )(

x

( = Greek xi)

Area under pdf to right of x gives complemented CDF:

P(X > x) = Fc(x) = 1- F(x)

f()

x

df )(

x

Total area under pdf must be 1:

x

f(x)

1)()(

Fdxxf

Expected Value or population mean E(X) or x ( = mu)

Discrete Distribution:

E.g. fair die:

E(X) = (1/6)(1) + (1/6)(2) + ... (1/6)(6)

= 3.5

Continuous Dist:

n

i

ii pxX1

)E(

dxxxfX

)()E(

Linearity of Expectation Operator

for any constants a and b,

E(aX + b) = aE(X) + b,

for any 2 RVs X, Y,

E(X + Y) = E(X) + E(Y)

Variance

Var(X) or x2 ( = sigma)

= E(X-x)2

dxxfx

px

x

i

n

i

xi

)(or2

1

2

has units of X2

e.g. X~[$] Var(X)~[$2] (!)

Standard deviation

xXX )ar(V)sd(

• has same units as X

•X~[$] sd(X)~[$]

Nonlinearity of Variance Operator

Var(aX + b) = a2Var(X)

but –

s.d.(aX + b) = |a| s.d.(X)

Standardized RVs “Z-scores”

Suppose that for some RV X,

E(X) = ,

Var(X) = 2

so s.d.(X) = .

Then for Z = (X - ) / ,

E(Z) = 0,

Var(Z) = s.d.(Z) = 1.

Also,

FX(x) = P(X < x)

= P(Z < (x - )/ )

= FZ(z) for z = (x - )/

so that FZ(z) tells us FX(x).

Next Class

Covariance

Estimation of mean, variance

Appendices B, C

HW1

Due Friday 5 PM

Box of Shin-Wu Yu outside Arps 410