econ 5410 (former 641) econometrics i · econ 5410 time series several points in time, one...
TRANSCRIPT
Lecture 1
Introduction
Chapter 1
Elementary Statistics
Random Variables, distributions
Mean, variance
Appendix B.1 – B.3.
Levels of Econometrics
Gourmet – 2nd year grad sequence
Invent new recipes
Cookbook – 1st year grad sequence
Assemble Ingredients from scratch, following recipe
Levels of Econometrics
Gourmet – 2nd year grad sequence
Invent new recipes
Cookbook – 1st year grad sequence
Assemble ingredients from scratch, following recipe
Microwave – Econ 5410, 5420
Select nutritious entree
Punch in # of minutes
Levels of Econometrics
Gourmet – 2nd year grad sequence
Invent new recipes
Cookbook – 1st year grad sequence
Assemble Ingredients from scratch, following recipe
Microwave – Econ 5410, 5420
Select nutritious entree
Punch in # of minutes
Junk Food – Unhealthy diet
Darrell Huff, How to Lie with Statistics, 1954
Data Mining
Cherry picking, Lemon dropping
Duplication fallacy
Newey-West HAC standard errors
Types of Data
Cross Sectional Several “individuals,” one point in time
Econ 5410
Continuous vs discrete dependent variables Logit, Probit
Econ 5410
Time Series Several points in time, one “individual”
Mostly Econ 5420
Pooled Cross Sections Several (different) individuals, several points in time
Panel Data – same individuals across time
Econ 5420
Systems of simultaneous equations, Instrumental Variables e.g. Supply and Demand
Econ 5410, 5420
Concerns
Determining precision of estimates, forecasts
standard errors, confidence intervals
Holding other things constant – Ceteris Paribus
multiple regression
Joint tests of significance
F test, confidence ellipses
Serial Correlation
Reduces effective sample size
Problems
Nonlinearity – 5410
Data Mining – 5410
Heteroskedasticity (non-const. variance) WLS, HCC – 5410
Non-Gaussian errors
Tests for normality – 5410
Robust regression – 5410
Censored dependent variable
Tobit – Econ 5410
Serial correlation – Econ 5410, 5420
Fixed, Random effects in Panel data – 5420
Endogenous regressors
Instrumental variables, 2SLS – 5410, 5420
Elem. Statistics I Wooldridge App. B
Random Variables (RVs)
Appendix B follows convention:
Upper case (e.g. X) indicates RV itself
Lower case indicates possible values RV may take
e.g. x if continuous,
x1,, x2, etc if discrete
Tilde (~) also often used to indicate RV
e.g. X
~
Probability Distributions Characterized by Cumulative Distribution
Function (CDF):
F(x) = P(X x)
CDF must be non-decreasing, bounded by 0 and 1:
F(x)
1
x
Distributions may be discrete or continuous
Discrete Distributions
x1, x2, ... xn = possible values
Probability Mass Function (PMF) gives probability of each
outcome:
P(X = xi) = pi
CDF is a step function
Eg fair die
x1 = 1, ... x6 = 6
pi = 1/6, i = 1, ... 6
xx
i
i
px)F(
11
n
i
ip
1 2 3 4 5 6
F(x)
x
1
5/6
2/3
½
1/3
1/6
Continuous Distributions CDF is a continuous function
P(X=x) = 0, all x, so PMF uninformative
Probability Density Function (pdf)
f(x) = F(x) F(x)/x
= local prob. per unit of x
x
f(x)
f(x) 0
Expected Value or population mean E(X) or x ( = mu)
Discrete Distribution:
E.g. fair die:
E(X) = (1/6)(1) + (1/6)(2) + ... (1/6)(6)
= 3.5
Continuous Dist:
n
i
ii pxX1
)E(
dxxxfX
)()E(
Linearity of Expectation Operator
for any constants a and b,
E(aX + b) = aE(X) + b,
for any 2 RVs X, Y,
E(X + Y) = E(X) + E(Y)
Variance
Var(X) or x2 ( = sigma)
= E(X-x)2
dxxfx
px
x
i
n
i
xi
)(or2
1
2
has units of X2
e.g. X~[$] Var(X)~[$2] (!)
Standardized RVs “Z-scores”
Suppose that for some RV X,
E(X) = ,
Var(X) = 2
so s.d.(X) = .
Then for Z = (X - ) / ,
E(Z) = 0,
Var(Z) = s.d.(Z) = 1.
Also,
FX(x) = P(X < x)
= P(Z < (x - )/ )
= FZ(z) for z = (x - )/
so that FZ(z) tells us FX(x).