copyright 1996 lawrence c. marsh powerpoint slides for undergraduate econometrics by lawrence c....

Post on 28-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Copyright 1996 Lawrence C. Marsh

PowerPoint Slidesfor

Undergraduate Econometricsby

Lawrence C. Marsh

To accompany: Undergraduate Econometricsby R. Carter Hill, William E. Griffiths and George G. Judge

Publisher: John Wiley & Sons, 1997

Copyright 1996 Lawrence C. Marsh

The Role of Econometrics

in Economic Analysis

Chapter 1

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

1.1

Copyright 1996 Lawrence C. Marsh

Using Information:

1. Information from economic theory.

2. Information from economic data.

The Role of Econometrics 1.2

Copyright 1996 Lawrence C. Marsh

Understanding Economic Relationships:

federalbudget

Dow-JonesStock Index

tradedeficit Federal Reserve

Discount Rate

capital gains taxrentcontrollaws

short termtreasury bills

power oflabor unions

crime rate

inflation

unemployment

money supply

1.3

Copyright 1996 Lawrence C. Marsh

economic theory

economic data } economicdecisions

To use information effectively:

*Econometrics* helps us combine economic theory and economic data .

Economic Decisions 1.4

Copyright 1996 Lawrence C. Marsh

Consumption, c, is some function of income, i :

c = f(i)

For applied econometric analysis this consumption function must be specified more precisely.

The Consumption Function 1.5

Copyright 1996 Lawrence C. Marsh

demand, qd, for an individual commodity:

qd = f( p, pc, ps, i )

supply, qs, of an individual commodity:

qs = f( p, pc, pf )

p = own price; pc = price of complements;ps = price of substitutes; i = income

p = own price; pc = price of competitive products;ps = price of substitutes; pf = price of factor inputs

demand

supply

1.6

Copyright 1996 Lawrence C. Marsh

Listing the variables in an economic relationship is not enough.

For effective policy we must know the amount of changeneeded for a policy instrument to bring about the desiredeffect:

How much ?How much ?

• By how much should the Federal Reserve raise interest rates to prevent inflation?

• By how much can the price of football tickets be increased and still fill the stadium?

1.7

Copyright 1996 Lawrence C. Marsh

Answering the How Much? question

Need to estimate parameters that are both:

1. unknown and

2. unobservable

1.8

Copyright 1996 Lawrence C. Marsh

Average or systematic behaviorover many individuals or many firms.

Not a single individual or single firm.

Economists are concerned with theunemployment rate and not whethera particular individual gets a job.

The Statistical Model 1.9

Copyright 1996 Lawrence C. Marsh

The Statistical Model

Actual vs. Predicted Consumption:

Actual = systematic part + random error

Systematic part provides prediction, f(i),but actual will miss by random error, e.

Consumption, c, is function, f, of income, i, with error, e:

c = f(i) + e

1.10

Copyright 1996 Lawrence C. Marsh

c = f(i) + e

Need to define f(i) in some way.

To make consumption, c, a linear function of income, i :

f(i) = 1 + 2 i

The statistical model then becomes:

c = 1 + 2 i + e

The Consumption Function 1.11

Copyright 1996 Lawrence C. Marsh

• Dependent variable, y, is focus of study (predict or explain changes in dependent variable).

• Explanatory variables, X2 and X3, help us explain

observed changes in the dependent variable.

y = 1 + 2 X2 + 3 X3 + e

The Econometric Model 1.12

Copyright 1996 Lawrence C. Marsh

Statistical Models

Controlled (experimental) vs.

Uncontrolled (observational)

Uncontrolled experiment (econometrics) explaining consump-

tion, y : price, X2, and income, X3, vary at the same time.

Controlled experiment (“pure” science) explaining mass, y :

pressure, X2, held constant when varying temperature, X3,

and vice versa.

1.13

Copyright 1996 Lawrence C. Marsh

Econometric model

• economic modeleconomic variables and parameters.

• statistical modelsampling process with its parameters.

• dataobserved values of the variables.

1.14

Copyright 1996 Lawrence C. Marsh

• Uncertainty regarding an outcome.• Relationships suggested by economic theory.• Assumptions and hypotheses to be specified.• Sampling process including functional form.• Obtaining data for the analysis.• Estimation rule with good statistical properties.• Fit and test model using software package.• Analyze and evaluate implications of the results.• Problems suggest approaches for further research.

The Practice of Econometrics 1.15

Copyright 1996 Lawrence C. Marsh

Note: the textbook uses the following symbol to mark sections with advanced material:

“Skippy”

1.16

Copyright 1996 Lawrence C. Marsh

Some Basic Probability Concepts

Chapter 2

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

2.1

Copyright 1996 Lawrence C. Marsh

random variable: A variable whose value is unknown until it is observed.The value of a random variable results from an experiment.

The term random variable implies the existence of someknown or unknown probability distribution defined overthe set of all possible values of that variable.

In contrast, an arbitrary variable does not have aprobability distribution associated with its values.

Random Variable2.2

Copyright 1996 Lawrence C. Marsh

Controlled experiment values of explanatory variables are chosen with great care in accordance withan appropriate experimental design.

Uncontrolled experiment valuesof explanatory variables consist of nonexperimental observations overwhich the analyst has no control.

2.3

Copyright 1996 Lawrence C. Marsh

discrete random variable:A discrete random variable can take only a finitenumber of values, that can be counted by using the positive integers.

Example: Prize money from the followinglottery is a discrete random variable:

first prize: $1,000second prize: $50third prize: $5.75

since it has only four (a finite number) (count: 1,2,3,4) of possible outcomes:

$0.00; $5.75; $50.00; $1,000.00

Discrete Random Variable 2.4

Copyright 1996 Lawrence C. Marsh

continuous random variable:A continuous random variable can take any real value (not just whole numbers) in at least one interval on the real line.

Examples: Gross national product (GNP)money supplyinterest ratesprice of eggshousehold incomeexpenditure on clothing

Continuous Random Variable 2.5

Copyright 1996 Lawrence C. Marsh

A discrete random variable that is restrictedto two possible values (usually 0 and 1) iscalled a dummy variable (also, binary orindicator variable).

Dummy variables account for qualitative differences:gender (0=male, 1=female), race (0=white, 1=nonwhite),citizenship (0=U.S., 1=not U.S.), income class (0=poor, 1=rich).

Dummy Variable2.6

Copyright 1996 Lawrence C. Marsh

A list of all of the possible values takenby a discrete random variable along withtheir chances of occurring is called a probabilityfunction or probability density function (pdf).

die x f(x)one dot 1 1/6two dots 2 1/6three dots 3 1/6four dots 4 1/6five dots 5 1/6six dots 6 1/6

2.7

Copyright 1996 Lawrence C. Marsh

A discrete random variable X has pdf, f(x), which is the probabilitythat X takes on the value x.

f(x) = P(X=x)

0 < f(x) < 1

If X takes on the n values: x1, x2, . . . , xn, then f(x1) + f(x2)+. . .+f(xn) = 1.

Therefore,

2.8

Copyright 1996 Lawrence C. Marsh

Probability, f(x), for a discrete randomvariable, X, can be represented by height:

0 1 2 3 X

number, X, on Dean’s List of three roommates

f(x)0.2

0.4

0.1

0.3

2.9

Copyright 1996 Lawrence C. Marsh

A continuous random variable uses area under a curve rather than theheight, f(x), to represent probability:

f(x)

X$34,000 $55,000. .

per capita income, X, in the United States

0.13240.8676

red area

green area

2.10

Copyright 1996 Lawrence C. Marsh

Since a continuous random variable has an uncountably infinite number of values, the probability of one occurring is zero.

P [ X = a ] = P [ a < X < a ] = 0

Probability is represented by area.

Height alone has no area.

An interval for X is needed to get an area under the curve.

2.11

Copyright 1996 Lawrence C. Marsh

P [ a < X < b ] = f(x) dxb

a

The area under a curve is the integral ofthe equation that generates the curve:

For continuous random variables it is the integral of f(x), and not f(x) itself, whichdefines the area and, therefore, the probability.

2.12

Copyright 1996 Lawrence C. Marsh

n

Rule 2: axi = a xi i = 1 i = 1

n

Rule 1: xi = x1 + x2 + . . . + xni = 1

n

Rule 3: xi +yi = xi + yii = 1 i = 1 i = 1

n n n

Note that summation is a linear operatorwhich means it operates term by term.

Rules of Summation2.13

Copyright 1996 Lawrence C. Marsh

Rule 4: axi +byi = a xi + b yii = 1 i = 1 i = 1

n n n

Rules of Summation (continued)

Rule 5: x = xi =i = 1

n

n1 x1 + x2 + . . . + xn

n

The definition of x as given in Rule 5 impliesthe following important fact:

xi x) = 0i = 1

n

2.14

Copyright 1996 Lawrence C. Marsh

Rule 6: f(xi) = f(x1) + f(x2) + . . . + f(xn)i = 1

n

Notation: f(xi) = f(xi) = f(xi)

n

x i i = 1

n

Rule 7: f(xi,yj) = [ f(xi,y1) + f(xi,y2)+. . .+ f(xi,ym)] i = 1 i = 1

n m

j = 1

The order of summation does not matter :

f(xi,yj) = f(xi,yj)i = 1

n m

j = 1 j = 1

m n

i = 1

Rules of Summation (continued) 2.15

Copyright 1996 Lawrence C. Marsh

The mean or arithmetic average of arandom variable is its mathematicalexpectation or expected value, EX.

The Mean of a Random Variable

2.16

Copyright 1996 Lawrence C. Marsh

Expected Value

There are two entirely different, but mathematicallyequivalent, ways of determining the expected value:

1. Empirically: The expected value of a random variable, X,is the average value of the random variable in aninfinite number of repetitions of the experiment.

In other words, draw an infinite number of samples,and average the values of X that you get.

2.17

Copyright 1996 Lawrence C. Marsh

Expected Value

2. Analytically: The expected value of a discrete random variable, X, is determined by weighting all the possible values of X by the correspondingprobability density function values, f(x), and summing them up.

E[X] = x1f(x1) + x2f(x2) + . . . + xnf(xn)

In other words:

2.18

Copyright 1996 Lawrence C. Marsh

In the empirical case when the sample goes to infinity the values of X occur with a frequency equal to the corresponding f(x) in the analytical expression.

As sample size goes to infinity, the empirical and analytical methods will produce the same value.

Empirical vs. Analytical 2.19

Copyright 1996 Lawrence C. Marsh

x = xi

n

i = 1

where n is the number of sample observations.

Empirical (sample) mean:

E[X] = xif(xi)i = 1

n

where n is the number of possible values of xi.

Analytical mean:

Notice how the meaning of n changes.

2.20

Copyright 1996 Lawrence C. Marsh

E X = xi f(xi) i=1

n

The expected value of X-squared:

E X = xi f(xi) i=1

n2 2

It is important to notice that f(xi) does not change!

The expected value of X-cubed:

E X = xi f(xi) i=1

n3 3

The expected value of X:2.21

Copyright 1996 Lawrence C. Marsh

EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)

2

EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)22222

= 1.9

= 0 + .3 + 1.2 + 1.8 + 1.6

= 4.9

3

EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) +4 (.1)33333

= 0 + .3 + 2.4 + 5.4 + 6.4

= 14.5

2.22

Copyright 1996 Lawrence C. Marsh

E [g(X)] = g(xi) f(xi)n

i = 1

g(X) = g1(X) + g2(X)

E [g(X)] = g1(xi) + g2(xi)] f(xi)n

i = 1

E [g(X)] = g1(xi) f(xi) + g2(xi) f(xi)n

i = 1

n

i = 1

E [g(X)] = E [g1(X)] + E [g2(X)]

2.23

Copyright 1996 Lawrence C. Marsh

Adding and Subtracting Random Variables

E(X-Y) = E(X) - E(Y)

E(X+Y) = E(X) + E(Y)

2.24

Copyright 1996 Lawrence C. Marsh

E(X+a) = E(X) + a

Adding a constant to a variable willadd a constant to its expected value:

Multiplying by constant will multiply its expected value by that constant:

E(bX) = b E(X)

2.25

Copyright 1996 Lawrence C. Marsh

var(X) = average squared deviations around the mean of X.

var(X) = expected value of the squared deviations around the expected value of X.

var(X) = E [(X - EX) ] 2

Variance2.26

Copyright 1996 Lawrence C. Marsh

var(X) = E [(X - EX) ]

= E [X - 2XEX + (EX) ]

2

2

2= E(X ) - 2 EX EX + E (EX)

2

2

= E(X ) - 2 (EX) + (EX) 2 2 2

= E(X ) - (EX) 2 2

var(X) = E [(X - EX) ] 2

var(X) = E(X ) - (EX) 22

2.27

Copyright 1996 Lawrence C. Marsh

variance of a discreterandom variable, X:

standard deviation is square root of variance

var ( X ) = (xi - EX )2 f(xi)i = 1

n

ه

2.28

Copyright 1996 Lawrence C. Marsh

xi f(xi) (xi - EX) (xi - EX) f(xi)

2 .1 2 - 4.3 = -2.3 5.29 (.1) = .5293 .3 3 - 4.3 = -1.3 1.69 (.3) = .5074 .1 4 - 4.3 = - .3 .09 (.1) = .0095 .2 5 - 4.3 = .7 .49 (.2) = .0986 .3 6 - 4.3 = 1.7 2.89 (.3) = .867

xi f(xi) = .2 + .9 + .4 + 1.0 + 1.8 = 4.3

(xi - EX) f(xi) = .529 + .507 + .009 + .098 + .867 = 2.01

2

2

calculate the variance for a discrete random variable, X:

i = 1

n

n

i = 1

2.29

Copyright 1996 Lawrence C. Marsh

Z = a + cX

var(Z) = var(a + cX)

= E [(a+cX) - E(a+cX)]

= c var(X)

2

2

var(a + cX) = c var(X)2

2.30

Copyright 1996 Lawrence C. Marsh

A joint probability density function, f(x,y), provides the probabilities associated with the joint occurrence of all of the possible pairs of X and Y.

Joint pdf 2.31

Copyright 1996 Lawrence C. Marsh

college gradsin household

.15

.05

.45

.35

joint pdff(x,y)

Y = 1 Y = 2

vacationhomesowned

X = 0

X = 1

Survey of College City, NY

f(0,1) f(0,2)

f(1,1) f(1,2)

2.32

Copyright 1996 Lawrence C. Marsh

E[g(X,Y)] = g(xi,yj) f(xi,yj)i j

E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75

E(XY) = xi yj f(xi,yj)i j

Calculating the expected value of functions of two random variables.

2.33

Copyright 1996 Lawrence C. Marsh

The marginal probability density functions,f(x) and f(y), for discrete random variables,can be obtained by summing over the f(x,y) with respect to the values of Y to obtain f(x) with respect to the values of X to obtain f(y).

f(xi) = f(xi,yj) f(yj) = f(xi,yj)ij

Marginal pdf 2.34

Copyright 1996 Lawrence C. Marsh

.15

.05

.45

.35

marginalY = 1 Y = 2

X = 0

X = 1

.60

.40

.50.50

f(X = 1)

f(X = 0)

f(Y = 1) f(Y = 2)

marginalpdf for Y:

marginalpdf for X:

2.35

Copyright 1996 Lawrence C. Marsh

The conditional probability density

functions of X given Y=y , f(x|y),

and of Y given X=x , f(y|x), are obtained by dividing f(x,y) by f(y)

to get f(x|y) and by f(x) to get f(y|x).

f(x|y) = f(y|x) =f(x,y) f(x,y)f(y) f(x)

Conditional pdf 2.36

Copyright 1996 Lawrence C. Marsh

.15

.05

.45

.35

conditonalY = 1 Y = 2

X = 0

X = 1

.60

.40

.50.50

.25.75

.875.125

.90.10 .70

.30

f(Y=2|X= 0)=.25f(Y=1|X = 0)=.75

f(Y=2|X = 1)=.875

f(X=0|Y=2)=.30

f(X=1|Y=2)=.70

f(X=0|Y=1)=.90

f(X=1|Y=1)=.10

f(Y=1|X = 1)=.125

2.37

Copyright 1996 Lawrence C. Marsh

X and Y are independent random variables if their joint pdf, f(x,y),is the product of their respectivemarginal pdfs, f(x) and f(y) .

f(xi,yj) = f(xi) f(yj)for independence this must hold for all pairs of i and j

Independence 2.38

Copyright 1996 Lawrence C. Marsh

.15

.05

.45

.35

not independentY = 1 Y = 2

X = 0

X = 1

.60

.40

.50.50

f(X = 1)

f(X = 0)

f(Y = 1) f(Y = 2)

marginalpdf for Y:

marginalpdf for X:

.50x.60=.30 .50x.60=.30

.50x.40=.20 .50x.40=.20 The calculations in the boxes show the numbers required to have independence.

2.39

Copyright 1996 Lawrence C. Marsh

The covariance between two randomvariables, X and Y, measures thelinear association between them.

cov(X,Y) = E[(X - EX)(Y-EY)]

Note that variance is a special case of covariance.

cov(X,X) = var(X) = E[(X - EX) ]2

Covariance2.40

Copyright 1996 Lawrence C. Marsh

cov(X,Y) = E [(X - EX)(Y-EY)]

= E [XY - X EY - Y EX + EX EY]

= E(XY) - 2 EX EY + EX EY

= E(XY) - EX EY

cov(X,Y) = E [(X - EX)(Y-EY)]

cov(X,Y) = E(XY) - EX EY

= E(XY) - EX EY - EY EX + EX EY

2.41

Copyright 1996 Lawrence C. Marsh

.15

.05

.45

.35

Y = 1 Y = 2

X = 0

X = 1

.60

.40

.50.50

EX=0(.60)+1(.40)=.40

EY=1(.50)+2(.50)=1.50

E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75

EX EY = (.40)(1.50) = .60

cov(X,Y) = E(XY) - EX EY = .75 - (.40)(1.50) = .75 - .60 = .15

covariance

2.42

Copyright 1996 Lawrence C. Marsh

The correlation between two random variables X and Y is their covariance divided by the square roots of their respective variances.

Correlation is a pure number falling between -1 and 1.

cov(X,Y)(X,Y) =var(X) var(Y)

Correlation2.43

Copyright 1996 Lawrence C. Marsh

.15

.05

.45

.35

Y = 1 Y = 2

X = 0

X = 1

.60

.40

.50.50

EX=.40

EY=1.50

cov(X,Y) = .15

correlation

EX=0(.60)+1(.40)=.4022 2

var(X) = E(X ) - (EX) = .40 - (.40) = .24

2 2

2

EY=1(.50)+2(.50) = .50 + 2.0 = 2.50

2 2 2

var(Y) = E(Y ) - (EY) = 2.50 - (1.50) = .25

2 2

2

(X,Y) =cov(X,Y)

var(X) var(Y)

(X,Y) = .61

2.44

Copyright 1996 Lawrence C. Marsh

Independent random variables have zero covariance and, therefore, zero correlation.

The converse is not true.

Zero Covariance & Correlation2.45

Copyright 1996 Lawrence C. Marsh

The expected value of the weighted sumof random variables is the sum of the expectations of the individual terms.

Since expectation is a linear operator,it can be applied term by term.

E[c1X + c2Y] = c1EX + c2EY

E[c1X1+...+ cnXn] = c1EX1+...+ cnEXn

In general, for random variables X1, . . . , Xn :

2.46

Copyright 1996 Lawrence C. Marsh

The variance of a weighted sum of random variables is the sum of the variances, each times the square of the weight, plus twice the covariances of all the random variables times the products of their weights.

var(c1X + c2Y)=c1 var(X)+c2 var(Y) + 2c1c2cov(X,Y)2 2

var(c1X c2Y) = c1 var(X)+c2 var(Y) 2c1c2cov(X,Y)2 2

Weighted sum of random variables:

Weighted difference of random variables:

2.47

Copyright 1996 Lawrence C. Marsh

The Normal Distribution

Y ~ N(,2)

f(y) =2 2

1 exp

y

f(y)

2 2

(y - )2-

2.48

Copyright 1996 Lawrence C. Marsh

The Standardized Normal

Z ~ N(,)

f(z) =2

1 exp 2z2-

Z = (y - )/

2.49

Copyright 1996 Lawrence C. Marsh

P [ Y > a ] = P > = P Z > a - a - Y -

y

f(y)

a

Y ~ N(,2)2.50

Copyright 1996 Lawrence C. Marsh

P [ a < Y < b ] = P < <

= P < Z <

a - Y -

b -

a -

b -

y

f(y)

a

Y ~ N(,2)

b

2.51

Copyright 1996 Lawrence C. Marsh

Y1 ~ N(1,12), Y2 ~ N(2,2

2), . . . , Yn ~ N(n,n2)

W = c1Y1 + c2Y2 + . . . + cnYn

Linear combinations of jointlynormally distributed random variablesare themselves normally distributed.

W ~ N[ E(W), var(W) ]

2.52

Copyright 1996 Lawrence C. Marsh

mean: E[V] = E[ (m) ] = m

If Z1, Z2, . . . , Zm denote m independentN(0,1) random variables, andV = Z1 + Z2 + . . . + Zm , then V ~ (m)

2 2 2 2

V is chi-square with m degrees of freedom.

Chi-Square

variance: var[V] = var[ (m) ] = 2m

If Z1, Z2, . . . , Zm denote m independentN(0,1) random variables, andV = Z1 + Z2 + . . . + Zm , then V ~ (m)

2 2 2 2

V is chi-square with m degrees of freedom.

2

2

2.53

Copyright 1996 Lawrence C. Marsh

mean: E[t] = E[t(m) ] = 0 symmetric about zero

variance: var[t] = var[t(m) ] = m / (m2)

If Z ~ N(0,1) and V ~ (m) and if Z and Vare independent then, ~ t(m)

t is student-t with m degrees of freedom.

2

t = Z

V m

Student - t 2.54

Copyright 1996 Lawrence C. Marsh

If V1 ~ (m1) and V2 ~ (m

2) and if V1 and V2

are independent, then~ F(m

1,m

2)

F is an F statistic with m1 numeratordegrees of freedom and m2 denominatordegrees of freedom.

2

F =V1

m1

V2m

2

2

F Statistic 2.55

Copyright 1996 Lawrence C. Marsh

The Simple Linear Regression

Model

Chapter 3

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

3.1

Copyright 1996 Lawrence C. Marsh

1. Estimate a relationship among economic   variables, such as y = f(x).

2. Forecast or predict the value of one    variable, y, based on the value of   another variable, x.

Purpose of Regression Analysis

3.2

Copyright 1996 Lawrence C. Marsh

Weekly Food Expenditures

y = dollars spent each week on food items.

x = consumer’s weekly income.

The relationship between x and the expected value of y , given x, might be linear:

E(y|x) = 1 + 2 x

3.3

Copyright 1996 Lawrence C. Marsh

f(y|x=480)

f(y|x=480)

yy|x=480

Figure 3.1a Probability Distribution f(y|x=480) of Food Expenditures if given income x=$480.

3.4

Copyright 1996 Lawrence C. Marshf(y|x) f(y|x=480) f(y|x=800)

yy|x=480 y|x=800

Figure 3.1b Probability Distribution of FoodExpenditures if given income x=$480 and x=$800.

3.5

Copyright 1996 Lawrence C. Marsh

{1

x

E(y|x)

E(y|x)

AverageExpenditure

x (income)

E(y|x)=1+2x

2=E(y|x)

x

Figure 3.2 The Economic Model: a linear relationship between avearage expenditure on food and income.

3.6

Copyright 1996 Lawrence C. Marsh

..

xtx1=480 x2=800

y tf(yt)

Figure 3.3. The probability density function for yt at two levels of household income, x t

expe

nditu

re

Homoskedastic Case

income

3.7

Copyright 1996 Lawrence C. Marsh

.

x tx1 x2

y tf(yt)

Figure 3.3+. The variance of yt increases as household income, x t , increases.

expe

nditu

re

Heteroskedastic Case

x3

..

income

3.8

Copyright 1996 Lawrence C. Marsh

Assumptions of the Simple Linear Regression Model - I

1. The average value of y, given x, is given bythe linear regression: E(y) = 1 + 2x

2. For each value of x, the values of y aredistributed around their mean with variance: var(y) = 2

3. The values of y are uncorrelated, having zerocovariance and thus no linear relationship: cov(yi ,yj) = 0

4. The variable x must take at least two differentvalues, so that x ° c, where c is a constant.

3.9

Copyright 1996 Lawrence C. Marsh

5. (optional) The values of y are normally distributed about their mean for each value of x:

y ~ N [(1+2x),2 ]

One more assumption that is often used inpractice but is not required for least squares:

3.10

Copyright 1996 Lawrence C. Marsh

The Error Term

y is a random variable composed of two parts:

I. Systematic component: E(y) = 1 + 2x This is the mean of y.

II. Random component: e = y - E(y) = y - 1 - 2x This is called the random error.

Together E(y) and e form the model: y = 1 + 2x + e

3.11

Copyright 1996 Lawrence C. Marsh

Figure 3.5 The relationship among y, e and the true regression line.

.

..

.

y4

y1

y2

y3

x1 x2 x3 x4

}

}

{

{

e1

e2

e3

e4 E(y) = 1 + 2x

x

y 3.12

Copyright 1996 Lawrence C. Marsh

}.

}.

.

.

y4

y1

y2 y3

x1 x2 x3 x4

{

{

e1

e2

e3

e4

x

y

Figure 3.7a The relationship among y, e and the fitted regression line.

^

y = b1 + b2x^

.

.

.

.y1

y2

y3

y4

^^

^^

^^

^^

3.13

Copyright 1996 Lawrence C. Marsh

{

{.

.

.

.

.

y4

y1

y2 y3

x1 x2 x3 x4 x

y

Figure 3.7b The sum of squared residuals from any other line will be larger.

y = b1 + b2x^

.

..y1

^y3^

y4^ y = b1 + b2x

^* * **

e1^*

e2^*

y2^*

e3^*

*e4*

*{{

3.14

Copyright 1996 Lawrence C. Marshf(.) f(e) f(y)

Figure 3.4 Probability density function for e and y

0 1+2x

3.15

Copyright 1996 Lawrence C. MarshThe Error Term Assumptions

1. The value of y, for each value of x, is y = 1 + 2x + e

2. The average value of the random error e is: E(e) = 0

3. The variance of the random error e is: var(e) = 2 = var(y)

4. The covariance between any pair of e’s is: cov(ei ,ej) = cov(yi ,yj) = 0

5. x must take at least two different values so that x ° c, where c is a constant.

6. e is normally distributed with mean 0, var(e)=2 (optional) e ~ N(0,2)

3.16

Copyright 1996 Lawrence C. MarshUnobservable Nature

of the Error Term

1. Unspecified factors / explanatory variables, not in the model, may be in the error term.

2. Approximation error is in the error term if relationship between y and x is not exactly  a perfectly linear relationship.

3. Strictly unpredictable random behavior that may be unique to that observation is in error.

3.17

Copyright 1996 Lawrence C. Marsh

Population regression values: y t = 1 + 2x t + e t

Population regression line: E(y t|x t) = 1 + 2x t

Sample regression values: y t = b1 + b2x t + e t

Sample regression line: y t = b1 + b2x t

^

^

3.18

Copyright 1996 Lawrence C. Marsh

y t = 1 + 2x t + e t

Minimize error sum of squared deviations:

S(1,2) = (y t - 1 - 2x t )2 (3.3.4)t=1

T

e t = y t - 1 - 2x t

3.19

Copyright 1996 Lawrence C. MarshMinimize w. r. t. 1 and 2:

S(1,2) = (y t - 1 - 2x t )2 (3.3.4)t =1

T

= - 2 (y t - 1 - 2x t )

= - 2 x t (y t - 1 - 2x t )

S()1

S()2

Set each of these two derivatives equal to zero and

solve these two equations for the two unknowns: 1 2

3.20

Copyright 1996 Lawrence C. Marsh

S(.)S(.)

ibi

..

.

Minimize w. r. t. 1 and 2:

S() = (y t - 1 - 2x t )2t =1

T

S(.)i

< 0 S(.)

i

> 0 S(.)i

= 0

3.21

Copyright 1996 Lawrence C. Marsh To minimize S(.), you set the two

derivatives equal to zero to get:

= - 2 (y t - b1 - b2x t ) = 0

= - 2 x t (y t - b1 - b2x t ) = 0

S()1

S()2

When these two terms are set to zero,

1 and 2 become b1 and b2 because they no longer

represent just any value of 1 and 2 but the special

values that correspond to the minimum of S() .

3.22

Copyright 1996 Lawrence C. Marsh

- 2 (y t - b1 - b2x t ) = 0

- 2 x t (y t - b1 - b2x t ) = 0

y t - Tb1 - b2 x t = 0

x t y t - b1 x t - b2 xt = 0 2

Tb1 + b2 x t = y t

b1 x t + b2 xt = x t y t 2

3.23

Copyright 1996 Lawrence C. Marsh

Solve for b1 and b2 using definitions of x and y

Tb1 + b2 x t = y t

b1 x t + b2 xt = x t y t 2

T x t yt - x t y t

T x t - ( x t)2 2

b2 =

b1 = y - b2 x

3.24

Copyright 1996 Lawrence C. Marsh

elasticities

percentage change in ypercentage change in x

= = x/xy/y

= y xx y

Using calculus, we can get the elasticity at a point:

= lim = y xx y

y xx yx 0

3.25

Copyright 1996 Lawrence C. Marsh

E(y) = 1 + 2 x

E(y)x = 2

applying elasticities

E(y)x

= 2=

E(y)x

E(y)x

3.26

Copyright 1996 Lawrence C. Marsh

estimating elasticities yx = b2

= yx

yx^

yt = b1 + b2 x t = 4 + 1.5 x t^

x = 8 = average number of years of experiencey = $10 = average wage rate

= 1.5 = 1.2810= b2

yx^

3.27

Copyright 1996 Lawrence C. Marsh

Prediction

yt = 4 + 1.5 x t^

Estimated regression equation:

x t = years of experience

yt = predicted wage rate^

If x t = 2 years, then yt = $7.00 per hour.^

If x t = 3 years, then yt = $8.50 per hour.^

3.28

Copyright 1996 Lawrence C. Marsh

log-log models

ln(y) = 1 + 2 ln(x)

ln(y)x

ln(x)x= 2

yx = 2

1y

xx

1x

3.29

Copyright 1996 Lawrence C. Marsh

yx = 2

1y

xx

1x

= 2

yx

xy

elasticity of y with respect to x:

= 2

yx

xy

=

3.30

Copyright 1996 Lawrence C. Marsh

Properties of Least Squares

Estimators

Chapter 4

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

4.1

Copyright 1996 Lawrence C. Marsh

yt = household weekly food expenditures

Simple Linear Regression Model

yt = 1 + 2 x t + t

x t = household weekly income

For a given level of x t, the expected

level of food expenditures will be:

E(yt|x t) = 1 + 2 x t

4.2

Copyright 1996 Lawrence C. Marsh

1. yt = 1 + 2x t + t

2. E( t) = 0 <=> E(yt) = 1 + 2x t

3. var( t) = 2 = var(yt)

4. cov( i, j) = cov(yi,yj) = 0

5. x t ° c for every observation

6. t~N(0,2) <=> yt~N(1+ 2x t,

2)

Assumptions of the SimpleLinear Regression Model

4.3

Copyright 1996 Lawrence C. MarshThe population parameters 1 and 2

are unknown population constants.

The formulas that produce thesample estimates b1 and b2 are

called the estimators of 1 and 2.

When b0 and b1 are used to represent

the formulas rather than specific values,

they are called estimators of 1 and 2which are random variables becausethey are different from sample to sample.

4.4

Copyright 1996 Lawrence C. Marsh

• If the least squares estimators b0 and b1

are random variables, then what are theirtheir means, variances, covariances andprobability distributions?

• Compare the properties of alternative estimators to the properties of the least squares estimators.

Estimators are Random Variables ( estimates are not )

4.5

Copyright 1996 Lawrence C. Marsh

The Expected Values of b1 and b2

The least squares formulas (estimators)in the simple regression case:

b2 =Txtyt - xt yt

Txt -(xt)2 2

b1 = y - b2x

where y = yt / T and x = x t / T

(3.3.8a)

(3.3.8b)

4.6

Copyright 1996 Lawrence C. MarshSubstitute in yt = 1 + 2x t + t

to get:

b2 = 2 +Txtt - xt t

Txt -(xt)2 2

The mean of b2 is:

Eb2 = 2 +TxtEt - xt Et

Txt -(xt)2 2

Since Et = 0, then Eb2 = 2 .

4.7

Copyright 1996 Lawrence C. Marsh

The result Eb2 = 2 means that

the distribution of b2 is centered at 2.

Since the distribution of b2

is centered at 2 ,we say that

b2 is an unbiased estimator of 2.

An Unbiased Estimator 4.8

Copyright 1996 Lawrence C. Marsh

The unbiasedness result on the previous slide assumes that weare using the correct model.

If the model is of the wrong formor is missing important variables,

then Et ° 0, then Eb2 ° 2 .

Wrong Model Specification 4.9

Copyright 1996 Lawrence C. Marsh

Unbiased Estimator of the Intercept

In a similar manner, the estimator b1

of the intercept or constant term can beshown to be an unbiased estimator of 1

when the model is correctly specified.

Eb1 = 1

4.10

Copyright 1996 Lawrence C. Marsh

b2 =Txtyt xt yt

Txt (xt)2 2 (3.3.8a)

(4.2.6)

Equivalent expressions for b2:

Expand and multiply top and bottom by T:

b2 =(xt x )yt y )xt x )

2

4.11

Copyright 1996 Lawrence C. Marsh

Variance of b2

Given that both yt and t have variance 2,

the variance of the estimator b2 is:

b2 is a function of the yt values butvar(b2) does not involve yt directly.

x t x

2

2var(b2) =

4.12

Copyright 1996 Lawrence C. Marsh

Variance of b1

x t x2var(b1) = 2

x t2

the variance of the estimator b1 is:

b1 = y b2xGiven

4.13

Copyright 1996 Lawrence C. Marsh

Covariance of b1 and b2

x t x2cov(b1,b2) = 2

x

If x = 0, slope can change without affectingthe variance.

4.14

Copyright 1996 Lawrence C. Marsh What factors determine variance and covariance ?

1. 2: uncertainty about yt values uncertainty about b1, b2 and their relationship.

2. The more spread out the xt values are then the more confidence we have in b1, b2, etc.

3. The larger the sample size, T, the smaller the variances and covariances.

4. The variance b1 is large when the (squared) xt values are far from zero (in either direction).

5. Changing the slope, b2, has no effect on the intercept, b1, when the sample mean is zero. But if sample mean is positive, the covariance between b1 and b2 will be negative, and vice versa.

4.15

Copyright 1996 Lawrence C. Marsh

Gauss-Markov Theorm

Under the first five assumptions of the simple, linear regression model, the ordinary least squares estimators b1 and b2 have the smallest variance of all linear and unbiased estimators of1 and 2. This means that b1and b2 are the Best Linear Unbiased Estimators (BLUE) of 1 and 2.

4.16

Copyright 1996 Lawrence C. Marsh

implications of Gauss-Markov

1. b1 and b2 are “best” within the class     of linear and unbiased estimators.

2. “Best” means smallest variance within the class of linear/unbiased.

3. All of the first five assumptions must hold to satisfy Gauss-Markov.

4. Gauss-Markov does not require assumption six: normality.

5. G-Markov is not based on the least    squares principle but on b1 and b2.

4.17

Copyright 1996 Lawrence C. Marsh

G-Markov implications (continued)

6. If we are not satisfied with restricting our estimation to the class of linear and unbiased estimators, we should ignore the Gauss-Markov Theorem and use some nonlinear and/or biased estimator instead. (Note: a biased or nonlinear estimator could have smaller variance than those satisfying Gauss-Markov.)

7. Gauss-Markov applies to the b1 and b2

estimators and not to particular sample values (estimates) of b1 and b2.

4.18

Copyright 1996 Lawrence C. MarshProbability Distribution

of Least Squares Estimators

b2 ~ N 2 , x t x

2

2

b1 ~ N 1 ,x t x2

2 x t

2

4.19

Copyright 1996 Lawrence C. Marsh yt and t normally

distributed The least squares estimator of 2 can beexpressed as a linear combination of yt’s:

b2 = wt yt

b1 = y b2xx t x

2where wt =x t x

This means that b1and b2 are normal sincelinear combinations of normals are normal.

4.20

Copyright 1996 Lawrence C. Marsh

normally distributed under The Central Limit Theorem

If the first five Gauss-Markov assumptionshold, and sample size, T, is sufficiently large,then the least squares estimators, b1 and b2,have a distribution that approximates thenormal distribution with greater accuracythe larger the value of sample size, T.

4.21

Copyright 1996 Lawrence C. Marsh

Consistency

We would like our estimators, b1 and b2, to collapse onto the true population values, 1 and 2, as sample size, T, goes to infinity.

One way to achieve this consistency property is for the variances of b1 and b2 to go to zero as T goes to infinity.

Since the formulas for the variances of the least squares estimators b1 and b2 show that their variances do, in fact, go to zero, then b1 and b2, are consistent estimators of 1 and 2.

4.22

Copyright 1996 Lawrence C. Marsh Estimating the variance

of the error term, 2

et = yt b1 b2 x t^

et

^t =1

T 2

T2

=

is an unbiased estimator of

2

^

^

4.23

Copyright 1996 Lawrence C. Marsh

The Least Squares Predictor, yo ^

Given a value of the explanatory variable, Xo, we would like to predicta value of the dependent variable, yo.

The least squares predictor is:

yo = b1 + b2 x o (4.7.2)

^

4.24

Copyright 1996 Lawrence C. Marsh

Inference in the Simple

Regression Model

Chapter 5

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

5.1

Copyright 1996 Lawrence C. Marsh

1. yt = 1 + 2x t + t

2. E( t) = 0 <=> E(yt) = 1 + 2x t

3. var( t) = 2 = var(yt)

4. cov( i, j) = cov(yi,yj) = 0

5. x t ° c for every observation

6. t~N(0,2) <=> yt~N(1+ 2x t,

2)

Assumptions of the Simple Linear Regression Model

5.2

Copyright 1996 Lawrence C. Marsh

Probability Distribution of Least Squares

Estimators

b1 ~ N 1 ,x t x2

2 x t2

b2 ~ N 2 , x t x2

2

5.3

Copyright 1996 Lawrence C. Marsh

2^ =

et^

Unbiased estimator of the error variance:

2

2

^

Transform to a chi-square distribution:

Error Variance Estimation 5.4

Copyright 1996 Lawrence C. MarshWe make a correct decision if:

• The null hypothesis is false and we decide to reject it.

• The null hypothesis is true and we decide not to reject it.

Our decision is incorrect if:

• The null hypothesis is true and we decide to reject it. This is a type I error.

• The null hypothesis is false and we decide not to reject it. This is a type II error.

5.5

Copyright 1996 Lawrence C. Marsh

b2 ~ N 2 , x t x2

2

Create a standardized normal random variable, Z, by subtracting the mean of b2 and dividing by its standard deviation:

b2 2

var(b2)

5.6

Copyright 1996 Lawrence C. Marsh

Simple Linear Regression

yt = 1 + 2x t + twhere E t = 0

yt ~ N(1+ 2x t , 2)

since Eyt = 1 + 2x t

t = yt 1 2x t

Therefore, t ~ N(0,2) .

5.7

Copyright 1996 Lawrence C. Marsh

Create a Chi-Square

t ~ N(0,2) but want N(0,) .

t /~ N(0,) Standard Normal .

t /~ Chi-Square .

5.8

Copyright 1996 Lawrence C. Marsh

Sum of Chi-Squares

t =1 t /=

1 / 2 / T

/

+

+. . .+ =

Therefore, t =1 t /

5.9

Copyright 1996 Lawrence C. Marsh

Since the errors t = yt 1 2x t

are not observable, we estimate them with

the sample residuals e t = yt b1 b2x t.

Unlike the errors, the sample residuals arenot independent since they use up two degreesof freedom by using b1 and b2 to estimate 1 and 2.

We get only T2 degrees of freedom instead of T.

Chi-Square degrees of freedom5.10

Copyright 1996 Lawrence C. Marsh

Student-t Distribution

t = ~ t(m)

Z

V / m

where Z ~ N(0,1)

and V ~ (m)

2

5.11

Copyright 1996 Lawrence C. Marsh

t = ~ t(m)

Z

V / ( T2)

where Z = (b2 2)

var(b2)

and var(b2) =

2

( xi x )2

5.12

Copyright 1996 Lawrence C. Marsh

t = Z

V / (T-2)

(b2 2)

var(b2)t =

(T2) 2

2

^( T2)

V = (T2)

2

2

^

5.13

Copyright 1996 Lawrence C. Marsh

var(b2) =

2

( xi x )2

(b2 2)

2

( xi x )2

t = =

(T2) 2

2

^( T2)

(b2 2)

2

( xi x )2

^

notice thecancellations

5.14

Copyright 1996 Lawrence C. Marsh

(b2 2)

2

( xi x )2

^t = =

(b2 2)

var(b2)^

t = (b2 2)

se(b2)

5.15

Copyright 1996 Lawrence C. Marsh

Student’s t - statistic

t = ~ t (T2)

(b2 2)

se(b2)

t has a Student-t Distribution with T2 degrees of freedom.

5.16

Copyright 1996 Lawrence C. Marsh

Figure 5.1 Student-t Distribution

()

t0

f(t)

-tc tc

/2/2

red area = rejection region for 2-sided test

5.17

Copyright 1996 Lawrence C. Marsh

probability statements

P(-tc ٹ t ٹ tc) = 1

P( t < -tc ) = P( t > tc ) =

P(-tc ٹ tc) = 1 ٹ(b2 2)

se(b2)

5.18

Copyright 1996 Lawrence C. Marsh

Confidence Intervals

Two-sided (1)x100% C.I. for 1:

b1 t/2[se(b1)], b1 + t/2[se(b1)]

b2 t/2[se(b2)], b2 + t/2[se(b2)]

Two-sided (1)x100% C.I. for 2:

5.19

Copyright 1996 Lawrence C. Marsh

Student-t vs. Normal Distribution

1. Both are symmetric bell-shaped distributions.

2. Student-t distribution has fatter tails than the normal.

3. Student-t converges to the normal for infinite sample.

4. Student-t conditional on degrees of freedom (df).

5. Normal is a good approximation of Student-t for the first few decimal places when df > 30 or so.

5.20

Copyright 1996 Lawrence C. Marsh

Hypothesis Tests1. A null hypothesis, H0.

2. An alternative hypothesis, H1.

3. A test statistic.

4. A rejection region.

5.21

Copyright 1996 Lawrence C. Marsh

Rejection Rules1. Two-Sided Test:

If the value of the test statistic falls in the critical region in either tail of the t-distribution, then we reject the null hypothesis in favor of the alternative.

2. Left-Tail Test:If the value of the test statistic falls in the critical region which lies in the left tail of the t-distribution, then we reject the null hypothesis in favor of the alternative.

2. Right-Tail Test:If the value of the test statistic falls in the critical region which lies in the right tail of the t-distribution, then we reject the null hypothesis in favor of the alternative.

5.22

Copyright 1996 Lawrence C. Marsh

Format for Hypothesis Testing

1. Determine null and alternative hypotheses.

2. Specify the test statistic and its distribution as if the null hypothesis were true.

3. Select and determine the rejection region.

4. Calculate the sample value of test statistic.

5. State your conclusion.

5.23

Copyright 1996 Lawrence C. Marshpractical vs. statistical

significance in economics

Practically but not statistically significant:

When sample size is very small, a large average gap between the salaries of men and women might not be statistically significant.

Statistically but not practically significant:

When sample size is very large, a small correlation (say, = 0.00000001) between the winning numbers in the PowerBall Lottery and the Dow-Jones Stock Market Index might be statistically significant.

5.24

Copyright 1996 Lawrence C. Marsh

Type I and Type II errors

Type I error:

We make the mistake of rejecting the null hypothesis when it is true.

= P(rejecting H0 when it is true).

Type II error:

We make the mistake of failing to reject the null hypothesis when it is false. = P(failing to reject H0 when it is false).

5.25

Copyright 1996 Lawrence C. Marsh

Prediction Intervals

A (1)x100% prediction interval for yo is:

yo ± tc se( f )^

se( f ) = var( f )^f = yo yo^

x t x2var( f ) =

2 1 + +^ 1

x o x2^

5.26

Copyright 1996 Lawrence C. Marsh

The Simple Linear Regression Model

Chapter 6

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

6.1

Copyright 1996 Lawrence C. Marsh

Explaining Variation in yt

Predicting yt without any explanatory variables:

yt = 1 + et

et = (yt 1)2

2

t = 1 t = 1

T T

= (yt b1) = 0et

2t = 1

t = 1

T

T

1

(yt b1) = 0t = 1

T

yt Tb1 = 0t = 1

T

b1 = y

Why not y?

6.2

Copyright 1996 Lawrence C. Marsh

Explaining Variation in yt

yt = b1 + b2xt + et^

Unexplained variation:

yt = b1 + b2xt^Explained variation:

et = yt yt = yt b1 b2xt^^

6.3

Copyright 1996 Lawrence C. Marsh

Explaining Variation in yt

yt = yt + et^

Why not y?

^

yt y = yt y + et^ ^

using y as baseline

SST = SSR + SSE(yty)2 = (yty)2 +et

t = 1

T ^ ^T T

t = 1 t = 1

2cross

producttermdropsout

6.4

Copyright 1996 Lawrence C. Marsh

Total Variation in yt

SST = total sum of squares

SST measures variation of yt around y

(yt y)2t = 1

T

SST =

6.5

Copyright 1996 Lawrence C. Marsh

Explained Variation in yt

SSR = regression sum of squares

yt = b1 + b2xt^Fitted yt values:^

SSR measures variation of yt around y^

(yt y)2t = 1

T

SSR = ^

6.6

Copyright 1996 Lawrence C. Marsh

Unexplained Variation in yt

SSE = error sum of squares

SSE measures variation of yt around yt^

et = ytyt = yt b1 b2xt^^

(yt yt)2 = et

2 t = 1

T

SSE = ^t = 1

T^

6.7

Copyright 1996 Lawrence C. Marsh

Analysis of Variance Table

^

Table 6.1 Analysis of Variance Table Source of Sum of MeanVariation DF Squares Square Explained 1 SSR SSR/1Unexplained T-2 SSE SSE/(T-2) [= 2]Total T-1 SST

6.8

Copyright 1996 Lawrence C. Marsh

Coefficient of Determination

1ٹ R2 ٹ 0

What proportion of the variation in yt is explained?

SSRSSTR2 =

6.9

Copyright 1996 Lawrence C. Marsh

Coefficient of Determination

SST = SSR + SSE

SST SSR SSESST SST SST

= +

SSR SSESST SST

1 = +

Dividingby SST

SSRSSTR2 = = 1 SSE

SST

6.10

Copyright 1996 Lawrence C. Marsh

R2 is only a descriptive measure.

R2 does not measure the qualityof the regression model.

Focusing solely on maximizing R2 is not a good idea.

Coefficient of Determination 6.11

Copyright 1996 Lawrence C. Marsh

cov(X,Y) =var(X) var(Y)

Correlation Analysis

cov(X,Y)r =var(X) var(Y)

Population:

^^ ^

Sample:

6.12

Copyright 1996 Lawrence C. Marsh

Correlation Analysis

var(X) =^(xt x)2/(T1)t = 1

T

var(Y) =^(yt y)2/(T1)t = 1

T

cov(X,Y) =^(xt x)(yt y)/(T1)t = 1

T

6.13

Copyright 1996 Lawrence C. Marsh

Correlation Analysis

(xt x)2 (yt y)2t = 1

T(xt x)(yt y)t = 1

T

r =t = 1

T

Sample Correlation Coefficient

6.14

Copyright 1996 Lawrence C. Marsh

Correlation Analysis and R2

For simple linear regression analysis:

r

2 = R2

R2 is also the correlation

between yt and yt

measuring “goodness of fit”.

^

6.15

Copyright 1996 Lawrence C. Marsh

Regression Computer Output

Table 6.2 Computer Generated Least Squares Results (1) (2) (3) (4) (5) Parameter Standard T for H0:Variable Estimate Error Parameter=0 Prob>|T|INTERCEPT 40.7676 22.1387 1.841 0.0734X 0.1283 0.0305 4.201 0.0002

Typical computer output of regression estimates:

6.16

Copyright 1996 Lawrence C. Marsh

Regression Computer Output

se(b1) = var(b1) = 490.12 = 22.1287^

se(b2) = var(b2) = 0.0009326 = 0.0305^

b1 = 40.7676 b2 = 0.1283

se(b1)t = = = 1.84

b1 40.7676

22.1287

se(b2)b2t = = = 4.20

0.12830.0305

6.17

Copyright 1996 Lawrence C. Marsh

Regression Computer Output

Table 6.3 Analysis of Variance Table Sum of MeanSource DF Squares Square Explained 1 25221.2229 25221.2229Unexplained 38 54311.3314 1429.2455Total 39 79532.5544 R-square: 0.3171

Sources of variation in the dependent variable:

6.18

Copyright 1996 Lawrence C. Marsh

Regression Computer Output

SSRSST

R2 = = 1 = 0.317SSESST

SSE /(T-2) = 2 = 1429.2455 ^SSE = et

2 = 54311^

SST = (yty)2 = 79532

SSR = (yty)2 = 25221^

6.19

Copyright 1996 Lawrence C. Marsh

yt = 40.7676 + 0.1283xt

(s.e.) (22.1387) (0.0305)

yt = 40.7676 + 0.1283xt

(t) (1.84) (4.20)

Reporting Regression Results 6.20

Copyright 1996 Lawrence C. Marsh

R2 = 0.317

Reporting Regression Results

This R2 value may seem low but it istypical in studies involving cross-sectionaldata analyzed at the individual or micro level.

A considerably higher R2 value would beexpected in studies involving time-series dataanalyzed at an aggregate or macro level.

6.21

Copyright 1996 Lawrence C. Marsh Effects of Scaling the Data

Changing the scale of x

yt = 1 + (c2)(xt/c) + et

yt = 1 + 2xt + et

yt = 1 + 2xt + et* *

2 = c2* xt = xt/c

*

where

and

The estimatedcoefficient andstandard errorchange but theother statisticsare unchanged.

6.22

Copyright 1996 Lawrence C. Marsh Effects of Scaling the Data

Changing the scale of y

yt/c = (1/c) + (2/c)xt + et/c

yt = 1 + 2xt + et

1 = 1/c * and

All statisticsare changedexcept forthe t-statisticsand R2 value.

yt = 1 + 2xt + et* ***

2 = 2/c *

*et = et/cyt = yt/cwhere *

6.23

Copyright 1996 Lawrence C. Marsh Effects of Scaling the Data

Changing the scale of x and y

yt/c = (1/c) + (c2/c)xt/c + et/c

yt = 1 + 2xt + et

1 = 1/c * and

No change inthe R2 or thet-statistics orin regression

results for 2

but all otherstats change.

yt = 1 + 2xt + et* ***

xt = xt/c *

*et = et/cyt = yt/cwhere *

6.24

Copyright 1996 Lawrence C. Marsh Functional Forms

The term linear in a simple regression model does not mean a linear relationship between variables, but a model in which the parameters enter the model in a linear way.

6.25

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + et

Linear Statistical Models:

Nonlinear Statistical Models:

ln(yt) = 1 + 2xt + et

yt = 1 + 2 ln(xt) + et

yt = 1 + 2xt + et2

yt = 1 + 2xt + et

3

yt = 1 + 2xt + exp(3xt) + et

yt = 1 + 2xt + et

3

Linear vs. Nonlinear 6.27

Copyright 1996 Lawrence C. Marsh

y

x

nonlinear relationship

between food expenditure and

income

Linear vs. Nonlinear

foodexpenditure

income0

6.27

Copyright 1996 Lawrence C. Marsh

Useful Functional Forms

1. Linear2. Reciprocal3. Log-Log4. Log-Linear5. Linear-Log6. Log-Inverse

Look at each form and its slope and elasticity

6.28

Copyright 1996 Lawrence C. Marsh

Linear

yt = 1 + 2xt + et

slope: 2 elasticity: 2 yt

Useful Functional Forms

xt

6.29

Copyright 1996 Lawrence C. Marsh

Reciprocal

yt = 1 + 2 + et

Useful Functional Forms

1

xtslope: elasticity:

1xt

221

xt yt2

6.30

Copyright 1996 Lawrence C. Marsh

xt

yt

Log-Log

ln(yt)= 1 + 2ln(xt) + et

slope: 2 elasticity: 2

Useful Functional Forms 6.31

Copyright 1996 Lawrence C. Marsh

Log-Linear

ln(yt)= 1 + 2xt + et

slope: 2 yt elasticity: 2xt

Useful Functional Forms 6.32

Copyright 1996 Lawrence C. Marsh

Linear-Log

yt= 1 + 2ln(xt) + et

_slope: 2 elasticity: 2

1xt

yt

1_

Useful Functional Forms 6.33

Copyright 1996 Lawrence C. Marsh

Useful Functional Forms

ln(yt) = 1 - 2 + et 1xt

Log-Inverse

slope: 2 elasticity: 2x2t

yt 1xt

6.34

Copyright 1996 Lawrence C. Marsh

1. E (et) = 0

2. var (et) = 2

3. cov(ei, ej) = 0

4. et ~ N(0, 2)

Error Term Properties 6.35

Copyright 1996 Lawrence C. Marsh

Economic Models

1. Demand Models2. Supply Models3. Production Functions4. Cost Functions5. Phillips Curve

6.36

Copyright 1996 Lawrence C. Marsh

1. Demand Models

* quality demanded (yd) and price (x)* constant elasticity

Economic Models

ln(yt )= 1 + 2ln(x)t + et d

6.37

Copyright 1996 Lawrence C. Marsh

2. Supply Models

* quality supplied (ys) and price (x)* constant elasticity

Economic Models

ln(yt )= 1 + 2ln(xt) + et s

6.38

Copyright 1996 Lawrence C. Marsh

3. Production Functions* output (y) and input (x)

* constant elasticity

Economic Models

ln(yt)= 1 + 2ln(xt) + et

Cobb-Douglas Production Function:

6.39

Copyright 1996 Lawrence C. Marsh

4a. Cost Functions

* total cost (y) and output (x)

Economic Models

yt = 1 + 2x2t + et

6.40

Copyright 1996 Lawrence C. Marsh

4b. Cost Functions

* average cost (x/y) and output (x)

Economic Models

(yt/xt) = 1/xt + 2xt + et/xt

6.41

Copyright 1996 Lawrence C. Marsh

5. Phillips Curve

* wage rate (wt) and time (t)

Economic Models

unemployment rate, ut

wt-1% wt =

wt wt-1= ut

1

nonlinear in both variables and parameters

6.42

Copyright 1996 Lawrence C. Marsh

The Multiple Regression Model

Chapter 7

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

7.1

Copyright 1996 Lawrence C. MarshTwo Explanatory Variables

yt = 1 + 2xt2 + 3xt3 + et

yt

xt2

= 2 xt3

yt = 3xt‘s affect yt

separately

But least squares estimation of 2

now depends upon both xt2 and xt3 .

7.2

Copyright 1996 Lawrence C. MarshCorrelated Variables

yt = output xt2 = capital xt3 = labor

Always 5 workers per machine.

If number of workers per machine is never varied, it becomes impossible to tell if the machines or the workers are responsible for changes in output.

yt = 1 + 2xt2 + 3xt3 + et

7.3

Copyright 1996 Lawrence C. MarshThe General Model

yt = 1 + 2xt2 + 3xt3 +. . .+ KxtK + et

The parameter 1 is the intercept (constant) term.

The “variable” attached to 1 is xt1= 1.

Usually, the number of explanatory variables is said to be K1 (ignoring xt1= 1), while the

number of parameters is K. (Namely: 1 . . . K).

7.4

Copyright 1996 Lawrence C. Marsh

1. E(et) = 0

2. var(et) = 2

covet , es= for t ° s 

4. et ~ N(0, 2)

Statistical Properties of et 7.5

Copyright 1996 Lawrence C. Marsh

1. E (yt) = 1 + 2xt2 +. . .+ KxtK

2. var(yt) = var(et) = 2

cov(yt ,ys) = cov(et , es) = 0 t°s

4. yt ~ N(1+2xt2 +. . .+KxtK, 2)

Statistical Properties of yt 7.6

Copyright 1996 Lawrence C. MarshAssumptions

1. yt = 1 + 2xt2 +. . .+ KxtK + et

2. E (yt) = 1 + 2xt2 +. . .+ KxtK

3. var(yt) = var(et) = 2

cov(yt ,ys) = cov(et ,es) = 0 t ° s

5. The values of xtk are not random

6. yt ~ N(1+2xt2 +. . .+KxtK, 2)

7.7

Copyright 1996 Lawrence C. Marsh

Least Squares Estimation

yt = 1 + 2xt2 + 3xt3 + et

S ؛ S(1, 2, 3) = yt12xt23xt3

t = 1

T

Define: yt = yt y*

xt2 = xt2 x2*

xt3 = xt3 x3*

7.8

Copyright 1996 Lawrence C. Marsh

b1 = yb1b2x2 b3x3

b3 = yt xt3xt2 yt xt2xt3xt2* * * * * * *2

xt2 xt3 xt2xt3* * * *2 2 2

b2 = yt xt2xt3 yt xt3xt2xt3* * * * * * *2

xt2 xt3 xt2xt3* * * *2 2 2

Least Squares Estimators 7.9

Copyright 1996 Lawrence C. Marsh

Dangers of Extrapolation

Statistical models generally are good only“within the relevant range”. This meansthat extending them to extreme data valuesoutside the range of the original data oftenleads to poor and sometimes ridiculous results.

If height is normally distributed and the normal ranges from minus infinity to plus infinity, pity the man minus three feet tall.

7.10

Copyright 1996 Lawrence C. MarshError Variance Estimation

2^ =

et^

Unbiased estimator of the error variance:

2

2

^

Transform to a chi-square distribution:

7.11

Copyright 1996 Lawrence C. MarshGauss-Markov Theorem

Under the assumptions of the multiple regression model, the ordinary least squares estimators have the smallest variance of all linear and unbiased estimators. This means that the least squares estimators are the Best Linear U nbiased Estimators (BLUE).

7.12

Copyright 1996 Lawrence C. Marsh

Variances

yt = 1 + 2xt2 + 3xt3 + et

var(b3) =

(1 r23)(xt3 x3)2

22

var(b2) =(1 r23)(xt2 x2)

222

(xt2 x2)2 (xt3 x3)

2

where r23 = (xt2 x2)(xt3 x3)

When r23 = 0these reduceto the simpleregressionformulas.

7.13

Copyright 1996 Lawrence C. Marsh

Variance Decomposition

The variance of an estimator is smaller when:

1. The error variance, 2, is smaller:

2 0 .

2. The sample size, T, is larger:

(xt2 x2)2 .

3. The variable’s values are more spread out: (xt2 x2)

2 .

4. The correlation is close to zero: r23 0 .

2

t = 1

T

7.14

Copyright 1996 Lawrence C. Marsh

Covariances

yt = 1 + 2xt2 + 3xt3 + et

where r23 =

(xt2 x2)2 (xt3 x3)

2

(xt2 x2)(xt3 x3)

(1 r23) (xt2 x2)2 (xt3 x3)

2

cov(b2,b3) = 2

r23 2

7.15

Copyright 1996 Lawrence C. Marsh

Covariance Decomposition

1. The error variance, 2, is larger.

2. The sample size, T, is smaller.

3. The values of the variables are less spread out.

4. The correlation, r23, is high.

The covariance between any two estimatorsis larger in absolute value when:

7.16

Copyright 1996 Lawrence C. MarshVar-Cov Matrix

yt = 1 + 2xt2 + 3xt3 + et

var(b1) cov(b1,b2) cov(b1,b3)cov(b1,b2,b3) = cov(b1,b2) var(b2) cov(b2,b3) cov(b1,b3) cov(b2,b3) var(b3)

The least squares estimators b1, b2, and b3

have covariance matrix:

7.17

Copyright 1996 Lawrence C. MarshNormal

yt = 1 + 2x2t + 3x3t +. . .+ KxKt + et

yt ~N (1 + 2x2t + 3x3t +. . .+ KxKt), 2

et ~ N(0, 2)This implies and is implied by:

bk ~ N k, var(bk)

z = ~ N(0,1) for k = 1,2,...,Kbk k

var(bk)

Since bk is a linear

function of the yt’s:

7.18

Copyright 1996 Lawrence C. MarshStudent-t

bk k

var(bk)^t = =

bk k

se(bk)

Since generally the population varianceof bk , var(bk) , is unknown, we estimate it with which uses

2 instead of 2.var(bk)^ ^

t has a Student-t distribution with df=(TK).

7.19

Copyright 1996 Lawrence C. Marsh

Interval Estimation

bk k

se(bk)P tc ٹ tc = 1 ٹ

tc is critical value for (T-K) degrees of freedom

such that P(t چ tc) = /2.

P bk tc se(bk) ٹ k ٹ bk + tc se(bk) = 1

Interval endpoints: bk tc se(bk) , bk + tc se(bk)

7.20

Copyright 1996 Lawrence C. Marsh

Hypothesis Testing and

Nonsample Information

Chapter 8

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

8.1

Copyright 1996 Lawrence C. Marsh

1. Student-t Tests

2. Goodness-of-Fit

3. F-Tests

4. ANOVA Table

5. Nonsample Information

6. Collinearity

7. Prediction

Chapter 8: Overview 8.2

Copyright 1996 Lawrence C. MarshStudent - t Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et

Student-t tests can be used to test any linearcombination of the regression coefficients:

H0: 2 + 3 + 4 = 1H0: 1 = 0

H0: 32 73 = 21 H0: 2 3 5ٹ

Every such t-test has exactly TK degrees of freedomwhere K=#coefficients estimated(including the intercept).

8.3

Copyright 1996 Lawrence C. MarshOne Tail Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et

H0: 3 0ٹ

H1: 3 > 0b3

se(b3)t = ~ t (TK)

tc0

df = TK = T4

8.4

Copyright 1996 Lawrence C. MarshTwo Tail Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et

H0: 2 = 0

H1: 2 ° 0b2

se(b2)t = ~ t (TK)

tc0

df = TK = T4

-tc

8.5

Copyright 1996 Lawrence C. MarshGoodness - of - Fit

1ٹ R2 ٹ 0

Coefficient of Determination

SSTR2 = = (yt y)2t = 1

T^

SSR

(yt y)2t = 1

T

8.6

Copyright 1996 Lawrence C. MarshAdjusted R-Squared

Adjusted Coefficient of Determination

Original:

Adjusted:

SST/(T1)R2 = 1 SSE/(TK)

SST= 1 SSER2 = SSTSSR

8.7

Copyright 1996 Lawrence C. MarshComputer Output

Table 8.2 Summary of Least Squares Results Variable Coefficient Std Error t-value p-value constant 104.79 6.48 16.17 0.000price 6.642 3.191 2.081 0.042advertising 2.984 0.167 17.868 0.000

b2

se(b2)t = =

6.642

3.1912.081=

8.8

Copyright 1996 Lawrence C. MarshReporting Your Results

yt = Xt2 + Xt3^

(6.48) (3.191) (0.167) (s.e.)

yt = Xt2 + Xt3^

(16.17) (-2.081) (17.868) (t)

Reporting t-statistics:

Reporting standard errors:

8.9

Copyright 1996 Lawrence C. MarshSingle Restriction F-Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et

H0: 2 = 0

H1: 2 ° 0

dfd = TK = 49dfn = J = 1

(SSER SSEU)/J

SSEU/(TK)F =

(1964.758 1805.168)/1

1805.168/(52 3)=

= 4.33

By definition this is the t-statistic squared:t = 2.081 F = t2 =

8.10

Copyright 1996 Lawrence C. MarshMultiple Restriction F-Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et

H0: 2 = 0, 4 = 0

H1: H0 not true

dfd = TK = 49dfn = J = 2

(SSER SSEU)/J

SSEU/(TK)F =

First run the restrictedregression by dropping Xt2 and Xt4 to get SSER.Next run unrestricted regression to get SSEU .

8.11

Copyright 1996 Lawrence C. Marsh

F-Tests

(SSER SSEU)/J

SSEU/(TK)F =

F-Tests of this type are always right-tailed, even for left-sided or two-sided hypotheses, because any deviation from the null will make the F value bigger (move rightward).

0 Fc

f(F)

F

8.12

Copyright 1996 Lawrence C. MarshF-Test of Entire Equation

yt = 1 + 2Xt2 + 3Xt3 + et

H0: 2 = 3 = 0

H1: H0 not true

dfd = TK = 49dfn = J = 2

(SSER SSEU)/J

SSEU/(TK)F =

(13581.35 1805.168)/2

1805.168/(52 3)=

= 159.828

We ignore 1. Why?

Fc = 3.187

= 0.05

Reject H0!

8.13

Copyright 1996 Lawrence C. MarshANOVA Table

Table 8.3 Analysis of Variance Table Sum of MeanSource DF Squares Square F-Value Explained 2 11776.18 5888.09 158.828Unexplained 49 1805.168 36.84Total 51 13581.35 p-value: 0.0001

SSTR2 = =SSR = 0.86711776.18

13581.35

8.14

Copyright 1996 Lawrence C. MarshNonsample Information

ln(yt) = 1 + 2 ln(Xt2) + 3 ln(Xt3) + 4 ln(Xt4) + et

A certain production process is known to beCobb-Douglas with constant returns to scale.

2 + 3 + 4 = 1where 4 = (1 2 3)

ln(yt /Xt4) = 1 + 2 ln(Xt2/Xt4) + 3 ln(Xt3 /Xt4) + et

yt = 1 + 2 Xt2 + 3 Xt3 + 4 Xt4 + et* * * *

Run least squares on the transformed model.Interpret coefficients same as in original model.

8.15

Copyright 1996 Lawrence C. MarshCollinear Variables

The term “independent variable” means an explanatory variable is independent of of the error term, but not necessarily independent of other explanatory variables.

Since economists typically have no controlover the implicit “experimental design”,explanatory variables tend to movetogether which often makes sorting outtheir separate influences rather problematic.

8.16

Copyright 1996 Lawrence C. MarshEffects of Collinearity

1. no least squares output when collinearity is exact.

2. large standard errors and wide confidence intervals.

3. insignificant t-values even with high R2 and a significant F-value.

4. estimates sensitive to deletion or addition of a few observations or “insignificant” variables.

5. good “within-sample”(same proportions) but poor “out-of-sample”(different proportions) prediction.

A high degree of collinearity will produce:

8.17

Copyright 1996 Lawrence C. MarshIdentifying Collinearity

Evidence of high collinearity include:

1. a high pairwise correlation between two explanatory variables.

2. a high R-squared when regressing one explanatory variable at a time on each of the remaining explanatory variables.

3. a statistically significant F-value when the t-values are statistically insignificant.

4. an R-squared that doesn’t fall by much when dropping any of the explanatory variables.

8.18

Copyright 1996 Lawrence C. MarshMitigating Collinearity

Since high collinearity is not a violation ofany least squares assumption, but rather a lack of adequate information in the sample:

1. collect more data with better information.

2. impose economic restrictions as appropriate.

3. impose statistical restrictions when justified.

4. if all else fails at least point out that the poor model performance might be due to the collinearity problem (or it might not).

8.19

Copyright 1996 Lawrence C. MarshPrediction

Given a set of values for the explanatoryvariables, (1 X02 X03), the best linearunbiased predictor of y is given by:

yt = 1 + 2Xt2 + 3Xt3 + et

This predictor is unbiased in the sensethat the average value of the forecasterror is zero.

y0 = b1 + b2X02 + b3X03^

8.20

Copyright 1996 Lawrence C. Marsh

Extensions of the Multiple

Regression Model

Chapter 9

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

9.1

Copyright 1996 Lawrence C. Marsh

Topics for This Chapter

1. Intercept Dummy Variables

2. Slope Dummy Variables

3. Different Intercepts & Slopes

4. Testing Qualitative Effects

5. Are Two Regressions Equal?

6. Interaction Effects

7. Dummy Dependent Variables

9.2

Copyright 1996 Lawrence C. Marsh

Intercept Dummy Variables

Dummy variables are binary (0,1)

Dt = 1 if red car, Dt = 0 otherwise.

yt = 1 + 2Xt + 3Dt + et

yt = speed of car in miles per hour

Xt = age of car in years

Police: red cars travel faster.H0: 3 = 0H1: 3 > 0

9.3

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2Xt + 3Dt + et

red cars: yt = (1 + 3) + 2xt + et other cars: yt = 1 + 2Xt + et

yt

Xt

milesper hour

age in years0

1 + 3

12

2

red cars

other cars

9.4

Copyright 1996 Lawrence C. MarshSlope Dummy Variables

yt = 1 + 2Xt + 3DtXt + et

yt = 1 + (2 + 3)Xt + et

yt = 1 + 2Xt + et

yt

Xt

valueofporfolio

years0

2 + 3

1

2

stocks

bonds

Stock portfolio: Dt = 1 Bond portfolio: Dt = 0

1 = initial

investment

9.5

Copyright 1996 Lawrence C. Marsh

Different Intercepts & Slopes

yt = 1 + 2Xt + 3Dt + 4DtXt + et

yt = (1 + 3) + (2 + 4)Xt + et

yt = 1 + 2Xt + et

yt

Xt

harvestweightof corn

rainfall

2 + 4

1

2

“miracle”

regular

“miracle” seed: Dt = 1 regular seed: Dt = 0

1 + 3

9.6

Copyright 1996 Lawrence C. Marshyt = 1 + 2 Xt + 3 Dt + et

21+ 3

2

1

yt

Xt

Men

Women

0

yt = 1 + 2 Xt + et

For men Dt = 1. For women Dt = 0.

years of experience

yt = (1+ 3) + 2 Xt + et

wagerate

H0: 3 = 0

H1: 3 > 0 .

. Testing fordiscriminationin starting wage

9.7

Copyright 1996 Lawrence C. Marshyt = 1 + 5 Xt + 6 Dt Xt + et

5

5 +6

1

yt

Xt

Men

Women

0

yt = 1 + (5 +6 )Xt + et

yt = 1 + 5 Xt + et

For men Dt = 1.

For women Dt = 0.

Men and women have the same starting wage, 1 , but their wage ratesincrease at different rates (diff.= 6 ).

6 > means that men’s wage rates areincreasing faster than women's wage rates.

years of experience

wagerate

9.8

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2 Xt + 3 Dt + 4 Dt Xt + et

1 + 3

1

2

2 + 4

yt

Xt

Men

Women

0

yt = (1 + 3) + (2 + 4) Xt + et

yt = 1 + 2 Xt + et

Women are given a higher starting wage, 1 , while men get the lower starting wage, 1 + 3 ,(3 < 0 ). But, men get a faster rate of increasein their wages, 2 + 4 , which is higher than therate of increase for women, 2 , (since 4 > 0 ).

years of experience

An Ineffective Affirmative Action Plan

women are startedat a higher wage.

Note:(3 < 0 )

wagerate

9.9

Copyright 1996 Lawrence C. Marsh

Testing Qualitative Effects

1. Test for differences in intercept.

2. Test for differences in slope.

3. Test for differences in both intercept and slope.

9.10

Copyright 1996 Lawrence C. Marsh

H0: vs1:

H0: vs1:

Yt 12Xt3Dt

4Dt Xt

b3

Est. Var b3کt n 4

b4

Est. Var b4کt n 4

men: Dt = 1 ; women: Dt = 0

Testing fordiscrimination instarting wage.

Testing fordiscrimination inwage increases.

intercept

slope

et

9.11

Copyright 1996 Lawrence C. Marsh

Testing Ho:

H1 : otherwise

and

SSE R yt b 1 b 2 X t 2

t 1

T

ه

SSE U yt b1bXt bDt b Dt Xt2

t1

T

ه

SSER SSEU 2

SSEU T 4 F T 4

intercept and slope

9.12

Copyright 1996 Lawrence C. Marsh

Are Two Regressions Equal?

yt = 1 + 2 Xt + 3 Dt + 4 Dt Xt + et

variations of “The Chow Test”

I. Assuming equal variances (pooling):

men: Dt = 1 ; women: Dt = 0

Ho: 3 = 4 = 0 vs. H1: otherwise

yt = wage rate

This model assumes equal wage rate variance.

Xt = years of experience

9.13

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2 Xt + et

II. Allowing for unequal variances:

ytm = 1 + 2 Xtm + etm

ytw = 1 + 2 Xtw + etw

Everyone:

Men only:Women only:

SSER

Forcing men and women to have same 1, 2.

Allowing men and women to be different.SSEm

SSEw

where SSEU = SSEm + SSEw

F =(SSER SSEU)/J

SSEU /(TK)

J = # restrictions

K=unrestricted coefs.

(running three regressions)

J = 2 K = 4

9.14

Copyright 1996 Lawrence C. Marsh

Interaction Variables

1. Interaction Dummies

2. Polynomial Terms (special case of continuous interaction)

3. Interaction Among Continuous Variables

9.15

Copyright 1996 Lawrence C. Marsh

1. Interaction Dummies

yt = 1 + 2 Xt + 3 Mt + 4 Bt + et

For men Mt = 1. For women Mt = 0. For black Bt = 1. For nonblack Bt = 0.

No Interaction: wage gap assumed the same:

yt = 1 + 2 Xt + 3 Mt + 4 Bt + 5 Mt Bt + et

Interaction: wage gap depends on race:

Wage Gap between Men and Women

yt = wage rate; Xt = experience

9.16

Copyright 1996 Lawrence C. Marsh

2. Polynomial Terms

yt = 1 + 2 X t + 3 X2

t + 4 X3

t + et

Linear in parameters but nonlinear in variables:

yt = income; Xt = agePolynomial Regression

yt

X tPeople retire at different ages or not at all.

9020 30 40 50 60 8070

9.17

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2 X t + 3 X2

t + 4 X3

t + et

yt = income; Xt = age

Polynomial Regression

Rate income is changing as we age:yt

Xt

= 2 + 2 3 X t + 3 4 X

2t

Slope changes as X t changes.

9.18

Copyright 1996 Lawrence C. Marsh

3. Continuous Interaction

yt = 1 + 2 Zt + 3 Bt + 4 Zt Bt + et

Exam grade = f(sleep:Zt , study time:Bt)

Sleep and study time do not act independently.

More study time will be more effective when combined with more sleep and less effective when combined with less sleep.

9.19

Copyright 1996 Lawrence C. Marsh

Your mind sortsthings out whileyou sleep (when you have things to sort out.)

yt = 1 + 2 Zt + 3 Bt + 4 Zt Bt + et

Exam grade = f(sleep:Zt , study time:Bt)

yt

Bt

= 2 + 4 Zt

Your studying is more effectivewith more sleep.

yt

Zt

= 2 + 4 Bt

continuous interaction 9.20

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2 Zt + 3 Bt + 4 Zt Bt + et

Exam grade = f(sleep:Zt , study time:Bt)

If Zt + Bt = 24 hours, then Bt = (24 Zt)

yt = 1+ 2 Zt +3(24 Zt) +4 Zt (24 Zt) + et

yt = (1+24 3) + (23+24 4)Zt 4Z2

t + et

yt = 1 + 2 Zt + 3 Z2

t + et

Sleep needed to maximize your exam grade:yt

Zt

= 2 + 23 Zt = 0where 2 > 0 and 3 < 0

2

3

Zt =

9.21

Copyright 1996 Lawrence C. Marsh

1. Linear Probability Model

2. Probit Model

3. Logit Model

Dummy Dependent Variables9.22

Copyright 1996 Lawrence C. Marsh

Linear Probability Model

yi = 1 + 2 Xi2 + 3 Xi3 + 4 Xi4 + ei

Xi2 = total hours of work each week

1 quits job 0 does not quit

yi =

Xi3 = weekly paycheck

Xi4 = hourly pay (Xi3 divided by Xi2)

9.23

Copyright 1996 Lawrence C. Marsh

Xi2

yi = 1 + 2 Xi2 + 3 Xi3 + 4 Xi4 + ei

yt = 1

0yt =total hours of work each week

yi = b1 + b2 Xi2 + b3 Xi3 + b4 Xi4^

yi^

Read predicted values of yi off the regression line

Linear Probability Model 9.24

Copyright 1996 Lawrence C. Marsh

1. Probability estimates are sometimesless than zero or greater than one.

2. Heteroskedasticity is present in that the model generates a nonconstant error variance.

Linear Probability Model

Problems with Linear Probability Model:

9.25

Copyright 1996 Lawrence C. Marsh

Probit Model

zi = 1 + 2 Xi2 +

2 f(zi) = e0.5zi

21

F(zi) = P[ Z zi ] = e0.5u2

du 2 1

Normal probability density function:

Normal cumulative probability function:

zi

latent variable, zi :

9.26

Copyright 1996 Lawrence C. Marsh

pi = P[ Z 1 + 2Xi2 ] = F(1 + 2Xi2)

Since zi = 1 + 2 Xi2 + , we cansubstitute in to get

Probit Model

Xi2total hours of work each week

yt = 1

0yt =

9.27

Copyright 1996 Lawrence C. Marsh

Logit Model

pi =1

1 + e (1 +

2 X

i2 +

Define pi :

For 2 > 0, pi will approach 1 as Xi2 +

pi is the probability of quitting the job.

For 2 > 0, pi will approach 0 as Xi2

9.28

Copyright 1996 Lawrence C. Marsh

Logit Model

Xi2total hours of work each week

yt = 1

0yt =

pi =1

1 + e (1 +

2 X

i2 +

pi is the probability of quitting the job.

9.29

Copyright 1996 Lawrence C. Marsh

Maximum Likelihood

Maximum likelihood estimation (MLE)is used to estimate Probit and Logit functions.

The small sample properties of MLE are not known, but in large samples MLE is normally distributed, and it is consistent and asymptotically efficient.

9.30

Copyright 1996 Lawrence C. Marsh

Heteroskedasticity

Chapter 10

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

10.1

Copyright 1996 Lawrence C. Marsh

The Nature of Heteroskedasticity

Heteroskedasticity is a systematic pattern in the errors where the variances of the errors are not constant.

Ordinary least squares assumes that all observations are equally reliable.

For efficiency (accurate estimation/prediction) reweight observations to ensure equal error variance.

10.2

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + et

Regression Model

E(et) = 0

var(et) = 2

zero mean:

homoskedasticity:

nonautocorrelation: cov(et, es) = t ° s

heteroskedasticity: var(et) = t2

10.3

Copyright 1996 Lawrence C. Marsh

Homoskedastic pattern of errors

xt

yt

...

....

...

...

..

..

. .. ... . .

. .. .. .

. ...

...

..

.

income

consumption

10.4

Copyright 1996 Lawrence C. Marsh

..

xtx1 x2

y tf(yt)

The Homoskedastic Case

..

x3 x4 income

cons

umpti

on

10.5

Copyright 1996 Lawrence C. Marsh

Heteroskedastic pattern of errors

xt

yt

.

... .

. .. .

..

.

.

.

.

..

..

.

..

.

..

...

.

..

..

..... ..... .

..

income

consumption

10.6

Copyright 1996 Lawrence C. Marsh

.

x tx1 x2

y tf(yt)

cons

umpti

on

x3

..

The Heteroskedastic Case

income

rich people

poor people

10.7

Copyright 1996 Lawrence C. Marsh

Properties of Least Squares

1. Least squares still linear and unbiased.

2. Least squares not efficient.

3. Usual formulas give incorrect standard errors for least squares.

4. Confidence intervals and hypothesis tests based on usual standard errors are wrong.

10.8

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + et

heteroskedasticity: var(et) = t2

incorrect formula for least squares variance:

var(b2) = 2

xt x

correct formula for least squares variance:

var(b2) = t2xt x

xt x

10.9

Copyright 1996 Lawrence C. Marsh

Hal White’s Standard Errors

White’s estimator of the least squares variance:

est.var(b2) = et2xt x

xt x

^

In large samples White’s standard error (square root of estimated variance) is acorrect / accurate / consistent measure.

10.10

Copyright 1996 Lawrence C. Marsh

Two Types of Heteroskedasticity

1. Proportional Heteroskedasticity. (continuous function(of xt, for example))

2. Partitioned Heteroskedasticity. (discrete categories/groups)

10.11

Copyright 1996 Lawrence C. Marsh

Proportional Heteroskedasticity

yt = 1 + 2xt + et

where

var(et) = t2E(et) = 0 cov(et, es) = 0 t ° s

t2 = 2 xt

The variance is assumed to be proportional to the value of xt

10.12

Copyright 1996 Lawrence C. Marsh

t2 = 2 xt

yt = 1 + 2xt + et

std.dev. proportional to xt

variance:

standard deviation: t = xt

yt 1 xt et= 1 + 2 + xt xt xt xt

To correct for heteroskedasticity divide the model by xt

var(et) = t2

10.13

Copyright 1996 Lawrence C. Marshyt 1 xt et= 1 + 2 + xt xt xt xt

yt = 1xt1 + 2xt2 + et* * * *

var(et ) = var( ) = var(et) = 2 xt* et

xt

1xt

1xt

var(et ) = 2*

et is heteroskedastic, but et is homoskedastic.*

10.14

Copyright 1996 Lawrence C. Marsh

1. Decide which variable is proportional to the heteroskedasticity (xt in previous example).

2. Divide all terms in the original model by the square root of that variable (divide by xt ).

3. Run least squares on the transformed model which has new yt, xt1 and xt2 variables

but no intercept.

Generalized Least Squares

These steps describe weighted least squares:

* * *

10.15

Copyright 1996 Lawrence C. Marsh

Partitioned Heteroskedasticity

yt = 1 + 2xt + et

var(et) = 12

var(et) = 22

error variance of “field” corn:

error variance of “sweet” corn:

yt = bushels per acre of corn

xt = gallons of water per acre (rain or other)

t = 1, . . . ,100

t = 1, . . . ,80

t = 81, . . . ,100

10.16

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + et var(et) = 12“field” corn:

yt = 1 + 2xt + et var(et) = 22“sweet” corn:

yt 1 xt et= 1 + 2 + 1 1 1 1

yt 1 xt et= 1 + 2 + 2 2 2 2

Reweighting Each Group’s Observations

t = 1, . . . ,80

t = 81, . . . ,100

10.17

Copyright 1996 Lawrence C. Marsh

Apply Generalized Least Squares

Run least squares separately on data for each group.

12 provides estimator of 1

2 using

the 80 observations on “field” corn.

^

22 provides estimator of 2

2 using

the 20 observations on “sweet” corn.

^

10.18

Copyright 1996 Lawrence C. Marsh

1. Residual Plots provide information on the exact nature of heteroskedasticity (partitioned or proportional) to aid in correcting for it.

2. Goldfeld-Quandt Test checks for presence of heteroskedasticity.

Detecting Heteroskedasticity

Determine existence and nature of heteroskedasticity:

10.19

Copyright 1996 Lawrence C. Marsh

Residual Plots

et

0

xt

.

.

..

.

.

.. .

..

..

..

.

... .

..

..

. .. ..

.

. . ..

.....

..

..

.

Plot residuals against one variable at a timeafter sorting the data by that variable to tryto find a heteroskedastic pattern in the data.

10.20

Copyright 1996 Lawrence C. Marsh

Goldfeld-Quandt Test

The Goldfeld-Quandt test can be used to detect heteroskedasticity in either the proportional case or for comparing two groups in the discrete case.

For proportional heteroskedasticity, it is first necessaryto determine which variable, such as xt, is proportionalto the error variance. Then sort the data from the largest to smallest values of that variable.

10.21

Copyright 1996 Lawrence C. Marsh

Ho: 12 = 2

2

H1: 12 > 2

2

GQ = ~ F[T1-K1, T2-K2]

12

22

^

^

In the proportional case, drop the middle r observations where r T/6, then runseparate least squares regressions on the firstT1 observations and the last T2 observations.

Small values of GQ support Ho while large values support H1.

Goldfeld-Quandt Test Statistic

Use F Table

10.22

Copyright 1996 Lawrence C. Marsh

t2 = 2 exp{1 zt1 + 2 zt2}

More General Model

Structure of heteroskedasticity could be more complicated:

zt1 and zt2 are any observable variables upon

which we believe the variance could depend.

Note: The function exp{.} ensures that t2 is positive.

10.23

Copyright 1996 Lawrence C. Marsh

t2 = 2 exp{1 zt1 + 2 zt2}

More General Model

lnt2 = ln

2+ 1 zt1 + 2 zt2

lnt2 = + 1 zt1 + 2 zt2

where = ln2

Ho: 1 = 0, 2

= 0

H1: 1 ° 0, 2

° 0 and/or

Least squares residuals, et ^

lnet2 =+1zt1+2zt2 + t

^

the usual F test

10.24

Copyright 1996 Lawrence C. Marsh

Autocorrelation

Chapter 11

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

11.1

Copyright 1996 Lawrence C. Marsh

The Nature of Autocorrelation

For efficiency (accurate estimation/prediction) all systematic information needs to be incor-porated into the regression model.

Autocorrelation is a systematic pattern in the errors that can be either attracting (positive) or repelling (negative) autocorrelation.

11.2

Copyright 1996 Lawrence C. Marsh

PostiveAuto.

NoAuto.

NegativeAuto.

et

.0

et

0

et

0

t

t

t

.. . . .

. . . . .. .

. . . ... ...

.. .

.

.. ... .

...

.. .. ... .

....

..

.

.

..

.

...

.

..

. .

....

.

.

crosses line not enough (attracting)

crosses line randomly

crosses line too much (repelling)

11.3

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + et

Regression Model

E(et) = 0

var(et) = 2

zero mean:

homoskedasticity:

nonautocorrelation: cov(et, es) = t ° s

autocorrelation: cov(et, es) ° t ° s

11.4

Copyright 1996 Lawrence C. Marsh

Order of Autocorrelation

yt = 1 + 2xt + et

et = et1 + t

et = 1et1 + 2et2 + t

et = 1et1 + 2et2 + 3et3 + t

1st Order:

2nd Order:

3rd Order:

We will assume First Order Autocorrelation:

et = et1 + tAR(1) :

11.5

Copyright 1996 Lawrence C. Marsh

First Order Autocorrelation

yt = 1 + 2xt + et

et = et1 + t where 1 < < 1

E(t) = 0 var(t) = 2 cov(t, s) = t ° s

These assumptions about t imply the following about et :

E(et) = 0

var(et) = e2 =

cov(et, etk) = e2 k for k > 0

corr(et, etk) = k for k > 0

2

12

11.6

Copyright 1996 Lawrence C. Marsh

Autocorrelation creates someProblems for Least Squares:

1. The least squares estimator is still linear and unbiased but it is not efficient.

2. The formulas normally used to compute the least squares standard errors are no longer correct and confidence intervals and hypothesis tests using them will be wrong.

11.7

Copyright 1996 Lawrence C. Marsh

Generalized Least Squares

yt = 1 + 2xt + et

et = et1 + t

yt = 1 + 2xt + et1 + t

substitute in for et

Now we need to get rid of et1

(continued)

AR(1) :

11.8

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + et

yt = 1 + 2xt + et1 + t

et = yt 12xt

et1 = yt1 12xt1

yt = 1 + 2xt + yt1 12xt1 + t

lag theerrorsonce

(continued)

11.9

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt + yt1 12xt1 + t

yt = 1 + 2xt + yt1 12xt1 + t

yt yt1 = 1(1) + 2(xtxt1) + t

yt = 1 + 2xt2 + t* * *

yt = yt yt1 *

1 = 1(1) *

xt2 = (xtxt1) *

11.10

Copyright 1996 Lawrence C. Marsh

yt = 1 + 2xt2 + t* * *

yt = yt yt1 * 1 = 1(1) *

xt2 = xt xt1*

Problems estimating this model with least squares:

1. One observation is used up in creating the transformed (lagged) variables leaving only (T1) observations for estimating the model.

2. The value of is not known. We must find some way to estimate it.

11.11

Copyright 1996 Lawrence C. Marsh

Recovering the 1st Observation

Dropping the 1st observation and applying least squares is not the best linear unbiased estimation method.

Efficiency is lost because the varianceof the error associated with the 1st observation is not equal to that of the other errors.

This is a special case of the heteroskedasticityproblem except that here all errors are assumedto have equal variance except the 1st error.

11.12

Copyright 1996 Lawrence C. Marsh

Recovering the 1st Observation

y1 = 1 + 2x1 + e1

The 1st observation should fit the original model as:

We could include this as the 1st observation for ourestimation procedure but we must first transform it sothat it has the same error variance as the other observations.

with error variance: var(e1) = e2 =

2 /(1-2).

Note: The other observations all have error variance 2.

11.13

Copyright 1996 Lawrence C. Marsh

y1 = 1 + 2x1 + e1

with error variance: var(e1) = e2 =

2 /(1-2).

The other observations all have error variance 2.

Given any constant c : var(ce1) = c2 var(e1).

If c = 1-2 , then var( 1-2 e1) = (1-2) var(e1).

= (1-2) e2

= (1-2) 2 /(1-2)

= 2

The transformation 1 = 1-2 e1 has variance 2 .

11.14

Copyright 1996 Lawrence C. Marsh

y1 = 1 + 2x1 + e1

The transformed error 1 = 1-2 e1 has variance 2 .

Multiply through by 1-2 to get:

1-2 y1 = 1-2 1 + 1-2 2x1 + 1-2 e1

This transformed first observation may now be added to the other (T-1) observations to obtain the fully restored set of T observations.

11.15

Copyright 1996 Lawrence C. Marsh

Estimating Unknown Value

et = et1 + t

First, use least squares to estimate the model:

If we had values for the et’s, we could estimate:

yt = 1 + 2xt + et

The residuals from this estimation are:

et = yt - b1 - b2xt^

11.16

Copyright 1996 Lawrence C. Marsh

et = yt - b1 - b2xt^

et = et1 + t^ ^ ^

Next, estimate the following by least squares:

The least squares solution is:

et et-1

et-1

T

T

t = 2

t = 2

2

^ ^

^=^

11.17

Copyright 1996 Lawrence C. Marsh

Durbin-Watson Test

Ho: = 0 vs. H1: ° 0 , > 0, or < 0

et et-1

et

T

T

t = 2

t = 1

2

^ ^

^d=

2

The Durbin-Watson Test statistic, d, is :

11.18

Copyright 1996 Lawrence C. Marsh

Testing for AutocorrelationThe test statistic, d, is approximately related to as:^

d 2(1)^

When = 0 , the Durbin-Watson statistic is d 2.^

When = 1 , the Durbin-Watson statistic is d 0.^

Tables for critical values for d are not always readily available so it is easier to use the p-valuethat most computer programs provide for d.

Reject Ho if p-value < , the significance level.

11.19

Copyright 1996 Lawrence C. Marsh

Prediction with AR(1) Errors

When errors are autocorrelated, the previous period’s error may help us predict next period’s error.

The best predictor, yT+1 , for next period is:

yT+1 = 1 + 2xT+1 + eT^ ^ ^ ^ ~

where 1 and 2 are generalized least squares

estimates and eT is given by:~

^ ^

eT = yT 1 2xT ^ ^ ~

11.20

Copyright 1996 Lawrence C. Marsh

yT+h = 1 + 2xT+h + h eT^ ^ ^ ^ ~

For h periods ahead, the best predictor is:

Assuming | | < 1, the influence of h eT

diminishes the further we go into the future(the larger h becomes).

^ ^ ~

11.21

Copyright 1996 Lawrence C. Marsh

Pooling

Time-Series and

Cross-Sectional Data

Chapter 12

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

12.1

Copyright 1996 Lawrence C. Marsh

Pooling Time and Cross Sections

yit = 1it + 2itx2it + 3itx3it + eit

If left unrestricted,this model requires different equationsfor each firm in each time period.

for the ith firm in the tth time period

12.2

Copyright 1996 Lawrence C. Marsh

Seemingly Unrelated Regressions

yit = 1i + 2ix2it + 3ix3it + eit

SUR models impose the restrictions:

1it = 1i 2it = 2i 3it = 3i

Each firm gets its own coefficients: 1i , 2i and 3i

but those coefficients are constant over time.

12.3

Copyright 1996 Lawrence C. Marsh

The investment expenditures (INV) of General Electric (G) and Westinghouse(W) may be related to their stock marketvalue (V) and actual capital stock (K) as follows:

INVGt = 1G + 2GVGt + 3GKGt + eGt

INVWt = 1W + 2WVWt + 3WKWt + eWt

i = G, W t = 1, . . . , 20

Two-Equation SUR Model12.4

Copyright 1996 Lawrence C. Marsh

Estimating Separate Equations

For now make the assumption of no correlationbetween the error terms across equations:

We make the usual error term assumptions:

cov(eGt, eGs) = 0 cov(eWt, eWs) = 0

var(eGt) = G2

var(eWt) = W2

E(eGt) = 0 E(eWt) = 0

cov(eGt, eWt) = 0 cov(eGt, eWs) = 0

12.5

Copyright 1996 Lawrence C. Marshhomoskedasticity assumption:

G = W

2 2

INVt = 1G + 1Dt + 2GVt + 2DtVt + 3GKt + 3DtKt + et

Dummy variable model assumes that :G = W

2 2

For Westinghouse observations Dt = 1; otherwise Dt = 0.

1W = 1G + 1

2W = 2G + 2

3W = 3G + 3

12.6

Copyright 1996 Lawrence C. Marsh

Problem with OLS on Each Equation

The first assumption of the Gauss-Markov Theorem concerns the model specification.

If the model is not fully and correctly specified the Gauss-Markov properties might not hold.

Any correlation of error terms across equations must be part of model specification.

12.7

Copyright 1996 Lawrence C. Marsh

Any correlation between the dependent variables of two or more equations that is not due to their explanatory variables is by default due to correlated error terms.

Correlated Error Terms12.8

Copyright 1996 Lawrence C. Marsh

1. Sales of Pepsi vs. sales of Coke.(uncontrolled factor: outdoor temperature)

2. Investments in bonds vs. investments in stocks. (uncontrolled factor: computer/appliance sales)

3. Movie admissions vs. Golf Course admissions. (uncontrolled factor: weather conditions)

4. Sales of butter vs. sales of bread. (uncontrolled factor: bagels and cream cheese)

Which of the following models wouldbe likely to produce positively correlatederrors and which would produce negatively correlations errors?

12.9

Copyright 1996 Lawrence C. Marsh

Joint Estimation of the Equations

INVGt = 1G + 2GVGt + 3GKGt + eGt

INVWt = 1W + 2WVWt + 3WKWt + eWt

cov(eGt, eWt) = GW

12.10

Copyright 1996 Lawrence C. Marsh

Seemingly Unrelated Regressions

When the error terms of two or more equationsare correlated, efficient estimation requires the useof a Seemingly Unrelated Regressions (SUR) type estimator to take the correlation into account.

Be sure to use the Seemingly Unrelated Regressions (SUR)procedure in your regression software program to estimateany equations that you believe might have correlated errors.

12.11

Copyright 1996 Lawrence C. Marsh

Separate vs. Joint Estimation

SUR will give exactly the same results as estimating each equation separately with OLS if either or both of the following two conditions are true:

1. Every equation has exactly the same set of explanatory variables with exactly the same values.

2. There is no correlation between the error terms of any of the equations.

12.12

Copyright 1996 Lawrence C. Marsh

Test for Correlation

GW

Test the null hypothesis of zero correlation

GW

G W

^

^ ^rGW 2

2 2

2 = T rGW

2

2(1)

asy.

12.13

Copyright 1996 Lawrence C. MarshStart withthe residuals eGt and eWt

from eachequation estimatedseparately.

^ ^

GW

G W

^

^ ^rGW 2

2 2

2

= T rGW

2 2(1)

asy.

GW eGteWt1T

^ ^^

G eGt1T

^^ 2 2

W eWt1T

^^ 2 2

12.14

Copyright 1996 Lawrence C. Marsh

Fixed Effects Model

yit = 1it + 2itx2it + 3itx3it + eit

yit = 1i + 2x2it + 3x3it + eit

Fixed effects models impose the restrictions:

1it = 1i 2it = 2 3it = 3

For each ith cross section in the tth time period:

Each ith cross-section has its own constant 1i intercept.

12.15

Copyright 1996 Lawrence C. Marsh

The Fixed Effects Model is conveniently represented using dummy variables:

yit = 11D1i + 12D2i + 13D3i + 14D4 i+ 2x2it + 3x3it + eit

D1i=1 if NorthD1i=0 if not N

D2i=1 if EastD2i=0 if not E

D3i=1 if SouthD3i=0 if not S

D4i=1 if WestD4i=0 if not W

yit = millions of bushels of corn producedx2it = price of corn in dollars per bushelx3it = price of soybeans in dollars per bushel

Each cross-sectional unit gets its own intercept,but each cross-sectional intercept is constant over time.

12.16

Copyright 1996 Lawrence C. Marsh

Ho : 11 = 12 = 13 = 14

Test for Equality of Fixed Effects

H1 : Ho not true

The Ho joint null hypothesis may be tested with F-statistic:

(SSER SSEU) / J

SSEU / (NT K)F = ~ F

(NT K)

J

SSER is the restricted error sum of squares (one intercept)SSEU is the unrestricted error sum of squares (four intercepts)N is the number of cross-sectional units (N = 4)K is the number of parameters in the model (K = 6)J is the number of restrictions being tested (J = N1 = 3)T is the number of time periods

12.17

Copyright 1996 Lawrence C. Marsh

Random Effects Model

yit = 1i + 2x2it + 3x3it + eit

1i = 1 + i

1 is the population mean intercept.

i is an unobservable random error thataccounts for the cross-sectional differences.

12.18

Copyright 1996 Lawrence C. Marsh

1i = 1 + i

i are independent of one another and of eit

E(i) = 0 var(i) = 2

where i = 1, ... ,N

Consequently, E(1i) = 1 var(1i) = 2

Random Intercept Term 12.19

Copyright 1996 Lawrence C. Marsh

yit = 1i + 2x2it + 3x3it + eit

yit = (1+i) + 2x2it + 3x3it + eit

yit = 1 + 2x2it + 3x3it + (i +eit)

yit = 1 + 2x2it + 3x3it + it

Random Effects Model 12.20

Copyright 1996 Lawrence C. Marsh

it = (i +eit)

yit = 1 + 2x2it + 3x3it + it

it has zero mean: E(it) = 0

it is homoskedastic: var(it) =+ e2 2

The errors from the same firm in different time periodsare correlated:

The errors from different firms are always uncorrelated:

cov(it,is) =2

cov(it,js) =

t ° s

i ° j

12.21

Copyright 1996 Lawrence C. Marsh

Simultaneous

Equations

Models

Chapter 13

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

13.1

Copyright 1996 Lawrence C. Marsh

Keynesian Macro Model

Assumptions of Simple Keynesian Model

1. Consumption, c, is function of income, y.

2. Total expenditures = consumption + investment.

3. Investment assumed independent of income.

13.2

Copyright 1996 Lawrence C. Marsh

consumption is a function of income:

income is either consumed or invested:

c = 1 + 2 y

y = c + i

The Structural Equations 13.3

Copyright 1996 Lawrence C. Marsh

The Statistical Model

ct = 1 + 2 yt + et

yt = ct + it

The consumption equation:

The income identity:

13.4

Copyright 1996 Lawrence C. MarshThe Simultaneous Natureof Simultaneous Equations

ct = 1 + 2 yt + et

yt = ct + it

Since yt

contains et

they arecorrelated

2. 1.

3.

4.

5.

13.5

Copyright 1996 Lawrence C. Marsh

The Failure of Least Squares

The least squares estimators of parameters in a structural simul-taneous equation is biased andinconsistent because of the cor-relation between the random errorand the endogenous variables onthe right-hand side of the equation.

13.6

Copyright 1996 Lawrence C. Marsh

Single Equation: Simultaneous Equations:

Single vs. Simultaneous Equations

ct

yt

et

ct

yt i t

et

13.7

Copyright 1996 Lawrence C. Marsh

Deriving the Reduced Form

ct = 1 + 2 yt + et

yt = ct + it

ct = 1 + 2(ct + it) + et

(1 2)ct = 1 + 2 it + et

13.8

Copyright 1996 Lawrence C. MarshDeriving the Reduced Form

(1 2)ct = 1 + 2 it + et

ct = + it + et(12) (12) (12)11

2

ct = 11 + 21 it + t

The Reduced Form Equation

13.9

Copyright 1996 Lawrence C. Marsh

Reduced Form Equation

ct = 11 + 21 it + t

(12)111 = (12)

221 =

(12)1t = + et and

13.10

Copyright 1996 Lawrence C. Marsh

yt = ct + itwhere ct = 11 + 21 it + t

yt = 12 + 22 it + t

It is sometimes useful to give this equationits own reduced form parameters as follows:

yt = 11 + (1+21) it + t

13.11

Copyright 1996 Lawrence C. Marsh

yt = 12 + 22 it + t

ct = 11 + 21 it + t

Since ct and yt are related through the identity:

yt = ct + it , the error term, t, of these two

equations is the same, and it is easy to show that:

(12)111 = 12 =

(12)22 = (121) = 1

13.12

Copyright 1996 Lawrence C. Marsh

IdentificationThe structural parameters are 1 and 2.

The reduced form parameters are 11 and 21.

Once the reduced form parameters are estimated,the identification problem is to determine if theorginal structural parameters can be expresseduniquely in terms of the reduced form parameters.

(121)2 =

21^^

^(121)1 =

11^

^

^

13.13

Copyright 1996 Lawrence C. Marsh

Identification

An equation is exactly identified if its structural (behavorial) parameters can be uniquely expres-sed in terms of the reduced form parameters.

An equation is over-identified if there is morethan one solution for expressing its structural (behavorial) parameters in terms of the reducedform parameters.

An equation is under-identified if its structural (behavorial) parameters cannot be expressed in terms of the reduced form parameters.

13.14

Copyright 1996 Lawrence C. MarshThe Identification Problem

A system of M equations containing M endogenous variables must exclude at least M1 variables from a given equation in order for the parameters of that equation to be identified and to be able tobe consistently estimated.

13.15

Copyright 1996 Lawrence C. Marsh

Two Stage Least Squares

Problem: right-hand endogenous variablesyt2 and yt1 are correlated with the error terms.

yt1 = 1 + 2 yt2 + 3 xt1 + et1

yt2 = 1 + 2 yt1 + 3 xt2 + et2

13.16

Copyright 1996 Lawrence C. MarshProblem: right-hand endogenous variablesyt2 and yt1 are correlated with the error terms.

Solution: First, derive the reduced form equations.

yt1 = 1 + 2 yt2 + 3 xt1 + et1

yt2 = 1 + 2 yt1 + 3 xt2 + et2

yt1 = 11 + 21 xt1 + 31 xt2 + t1

yt2 = 12 + 22 xt1 + 32 xt2 + t2

Solve two equations for two unknowns, yt1, yt2 :

13.17

Copyright 1996 Lawrence C. Marsh

yt1 = 11 + 21 xt1 + 31 xt2 + t1

yt2 = 12 + 22 xt1 + 32 xt2 + t2

Use least squares to get fitted values:

2SLS: Stage I

yt1 = 11 + 21 xt1 + 31 xt2^ ^ ^ ^

yt2 = 12 + 22 xt1 + 32 xt2^ ^ ^ ^ yt2 = yt2 + t2

^ ^yt1 = yt1 + t1

^ ^

13.18

Copyright 1996 Lawrence C. Marsh2SLS: Stage II

yt2 = yt2 + t2^ ^yt1 = yt1 + t1

^ ^ and

yt1 = 1 + 2 yt2 + 3 xt1 + et1

yt2 = 1 + 2 yt1 + 3 xt2 + et2

Substitue in

for yt1 , yt2

yt1 = 1 + 2 (yt2 + t2) + 3 xt1 + et1^ ^

yt2 = 1 + 2 (yt1 + t1) + 3 xt2 + et2^^

13.19

Copyright 1996 Lawrence C. Marsh2SLS: Stage II (continued)

yt1 = 1 + 2 yt2 + 3 xt1 + ut1^

yt2 = 1 + 2 yt1 + 3 xt2 + ut2^

^ut1 = 2t2 + et1 ut2 = 2t1 + et2^where and

Run least squares on each of the above equationsto get 2SLS estimates:

1 , 2 , 3 , 1 , 2 and 3

~ ~ ~ ~ ~ ~

13.20

Copyright 1996 Lawrence C. Marsh

Nonlinear

Least

Squares

Chapter 14

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

14.1

Copyright 1996 Lawrence C. Marsh

(A.) “Regression” model with only an intercept term:

Review of Least Squares Principle

yt = + et

et = yt

et = (yt )2 2

SSE = (yt )2

SSE = 2 (yt ) = 0^

yt = 0^

yt = 0^

= yt = y^ 1T

(minimize the sum of squared errors)

Yields an exact analytical solution:

14.2

Copyright 1996 Lawrence C. Marsh

Review of Least Squares(B.) Regression model without an intercept term:

yt = xt + et

et = yt xt

et = (yt xt)2 2

SSE = (yt xt)2

SSE = 2 xt(yt xt)= 0^

xtyt xt = 0^ 2

xt yt xt = 0^ 2

xt = xtyt^ 2

=^ xtyt

2xt

This yields an exact analytical solution:

14.3

Copyright 1996 Lawrence C. MarshReview of Least Squares(C.) Regression model with both an intercept and a slope:

yt = + xt + et SSE = (yt xt)2

SSE = 2 (yt xt) = 0^ ^

SSE = 2 xt(yt xt) = 0^ ^

y x = 0^ ^

=^ (xtx)(yty)

(xtx)2xtyt xt xt = 0^ ^ 2

This yields an exact analytical solution:

= y x^ ^

14.4

Copyright 1996 Lawrence C. MarshNonlinear Least Squares(D.) Nonlinear Regression model:

yt = xt + et

SSE = (yt xt)

2

PROBLEM: An exactanalytical solution tothis does not exist.

SSE = 2 xt

ln(xt)(yt xt) = 0

^ ^

xtln(xt)yt] xt

ln(xt)] = 0 ^^

Must use numericalsearch algorithm totry to find value of to satisfy this.

14.5

Copyright 1996 Lawrence C. Marsh

Find Minimum of Nonlinear SSE

^

SSE SSE = (yt xt)

2

14.6

Copyright 1996 Lawrence C. Marsh

The least squares principle is still appropriate when themodel is nonlinear, but it is harder to find the solution.

Conclusion14.7

Copyright 1996 Lawrence C. Marsh

Nonlinear least squares optimization methods:

The Gauss-Newton Method

Optional Appendix 14.8

Copyright 1996 Lawrence C. Marsh

The Gauss-Newton Algorithm

1. Apply the Taylor Series Expansion to the nonlinear model around some initial b(o).

2. Run Ordinary Least Squares (OLS) on the linear part of the Taylor Series to get b(m).

3. Perform a Taylor Series around the new b(m) to get b(m+1) .

4. Relabel b(m+1) as b(m) and rerun steps 2.-4.

5. Stop when (b(m+1) b(m) ) becomes very small.

14.9

Copyright 1996 Lawrence C. MarshThe Gauss-Newton Method

yt = f(Xt,b) + t for t = 1, . . . , n.

Do a Taylor Series Expansion around the vector b = b(o) as follows:

yt = f(Xt,b) + f’(Xt,b)(b - b) + t

where t؛ (b - b(o))

Tf’’(Xt,b)(b - b) + Rt + t

f(Xt,b) = f(Xt,b) + f’(Xt,b)(b - b)

+ (b - b)Tf’’(Xt,b)(b - b) + Rt

14.10

Copyright 1996 Lawrence C. Marshyt = f(Xt,b) + f’(Xt,b)(b - b) + t

yt - f(Xt,b) = f’(Xt,b)b - f’(Xt,b) b + t

yt - f(Xt,b) + f’(Xt,b) b = f’(Xt,b)b + t

yt = f’(Xt,b)b + t

where yt

؛yt - f(Xt,b) + f’(Xt,b) b

This is linear in b .

Gauss-Newton just runs OLS on thistransformed truncated Taylor series.

14.11

Copyright 1996 Lawrence C. Marsh

yt = f’(Xt,b)b + t

Gauss-Newton just runs OLS on thistransformed truncated Taylor series.

or y = f’(X,b)b + خ

for t = 1, . . . , n in matrix terms

b [ f’(X,b)T f’(X,b)]-1 f’(X,b)

T y

^

This is analogous to linear OLS where y = Xb + خ led to the solution: b XTX)XTy^except that X is replaced with the matrix of firstpartial derivatives: f’(Xt,b) and y is replaced by y

(i.e. “y” = y* and “X” = f’(X,b) )

14.12

Copyright 1996 Lawrence C. Marsh

Recall that: y*(o)

yf(X,b(o)) + f’(X,b) b؛

Now define: y

؛ y f(X,b(o))

Therefore: y

= y + f’(X,b) b

b [ f’(X,b)T f’(X,b)]-1 f’(X,b)

T y

^

Now substitute in for yin Gauss-Newton solution:

to get:

b = b(o) + [ f’(X,b)T f’(X,b)]-1 f’(X,b)

T y

^

14.13

Copyright 1996 Lawrence C. Marsh

b = b(o) + [ f’(X,b)T f’(X,b)]-1 f’(X,b)

T y

^

b(1) = b+ [ f’(X,b)T f’(X,b)]-1 f’(X,b)

T y

Now call this b value bas follows:^

More generally, in going from interation m toiteration (m+1) we obtain the general expression:

b(m+1) = b(m) + [ f’(X,b(m)T f’(X,b(m)]-1 f’(X,b(m)

T y(m)

14.14

Copyright 1996 Lawrence C. Marsh

b(m+1) = [ f’(X,b(m)T f’(X,b(m)]-1 f’(X,b(m))

T y*(m)

b(m+1) = b(m) + [ f’(X,b(m))T f’(X,b(m))]-1 f’(X,b(m))

T y(m)

Thus, the Gauss-Newton (nonlinear OLS) solutioncan be expressed in two alternative, but equivalent,forms:

1. replacement form:

2. updating form:

14.15

Copyright 1996 Lawrence C. Marsh

For example, consider Durbin’s Method of estimatingthe autocorrelation coefficient under a first-order autoregression regime:

y t = b1 + b2Xt 2 + . . . + bK Xt K + t for t = 1, . . . , n.

t = t - 1 + ut where u t satisfies the conditions

E u t = 0 , E u 2t = su2, E u t u s = 0 for s ° t.

Therefore, u t is nonautocorrelated and homoskedastic.

Durbin’s Method is to set aside a copy of the equation,lag it once, multiply by and subtract the new equation

from the original equation, then move the yt-1 term to

the right side and estimate along with the bs by OLS.

14.16

Copyright 1996 Lawrence C. Marsh

Durbin’s Method is to set aside a copy of the equation,lag it once, multiply by and subtract the new equation

from the original equation, then move the yt-1 term to

the right side and estimate along with the b’s by OLS.

y t = b1 + b2Xt 2 + b3 X t 3 + t for t = 1, . . . , n.

where t = t - 1 + ut

y t-1 = b1 + b2Xt -1, 2 + b3 Xt -1, 3 + t -1

Lag once and multiply by

Subtract from the original and move y t-1 to right side:

yt = b1-+ b2Xt 2 - Xt-1, 2 + b3(Xt 3 Xt-1, 3)+ y t-1+ ut

14.17

Copyright 1996 Lawrence C. Marsh

yt = b1-+ b2Xt 2 - Xt-1, 2 + b3(Xt 3 -Xt-1, 3) + y t-1+ ut

Now Durbin separates out the terms as follows:

yt = b1-+ b2Xt 2 - b2Xt-1 2 + b3Xt 3 - b3Xt-1 3+ y t-1+ ut

The structural (restricted,behavorial) equation is:

The corresponding reduced form (unrestricted) equation is:

yt = 1+ 2Xt, 2 + 3Xt-1, 2 + 4Xt, 3 + 5Xt-1, 3 + 6yt-1+ u t

1 = b1- 2 = b23= - b2 4 = b3 5= - b3 6=

14.18

Copyright 1996 Lawrence C. Marsh

Given OLS estimates: 1 2 3 4 5 6

^^^^^ ^

we can get three separate and distinct estimates for

3

2

^ ^

^

5

4^

^^ 6^^

These three separate estimates of are in conflict !!!It is difficult to know which one to use as “the”legitimate estimate of Durbin used the last one.

1 = b1- 2 = b23= - b2 4 = b3 5= - b3 6=

14.19

Copyright 1996 Lawrence C. Marsh

The problem with Durbin’s Method is that it ignoresthe inherent nonlinear restrictions implied by this structural model. To get a single (i.e. unique) estimatefor the implied nonlinear restrictions must be incorporated directly into the estimation process.

Consequently, the above structural equation should beestimated using a nonlinear method such as theGauss-Newton algorithm for nonlinear least squares.

yt = b1-+ b2Xt 2 - b2Xt -1, 2 + b3Xt 3 - b3Xt -1, 3+ yt-1+ ut

14.20

Copyright 1996 Lawrence C. Marsh

yt = b1-+ b2Xt 2 - b2Xt-1, 2 + b3Xt 3 - b3Xt-1, 3+ yt-1+ ut

f’(Xt,b) [ ]yt

= =X t, 2 X t-1,2)

=X t, 3 X t-1,3)

yt

= ( - b1-b2Xt-1,2 - b3Xt-1,3+ y t-1 )

yt

b1

yt

b2

yt

b3

yt

b1

yt

b2

yt

b3

14.21

Copyright 1996 Lawrence C. Marsh

where yt(m)

yt - f(Xt,bm) + f’(Xt,b(m) b(m؛

(m+1) = [ f’(X,bm)T f’(X,bm)]-1 f’(X,b(m)

T y(m

^

f(Xt,b) = b1-+ b2Xt 2 - b2Xt-1 2 + b3Xt 3 - b3Xt-1 3+ y t-1

b(m) =

b

1(m)

(m)

b2(m)

b3(m)

Iterate until convergence.

f’(Xt,bm [ ]yt

(m)

yt

b1(m)

yt

b2(m)

yt

b3(m)

14.22

Copyright 1996 Lawrence C. Marsh

Distributed

Lag Models

Chapter 15

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

15.1

Copyright 1996 Lawrence C. Marsh

The Distributed Lag Effect

Economic actionat time t

Effect at time t

Effect at time t+1

Effect at time t+2

15.2

Copyright 1996 Lawrence C. Marsh

Unstructured Lags

yt = + 0 xt + 1 xt-1 + 2 xt-2 + . . . +n xt-n + et

“n” unstructured lags

no systematic structure imposed on the ’s

the ’s are unrestricted

15.3

Copyright 1996 Lawrence C. Marsh

Problems with Unstructured Lags

1. n observations are lost with n-lag setup.

2. high degree of multicollinearity among xt-j’s.

3. many degrees of freedom used for large n.

4. could get greater precision using structure.

15.4

Copyright 1996 Lawrence C. Marsh

The Arithmetic Lag Structure

proposed by Irving Fisher (1937)

the lag weights decline linearly

Imposing the relationship:

# = (n - # + 1)

0 = (n+1) 1 = n 2 = (n-1) 3 = (n-2) n-2 = 3 n-1 = 2 n = only need to estimate one coefficient, ,

instead of n+1 coefficients, 0 , ... , n .

15.5

Copyright 1996 Lawrence C. Marsh

Arithmetic Lag Structure

yt = + 0 xt + 1 xt-1 + 2 xt-2 + . . . +n xt-n + et

yt = + (n+1) xt + n xt-1 + (n-1) xt-2 + . . . + xt-n + et

Step 1: impose the restriction: # = (n - # + 1)

Step 2: factor out the unknown coefficient, .

yt = + [(n+1)xt + nxt-1 + (n-1)xt-2 + . . . + xt-n] + et

15.6

Copyright 1996 Lawrence C. Marsh

Arithmetic Lag Structure

Step 3: Define zt .

yt = + [(n+1)xt + nxt-1 + (n-1)xt-2 + . . . + xt-n] + et

zt = [(n+1)xt + nxt-1 + (n-1)xt-2 + . . . + xt-n]

Step 5: Run least squares regression on:

yt = + zt + et

Step 4: Decide number of lags, n.

For n = 4: zt = [ 5xt + 4xt-1 + 3xt-2 + 2xt-3 + xt-4]

15.7

Copyright 1996 Lawrence C. Marsh

Arithmetic Lag Structure

i

i

0 = (n+1)

1 = n

2 = (n-1)

n =

.

.

.

0 1 2 . . . . . n n+1

..

.

.

linear lag structure

15.8

Copyright 1996 Lawrence C. Marsh

Polynomial Lag Structureproposed by Shirley Almon (1965)

the lag weights fit a polynomial

where i = 1, . . . , n p = 2 and n = 4

For example, a quadratic polynomial:0 = 0

1 = 0 + 1 + 2

2 = 0 + 21 + 42

3 = 0 + 31 + 92

4 = 0 + 41 + 162

n = the length of the lagp = degree of polynomial

where i = 1, . . . , ni = 0 + 1i + 2i +...+ pi 2 p

i = 0 + 1i + 2i 2

15.9

Copyright 1996 Lawrence C. Marsh

Polynomial Lag Structureyt = + 0 xt + 1 xt-1 + 2 xt-2 + 3 xt-3 +4 xt-4 + et

yt = + 0xt + 0 + 1 + 2xt-1 + (0 + 21 + 42)xt-2

+ (0 + 31 + 92)xt-3+ (0 + 41 + 162)xt-4 + et

Step 2: factor out the unknown coefficients: 0, 1, 2.

yt = + 0 [xt + xt-1 + xt-2 + xt-3 + xt-4] + 1 [xt + xt-1 + 2xt-2 + 3xt-3 + 4xt-4] + 2 [xt + xt-1 + 4xt-2 + 9xt-3 + 16xt-4] + et

Step 1: impose the restriction: i = 0 + 1i + 2i 2

15.10

Copyright 1996 Lawrence C. Marsh

Polynomial Lag Structure

Step 3: Define zt0 , zt1 and zt2 for 0 , 1 , and 2.

yt = + 0 [xt + xt-1 + xt-2 + xt-3 + xt-4] + 1 [xt + xt-1 + 2xt-2 + 3xt-3 + 4xt-4] + 2 [xt + xt-1 + 4xt-2 + 9xt-3 + 16xt-4] + et

z t0 = [xt + xt-1 + xt-2 + xt-3 + xt-4]

z t1 = [xt + xt-1 + 2xt-2 + 3xt-3 + 4xt- 4 ]

z t2 = [xt + xt-1 + 4xt-2 + 9xt-3 + 16xt- 4]

15.11

Copyright 1996 Lawrence C. Marsh

Polynomial Lag Structure

Step 4: Regress yt on zt0 , zt1 and zt2 .

yt = + 0 z t0 + 1 z t1 + 2 z t2 + et

Step 5: Express i‘s in terms of 0 , 1 , and 2.

^ ^ ^ ^

0 = 0

1 = 0 + 1 + 2

2 = 0 + 21 + 42

3 = 0 + 31 + 92

4 = 0 + 41 + 162

^^

^

^

^

^

^ ^ ^

^ ^ ^

^ ^ ^

^ ^ ^

15.12

Copyright 1996 Lawrence C. Marsh

Polynomial Lag Structure

.. . .

.

0 1 2 3 4 i

i

Figure 15.3

0

1

23

4

15.13

Copyright 1996 Lawrence C. Marsh

Geometric Lag Structure

yt = + 0 xt + 1 xt-1 + 2 xt-2 + . . . + et

infinite distributed lag model:

yt = + i xt-i + eti = 0

(15.3.1)

geometric lag structure:

i = i where || < 1 and i

15.14

Copyright 1996 Lawrence C. Marsh

Geometric Lag Structure

yt = + 0 xt + 1 xt-1 + 2 xt-2 + 3 xt-3 + . . . + et

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et

infinite unstructured lag:

infinite geometric lag:

Substitute i = i

0 = 1 = 2 = 2

3 = 3

. ..

15.15

Copyright 1996 Lawrence C. Marsh

Geometric Lag Structure

interim multiplier (3-period) :

impact multiplier :

long-run multiplier :

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et

+ +

+ + + + . . .

15.16

Copyright 1996 Lawrence C. Marsh

Geometric Lag Structure

i

Figure 15.5

.

..

. .0 1 2 3 4 i

1 =

2 = 2

3 = 3

4 = 4

0 =

geometrically declining weights

15.17

Copyright 1996 Lawrence C. Marsh

Geometric Lag Structure

Problem:

How to estimate the infinite number of geometric lag coefficients ???

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et

Answer:

Use the Koyck transformation.

15.18

Copyright 1996 Lawrence C. Marsh

The Koyck Transformation

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et

yt yt-1 = + xt + (et et-1)

Lag everything once, multiply by and subtract from original:

yt-1 = + xt-1 + xt-2 + xt-3 + . . .) + et-1

15.19

Copyright 1996 Lawrence C. Marsh

The Koyck Transformation

yt yt-1 = + xt + (et et-1)

yt = + yt-1 + xt + (et et-1)

Solve for yt by adding yt-1 to both sides:

yt = + yt-1 + xt + t

15.20

Copyright 1996 Lawrence C. Marsh

The Koyck Transformationyt = + yt-1 + xt + (et et-1)

yt = + yt-1 + xt + t

Defining = , = , and = ,

use ordinary least squares:

= ^ ^

= ^ ^

= ^ ^ ^

The original structuralparameters can now beestimated in terms ofthese reduced formparameter estimates.

15.21

Copyright 1996 Lawrence C. Marsh

Geometric Lag Structure

0 =

1 =

2 = 2

3 = 3

. ..

^ ^

^ ^ ^

^ ^ ^

^ ^ ^

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et^ ^ ^ ^ ^ ^

yt = + 0 xt + 1 xt-1 + 2 xt-2 + 3 xt-3 + . . . + et ^^ ^ ^ ^ ^

15.22

Copyright 1996 Lawrence C. Marsh

Durbin’s h-test for autocorrelation

T 11 ( T 1)[se(b2)]2

h = 1 d2

h = Durbin’s h-test statistic

d = Durbin-Watson test statistic

se(b2) = standard error of the estimate b2

T = sample size

Estimates inconsistent if geometric lag model is autocorrelated,but Durbin-Watson test is biased in favor of no autocorrelation.

15.23

Copyright 1996 Lawrence C. Marsh

yt = + x*t + et

Adaptive Expectations

yt = credit card debt

x*t = expected (anticipated) income

(x*t is not observable)

15.24

Copyright 1996 Lawrence C. MarshAdaptive Expectations

x*t - x*t-1 = (xt-1 - x*t-1)

adjust expectations based on past realization:

15.25

Copyright 1996 Lawrence C. MarshAdaptive Expectations

x*t - x*t-1 = (xt-1 - x*t-1)

x*t = xt-1 + (1- ) x*t-1

rearrange to get:

xt-1 = [x*t - (1- ) x*t-1]

or

15.26

Copyright 1996 Lawrence C. MarshAdaptive Expectations

yt = + x*t + et

Lag this model once and multiply by (1):

yt = - (1)yt-1+ [x*t - (1)x*t-1]

+ et - (1)et-1

subtract this from the original to get:

(1)yt-1 = (1) + (1) x*t-1 + (1)et-1

15.27

Copyright 1996 Lawrence C. MarshAdaptive Expectations

yt = - (1)yt-1+ [x*t - (1)x*t-1]

+ et - (1)et-1

Since xt-1 = [x*t - (1- ) x*t-1] we get:

yt = - (1)yt-1+ xt-1 + ut

where ut = et - (1)et-1

15.28

Copyright 1996 Lawrence C. MarshAdaptive Expectations

yt = - (1)yt-1+ xt-1 + ut

yt = 1 + 2yt-1+ 3xt-1 + ut

Use ordinary least squares regression on:

and we get:

=(12)

3

^ ^

^ = (12)

^ ^ =(12)

1

^ ^

^

15.29

Copyright 1996 Lawrence C. Marsh

Partial Adjustment

yt - yt-1 = (y*t - yt-1)

inventories partially adjust , 0 < < 1,

towards optimal or desired level, y*t :

y*t = + xt + et

15.30

Copyright 1996 Lawrence C. MarshPartial Adjustment

yt - yt-1 = (y*t - yt-1)

= ( + xt + et - yt-1)

= + xt - yt-1+ et

yt = + (1 - yt-1 + xt + et

Solving for yt :

15.31

Copyright 1996 Lawrence C. MarshPartial Adjustment

yt = + (1 - yt-1 + xt + et

yt = 1 + 2yt-1+ 3xt + t

=(12)

3

^^

^ = (12) ^ ^

=(12)

1

^^

^

Use ordinary least squares regression to get:

15.32

Copyright 1996 Lawrence C. Marsh

Time Series

Analysis

Chapter 16

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

16.1

Copyright 1996 Lawrence C. Marsh

Previous Chapters used Economic Models

1. economic model for dependent variable of interest.

2. statistical model consistent with the data.

3. estimation procedure for parameters using the data.

4. forecast variable of interest using estimated model.

Times Series Analysis does not use this approach.

16.2

Copyright 1996 Lawrence C. Marsh

Time Series Analysis is useful for short term forecasting only.

Time Series Analysis does not generallyincorporate all of the economic relationships found in economic models.

Times Series Analysis uses more statistics and less economics.

Long term forecasting requires incorporating more involvedbehavioral economic relationships into the analysis.

16.3

Copyright 1996 Lawrence C. Marsh

Univariate Time Series Analysis can be used to relate the current values of a single economicvariable to:

1. its past values

2. the values of current and past random errors

Other variables are not used in univariate time series analysis.

16.4

Copyright 1996 Lawrence C. Marsh

1. autoregressive (AR)

2. moving average (MA)

3. autoregressive moving average (ARMA)

Three types of Univariate Time Series Analysisprocesses will be discussed in this chapter:

16.5

Copyright 1996 Lawrence C. Marsh

1. its past values.

2. the past values of the other forecasted variables.

3. the values of current and past random errors.

Multivariate Time Series Analysis can be used to relate the current value of each of several economic variables to:

Vector autoregressive models discussed later inthis chapter are multivariate time series models.

16.6

Copyright 1996 Lawrence C. Marsh

First-Order Autoregressive Processes, AR(1):

yt = + 1yt-1+ et, t = 1, 2,...,T. (16.1.1)

is the intercept.

1 is parameter generally between -1 and +1.

et is an uncorrelated random error with mean zero and variance e

.

16.7

Copyright 1996 Lawrence C. Marsh

Autoregressive Process of order p, AR(p) :

yt = + 1yt-1 + 2yt-2 +...+ pyt-p + et (16.1.2)

is the intercept.

i’s are parameters generally between -1 and +1.

et is an uncorrelated random error with mean zero and variance e

.

16.8

Copyright 1996 Lawrence C. Marsh

AR models always have one or more lagged dependent variables on the right hand side.

Consequently, least squares is no longer abest linear unbiased estimator (BLUE), but it does have some good asymptotic properties including consistency.

Properties of least squares estimator:

16.9

Copyright 1996 Lawrence C. Marsh

AR(2) model of U.S. unemployment rates

yt = 0.5051 + 1.5537 yt-1 - 0.6515 yt-2

(0.1267) (0.0707) (0.0708)

Note: Q1-1948 through Q1-1978 from J.D.Cryer (1986) see unempl.dat

positive

negative

16.10

Copyright 1996 Lawrence C. Marsh

Choosing the lag length, p, for AR(p):

The Partial Autocorrelation Function (PAF)

The PAF is the sequence of correlations between(yt and yt-1), (yt and yt-2), (yt and yt-3), and so on,given that the effects of earlier lags on yt are held constant.

16.11

Copyright 1996 Lawrence C. Marsh Partial Autocorrelation Function

yt = 0.5 yt-1 + 0.3 yt-2 + et

02 / T

2 / T

1

1

k

kk is the last (kth) coefficient in a kth order AR process.

This sample PAF suggests a second order process AR(2) which is correct.

Data simulatedfrom this model:

kk^

16.12

Copyright 1996 Lawrence C. Marsh

Using AR Model for Forecasting:

unemployment rate: yT-1 = 6.63 and yT = 6.20

yT+1 = + 1 yT + 2 yT-1

= 0.5051 + (1.5537)(6.2) - (0.6515)(6.63)

= 5.8186

^ ^ ^ ^

yT+2 = + 1 yT+1 + 2 yT

= 0.5051 + (1.5537)(5.8186) - (0.6515)(6.2)

= 5.5062

^ ^ ^ ^

yT+1 = + 1 yT + 2 yT-1

= 0.5051 + (1.5537)(5.5062) - (0.6515)(5.8186)

= 5.2693

^ ^ ^ ^

16.13

Copyright 1996 Lawrence C. Marsh

Moving Average Process of order q, MA(q):

yt = + et + 1et-1 + 2et-2 +...+ qet-q + et (16.2.1)

is the intercept.

i‘s are unknown parameters.

et is an uncorrelated random error with mean zero and variance e

.

16.14

Copyright 1996 Lawrence C. Marsh

An MA(1) process:

yt = + et + 1et-1 (16.2.2)

Minimize sum of least squares deviations:

S(,1) = et = yt - -1et-1) (16.2.3)2

t=1

T

t=1

T 2

16.15

Copyright 1996 Lawrence C. Marsh

stationary:A stationary time series is one whose mean, variance,and autocorrelation function do not change over time.

nonstationary:A nonstationary time series is one whose mean,variance or autocorrelation function change over time.

Stationary vs. Nonstationary 16.16

Copyright 1996 Lawrence C. Marsh

yt = z t - z t-1

First Differencing is often used to transforma nonstationary series into a stationary series:

where z t is the original nonstationary series

and yt is the new stationary series.

16.17

Copyright 1996 Lawrence C. Marsh

Choosing the lag length, q, for MA(q):

The Autocorrelation Function (AF)

The AF is the sequence of correlations between(yt and yt-1), (yt and yt-2), (yt and yt-3), and so on,without holding the effects of earlier lags on yt constant.

The PAF controlled for the effects of previous lagsbut the AF does not control for such effects.

16.18

Copyright 1996 Lawrence C. MarshAutocorrelation Function

yt = et 0.9 et-1

02 / T

2 / T

1

1

k

rkk

rkk is the last (kth) coefficient in a kth order MA process.

This sample AF suggests a first order process MA(1) which is correct.

Data simulatedfrom this model:

16.19

Copyright 1996 Lawrence C. Marsh

Autoregressive Moving AverageARMA(p,q)

An ARMA(1,2) has one autoregressive lagand two moving average lags:

yt = + 1yt-1 + et + 1et-1 + 2 et-2

16.20

Copyright 1996 Lawrence C. Marsh

Integrated Processes

A time series with an upward or downwardtrend over time is nonstationary.

Many nonstationary time series can be made stationary by differencing them one or more times.

Such time series are called integrated processes.

16.21

Copyright 1996 Lawrence C. Marsh

The number of times a series must be differenced to make it stationary is theorder of the integrated process, d.

An autocorrelation function, AF, with large, significant autocorrelationsfor many lags may require more thanone differencing to become stationary.

Check the new AF after each differencingto determine if further differencing is needed.

16.22

Copyright 1996 Lawrence C. Marsh

Unit Root

zt = 1zt -1 + + et + 1et -1 (16.3.2)

-1 < 1 < 1 stationary ARMA(1,1)

1 = 1 nonstationary process

1 = 1 is called a unit root

16.23

Copyright 1996 Lawrence C. Marsh

Unit Root Tests

zt = 1zt -1 + + et + 1et -1 (16.3.3)

Testing1 = 0 is equivalent to testing 1 = 1

zt - zt -1 = (1- 1)zt -1 + + et + 1et -1

*

where zt = zt - zt -1 and 1 = 1- 1 *

*

16.24

Copyright 1996 Lawrence C. Marsh

Unit Root Tests

H0:1 = 0 vs. H1:1 < 0 (16.3.4)**

Computer programs typically use one of the following tests for unit roots:

Dickey-Fuller Test

Phillips-Perron Test

16.25

Copyright 1996 Lawrence C. Marsh

Autoregressive Integrated Moving Average ARIMA(p,d,q)

An ARIMA(p,d,q) model represents an AR(p) - MA(q) process that has been differenced (integrated, I(d)) d times.

yt = + 1yt-1 +...+ pyt-p + et + 1et-1 +... + q et-q

16.26

Copyright 1996 Lawrence C. Marsh

The Box-Jenkins approach:

1. Identificationdetermining the values of p, d, and q.

2. Estimationlinear or nonlinear least squares.

3. Diagnostic Checkingmodel fits well with no autocorrelation?

4. Forecastingshort-term forecasts of future yt values.

16.27

Copyright 1996 Lawrence C. Marsh

Vector Autoregressive (VAR) Models

yt = 0+ 1yt-1 +...+ pyt-p + 1xt-1 +... + p xt-p + et

xt = 0+ 1yt-1 +...+ pyt-p + 1xt-1 +... + p xt-p + ut

Use VAR for two or more interrelated time series:

16.28

Copyright 1996 Lawrence C. Marsh

1. extension of AR model.

2. all variables endogenous.

3. no structural (behavioral) economic model.

4. all variables jointly determined (over time).

5. no simultaneous equations (same time).

Vector Autoregressive (VAR) Models16.29

Copyright 1996 Lawrence C. Marsh

The random error terms in a VAR modelmay be correlated if they are affected byrelevant factors that are not in the modelsuch as government actions or national/international events, etc.

Since VAR equations all have exactly the same set of explanatory variables, the usualseemingly unrelation regression estimationproduces exactly the same estimates asleast squares on each equation separately.

16.30

Copyright 1996 Lawrence C. Marsh

Consequently, regardless of whetherthe VAR random error terms arecorrelated or not, least squares estimationof each equation separately will provideconsistent regression coefficient estimates.

Least Squares is Consistent

16.31

Copyright 1996 Lawrence C. Marsh

VAR Model Specification

To determine length of the lag, p, use:

2. Schwarz’s SIC criterion

1. Akaike’s AIC criterion

These methods were discussed in Chapter 15.

16.32

Copyright 1996 Lawrence C. Marsh

Spurious Regressions

yt = 1+ 2 xt + t

where t = 1 t-1 + t

-1 <1 < 1 I(0) (i.e. d=0)

1 = 1 I(1) (i.e. d=1)

If 1 =1 least squares estimates of 2 mayappear highly significant even when true 2 = 0 .

16.33

Copyright 1996 Lawrence C. Marsh

Cointegration

yt = 1+ 2 xt + t

If xt and yt are nonstationary I(1)

we might expect that t is also I(1).

However, if xt and yt are nonstationary I(1)

but t is stationary I(0), then xt and yt are

said to be cointegrated.

16.34

Copyright 1996 Lawrence C. Marsh

Cointegrated VAR(1) Model

yt = 0+ 1yt-1 + 1xt-1 + et

xt = 0+ 1yt-1 + 1xt-1 + ut

VAR(1) model:

If xt and yt are both I(1) and are cointegrated,

use an Error Correction Model, instead of VAR(1).

16.35

Copyright 1996 Lawrence C. Marsh

Error Correction Model

yt = 0+ (1-1)yt-1 + 1xt-1 + et

xt = 0+ 1yt-1 + (1-1)xt-1 + ut

yt = yt - yt-1 and xt = xt - xt-1

(continued)

16.36

Copyright 1996 Lawrence C. Marsh

Error Correction Model

yt = 0+ 1(yt-1 - 1- 2 xt-1) + et *

xt = 0+ 2(yt-1 - 1- 2 xt-1) + ut *

0= 0 + 11

*

0= 0 + 21* 2= 1

1=1 1

1 - 12=1

1- 1

16.37

Copyright 1996 Lawrence C. Marsh

yt-1 = 1+ 2 xt-1 + t-1

Estimate by least squares:

to get the residuals:

t-1 = yt-1 - 1- 2 xt-1 ^ ^ ^

Estimating an Error Correction Model

Step 1:Step 1:

16.38

Copyright 1996 Lawrence C. Marsh

Estimate by least squares:

Estimating an Error Correction Model

Step 2:Step 2:

yt = 0+ 1 t-1 + et *

xt = 0+ 2 t-1 + ut *

^

^

16.39

Copyright 1996 Lawrence C. Marsh

Using cointegrated I(1) variables in aVAR model expressed solely in termsof first differences and lags of firstdifferences is a misspecification.

The correct specification is to use an

Error Correction Model

16.40

Copyright 1996 Lawrence C. Marsh

Chapter 17

Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these programs or from the use of the information contained herein.

Guidelines forResearch Project

17.1

Copyright 1996 Lawrence C. Marsh

����

Formulation economic ====> econometric.

Estimation selecting appropriate method.

Interpretation how the xt’s impact on the yt .

Inference testing, intervals, prediction.

What Book Has Covered17.2

Copyright 1996 Lawrence C. Marsh

Topics for This Chapter1. Types of Data by Source

2. Nonexperimental Data

3. Text Data vs. Electronic Data

4. Selecting a Topic

5. Writing an Abstract

6. Research Report Format

17.3

Copyright 1996 Lawrence C. Marsh

Types of Data by Source

i) Experimental Data from controlled experiments.

ii) Observational Datapassively generated by society.

iii) Survey Datadata collected through interviews.

17.4

Copyright 1996 Lawrence C. Marsh

Time vs. Cross-Section

Time Series Datadata collected at distinct points in time(e.g. weekly sales, daily stock price, annual budget deficit, monthly unemployment.)

Cross Section Datadata collected over samples of units, individuals, households, firms at a particular point in time. (e.g. salary, race, gender, unemployment by state.)

17.5

Copyright 1996 Lawrence C. Marsh

Micro vs. Macro

Micro Data:data collected on individual economicdecision making units such as individuals,households or firms.

Macro Data:data resulting from a pooling or aggregatingover individuals, households or firms at thelocal, state or national levels.

17.6

Copyright 1996 Lawrence C. Marsh

Flow vs. Stock

Flow Data:outcome measured over a period of time,such as the consumption of gasoline duringthe last quarter of 1997.

Stock Data:outcome measured at a particular point intime, such as crude oil held by Chevron inUS storage tanks on April 1, 1997.

17.7

Copyright 1996 Lawrence C. Marsh

Quantitative vs. Qualitative

Quantitative Data:outcomes such as prices or income that maybe expressed as numbers or some transfor-mation of them (e.g. wages, trade deficit).

Qualitative Data:outcomes that are of an “either-or” nature(e.g. male, home owner, Methodist, bought car last year, voted in last election).

17.8

Copyright 1996 Lawrence C. Marsh

International Data

International Financial Statistics (IMF monthly).

Basic Statistics of the Community (OECD annual).

Consumer Price Indices in the European Community (OECD annual).

World Statistics (UN annual).

Yearbook of National Accounts Statistics (UN).

FAO Trade Yearbook (annual).

17.9

Copyright 1996 Lawrence C. Marsh

United States Data

Survey of Current Business (BEA monthly).

Handbook of Basic Economic Statistics (BES).

Monthly Labor Review (BLS monthly).

Federal Researve Bulletin (FRB monthly).

Statistical Abstract of the US (BC annual).

Economic Report of the President (CEA annual).

Economic Indicators (CEA monthly).

Agricultural Statistics (USDA annual).

Agricultural Situation Reports (USDA monthly).

17.10

Copyright 1996 Lawrence C. Marsh

State and Local Data

State and Metropolitan Area Data Book (Commerce and BC, annual).

CPI Detailed Report (BLS, annual).

Census of Population and Housing (Commerce, BC, annual).

County and City Data Book (Commerce, BC, annual).

17.11

Copyright 1996 Lawrence C. Marsh

Citibase on CD-ROM

• Financial series: interest rates, stock market, etc.

• Business formation, investment and consumers.

• Construction of housing.

• Manufacturing, business cycles, foreign trade.

• Prices: producer and consumer price indexes.

• Industrial production.

• Capacity and productivity.

• Population.

17.12

Copyright 1996 Lawrence C. Marsh

Citibase on CD-ROM(continued)

• Labor statistics: unemployment, households.

• National income and product accounts in detail.

• Forecasts and projections.

• Business cycle indicators.

• Energy consumption, petroleum production, etc.

• International data series including trade statistics.

17.13

Copyright 1996 Lawrence C. Marsh

Resources for Economists

Resources for Economists by Bill Goffe

http://econwpa.wustl.edu/EconFAQ/EconFAQ.html

Bill Goffe provides a vast database of information about the economics profession including economic organizations, working papers and reports, and economic data series.

17.14

Copyright 1996 Lawrence C. Marsh

Internet Data Sources

• Shortcut to All Resources.

• Macro and Regional Data.

• Other U.S. Data.

• World and Non-U.S. Data.

• Finance and Financial Markets.

• Data Archives.

• Journal Data and Program Archives.

A few of the items on Bill Goffe’s Table of Contents:

17.15

Copyright 1996 Lawrence C. Marsh

Useful Internet Addresses

http://seamonkey.ed.asu.edu/~behrens/teach/WWW_data.html

http://www.sims.berkeley.edu/~hal/pages/interesting.html

http://www.stls.frb.org FED RESERVE BK - ST. LOUIS

http://www.bls.gov BUREAU OF LABOR STATISTICS

http://nber.harvard.edu NAT’L BUR. ECON. RESEARCH

http://www.inform.umd.edu:8080/EdRes/Topic/EconData/.www/econdata.html UNIVERSITY OF MARYLAND

http://www.bog.frb.fed.us FEB BOARD OF GOVERNORS

http://www.webcom.com/~yardeni/economic.html

17.16

Copyright 1996 Lawrence C. MarshData from Surveys

i) identify the population of interest.

ii) designing and selecting the sample.

iii) collecting the information.

iv) data reduction, estimation and inference.

The survey process has four distinct aspects:

17.17

Copyright 1996 Lawrence C. Marsh

Controlled Experiments

1. Labor force participation: negative income tax:guaranteed minimum income experiment.

2. National cash housing allowance experiment:impact on demand and supply of housing.

3. Health insurance: medical cost reduction: sensitivity of income groups to price change.

4. Peak-load pricing and electricity use:daily use pattern of residential customers.

Controlled experiments were done on these topics:

17.18

Copyright 1996 Lawrence C. Marsh

Economic Data Problems

I. poor implicit experimental design

(i) collinear explanatory variables.

(ii) measurement errors.

II. inconsistent with theory specification

(i) wrong level of aggregation.

(ii) missing observations or variables.

(iii) unobserved heterogeneity.

17.19

Copyright 1996 Lawrence C. Marsh

Selecting a Topic

• “What am I interested in?”

• Well-defined, relatively simple topic.

• Ask prof for ideas and references.

• Journal of Economic Literature (ECONLIT)

• Make sure appropriate data are available.

• Avoid extremely difficult econometrics.

• Plan your work and work your plan.

General tips for selecting a research topic:

�������

17.20

Copyright 1996 Lawrence C. Marsh

Writing an Abstract

(i) concise statement of the problem.

(ii) key references to available information.

(iii) description of research design including:(a) economic model(b) statistical model(c) data sources(d) estimation, testing and prediction

(iv) contribution of the work

Abstract of less than 500 words should include:

17.21

Copyright 1996 Lawrence C. Marsh

Research Report Format

1. Statement of the Problem. 2. Review of the Literature. 3. The Economic Model. 4. The Statistical Model. 5. The Data. 6. Estimation and Inferences Procedures. 7. Empirical Results and Conclusions. 8. Possible Extensions and Limitations. 9. Acknowledgments.10. References.

17.22

top related