tests of static asset pricing models. in general asset pricing models quantify the tradeoff between...

Tests of Static Asset Pricing Models

Tests of Static Asset Pricing Models

• In general asset pricing models quantify the tradeoff between risk and expected return.– Need to both measure risk and relate it to the

expected return on a risky asset.

• The most commonly used models are:– CAPM– APT– FF three factor model

Testable Implications

• These models have testable implications. For the CAPM, for example:– Expected excess return of a risky asset is

proportional to the covariance of its return and that of the market portfolio.

• Note, this tells us the measure of risk used and its relation to expected return.

– There are other restrictions that depend upon whether there exists a riskless asset.

Testable Implications

• For the APT,– The expected excess return on a risky asset is

linearly related to the covariance of its return with various risk factors.

– These risk factors are left unspecified by the theory and have been:

• Derived from the data (CR (1983), CK)

• Exogenously imposed (CRR (1985))

Plan

• Review the basic econometric methodology we will use to test these models.

• Review the CAPM.• Test the CAPM.

– Traditional tests (FM (1972), BJS (1972), Ferson and Harvey)

– ML tests (Gibbons (1982), GRS (1989))– GMM tests

• Factor models: APT and FF– Curve fitting vs. ad-hoc theorizing

Econometric Methodology Review

• Maximum Likelihood Estimation

• The Wald Test

• The F Test

• The LM Test

• A specialization to linear models and linear restrictions– A comparison of test statistics

Review of Maximum Likelihood Estimation

• Let {x1, … xT} be a sample of T, i.i.d. random variables.– Call that vector x.– Let x be continuously distributed with density

f(x|).– Where, is the unknown parameter vector that

determines the distribution.

The Likelihood Function

• The joint density for the independent random variables is given by:

f(x1|) f(x2|) f(x3|)… f(xT|)• This joint density is known as the likelihood

function, L(x|)

L(x|)= f(x1|) f(x2|) f(x3|)… f(xT|)• Can you write the joint density and L(x|) this

way when dealing with time-dependent observations?

Independence

• You can’t.– The reason you can write the product

f(x1|) f(x2|) f(x3|)… f(xT|)

is because of the independence.

• If you have dependence, writing the joint density can be extremely complicated.

• See, e.g. Hamilton (1994) for a good discussion of switching regression models and the EM algorithm.

Idea Behind Maximum Likelihood Estimation

• Pick the parameter vector estimate, , that maximizes the likelihood, L(x|), of observing the particular vector of realizations, x.

MLE Plusses and Minuses

• Plusses: Efficient estimation in terms of picking the estimator with the smallest covariance matrix.– Question: are ML estimators necessarily

unbiased?

• Minuses: Strong distributional assumptions make robustness a problem.

MLE Example: Normal Distributions where OLS assumptions are satisfied

• Sample y of size T is normally distributed with mean x where– X is a T x K matrix of explanatory variables is a K x 1 vector of parameters– The variance-covariance matrix of the errors

from the true regression is 2I, where– I is a T x T identity matrix

The Likelihood Function

• The likelihood function for the linear model with independent normally distributed errors is:

The Log-Likelihood Function

• With independent draws, it is easier to maximize the log-likelihood function, because products are replaced by sums. The log-likelihood is given by:

First-order Conditions:

First-order Conditions: 2

The Information Matrix

• If is our parameter vector,– I() is the information matrix,– which is minus the expectation of the matrix of

second partial derivatives of the log-likelihood with respect to the parameters.

The Information Matrix – Cont…

• The MLE achieves the Cramer-Rao lower bound, which means that the variance of the estimators equals the inverse of the information matrix:

• Now,

• note, the off diagonal elements are zero.

).,( 21 I

The Information Matrix – Cont…

• The negative of the expectation is:

• The inverse of this is:

Another way of Writing I(,2)

• For a vector, , of parameters, I(), the information matrix, can be written in a second way:

• This second form is more convenient for estimation, because it does not require estimating second derivatives.

Estimation

• The Likelihood Ratio Test– Let be a vector of parameters to be estimated.

– Let H0 be a set of restrictions on these parameters.

– These restrictions could be linear or non-linear.– Let be the MLE of estimated without

regard to constraints (the unrestricted model).– Let be the constrained MLE.

U

U

R

The Likelihood Ratio Test Statistic

• If and are the likelihood functions evaluated at these two estimates, the likelihood ratio is given by:

• Then, -2ln() = -2(ln( ) – ln( ) ~ 2 with degrees of freedom equal to the number of restrictions imposed.

)ˆ(ˆUUL )ˆ(ˆ

RRL

)ˆ(ˆUUL )ˆ(ˆ

RRL )ˆ(ˆ)ˆ(ˆ

UU

RR

L

L

Another Look at the LR Test

• Concentrated Log-Likelihood: Many problems can be formulated in terms of partitioning a parameter vector, into {1, 2} such that the solution to the optimization problem, can be written as a function of , e.g.:

• Then, we can concentrate the log-likelihood function as: F*(1, 2) = F(1, t(1)) Fc().

21

).ˆ(ˆ12 t

Why Do This?

• The unrestricted solution to

• then provides the full solution to the optimization problem, since t is known.

• We now use this technique to find estimates for the classical linear regression model.

)( 11 cFMax

Example

• The log-likelihood function (from CLM) with normal disturbances is given by:

• The solution to the likelihood equation for implies that however we estimate , the estimator for will be:

2

2

Ex: Concentrating the Likelihood Function

• Inserting this back into the log-likelihood yields:

• Because (y - X)(y - X) is just the sum of squared residuals from the regression (ee) we can rewrite ln(Lc) as:

Ex: Concentrating the Likelihood Function

• For the restricted model we obtain the restricted concentrated log-likelihood:

• So, plugging in these concentrated log-likelihoods into our definition of the LR test, we obtain:

• Or, T times the log of the ratio of the restricted SSR and the unrestricted SSR, a nice intuition.

)(

1ln)2ln(1

2)ln( '

RRcR eeT

TL

ee

eeTLR RR

'ln

'

Ex: OLS with Normal Errors

• True regression model:

• The t are iid normal.

• Sample size is T.

• Restriction: = 1.

ttt xy

Example – Cont…

• The first-order conditions for the estimates and simply reduce to the OLS normal equations:

Example – Cont…

• Solving

• Substituting into the FOC for yields:

xy ˆˆ

T

t t

T

t tt

xx

yyxx

1

2

1

)(

)))(((

Example – Cont…

• Solve for as before: 22

1

2 )ˆˆ(1

ˆ

T

ttt xy

T

Example – Cont…

• The restricted model is exactly the same, except that is constrained to be one, so that the normal equation reduces to:

and

One can then plug in to obtain and form the likelihood ratio, which is distributed 2(1).

2ˆ R

The Wald Test

• The problem with LR test: Need both restricted and unrestricted model estimates.

• One or the other could be hard to compute.• The Wald test is an alternative that requires

estimating the unrestricted model only.• Suppose y ~ N(X, ), with a sample size of T,

then:21 ~)()'( TXyXy

The Wald Test – Cont…

• Under the null hypothesis that E(y) = X, the quadratic form above has a 2 distribution. If the hypothesis is false, the quadratic form will be larger, on average, than it would be if the null were true.

• In particular, it will be a non-central 2 with the same degrees of freedom, which looks like a central 2, but lies to the right.

• This is the basis for the test.

The Restricted Model

• Now, step back from the normal and let be the parameter estimates from the unrestricted model.

• Let restrictions be given by

H0: f() = 0.

• If the restrictions are valid, then should satisfy them.

• If not, should be farther from zero than would be explained by sampling error alone.

)ˆ(f

Formalism

• The Wald statistic is

• Under H0 in large samples, W ~ 2 with d.f. equal to the number of restrictions. See Greene ch.9 for details.

• Lastly, to use the Wald test, we need to compute the variance term:

)ˆ()])ˆ([()'ˆ( 1 ffVarfW

Restrictions on Slope Coefficients

• If the restrictions are on slope coefficients of a linear regression, then:

where

and K is the number of regressors.

• Then, we can write the Wald Statistic:

where J is the number of restrictions.

12 )'(]ˆ[]ˆ[ XXsVarVar 22 ˆ

' KT

T

KT

ees

][)ˆ())'ˆ(])'()[ˆ(()'ˆ( 2112 JfGXXsGfW

Linear Restrictions

H0: R - q = 0

• For example, suppose there were three betas, 1, 2, and 3. Let’s look at three tests.

(1) 1 = 0,

(2) 1 = 2,

(3) 1 = 0 and 2 = 2.

• Each row of R is a single linear restriction on the coefficient vector.

Writing R

• Case 1:

• Case 2:

• Case3:

The Wald Statistic

• In general, the Wald statistic with J linear restrictions reduces to:

with J d.f.

• We will use these tests extensively in our discussion of Chapters5 and 6 of CLM.

]ˆ[]')'([]'ˆ[ 112 qRRXXRsqRW

The F Test

• A related way to test the validity of the J restrictions

R - q = 0

• Recall that the F test can be written in terms of a comparison of the sum of squared residuals for the restricted and unrestricted models:

• or

)/('

/)''(),(

KTee

JeeeeKTJF RR

J

qRRXXRsqRKTJF

]ˆ[]')'([]'ˆ[),(

112

Why Do We Care?

• We care because in a linear model with normally distributed disturbances under the null, the test statistic derived above is exact.– This will be important later because under

normality, some of our cross-sectional CAPM tests will be of this form and,

– A sufficient condition for the (static) CAPM to be “correct” is for asset returns to be normally distributed.

The LM Test

• This is a test that involves computing only the restricted estimator.– If the hypothesis is valid, at the value of the

restricted estimator, the derivative of the log-likelihood function should be close to zero.

– We will next form the LM test with the J restrictions f() = 0.

The LM Test – Cont…

This is maximized by choice of and

)(')]()'[()2(

1)ln(

2)2ln(

2)ln(

22

FXyXy

TTLLM

.ˆ 2

First-order Conditions

• and

The LM Test – Cont…

• The test then, is whether the Lagrange multipliers equal zero. When the restrictions are linear, the test statistic becomes (see Greene, chapter 7):

where J is the number of restrictions.

]ˆ[]')'([]'ˆ[ 112 qRRXXRsqRLM R

W, LR, LM, and F

• We compare them for J linear restrictions in the linear model with K regressors. It can be shown that:–

–

–

– and that W > LR > LM.

,FJKT

TW

,1

1ln

FJ

KTTLR

,]))/(1(1)[(

FJFJKTKT

TLM

tests of static asset pricing models. in general asset pricing models quantify the tradeoff between...

Documents

theorizing slide

loglikelihood function

t x t identity matrix

factor model slide

likelihood functions

density fx

likelihood ratio test

product fx