econometric models the concept of data …at least its salient features) as best as possible. what...

ECONOMETRIC MODELS

The concept of Data Generating Process (DGP)

and its relationships with the analysis of

specification.

Luca Fanelli

University of Bologna

[email protected]

The concept of Data Generating Process (DGP)

A convenient way to understand the concept of DGP

is to imagine to perform a simulation experiment.

Monte Carlo experiment.

Exercise: Simulate data from a (scalar) stationary

AR(1) model.

Exercise: Simulate data from a (scalar) stationary

AR(1) model.

First, set the model:

yt = β0 + β1yt−1 + ut

ut ∼WN(0,σ2u)

t = 1, 2, ..., T

y0 = 0.

Second, set the parameter values

β0 = 0.5, β1 = 0.7, σ2u = 0.5.

Third, specify a stochastic distribution for the vari-

ables:

ut ∼WNGaussian(0,σ2u) ≡WNN(0,σ2u)

We know exactly the stochastic distribution of yt be-

cause we are using it to generate the data !

We know exactly the mechanism through which the

sequence of T observations (given y0)

y1, y2, ..., yT

is generated.

0 50 100 150 200 250 300

-1

0

1

2

3

4 y(t) = 0.5 + 0.7*y(t-1) + u(t) , u(t) drawn from N(0, 0.5)

T = 300

A possible realization of T=300 observations from the

DGP

DGP and statistical model

In the Monte Carlo experiment above we have

simulated a simple economy.

In real cases, the investigator does not know how the

sequence

y0, y1, y2, ..., yT

has been generated. He/she specifies a statistical

model which attempts to approximate the ‘true’ DGP

(at least its salient features) as best as possible.

What is a statistical model ?

Statistical Model :=

⎧⎪⎨⎪⎩stochastic distribution

+sampling scheme

The statistical model is called parametric when the

only unknown quantity are the parameters that

characterize the stochastic distribution.

Given a parametric statistical model, one can always

write down the joint distribution of the observations

(data) by using the sequential factorization:

f(y1, y2, ..., yT | y0; δ0, δ1, σ2u):=f(yT | yT−1, ..., y0)×

×f(yT−1 | yT−2, ..., y0)× ...×f(y1 | y0)

:=TYt=1

f(yt | y0 , Ft−1) , Ft−1:= {yt−1, ..., y1} .

The joint distribution that summarizes the two

crucial ingredients of a statistical model is known as

likelihood function (a part from a constant).

Recall: when you write a likelihood function you have

an underlying statististical model !

As an example, assume that the statistician/econometrician

deems that given y0, the sequence

y1, y2, ..., yT

is generated by the following statistical (parametric)

model:

yt = δ0+δ1yt−1+ut , ut ∼WNN(0,σ2u), t = 1, ..., T

whose unknown parameters are δ0, δ1, σ2u.

The unknown parameters δ0, δ1, σ2u can be inferred

from the data

y0, y1, y2, ..., yT

by estimating the specified statistical model under

the maintained assumption that the DGP belongs

to the specified statistical model (i.e. under the pos-

tulated stochastic distribution and sampling scheme).

The likelihood function allows the statistician to re-

cover ML estimates of the unknown parameters.

In general, we like ML estimation because of its ‘nice’

properties when the model is correctly specified !

We say that a statistical model is correctly specifiedif it captures salient aspects of the DGP:

Extremely good case: the DGP belongs to the spec-ified statistical model (it means that the DGP is ob-tained from the statistical model by fixing the un-known parameters to their ‘true’ value). In this casethe statistician/econometrician recovers consistent es-timates of the unknown parameters and, possibly, ef-ficients, i.e. with minimum variance;

Reasonably good case: the estimation of the statis-tical model allows the statistician/econometrician torecover consistent estimates of the unknown para-meters (difficult to say something about efficiency).⇒Correct inference on theunknown parameters.

In this case, the distribution specified in the statis-tical model and/or the sampling scheme may differfrom the ‘true’ distribution and sampling scheme inthe DGP, but the extent of such difference does notaffect the possibility of estimating the parameters con-sistently.

Recall Section

Deterministic sequences

Let {hT , T = 1, 2, ...} ≡ {hT} be a sequence of real num-bers.

If the sequence has a limit, h, then this is denoted by

limT→∞

hT = h.

This implies that for every ε > 0 there exists a positive, finite

integer Tε such that

|hT − h| < ε for T > Tε.

If hT is a p × 1 vector, limT→∞ hT = h means that for

every ε > 0 there exists a positive, finite integer Tε such that

khT − hk2 < ε for T > Tε.

Note that kvk2:=(v0v)1/2 is the Euclidean norm of the vector

v .

This can be interpreted as a measure of the length of v in the

space Rp, i.e. a measure of the distance of the vector v from

the vector 0p×1.

One can generalize this measure by defining the norm

kvkA := (v0Av)1/2

where A is a symmetric positive definite matrix; this norm mea-

sures the distance of v from 0p×1 ‘weighted’ by the elementsof the matrix A.

Stochastic sequences

Henceforth hT will be considered a p× 1 vector, except wherestated otherwise

Suppose now that each hT is a (continuous) random vector.

We are interested in the concepts of convergence in probability

and convergence in distribution.

The sequence of random vectors {hT , T = 1, 2, ...} convergesin probability to the non-stochastic

vector h if for all > 0:

limT→∞

P (khT − hk2 < ) = 1;

we conventionally write hT →p h.

The concept of convergence in probability leads us to the concept

of consistency of an estimator.

Consistency of an estimator

Let θ̂T be the estimator of the unknown parameter θ0 ob-

tained from a sample of length T , and consider the sequencenθ̂T , T = 1, 2, ...

o(hence random vectors); then θ̂T is said

to be a consistent estimator of θ0 if θ̂T →p θ0.

Convergence in probability implies that the difference between

θ̂T and θ0 disappears with probability one as T →∞.

In the limit θ̂T and θ0 are essentially identical.

End of Recall Section

Example 1.

The DGP is as above and the statistician/econometrician

specifies

yt = β0 + β1yt−1 + β2zt + ut , ut ∼WNN(0,σ2u)

where zt is iid and is irrelevant with respect to the

DGP.

He/she can still get consistent estimates of β0, β1 and

σ2u based on the onbervatios

y0, y1, y2, ..., yT

z1, z2, ..., zT .

In turn, we say that a statistical model is not

correctly specified, i.e. is misspecified, if it provides

inconsistent estimates of the unknown parameters.

Example 2.

DGP as above but the statistician/econometrician spec-

ifies:

yt = β0 + ut , ut ∼WNN(0,σ2u).

The OLS (ML) estimators of β0 σ2u based on

y0, y1, y2, ..., yT

are not consistent !

Example 3 (structural break)

DGP:

yt = 0.5 + 0.7yt−1(1−Dt)− 0.3yt−1Dt + ut

ut ∼ WNN(0,1)

dummy: Dt =

(1 if t ≥ T10 otherwise

, 1 ≤ T1 ≤ T

Econometrician/statisticain specifies the statistical model:

yt = β0 + β1yt−1 + ut , ut ∼WNN(0,σ2u).

Here the OLS (or ML) estimator of β1 is not

consistent !

DGP, statistical model and econometric model

Which relationship exists between the statistical model

and the (dynamic) econometric model ?

Econometricians usually call ‘statistical model’ what

in their jargon is an econometric model in ‘reduced

form’.

An econometric model can usually be expressed in two

forms: reduced form and structural form:

Econometric model =

(reduced form representationstructural form representation.

An econometric model in reduced form is a model inwhich the endogenous variable(s) at time t dependonly on a set of variables, called predeterminated vari-ables, such that in order to know this set of variablesat time t one does need to know the value of theendogenous variable at time t .

Example 4.

We want to explain the consumption behaviour of aneconomic agent. Let ct be the log real per-capitaconsumption of the agent at time t, and let wt thelog of real per-capita financial wealth of the agent attime t. Imagine that according to the chosen theory:

ct = β0+β1ct−1+β2wt−1+ut, ut ∼WN(0,σ2u), t = 1, ..., TIn this example, ct is the endogenous variable andxt:=(1, ct−1, wt−1)0 the vector of predeterminated vari-ables. According to this model, the consumption levelof the agent at time t depends on a constant, theconsumption level in the previous period (habit persis-tence) and the level of financial wealth in the previousperiod; the knowledge of each element of xt does notrequire the knowledge of ct !

Example 4 (continued).

Imagine now that the theory instead predicts that

ct = β0+β1ct−1+β2wt+ut , ut ∼WN(0,σ2u) t = 1, ..., T.

Is the vector xt:=(1, ct−1, wt)0 still predetermined ?

We have the following doubt. Consumption and port-

folio decisions (the allocation of non-consumed dis-

posable income among different financial assets) might

be simultaneous. Since portfolio decisions at time t

affect wt, it follows that the knowlegde of wt might

require the contemporaneous knowledge of ct !

The predeterminate variables, by definition, do not

contain also endogenous variables, i.e. variables that

the model attempts to explain or that are directly

influenced at time t by the variable the model at-

tempts to expalin. A correctly specified econometric

model in reduced form should not be affected by the

so-called endogeneity bias issue.

Thus the econometric model coincides with the sta-

tistical model when it is expressed in reduced form.

Example 5.

Structural Form:

Rt = ρRt−1 + (1− ρ)[ϕbbt + ϕyπt] + ut

bt = α1bt−1 − α2bt−2 − δ(Rt − πt) + ηtÃutηt

!∼WNN

ÃÃ00

!,

"σ2u σu,η

σ2η

#!

Reduced Form:

Rt = π11Rt−1 + π12bt−1+π13bt−2 + π14πt + εRt

bt = π21Rt−1 + π22bt−1 + π23bt−2 + +π24πt + εbtÃεRtεbt

!∼WNN

ÃÃ00

!,

"σ21 σ1,2

σ22

#!.

Obvioulsy, the parameters of the two systems are strictly

(linearly) connected.

When we specify an econometric model, our ambition

is that its reduced form (statistical model)

approximates as close as possible the features of the

underlying DGP.

Of course, an investigator will never know that its

statistical model is correctly specified (because the

DGP is unknown by definition).

Imagine that the economy (or the market) has done

a Monte Carlo simulation and generated some obser-

vations.

The econometrician/statisticain does not know the

actual features of the experiment.

However, he/she uses his/her theoretical knowledge

about the phenomenon of interest and the available

data to infer the salient feature of that experiment.

econometric models the concept of data …at least its salient features) as best as possible. what...

Documents