advanced quantitative methods: autocorrelation - · pdf file ·...

ConsequencesTypical processes

StationarityDiagnostics

Spatial autocorrelation

Advanced Quantitative Methods:Autocorrelation

Jos Elkink

University College Dublin

February 23, 2011

Jos Elkink autocorrelation




1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots

Tests: Autocorrelation

Tests: Stationarity

5 Spatial autocorrelation





Notation: lagged variables

Instead of yi to indicate each of n observations, we will use yt torefer to each of T observations on a time-series.







yt−1 refers to the lagged value, i.e. the value of variable y at timet − 1, the observation just one time period before time t.







yt−1 refers to the lagged value, i.e. the value of variable y at timet − 1, the observation just one time period before time t.

A lag can have any length k (k > 0), yt−k .





Notation: first differences

The difference between yt and yt−1, or the change in variable y attime t, is called the first difference, ∆yt = yt − yt−1.





Notation: first differences

The difference between yt and yt−1, or the change in variable y attime t, is called the first difference, ∆yt = yt − yt−1.

Again, differences can have different lag lengths:∆yt−k = yt−k − yt−k−1.





Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity






The problem

A key assumption of (linear) regression is that observations areindependent.





The problem


Generally, in time-series or observations in space, the observationsdepend on each other. If GDP is high in 1999, it is likely to behigh in 2000. If GDP is high in Germany, it is likely to be high inThe Netherlands.





The problem


Generally, in time-series or observations in space, the observationsdepend on each other. If GDP is high in 1999, it is likely to behigh in 2000. If GDP is high in Germany, it is likely to be high inThe Netherlands.

Treating them as independent observations suggest that you havefar more information than you do.





The problem

Ignoring this autocorrelation leads to:

βOLS unbiased but inefficient (as long as E (ε|X) = 0)





The problem



V (βOLS) may be an under- or overestimate - the F - andt-tests cannot be trusted. If the autocorrelation is positive,V (βOLS) will be an underestimate.





The problem




The residual variance is likely to be underestimated and R2

overestimated.





The problem




The residual variance is likely to be underestimated and R2

overestimated.

Risk of spurious regressions





Spurious regressions

When two variables are uncorrelated, but nonstationary, they often lead

to highly significant estimates of their correlation in“naive” linear

regression. Assume:








regression. Assume:

yt = yt−1 + ε1,t

xt = xt−1 + ε2,t .








regression. Assume:

yt = yt−1 + ε1,t

xt = xt−1 + ε2,t .

Then OLS estimation of:

yt = α+ βxt + εt

will lead to a significant t-test on β.






0 20 40 60 80 100

−10

−5

05

Sam

ple

data





Spurious regression

lm(formula = y ~ x)

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.9646 0.3626 -2.660 0.00911 **

x -0.9207 0.1002 -9.185 6.54e-15 ***

Residual standard error: 3.021 on 99 degrees of freedom

Multiple R-Squared: 0.4601, Adjusted R-squared: 0.4547

F-statistic: 84.37 on 1 and 99 DF, p-value: 6.544e-15





Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity






Time-series processes

A time-series can have been generated by various different types ofprocesses.

Which process generated the data of course affects whicheconometric model is more appropriate to estimate its parameters.





Linear model

The linear regression model looks like:

y = µ+ ε,

where µ = Xβ, or, if we have no explanatory variables, µ is aconstant.





Linear model


y = µ+ ε,


For now, we will look at the latter case, µt = µ.





Linear model


y = µ+ ε,


For now, we will look at the latter case, µt = µ.

In the linear model, we assume ε to be an IID variable,ε ∼ N(0, σ2).





Moving average process

In the moving average model, we replace the assumption of entirelyindependent residuals by assuming that the residual at time t is aweighted average between that residual and the one at t − 1.






In the moving average model, we replace the assumption of entirelyindependent residuals by assuming that the residual at time t is aweighted average between that residual and the one at t − 1.

yt = µ+ (εt + φεt−1) − 1 < φ < 1






The above is a so-called MA(1) process, a moving average processwith one lag.






The above is a so-called MA(1) process, a moving average processwith one lag.

This model can be generalised to more lags, the MA(q) process:

yt = µ+ (εt + φ1εt−1 + φ2εt−2)

yt = µ+ (εt +

q∑

l=1

φlεt−l)






Theoretically this model can be generalised to infinitely many lags:

yt = µ+ (εt +

∞∑

l=1

φlεt−l)






Theoretically this model can be generalised to infinitely many lags:

yt = µ+ (εt +

∞∑

l=1

φlεt−l)

Now, we could assume that φl = αl , for some |α| < 1, thus anexponentially decreasing function of the lag.





Autoregressive process

yt = µ+∞∑

l=0

αlεt−l






yt = µ+∞∑

l=0

αlεt−l

This can be shown to be equivalent to:

yt = (1− α)µ+ αyt−1 + εt ,

which is called the autoregressive process.






yt = µ+∞∑

l=0

αlεt−l

This can be shown to be equivalent to:

yt = δ + αyt−1 + εt ,

which is called the autoregressive process.






The AR(1) process can also be extended to the AR(p) process:

yt = δ +

p∑

l=1

αlyt−l + εt






The AR(1) process can also be extended to the AR(p) process:

yt = δ +

p∑

l=1

αlyt−l + εt

Whereby

yt = δ +

∞∑

l=1

αlyt−l + εt

would be equivalent to a MA(1) process.






Simulated data, MA(1), φ = .5

0 20 40 60 80 100

−2

−1

01

23

Sam

ple

data






Simulated data, MA(1), φ = .9

0 20 40 60 80 100

−2

−1

01

23

Sam

ple

data






Simulated data, AR(1), α = .5

0 20 40 60 80 100

−2

−1

01

23

Sam

ple

data






Simulated data, AR(1), α = .9

0 20 40 60 80 100

−6

−4

−2

0

Sam

ple

data





ARMA(p,q)

The moving average process, MA(q), and the autoregressiveprocess, AR(p), can be combined in the ARMA(p,q) process.





ARMA(p,q)

The moving average process, MA(q), and the autoregressiveprocess, AR(p), can be combined in the ARMA(p,q) process.

yt = µ+

p∑

l=1

yt−lαl +

q∑

l=1

εt−lφl + εt





Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity






Stationarity

A process is strictly stationary if the underlying probabilitydistribution is constant over time.





Stationarity

A process is strictly stationary if the underlying probabilitydistribution is constant over time.

A process is weakly stationary if the following conditions hold:

E (yt) = µ ∀ t

Var(yt) = σ2 ∀ t

Cov(yt , yt−k) = Cov(yt+j , yt+j−k) ∀ t, k, j

and it follows that the autocorrelations will depend on the laglength only:

Cor(yt , yt−k) =Cov(yt , yt−k)

√

Var(yt)Var(yt−k)= ρk .

(Harrison 2009: 3)





Stationarity: example

yt = εt + 0.5εt−1

0 20 40 60 80 100

−0.

6−

0.4

−0.

20.

00.

20.

4





Nonstationarity: example

yt = yt−1 + εt

0 20 40 60 80 100

−1

01

2





Integrated

If E (yt), Var(yt) and Cov(yt , yt−k) converge to limits µ∗, σ∗2 andρ∗k , respectively, as t → ∞, then the process is calledasymptotically stationary, or integrated of order zero, or I (0).

A stationary process is thus I (0), but an I (0) process notnecessarily stationary.

(Harrison 2009: 40)





MA(1) properties

yt = µ+ εt + φεt−1





MA(1) properties


E (yt) = E (µ+ εt + φεt−1)

= E (µ) + E (εt) + φE (εt−1)

= µ+ 0 + 0 = µ

(Harrison 2009: 4-5)





MA(1) properties


Var(yt) = Var(µ+ εt + φεt−1)

= Var(εt) + φ2Var(εt−1)− 2Cov(εt , φεt−1)

= σ2 + φ2σ2

= σ2(1 + φ2)






MA(1) properties


Cov(yt , yt−1) = E ((εt + φεt−1)(εt−1 + φεt−2))

= E (εtεt−1 + φε2t−1 + φεtεt−2 + φ2εt−1εt−2)

= E (εtεt−1) + φE (ε2t−1) + φE (εtεt−2) + φ2E (εt−1εt−2)

= 0 + φσ2 + 0 + 0 = φσ2






MA(1) properties


ρ1 = Cor(yt , yt−1) =Cov(yt , yt−1)

√

Var(yt)Var(yt−1)

=φσ2

σ2(1 + φ2)

=φ

1 + φ2






MA(1) properties


Cov(yt , yt−k) = 0 and ρk = 0 for all k > 1,

thus an MA(1) process has a“memory”of one lag.

E (yt), Var(yt) and Cov(yt , yt−k) depend only on lag lengthk, so MA(1) is stationary.

|ρ1| ≤ 12 , thus MA(1) not appropriate model if correlation is

higher.






AR(1) properties

yt = δ + αyt−1 + εt





AR(1) properties


E (yt) = E (δ + αyt−1 + εt)

= E (δ) + αE (yt−1) + E (εt)

= δ + αE (yt) + 0

(1− α)E (yt) = δ

E (yt) =δ

1− α

Note that stating that E (yt) = E (yt−1) assumes stationaryprocess!





AR(1) properties


Var(yt) = Var(δ + αyt−1 + εt)

= α2Var(yt) + σ2 − 2Cov(αyt−1, εt)

(1− α2)Var(yt) = σ2 − 0

Var(yt) =σ2

1− α2

Note that stating that Var(yt) = Var(yt−1) assumes stationaryprocess!

(Harrison 2009: 6, 39-40) Jos Elkink autocorrelation




AR(1) properties


Cov(yt , yt−1) =ασ2

1− α2

Cov(yt , yt−k) =αkσ2

1− α2

ρk = Cor(yt , yt−k) = αk

(Harrison 2009: 6, 39-40)





AR(1) properties


If α = 1, neither E (yt) nor Var(yt) exist, so the process isnonstationary.

If α = −1, Var(yt) does not exist, so the process isnonstationary.

If |α| > 1, Var(yt) < 0, so the process is nonstationary.

An AR(1) process has a much longer memory than an MA(1)process,

but if |α| < 1, ρk decreases exponentially with k.

(Harrison 2009: 6, 39-40)





AR(1) properties


If we do not assume stationary process:


yt = δ + α(δ + αyt−2) + εt...

yt = δ(1 + α+ · · ·+ αt−1) + αty0 + εt + αεt−1 + · · ·+ αt−1ε1,

with y0 being some starting value of y .

(Harrison 2009: 6, 39-40)





AR(1) properties


Then it follows:

E (yt) = δ(1 + α+ · · ·+ αt−1) + αty0 t ≥ 1

Var(yt) = σ2(1 + α2 + · · ·+ α2(t−1)) t ≥ 1

Cov(yt , yt−k) = αkVar(yt−k) 1 ≤ k ≤ t − 1

thus all depend on t and AR(1) is not stationary. However, if|α| < 1 and as t → ∞, the previous results obtain. AR(1) is thusasymptotically stationary or I (0).

(Harrison 2009: 6, 39-40)





Unit root

An AR(1) process where |α| = 1 (i.e., yt = δ + yt−1 + εt) is saidto have a unit root.

(Harrison 2009: 42-46)





Unit root


Unit roots can be much harder to detect. E.g.yt = δ + 0.8yt−1 + 0.2yt−2 + εt also has a unit root.

(Harrison 2009: 42-46)





Unit root



Consequences:

Consistency and asymptotical normality proofs of OLS, GLS,ML, IV are invalid.

(Harrison 2009: 42-46)





Unit root



Consequences:

Consistency and asymptotical normality proofs of OLS, GLS,ML, IV are invalid.

Regressing two variables with unit roots on each other leadsto spurious regression.

(Harrison 2009: 42-46)





PlotsTests: AutocorrelationTests: Stationarity

Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity







Residual plots: no autocorrelation

0 2 4 6 8

510

1520

25

x

y






Residual plots: no autocorrelation

−4 −2 0 2 4

−4

−2

02

4

residuals(m)[−T]

resi

dual

s(m

)[−

1]






Residual plots: autocorrelation

0 2 4 6 8

05

1015

2025

x

y






Residual plots: autocorrelation

−6 −4 −2 0 2 4

−6

−4

−2

02

4

residuals(m)[−T]

resi

dual

s(m

)[−

1]






Autocorrelation function

The autocorrelation function (ACF), or correlogram, is thecorrelation between yt and yt−k , as a function of k.







The autocorrelation function (ACF), or correlogram, is thecorrelation between yt and yt−k , as a function of k.

If we define

Var(yt) = Var(yt−k) = γ0

Cov(yt , yt−k) = γk ,

thenρk =

γk√γ0γ0

=γkγ0







For the moving average model:

ρ1 =φ

1 + φ2, ρk = 0 ∀ k > 0







For the moving average model:

ρ1 =φ

1 + φ2, ρk = 0 ∀ k > 0

For the autoregressive model:

ρk = αk







Theoretical ACF, AR(1) process, α = .5

2 4 6 8 10 12 14

0.0

0.2

0.4

0.6

0.8

1.0

rho







Theoretical ACF, AR(1) process, α = .9

2 4 6 8 10 12 14

0.0

0.2

0.4

0.6

0.8

1.0

rho







Theoretical ACF, MA(1) process, φ = .5

2 4 6 8 10 12 14

0.0

0.2

0.4

0.6

0.8

1.0

rho







Theoretical ACF, MA(1) process, φ = .9

2 4 6 8 10 12 14

0.0

0.2

0.4

0.6

0.8

1.0

rho






Example

Empirical data, change in GDP per capita, Netherlands

1960 1970 1980 1990

−40

0−

200

020

040

0

Cha

nge

in G

DP

per

cap

ita, N

ethe

rland

s






Example

Empirical ACF, change in GDP per capita, Netherlands

2 4 6 8 10 12 14

−1.

0−

0.5

0.0

0.5

1.0

Aut

ocor

rela

tion






Partial autocorrelation

Instead of looking at the autocorrelation function, one can look atthe partial autocorrelation function (PACF). This describes thecorrelation between yt and yt−k , given all values of y in between.








This can quite simply be calculated by looking at αk , thecoefficient on the kth coefficient of the AR(k) model.









An AR(p) has an exponentially decreasing ACF and a sharp cut-offpoint in the PACF. The cut-off point suggests the proper value forp. A very slow (linear) decline in the ACF suggests a unit root.









An AR(p) has an exponentially decreasing ACF and a sharp cut-offpoint in the PACF. The cut-off point suggests the proper value forp. A very slow (linear) decline in the ACF suggests a unit root.An MA(q) has a sharp cut-off point in the ACF. The cut-off pointsuggests the proper value for q.






Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity







Durbin-Watson

d =

∑Tt=2(et − et−1)

2

∑Tt=1 e

2t






Durbin-Watson

d =

∑Tt=2(et − et−1)

2

∑Tt=1 e

2t

If ρ = cor(εt , εt−1) and ρ = cor(et , et−1), then d ≈ 2(1− ρ).Thus, if d is close to 0 or 4, there is high first-order serialautocorrelation.






Durbin-Watson

d =

∑Tt=2(et − et−1)

2

∑Tt=1 e

2t

If ρ = cor(εt , εt−1) and ρ = cor(et , et−1), then d ≈ 2(1− ρ).Thus, if d is close to 0 or 4, there is high first-order serialautocorrelation.

Note that E (d) ≈ 2 + 2(k−1)n−k

, thus biased.






Durbin-Watson

In matrix algebra, it could be written as:

d =ε′MAMε

ε′MεM = I− X(X′X)−1X′,

whereby

A =

1 −1 0 0 · · · 0 0−1 2 −1 0 · · · 0 00 −1 2 −1 · · · 0 0...

......

.... . .

......

0 0 0 0 · · · 2 −10 0 0 0 · · · −1 1






Durbin-Watson

In matrix algebra, it could be written as:

d =ε′MAMε

ε′MεM = I− X(X′X)−1X′,

whereby

A =

1 −1 0 0 · · · 0 0−1 2 −1 0 · · · 0 00 −1 2 −1 · · · 0 0...

......

.... . .

......

0 0 0 0 · · · 2 −10 0 0 0 · · · −1 1

The sampling distribution thus depends on X.Jos Elkink autocorrelation





Durbin-Watson

When the probability distribution of d is not exactly known, wecan use threshold values. Given T and k, boundary values dL anddU have been tabulated.






Durbin-Watson


E.g., if T = 50, k = 6, α = .05 then dL = 1.335 and dU = 1.771,so we reject H0 : ρ > 0 if d < dL and we do not reject if d > dU ,but in between we are undecided.






Durbin-Watson


E.g., if T = 50, k = 6, α = .05 then dL = 1.335 and dU = 1.771,so we reject H0 : ρ > 0 if d < dL and we do not reject if d > dU ,but in between we are undecided.

These threshold values are approximations and, depending on thespeed at which regressors change, can be more or less appropriate.






Durbin-Watson

library(lmtest)

dwtest(model)






Durbin-Watson

library(lmtest)

dwtest(model)

Somewhat“old-fashioned” test, requiring special table.






Durbin-Watson

library(lmtest)

dwtest(model)


Assumes normally distributed errors.






Durbin-Watson

library(lmtest)

dwtest(model)



Model must include intercept.






Durbin-Watson

library(lmtest)

dwtest(model)




Requires X to be non-stochastic.






Durbin-Watson

library(lmtest)

dwtest(model)




Requires X to be non-stochastic.

Only tests for presence of AR(1) process.






Durbin’s h test

The Durbin-Watson statistics cannot be used when there is alagged dependent variable in the model. You should, with suchvariable, always test for remaining autocorrelation, however. Onepossible test is Durbin’s h-test.






Durbin’s h test

The Durbin-Watson statistics cannot be used when there is alagged dependent variable in the model. You should, with suchvariable, always test for remaining autocorrelation, however. Onepossible test is Durbin’s h-test.

h = (1− 1

2d)

√

T

1− T · V (βyt−1)

a∼ N(0, 1).






Breusch-Godfrey LM test

A more powerful test, which can handle higher orderautoregressions, is the Breusch-Godfrey LM test.








1 Estimate OLS2 Regress e on X and lagged values of e (et−1, et−2, · · · , et−k)3 (T − k)R2 a∼ χ2(k)









library(lmtest)

bgtest(model, order = 3)









library(lmtest)

bgtest(model, order = 3)

This assumes normally distributed errors. A slightly more generalGauss-Newton regression would not make this assumption.






Gauss-Newton regression

Assume an AR(1) process: yt = x′tβ + ut , ut = ρεt−1 + εt .

(Davidson & MacKinnon 1993: 357-360) Jos Elkink autocorrelation







In this case, we can simply first regress y on X, and then use theresiduals from this regression (u) to regress u on X and u, wherebyu1 = 0 and ut = ut−1 ∀ t > 1:

u = Xβ + uρ+ ε









u = Xβ + uρ+ ε

The test can easily be extended by including multiple lags andperforming an F -test on all ρ’s.









u = Xβ + uρ+ ε

The test can easily be extended by including multiple lags andperforming an F -test on all ρ’s.

The test is also valid for testing MA(q) or ARMA(p,q) processes.







m <- lm(y ~ x1 + x2)

T <- dim(m$model)[1]

u <- residuals(m)

u.tilde <- c(0, u[-T])

summary(lm(u ~ x1 + x2 + u.tilde))

and then check the t-test for the u variable.

(Davidson & MacKinnon 1993: 357-360)






Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity







Dickey-Fuller test

Subtracting yt−1 from both sides of yt = αyt−1 + εt gives:

∆yt = (α− 1)yt−1 + εt = βyt−1 + εt

so we can regress (yt − yt−1) on yt−1 to test whether there is aunit root.

(Harrison 2009: 46-47)






Dickey-Fuller test


∆yt = (α− 1)yt−1 + εt = βyt−1 + εt


However, under the H0 of a unit root, ∆yt ∼ I (0) and yt ∼ I (1),so t-test is invalid. Critical values τnc , τc and τct have beenpublished for processes without constant, with constant, and withconstant and trend, respectively.

(Harrison 2009: 46-47)






Dickey-Fuller test


∆yt = (α− 1)yt−1 + εt = βyt−1 + εt


However, under the H0 of a unit root, ∆yt ∼ I (0) and yt ∼ I (1),so t-test is invalid. Critical values τnc , τc and τct have beenpublished for processes without constant, with constant, and withconstant and trend, respectively.

The test assumes no autocorrelation in ε.

(Harrison 2009: 46-47)






Augmented Dickey-Fuller test

The DF test only works for AR(1) processes withoutautocorrelation in ε. For AR(p) processes or AR(1) processes withautocorrelated errors, we can use the ADF test.

(Davidson & MacKinnon 1999: 610-613; Harrison 2009: 48)








∆yt = β∗yt−1 +

p∑

k=1

δk∆yt−k + εt

and the same τ ’s can be used as critical values for tβ∗ .









∆yt = β∗yt−1 +

p∑

k=1

δk∆yt−k + εt

and the same τ ’s can be used as critical values for tβ∗ .

The power of this test is low (i.e. detects unit root too easily). Thepower depends on the type and strength of the autocorrelation.







Dickey-Fuller test

Dickey-Fuller test:

library(tseries)

adf.test(x, k = 0)

Augmented Dickey-Fuller test:

adf.test(x)

adf.test(x, k = 2)





Outline

1 Consequences

2 Typical processes

3 Stationarity

4 Diagnostics

Plots


Tests: Stationarity






No spatial autocorrelation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

−3

−2

−1

01

23





Negative spatial autocorrelation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

−4

−2

02





Positive spatial autocorrelation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

−5

05

1015

2025





Connection matrix





Connection matrix

W =

0 1 1 0 0 11 0 1 1 0 01 1 0 0 1 00 1 0 0 1 10 0 1 1 0 01 0 0 1 0 0

W =

0 13

13 0 0 1

313 0 1

313 0 0

13

13 0 0 1

3 00 1

3 0 0 13

13

0 0 12

12 0 0

12 0 0 1

2 0 0





Spatial processes

Spatial autocorrelation has processes somewhat analogous to serialautocorrelation.

(Anselin 1988)





Spatial processes


Spatial error process: y = Xβ + u, u = λWu+ ε.

(Anselin 1988)





Spatial processes


Spatial error process: y = Xβ + u, u = λWu+ ε.

Spatial lag process: y = ρWy + Xβ + ε.

(Anselin 1988)





Moran’s I

I =

∑

i

∑

j wij(xi − x)(xj − x)∑

i

∑

j wij

· n∑

i (xi − x)2





Moran’s I

I =

∑

i

∑


i

∑

j wij

· n∑

i (xi − x)2∼ N(µI , σ

2I )





Moran’s I

I =

∑

i

∑


i

∑

j wij

· n∑

i (xi − x)2∼ N(µI , σ

2I )

µI = E (I ) =−1

n − 1

σ2I = Var(I ) =

n2S1 − nS2 + 3S20

S20 (n

2 − 1),

where

S0 =∑

i

∑

j

(wij+wji), S1 =1

2

∑

i

∑

j

(wij+wji)2, S2 =

∑

i

∑

j

(wij+wji)2





Moran’s I

library(ape)

Moran.I(y, W)

Moran.I(residuals(lm(y ~ x1 + x2)), W)





Moran’s I

library(ape)

Moran.I(y, W)

Moran.I(residuals(lm(y ~ x1 + x2)), W)

Moran’s I can only be calculated with a known W matrix. Higherorder lags are also possible, e.g. W2.





Example: democracy

1800 1850 1900 1950 2000

−0.

50.

00.

5

Year

Mor

an’s

I

1800 1850 1900 1950 2000

−0.

8−

0.4

0.0

Year

Mor

an’s

I

Figure: Spatial clustering, Polity IV dichotomized, I (yt) and I (∆yt),1800-2003





Checking residuals

I =n

∑

i

∑

j wij

e′We

e′e∼ N(µI , σ

2I )

LMerr =n2(e

′Wee′e )2

tr(W′W +W2)∼ χ2(1)

LMlag =n2(e

′Wye′e )2

(WXβOLS)′MWXβOLS/σ2 + tr(W′W +W2)∼ χ2(1),

with LMerr and LMlag referring to tests for spatial error and spatiallag processes, respectively.

(Anselin & Hudak 1992: 520)


advanced quantitative methods: autocorrelation - · pdf file ·...

Documents