spatial dependence in regressors and its e ect on … dependence in regressors and its e ect on...

30
Spatial Dependence in Regressors and its Effect on Estimator Performance R. Kelley Pace LREC Endowed Chair of Real Estate Department of Finance E.J. Ourso College of Business Administration Louisiana State University Baton Rouge, LA 70803-6308 [email protected] James P. LeSage Fields Endowed Chair Texas State University - San Marcos Department of Finance and Economics San Marcos, TX 78666 [email protected] Shuang Zhu Department of Finance E.J. Ourso College of Business Administration Louisiana State University Baton Rouge, LA 70803-6308 June 2, 2010

Upload: vananh

Post on 16-Jun-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Spatial Dependence in Regressors and its Effect onEstimator Performance

R. Kelley PaceLREC Endowed Chair of Real Estate

Department of FinanceE.J. Ourso College of Business Administration

Louisiana State UniversityBaton Rouge, LA 70803-6308

[email protected]

James P. LeSageFields Endowed Chair

Texas State University - San MarcosDepartment of Finance and Economics

San Marcos, TX [email protected]

Shuang ZhuDepartment of Finance

E.J. Ourso College of Business AdministrationLouisiana State University

Baton Rouge, LA 70803-6308

June 2, 2010

Abstract

In econometrics most work focuses on spatial dependence in the re-gressand or disturbances. However, LeSage and Pace (2009); Pace andLeSage (2009) showed that the bias in β from applying OLS to a regres-sand generated from a spatial autoregressive process was exacerbatedby spatial dependence in the regressor. Also, the marginal likelihoodfunction or restricted maximum likelihood (REML) function includes adeterminant of a function of the spatial parameter and the regressors.Therefore, high dependence in the regressor may affect the likelihoodthrough this term. Finally, the notion of effective sample size for depen-dent data suggests that the loss of information from dependence mayhave implications for the information content of various instrumentswhen using instrumental variables.

Empirically, many common economic regressors such as income,race, and employment show high levels of spatial autocorrelation. Basedon these empirical results, we conduct a Monte Carlo study using max-imum likelihood, restricted maximum likelihood, and two instrumentalvariable specifications for the lag y model (SAR) and spatial Durbinmodel (SDM) in the presence of correlated regressors while varyingsignal-to-noise, spatial dependence, and weight matrix specifications.We find that REML outperforms ML in the presence of correlatedregressors and that instrumental variable performance is affected bysuch dependence. The combination of correlated regressors and theSDM provides a challenging environment for instrumental variable tech-niques.

In addition, we examine the estimation of marginal effects and showthat this can behave better than estimation of component parameters.We also make suggestions for improving Monte Carlo experiments.

KEYWORDS: regressor autocorrelation, spatial Durbin model,REML, spatial autoregression, maximum likelihood, spatial econo-metrics.

1 Introduction

In spatial settings, such as real estate prices in a city, the dependence ofeach observation on potentially every other observation impedes the-oretical derivation of finite sample results. Consequently, almost allfinite sample investigations of spatial econometric methods rely uponMonte Carlo experiments. However, such Monte Carlo experimentsusually employ iid random variables for the regressors. However, em-pirically regressors often display substantial spatial dependence. Inthis note, we document the large spatial dependence exhibited by com-mon variables such as income, education, and race; and show that thedependence in regressors can make a material difference in the perfor-mance of two-stage least squares, a common spatial method (Anselin,1988; Kelejian and Prucha, 1998; Lee, 2007). Essentially, spatial de-pendence reduces the information content of variables which exacer-bates the weak instrument problem (Bound, Jaeger, and Baker, 1995;Staiger and Stock, 1997). The spatial dependence interacts with otherwell-known weak instrument characteristics such as goodness-of-fit andnumber of instruments. In addition, we show that estimation of thecommon spatial Durbin model (Anselin, 1988) becomes particularlysensitive to traditional weak instrument considerations as well as tothe spatial dependence in the regressors.

In section 2 we lay out the data generating processes, empirical datathat suggest high levels of spatial dependence in the regressors may becommon, and two stage least squares estimation. In section 3 we discussthe design of the Monte Carlo experiment and the results. Based onthese findings and other information we offer some suggestions for thedesign of Monte Carlo experiments in spatial settings in section ??.In section 5 we discuss the implications of this work for future spatialresearch.

1

2 Spatial Dependence in y and Regressors

In this section we set forth the data generating processes for the spa-tial autoregressive and spatial Durbin models in section 2.1. In 2.2 weempirically measure spatial dependence in common regressors. Thisinforms our choice of a spatial autoregressive process for the regressors.We discuss likelihood approaches (maximum likelihood and residualmaximum likelihood) to estimation in section 2.3. In section 2.4 we layout the two stages least squares and best instruments approach to esti-mating the parameter in the spatial autoregressive and spatial Durbinmodels. In particular, we discuss how these model choices affects theinstrument set. Finally, we discuss marginal effects and possible biasproblems in section 2.5.

2.1 Spatial Models

Assume that the n by 1 dependent variable y follows a spatial autore-gressive DGP (often abbreviated as SAR).1

y = (In − ρW )−1Xβ + (In − ρW )−1ε (1)

ε ∼ N(0, σ2εIn) (2)

where X contains n observations on k exogenous regressors, ε is a n

by 1 vector of normal iid disturbances with variance σ2ε , and β is ak by 1 vector of regression parameters. The spatial character of (1)is determined by the n by n spatial weight matrix W and the spatialscalar parameter ρ. When observation or region j is a neighbor toobservation or region i, Wij > 0 and otherwise Wij = 0. These valuesare exogenous. By convention, observations can not serve as neighborsto themselves and therefore Wii = 0. For simplicity, assume that Whas real eigenvalues such as occur when a matrix is similar to a real,symmetric matrix. Since scaling a matrix by its principal eigenvaluecan always yield a new matrix with a principal eigenvalue of 1, assume

1In spatial econometrics the acronym SAR usually refers to an autoregression in the dependentvariable while in spatial statistics it is usually an autoregression in the disturbances (Ripley, 1981).

2

W has a principal eigenvalue of 1 and a minimum eigenvalue of λmin.Since the diagonal of W contains zeros, tr(W ) = 0. Therefore, thesum of the eigenvalues equals 0. Given a positive principal eigenvalue,λmin < 0. Consequently, ρ ∈ (λ−1min, 1) yields a symmetric positivedefinite (In − ρW ).

Given (1), (3) becomes the corresponding estimation equation.

y = Xβ + ρWy + ε (3)

Since functions of the dependent variable appear on both sides of (3),estimation by OLS is biased (LeSage and Pace, 2009; Pace and LeSage,2009) and require alternative means of estimation such as those basedon the likelihood or, in the case under examination, instrumental vari-ables.

As with any model the composition of the explanatory variable ma-trix X becomes important. The SAR model has two variants in terms ofthe regressors where ι is the n by 1 vector of ones and U is a n by p ma-trix of non-constant regressors. The first variant XSAR in (4) containsregressors by themselves with associated p by 1 parameter vector γ forthe non-constant variables along with a scalar parameter α associatedwith the constant vector and the second variant XSDM in (6) containsregressors with associated p by 1 parameter vector γ along with theirspatial lags with associated p by 1 parameter vector θ as well as a scalarparameter α associated with the constant vector. This latter variantleads to the spatial Durbin model (SDM) which nests the SAR (whenθ = 0), spatial error model or SEM (when θ = −ργ), the spatial lagof X model or SLX (when ρ = 0) and the conventional iid disturbancemodel (when ρ = 0 and θ = 0). LeSage and Pace (2009); Pace andLeSage (2009) show the SDM model arises naturally as a byproduct ofspatial disturbances and omitted variables that are correlated with theincluded variables.

3

XSAR =[ι U

](4)

βSAR =[α γ′

]′(5)

XSDM =[ι U WU

](6)

βSAR =[α γ′ θ′

]′(7)

2.2 Spatial Dependence in the Regressors

In addition, many regressors display spatial dependence as well. Toobtain an idea of such dependence we took some common variablesand estimated a univariate SAR model for each of these via maximumlikelihood. The county data is from the contiguous US states, thecensus tract data is from New York state, and the block group data isfrom the Bronx. All variables are logged. Missing values were takenout of sample. The initial weight matrix used 30 (t = 1 . . . 30) nearestneighbors where the (t+1)th nearest neighbor received five percent lessweight than the tth order neighbor. The final weight matrix was thisinitial matrix plus its transpose reweighted to make it symmetric anddoubly stochastic with a principal eigenvalue of 1.

Table 1 documents that many common variables display a high de-gree of spatial dependence. These range from a high of 0.962 for theautocorrelation in county house prices to 0.751 for the autocorrelationin female employment across block groups in the Bronx.

To allow for the observed spatial dependence in the explanatory vari-ables we assume that these follow a univariate SAR model as specifiedin (8) with a DGP in (10).

Uj = φWUj +Rj j = 1 . . . k (8)

Rj ∼ N(0, σ2RjIn) (9)

Uj = (In − φW )−1Rj (10)

4

Variables County Census Tract Block Group

Black 0.930 0.940 0.938Age 60+ 0.871 0.805 0.812Age 16+ 0.853 0.859 0.858Bachelor’s Degree 0.869 0.928 0.909Civilian Employment 0.839 0.782 0.752Female Employment 0.837 0.796 0.751Median HH Income 0.953 0.937 0.918Per Capita Income 0.916 0.940 0.943Median Rent 0.950 0.938 0.886Median House Price 0.962 0.938 0.823Number of Observations 3, 109.000 4, 907.000 5, 733.000

Table 1: Estimates of spatial dependence for selected census variables in differentlevels of geography by spatial autoregressive regression model.

2.3 Likelihood-based Estimators

Maximum likelihood (concentrated) finds the value of ρ that maximizesthe concentrated log-likelihood function (11) with respect to ρ and theoptimum value (ρ∗) equals the maximum likelihood estimate (ρ∗ =ρ̃ML).

LML(ρ) = κ+ ln |In − ρW | −n

2ln(e(ρ)′e(ρ)) (11)

e(ρ) = (In − ρW )y −X (̃β) (12)

β̃ = (X ′X)−1X ′(In − ρW )y (13)

It is well-known that maximum likelihood often has a downward biasin estimation of ρ in small samples.

The restricted or residuals maximum likelihood (REML) as intro-duced by Patterson and Thompson (1971) results in a concentratedlog-likelihood function that augments the usual log-likelihood with an-other determinant term and multiplies the log of the sum-of-squarederror term by 0.5(n−k) rather than by 0.5n. REML can produced un-biased estimates of variances and covariances whereas maximum likeli-hood only produces consistent estimates.

5

LRE(ρ) = κ− 1

2ln |X ′(In − ρW )′(In − ρW )X|

+ ln |In − ρW | −n− k

2ln(e(ρ)′e(ρ)) (14)

Note, the REML concentrated log-likelihood has a determinant termthat can be affected by the dependence in X. Suppose X is univariateand equals U (SAR process) and W is symmetric. Further, suppose Ufollows an autoregressive process as described by (10). The expectationof the term equal (15) and since a determinant of a scalar is that scalar,this would also be the determinant of that term as well.

E(U ′(In − ρW )2U) = tr[(In − φW )−2(In − ρW )2] (15)

By inspection, high values of φ will tend to inflate the magnitude of thisterm. Therefore, spatial dependence in a regressor could cause REMLto differ from ML in such cases.

2.4 Instrumental Variables

Instrumental variable techniques have been widely used in spatial econo-metrics (Anselin, 1988; Kelejian and Prucha, 1998; Lee, 2007). Inpractice, due to the widespread availability of software and intuitiveappeal, two stage least squares (2SLS) has been often used in applica-tions (Anselin, 1988; Land and Deane, 1992; Byme, 2005; Millimet andRangaprasad, 2007; Richards and Padilla, 2009).

Given X, two stage least squares uses a set of instruments Z thatincludes X and some additional variables V as shown in (16). Theproblematic element in estimating (3) is the simultaneity between yand Wy. Two stage least squares replaces Wy with its predicted valuebased on the instruments PZWy as in (18).

6

Z =[X V

](16)

PZ = Z(Z ′Z)−1Z ′ (17)

y = Xβ + ρ(PZWy) + ε (18)

It remains to specify V . Kelejian and Prucha (1998) motivated possi-ble instruments through the combination of the Taylor series expansionof (In− ρW )−1 in (19) and the expression for the expectation of Wy inthe SAR model as in (20).

(In − ρW )−1 = In + ρW + ρ2W 2 + ρ3W 3 + . . . (19)

E(Wy) = WXβ + ρW 2Xβ + ρ2W 3Xβ + . . . (20)

Their suggested instruments for the SAR model was (21) which involvedthe first two terms of the expectation of Wy. By analogy, the firsttwo terms of the expectation of Wy in the SDM model that are notcontained in the regressors appear in (22).

VSAR =[WU W 2U

](21)

VSDM =[W 2U W 3U

](22)

Lee (2003) adopted a different instrumental variable approach, termedthe best instruments, which also used (20) to suggest replacing Wy withW (In − ρW )−1XβL. In this theoretical form (which uses the true pa-rameters) the best instruments provide a bound to the performance offeasible instrumental variable approaches (Kelejian, Prucha, and Yuze-fovich, 2004).

Examination of the finite sample properties of these techniques haslargely been conducted though Monte Carlo experiments (Kelejian,Prucha, and Yuzefovich, 2004; Klotz, 2004; Lee, 2007) and most ofthese examine cases with moderate to high levels of signal-to-noise. Forexample, Kelejian, Prucha, and Yuzefovich (2004) excluded low signal-to-noise scenarios. In contrast, Lee (2007) examined a case where theR2 was 0.04 in which scenario 2SLS performed poorly.

7

This suggests that the weak instrument problems of 2SLS in othercontexts (Bound, Jaeger, and Baker, 1995; Staiger and Stock, 1997)may carry over into spatial estimation. Factors affecting the weak in-strument problem include goodness-of-fit and number of instruments.These problems can reduce performance even in relatively large samplesizes.

In addition to these well-known problems, the spatial dependence ofregressors may affect the weak instrument problem. In fact, Bowdenand Turkington (1984, p. 77-81) found in a time series context thatincreasing the correlation in the regressors had a non-monotonic effect.On one hand, such correlations can increase the correlation betweenthe instrumented out variable and its instruments, on the other handit increases the correlation between the instruments and the exogenousvariables (lower marginal information).

The SDM has the potential to further affect the performance of in-strumental variables since including the spatial lags of already includedregressors in the model precludes using these in aiding the identificationof Wy. Specifically, the spatial Durbin model by construction reducesthe possible number of identifying instruments for Wy relative to theSAR model.

To further examine this we note that the SAR model in (3) alongwith the regressors given by (6) defines the SDM model as shown in(23). This has the associated DGP in (24). Recognizing that in (24)the term (In − ρW )−1Xγ also equals Xγ + (In − ρW )−1WXργ andcombining terms yields (25).

y = α + Uγ +WUθ + ρWy + ε (23)

y = α∗ + (In − ρW )−1Uγ + (In − ρW )−1WUθ + (In − ρW )−1ε (24)

y = α∗ + Uγ + (In − ρW )−1WU(ργ + θ) + (In − ρW )−1ε (25)

An important part of the two stage least squares method is theregression of the variable to be instrumented out (Wy) in (26) versusthe instruments Z as in (27) with its expanded counterpart (28). Note,matrices commute with functions of themselves so that W (In− ρW )−1

8

equals (In − ρW )−1W .

Wy = α∗∗ +WUγ + (In − ρW )−1W 2U(ργ + θ)

+ (In − ρW )−1Wε (26)

Wy = Zδ + ε (27)

WyXSDMδ(1) + VSDMδ

(2) + ε (28)

An important aspect of the weak instrument problem is the mag-nitude of δ(2), the parameter vector associated with the identifying in-struments VSDM . To address this more directly we employ the Frisch-Waugh-Lovell technique to examine the regression where both sides aremultiplied by, in this case, MXSDM

. This does not affect the parameterestimates of δ(2), but does remove δ(1) from the model as done in (30).

MXSDM= In −XSDM(X ′SDMXSDM)−1X ′SDM (29)

MXSDMWy = MXSDM

VSDMδ(2) +MXSDM

ε (30)

Consider the expectation of the regressand in (30) (using (26)) asshown in (31). This is the systematic part to be explained (the signal).If this has a small magnitude relative to the random part (the noise),this sets up the conditions leading to the weak instrument problem.

E(MXSDMWy) = MXSDM

(In − ρW )−1W 2U(ργ + θ) (31)

Note, the signal has a zero magnitude when the true DGP is an errormodel (θ = −ργ). In this case two stage least squares technique will notwork as this violates the conditions underlying the proof of consistency(instruments have explanatory power). In practice, θ rarely exactlyequals −ργ. However, it often has the opposite sign of γ and thissuggests that weak instruments will often be a problem when estimatingthe SDM using two stage least squares.

9

2.5 Marginal Effects and Bias Correction

A number of authors have pointed out that the derivative of the re-gressand with respect to the regressor in these models does not equalβ as in the case of iid models fitted by OLS. In fact, the partial deriva-tives of the expected value of the regressand with respect to regressorj (marginal effects) for the SAR DGP equals (32).

∂E(y)

∂X ′j= (In − ρW )−1βj (32)

= S(ρ)j (33)

Since this is an n by n matrix and since there are k − 1 non-constantvariables, this results in (k − 1)n2 partial derivatives which providesan overwhelming amount of information. To deal with this volumeof information LeSage and Pace (2009) suggested summarizing thesepartial derivatives. Specifically, they suggested averaging all the columnor row sums to arrive at the average total effect or impact, the averageof the diagonal of Sj(ρ) to arrive at the average direct effect or impact,and the average of the off-diagonal elements of Sj(ρ) to arrive at theaverage indirect effect or impact.

As shown in LeSage and Pace (2009) the simplest case arise for arow or doubly stochastic W . In this case the average total effect is asimple non-linear function of ρ as shown in (34).

T (ρ)j = (1− ρ)−1βj (34)

(35)

Due to the non-linearity and the sampling variabilty of estimates of ρ,the expected total effect does not equal the total effect of the estimateof ρ as shown in (36).

E(T (ρ)j) 6= (1− ρ̂)−1β̂j (36)

10

Fortunately, the quantiles (q) of the total effects are easier to use. Since(1 − ρ)−1 is a monotonic transformation, q(f(x)) equals f(q(x)) as in(37).

q(T (ρ)j) = (1− q(ρ))−1q(βj) (37)

3 Monte Carlo Experiments

In this section we discuss the design of the Monte Carlo experiments anddiscuss the experimental results. Specifically, in section 3.1 we examinea Monte Carlo experiment with different levels of spatial dependencein y, goodness-of-fit, and dependence in the regressors. In section 3.2we take the most difficult case from the previous experiments and seehow the estimators perform as a function of sample size. Finally, insection 3.3 we examine the performance of the various estimators inestimating the marginal effects.

In all of the experiments that follow two-stage least squares usesW 2X and W 3X for SDM, WX and W 2X for SAR as in (22) and(21). There are 15 non-constant explanatory variables and thereforethe number of additional instruments equals 30. We do not focus onthe problem of large number of instruments. However, one would ex-pect the findings of the experiments to be mitigated with fewer vari-ables (hence fewer additional instruments) and exacerbated with morevariables (additional instruments).

Many researchers have an idea of typical levels of spatial dependencein y and typical levels of signal-to-noise for data in their field of interest.In the experiments we set σ2ε to the level required to yield a given signal-to-noise as defined in (39). To maintain succinct notation, we call thismeasure R2.

11

R2 = 1− E(ε′(In − ρW )−2ε)

β′X ′(In − ρW )−2Xβ(38)

= 1− σ2εtr(

(In − ρW )−2

)β′X ′(In − ρW )−2Xβ

(39)

In generating Uj which provides the components for XSAR and XSDM

we set the variance σ2Rjto 1 in (10). In all the experiments the intercept

was set to 0 while the regression parameters associated with U in XSAR

and XSDM were set to 1 and the regression parameters associated withWU in XSDM were set to 0. Therefore, the DGP was always SAR.We did this to facilitate comparisons across the SAR and SDM results(especially for the marginal effects).

The best instrumental variable estimator uses the true values of βand ρ in computing (In − ρW )−1WXβ. This obviously is the bestpossible case for an instrumental variable technique.

3.1 Results by Signal-to-Noise, Spatial Dependence in Re-gressors and Regressand

In the following experiments, we only examine a sample size of 3, 000observations, normally considered a large sample size. We designed theexperiments to facilitate matching the cases to researchers knowledgeof their data. Specifically, we simulated using (1), (10) with levelsof ρ equal to 0.4 and 0.8 which reflect moderate and strong spatialdependence in y. Given the empirical levels of spatial dependence inthe regressors as shown in Table 1, we set φ to equal 0, 0.5, and 0.95to represent the iid case, the case of moderate spatial dependence inthe regressors, and finally the case of high spatial dependence in theregressors. The σ2ε was set to a level that assured that the R2 as definedby (39) equaled 0.1 (low signal-to-noise), 0.5 (moderate signal-to-noise),or 0.9 (high signal-to-noise). Therefore, the experiment examined 18cases that vary by spatial dependence in the regressors, signal-to-noise,and spatial dependence in y. Each case involves 1, 000 trials.

12

In the tables we recorded the median estimate of ρ as well as themedian absolute deviation (MAD) of the estimated values of ρ from allthe estimators. In the tables the subscripts RE, B, IV , ML stands forREML, the Best IV, two stage least squares, and maximum likelihood.

We used the medians and median absolute deviations to control theproblems which occured due to sometimes wild estimates from the bestand two stage least squares instrumental variables estimators.

To simplify the presentation, we discuss all SAR results in sec-tion 3.1.1 and all SDM results in section 3.1.2.

3.1.1 SAR Results

Table 2 show the usual instrumental variable results in the spatial liter-ature in the presence of high or moderate signal to noise (R2 = 0.9, 0.5)and no spatial dependence in the regressors (φ = 0). Namely, boththe best instrumental variable specification and two-stage least squaresshow very low levels of bias in those cases. Also, the results confirmthose of Lee (2007) for the low signal-to-noise cases where two stageleast squares shows greater bias. Note, the bias of maximum likelhoodmonotonically increases with φ. Although REML displays some bias,it is lower than ML and the lowest of the estimators.

However, Table 2 shows the great importance of spatial dependencein the regressors. All of the worst cases in terms of bias and rmsehappen in the high spatial dependence of regressor (φ = 0.95) casesfor the Best IV, two stage least squares, and maximum likelihood es-timators (relative to REML). The performance of the IV estimators isnon-monotonic with the highest relative RMSEs occurring for both thelowest and the highest levels of φ.

In terms of producing estimates outside of the interval (0, 1), the Best IV estimatorwith true parameters yielded estimates of ρ of less than 0 in 3 out of 1, 000 trials, andgreater than 1 in 4 out of 1, 000 trials when R2 = 0.1, φ = 0.95, and ρ = 0.4 (case18). Two stage least squares produced estimates greater than 1 in 9.6 percent of thetrials for this case.

13

ρ r2 φ ρ̃ML ρ̃RE ρ̂B ρ̂IV ρ̂OLS

1 0.4000 0.9000 0.0000 0.4004 0.4003 0.4004 0.4022 0.43171 0.4000 0.9000 0.0000 0.0104 0.0104 0.0113 0.0112 0.0110

2 0.4000 0.9000 0.5000 0.3996 0.3999 0.4005 0.4017 0.42702 0.4000 0.9000 0.5000 0.0097 0.0097 0.0102 0.0102 0.0101

3 0.4000 0.9000 0.9500 0.3962 0.3995 0.4006 0.4035 0.46593 0.4000 0.9000 0.9500 0.0149 0.0149 0.0170 0.0172 0.0172

4 0.8000 0.9000 0.0000 0.8001 0.8001 0.8003 0.8015 0.82964 0.8000 0.9000 0.0000 0.0055 0.0055 0.0060 0.0062 0.0051

5 0.8000 0.9000 0.5000 0.7997 0.7997 0.8003 0.8016 0.83005 0.8000 0.9000 0.5000 0.0055 0.0055 0.0061 0.0060 0.0054

6 0.8000 0.9000 0.9500 0.7968 0.7979 0.8003 0.8032 0.87556 0.8000 0.9000 0.9500 0.0082 0.0082 0.0109 0.0108 0.0082

7 0.4000 0.5000 0.0000 0.4005 0.4001 0.4012 0.4164 0.55627 0.4000 0.5000 0.0000 0.0191 0.0191 0.0340 0.0337 0.0237

8 0.4000 0.5000 0.5000 0.3988 0.3997 0.4016 0.4133 0.54228 0.4000 0.5000 0.5000 0.0195 0.0194 0.0305 0.0299 0.0246

9 0.4000 0.5000 0.9500 0.3916 0.3992 0.4018 0.4264 0.61469 0.4000 0.5000 0.9500 0.0226 0.0225 0.0514 0.0516 0.0337

10 0.8000 0.5000 0.0000 0.7996 0.7994 0.8009 0.8117 0.944610 0.8000 0.5000 0.0000 0.0103 0.0103 0.0179 0.0178 0.0088

11 0.8000 0.5000 0.5000 0.7986 0.7986 0.8008 0.8115 0.945311 0.8000 0.5000 0.5000 0.0102 0.0102 0.0182 0.0171 0.0096

12 0.8000 0.5000 0.9500 0.7928 0.7952 0.8011 0.8255 1.016412 0.8000 0.5000 0.9500 0.0127 0.0124 0.0327 0.0300 0.0115

13 0.4000 0.1000 0.0000 0.4006 0.4001 0.4035 0.5122 0.679113 0.4000 0.1000 0.0000 0.0234 0.0233 0.1028 0.0850 0.0339

14 0.4000 0.1000 0.5000 0.3981 0.3994 0.4049 0.4944 0.671114 0.4000 0.1000 0.5000 0.0236 0.0235 0.0918 0.0809 0.0346

15 0.4000 0.1000 0.9500 0.3901 0.3991 0.4053 0.5642 0.685015 0.4000 0.1000 0.9500 0.0249 0.0246 0.1615 0.1289 0.0371

16 0.8000 0.1000 0.0000 0.7996 0.7993 0.8027 0.8762 1.053916 0.8000 0.1000 0.0000 0.0126 0.0126 0.0545 0.0387 0.0104

17 0.8000 0.1000 0.5000 0.7982 0.7982 0.8025 0.8756 1.053617 0.8000 0.1000 0.5000 0.0125 0.0125 0.0546 0.0393 0.0108

18 0.8000 0.1000 0.9500 0.7920 0.7948 0.8030 0.9278 1.072718 0.8000 0.1000 0.9500 0.0136 0.0134 0.1031 0.0573 0.0109

Table 2: Estimation of the SAR Model via Instrumental Variables and MaximumLikelihood

14

0 ρ r2 φ R̃ML R̃RE R̂B R̂IV R̂OLS

1 0.4000 0.9000 0.0000 0.0070 0.0071 0.0076 0.0079 0.03172 0.4000 0.9000 0.5000 0.0066 0.0065 0.0070 0.0072 0.02703 0.4000 0.9000 0.9500 0.0107 0.0101 0.0115 0.0119 0.06594 0.8000 0.9000 0.0000 0.0037 0.0037 0.0040 0.0044 0.02965 0.8000 0.9000 0.5000 0.0038 0.0038 0.0041 0.0042 0.03006 0.8000 0.9000 0.9500 0.0065 0.0060 0.0073 0.0079 0.07557 0.4000 0.5000 0.0000 0.0129 0.0128 0.0229 0.0266 0.15628 0.4000 0.5000 0.5000 0.0130 0.0130 0.0210 0.0239 0.14229 0.4000 0.5000 0.9500 0.0159 0.0152 0.0348 0.0413 0.214610 0.8000 0.5000 0.0000 0.0067 0.0068 0.0120 0.0152 0.144611 0.8000 0.5000 0.5000 0.0070 0.0069 0.0124 0.0152 0.145312 0.8000 0.5000 0.9500 0.0097 0.0087 0.0222 0.0290 0.216413 0.4000 0.1000 0.0000 0.0157 0.0154 0.0696 0.1138 0.279114 0.4000 0.1000 0.5000 0.0156 0.0152 0.0620 0.0985 0.271115 0.4000 0.1000 0.9500 0.0178 0.0166 0.1086 0.1690 0.285016 0.8000 0.1000 0.0000 0.0084 0.0085 0.0354 0.0762 0.253917 0.8000 0.1000 0.5000 0.0087 0.0086 0.0363 0.0756 0.253618 0.8000 0.1000 0.9500 0.0108 0.0095 0.0670 0.1279 0.2727

Table 3: SAR Model RMSE from Instrumental Variables and Maximum Likelihood

15

3.1.2 SDM Results

Table 4 demonstrates the challenges posed by the SDM model for the ML, Best IV,and two stage least squares estimators. Only REML displays very low bias across allthe cases. Even ML shows increasing bias as dependence in the regressors (φ) rises.In the worst case (15) where φ = 0.95, ρ = 0.4, and R2 = 0.1, ML gave an averageestimate of 0.389 for a bias of 0.011. Relative to REML, the RMSE for ML was 12.1percent higher. In contrast, REML gave an average estimate of 0.399 for a bias of0.001. More dramatically, the best IV (with true starting values) gave estimates ofρ of 0.46 or a bias of 0.16 and two stage least squares produced average estimates of0.818 for a bias of 0.418. The RMSE for the best IV (with true starting values) andtwo stage least squares increased by factors of over 11 and 19 over REML.

Table 5 shows that only 53.8 and 59.8 percent of the trials resulting in estimatesfor the best IV and two stage least squares in the (0, 1) interval. For ML and REMLall of the ρ estimates fell in this interval. For two stage least squares in the worstcase (18) when ρ = 0.8, φ = 0.95, and R2 = 1 only 29.4 percent of the trials had ρestimates (0, 1). Even when R2 equaled 0.5, some of the cases resulted in ρ estimatesoutside of (0, 1) for the best IV and two stage least squares estimators.

16

ρ r2 φ ρ̃ML ρ̃RE ρ̂B ρ̂IV ρ̂OLS

1 0.4000 0.9000 0.0000 0.3960 0.3986 0.3949 0.4333 0.63291 0.4000 0.9000 0.0000 0.0211 0.0210 0.0598 0.0567 0.0302

2 0.4000 0.9000 0.5000 0.3940 0.3988 0.3942 0.4317 0.62622 0.4000 0.9000 0.5000 0.0218 0.0217 0.0586 0.0558 0.0305

3 0.4000 0.9000 0.9500 0.3885 0.3986 0.3983 0.5086 0.66903 0.4000 0.9000 0.9500 0.0228 0.0227 0.1037 0.0911 0.0330

4 0.8000 0.9000 0.0000 0.7980 0.7981 0.7989 0.8035 0.89254 0.8000 0.9000 0.0000 0.0087 0.0087 0.0123 0.0122 0.0077

5 0.8000 0.9000 0.5000 0.7972 0.7980 0.7990 0.8041 0.89715 0.8000 0.9000 0.5000 0.0093 0.0092 0.0126 0.0130 0.0082

6 0.8000 0.9000 0.9500 0.7926 0.7976 0.8010 0.8184 0.98826 0.8000 0.9000 0.9500 0.0111 0.0109 0.0242 0.0237 0.0103

7 0.4000 0.5000 0.0000 0.3954 0.3984 0.3847 0.6185 0.69367 0.4000 0.5000 0.0000 0.0226 0.0224 0.1852 0.1308 0.0339

8 0.4000 0.5000 0.5000 0.3931 0.3985 0.3827 0.6109 0.69058 0.4000 0.5000 0.5000 0.0224 0.0223 0.1797 0.1262 0.0344

9 0.4000 0.5000 0.9500 0.3890 0.3995 0.3953 0.7906 0.69369 0.4000 0.5000 0.9500 0.0232 0.0229 0.3130 0.1505 0.0354

10 0.8000 0.5000 0.0000 0.7966 0.7969 0.7966 0.8362 1.028910 0.8000 0.5000 0.0000 0.0119 0.0119 0.0371 0.0314 0.0111

11 0.8000 0.5000 0.5000 0.7952 0.7966 0.7969 0.8385 1.031811 0.8000 0.5000 0.5000 0.0118 0.0117 0.0383 0.0338 0.0109

12 0.8000 0.5000 0.9500 0.7907 0.7969 0.8027 0.9072 1.067812 0.8000 0.5000 0.9500 0.0126 0.0124 0.0732 0.0512 0.0107

13 0.4000 0.1000 0.0000 0.3960 0.3990 0.4098 0.8546 0.702313 0.4000 0.1000 0.0000 0.0230 0.0229 0.5872 0.1711 0.0348

14 0.4000 0.1000 0.5000 0.3934 0.3989 0.3783 0.8588 0.699414 0.4000 0.1000 0.5000 0.0227 0.0225 0.5758 0.1617 0.0345

15 0.4000 0.1000 0.9500 0.3895 0.4001 0.5664 0.9423 0.696515 0.4000 0.1000 0.9500 0.0231 0.0229 0.9076 0.1504 0.0349

16 0.8000 0.1000 0.0000 0.7968 0.7971 0.7897 0.9537 1.072516 0.8000 0.1000 0.0000 0.0123 0.0123 0.1177 0.0587 0.0106

17 0.8000 0.1000 0.5000 0.7952 0.7968 0.7918 0.9580 1.073117 0.8000 0.1000 0.5000 0.0126 0.0124 0.1206 0.0550 0.0105

18 0.8000 0.1000 0.9500 0.7907 0.7970 0.8217 1.0354 1.080918 0.8000 0.1000 0.9500 0.0128 0.0125 0.2174 0.0603 0.0110

Table 4: Estimation of the SDM Model via Instrumental Variables and MaximumLikelihood

17

0 ρ r2 φ ρ̂B < 0 ρ̂B > 1 ρ̂IV > 1 ρ̂OLS > 1

1 0.4000 0.0000 0.9000 0.0000 0.0000 0.0000 0.00002 0.4000 0.5000 0.9000 0.0000 0.0000 0.0000 0.00003 0.4000 0.9500 0.9000 0.0010 0.0000 0.0000 0.00004 0.8000 0.0000 0.9000 0.0000 0.0000 0.0000 0.00005 0.8000 0.5000 0.9000 0.0000 0.0000 0.0000 0.00006 0.8000 0.9500 0.9000 0.0000 0.0000 0.0000 0.11107 0.4000 0.0000 0.5000 0.0490 0.0000 0.0000 0.00008 0.4000 0.5000 0.5000 0.0430 0.0000 0.0000 0.00009 0.4000 0.9500 0.5000 0.1500 0.0060 0.0510 0.000010 0.8000 0.0000 0.5000 0.0000 0.0000 0.0000 0.993011 0.8000 0.5000 0.5000 0.0000 0.0000 0.0000 0.996012 0.8000 0.9500 0.5000 0.0020 0.0000 0.0200 1.000013 0.4000 0.0000 0.1000 0.2740 0.1010 0.1720 0.000014 0.4000 0.5000 0.1000 0.2780 0.0870 0.1710 0.000015 0.4000 0.9500 0.1000 0.2840 0.2810 0.3400 0.000016 0.8000 0.0000 0.1000 0.0200 0.0040 0.2200 1.000017 0.8000 0.5000 0.1000 0.0160 0.0060 0.2210 1.000018 0.8000 0.9500 0.1000 0.0660 0.1450 0.7120 1.0000

Table 5: Proportion of of the SDM Model estimates in (0, 1) for Instrumental Vari-ables and Maximum Likelihood

18

0 ρ r2 φ R̃ML R̃RE R̂B R̂IV R̂OLS

1 0.4000 0.9000 0.0000 0.0146 0.0142 0.0404 0.0469 0.23292 0.4000 0.9000 0.5000 0.0151 0.0146 0.0402 0.0455 0.22623 0.4000 0.9000 0.9500 0.0167 0.0151 0.0683 0.1115 0.26904 0.8000 0.9000 0.0000 0.0059 0.0059 0.0081 0.0086 0.09255 0.8000 0.9000 0.5000 0.0066 0.0064 0.0086 0.0091 0.09716 0.8000 0.9000 0.9500 0.0096 0.0075 0.0160 0.0216 0.18827 0.4000 0.5000 0.0000 0.0154 0.0152 0.1263 0.2197 0.29368 0.4000 0.5000 0.5000 0.0158 0.0155 0.1193 0.2113 0.29059 0.4000 0.5000 0.9500 0.0172 0.0156 0.2084 0.3906 0.293610 0.8000 0.5000 0.0000 0.0078 0.0078 0.0244 0.0375 0.228911 0.8000 0.5000 0.5000 0.0082 0.0080 0.0258 0.0400 0.231812 0.8000 0.5000 0.9500 0.0107 0.0084 0.0486 0.1072 0.267813 0.4000 0.1000 0.0000 0.0158 0.0152 0.3672 0.4546 0.302314 0.4000 0.1000 0.5000 0.0161 0.0155 0.3597 0.4588 0.299415 0.4000 0.1000 0.9500 0.0177 0.0156 0.6264 0.5423 0.296516 0.8000 0.1000 0.0000 0.0086 0.0084 0.0743 0.1537 0.272517 0.8000 0.1000 0.5000 0.0086 0.0085 0.0774 0.1580 0.273118 0.8000 0.1000 0.9500 0.0107 0.0086 0.1428 0.2354 0.2809

Table 6: SDM Model RMSE from Instrumental Variables and Maximum Likelihood

19

3.2 Performance by Sample Size

In section 3.1.2 the most difficult case (15) for the best instruments in estimating theSDM was for ρ = 0.4, R2 = 0.1, and φ = 0.95. Both the best instruments and twostage least squares produced biased estimates for ρ when n = 3, 000. However, thetheory establishes the consistency of these techniques. In this section we examinehow the bias and performance of the IV techniques varies with sample size. Table 7shows that the best instruments show low bias after 500, 000 observations while thetwo stage least squares estimates take 2, 500, 000 observations to achieve low bias. Incontrast, maximum likelihood displayed low bias across all sample sizes (despite usingan approximation to the log-determinant) while OLS displayed high (but relativelyconstant) bias across all sample sizes.

In terms of computation, we used a Delaunay triangle routine to determine Wbased on contiguity. This took slightly over one minute on a 2005 vintage computer.We used the Barry and Pace (1999) log-determinant approximation in calculatingmaximum likelihood (with some modifications described in LeSage and Pace (2009)).The reported times for maximum likelihood combined the times for computing thelog-determinant and estimating the model once. Naturally, in the simulation thelog-determinant approximation was only run once and the marginal time required forestimating the model for another y approximately equaled OLS. The best IV estima-tor, due to computing (In − ρW )−1Xβ, actually required more time than maximumlikelihood when n = 2, 500, 000. Nonetheless, all the techniques were computationalfeasible, even for large sample sizes.

20

n 20,000 100,000 500,000 2,500,000

mean ρ̂ML 0.3984 0.3999 0.3996 0.4002s.d. ρ̂ML 0.0090 0.0039 0.0019 0.0008

mean ρ̂B 0.3792 0.3209 0.4087 0.3954s.d. ρ̂B 0.8791 0.1695 0.0723 0.0322

mean ρ̂IV 0.8149 0.5815 0.4588 0.4073s.d. ρ̂IV 0.1613 0.1097 0.0585 0.0314

mean ρ̂OLS 0.7060 0.7081 0.7074 0.7080s.d. ρ̂OLS 0.0132 0.0057 0.0028 0.0012

mean R2 0.0972 0.0983 0.0985 0.0984

Delaunay secs 0.3451 1.7414 12.1831 62.5387

ML secs 4.9501 9.2216 44.5124 227.5284

Best IV secs 0.3811 2.5664 24.9302 295.3684

2SLS IV secs 0.4712 3.0837 14.4596 71.0653

OLS secs 0.1014 0.6483 3.1605 15.5437

Table 7: Estimation of the SDM Model for large n

21

3.3 Marginal Effects

We begin with the total marginal effects for the SAR model as shown in Table 8. First,the ML, REML, and Best IV estimator usually display low biases across the cases.However, two stage least squares displays a positive bias which becomes serious forsome of the cases when R2 = 0.5 and all of the cases when R2 = 0.1. In almost everysituation the variability of the estimates of the total effects from all the estimatorsrises with φ. In some situations the variability more than doubles.

For the SDM, Table 9 displays the total effect results across the estimators andcases. In this case REML displays the lowest unadjusted total effect biases, althoughit often displays slightly more variability than ML. In the cases where R2 = 0.9the best IV slightly underestimates the total effects. This underestimation becomessomewhat more prominent as the R2 falls. However, two stage least squares displaysvery serious biases when R2 equals 0.5 and especially 0.1. In case 15, due to havingestimates of ρ > 1, two stage least squares reverses the sign of the total effect.

22

ρ r2 φ T̃ML T̃RE T̂B T̂IV T̂OLS

1 0.4000 0.9000 0.0000 1.6681 1.6679 1.6686 1.6733 1.75661 0.4000 0.9000 0.0000 0.0489 0.0489 0.0503 0.0501 0.0538

2 0.4000 0.9000 0.5000 1.6672 1.6677 1.6705 1.6731 1.72702 0.4000 0.9000 0.5000 0.0494 0.0494 0.0519 0.0519 0.0520

3 0.4000 0.9000 0.9500 1.6671 1.6689 1.6692 1.6709 1.70783 0.4000 0.9000 0.9500 0.0641 0.0642 0.0619 0.0623 0.0673

4 0.8000 0.9000 0.0000 5.0043 5.0029 5.0057 5.0414 5.82714 0.8000 0.9000 0.0000 0.1794 0.1793 0.1812 0.1902 0.2214

5 0.8000 0.9000 0.5000 5.0021 5.0021 5.0145 5.0424 5.72655 0.8000 0.9000 0.5000 0.2007 0.2010 0.2085 0.2129 0.2378

6 0.8000 0.9000 0.9500 4.9878 4.9959 5.0173 5.0371 5.90296 0.8000 0.9000 0.9500 0.3458 0.3471 0.3402 0.3389 0.4629

7 0.4000 0.5000 0.0000 1.6715 1.6706 1.6723 1.7135 2.22567 0.4000 0.5000 0.0000 0.1324 0.1323 0.1510 0.1507 0.2022

8 0.4000 0.5000 0.5000 1.6743 1.6763 1.6776 1.7014 2.06278 0.4000 0.5000 0.5000 0.1419 0.1421 0.1550 0.1563 0.1917

9 0.4000 0.5000 0.9500 1.6705 1.6748 1.6741 1.6881 1.84849 0.4000 0.5000 0.9500 0.1901 0.1906 0.1869 0.1916 0.2333

10 0.8000 0.5000 0.0000 5.0073 5.0019 5.0098 5.3103 17.384010 0.8000 0.5000 0.0000 0.4533 0.4529 0.5465 0.5937 2.9525

11 0.8000 0.5000 0.5000 5.0047 5.0053 5.0368 5.2735 15.839411 0.8000 0.5000 0.5000 0.5542 0.5541 0.6199 0.6736 2.9666

12 0.8000 0.5000 0.9500 4.9912 5.0058 5.0601 5.2461 −11.781112 0.8000 0.5000 0.9500 1.0044 1.0109 1.0211 1.0778 13.1279

13 0.4000 0.1000 0.0000 1.6853 1.6840 1.6740 2.0528 3.053713 0.4000 0.1000 0.0000 0.3648 0.3642 0.4548 0.5255 0.7371

14 0.4000 0.1000 0.5000 1.7011 1.7041 1.6927 1.9238 2.735014 0.4000 0.1000 0.5000 0.4218 0.4226 0.4661 0.5236 0.7255

15 0.4000 0.1000 0.9500 1.6849 1.6905 1.6975 1.8340 1.981715 0.4000 0.1000 0.9500 0.5557 0.5583 0.5687 0.6500 0.7606

16 0.8000 0.1000 0.0000 5.0357 5.0290 4.9641 7.9957 −16.944516 0.8000 0.1000 0.0000 1.1136 1.1121 1.6774 2.8672 5.3322

17 0.8000 0.1000 0.5000 5.1088 5.1097 5.0499 7.5818 −14.097517 0.8000 0.1000 0.5000 1.5626 1.5629 1.9185 3.0750 6.0601

18 0.8000 0.1000 0.9500 5.0829 5.1015 5.2099 7.1609 −0.670118 0.8000 0.1000 0.9500 2.9592 2.9814 3.2568 6.7314 3.9017

Table 8: Estimated Total Effects from the SAR Model via Instrumental Variablesand Maximum Likelihood

23

ρ r2 φ T̃ML T̃RE T̂B T̂IV T̂OLS

1 0.4000 0.9000 0.0000 1.6507 1.6574 1.6499 1.7651 2.65801 0.4000 0.9000 0.0000 0.0681 0.0681 0.1669 0.1768 0.2171

2 0.4000 0.9000 0.5000 1.6443 1.6564 1.6453 1.7552 2.60582 0.4000 0.9000 0.5000 0.0730 0.0736 0.1584 0.1751 0.2147

3 0.4000 0.9000 0.9500 1.6329 1.6581 1.6537 2.0091 2.96853 0.4000 0.9000 0.9500 0.1783 0.1820 0.3085 0.3970 0.4300

4 0.8000 0.9000 0.0000 4.9459 4.9490 4.9654 5.0811 9.03844 0.8000 0.9000 0.0000 0.2441 0.2439 0.3131 0.3278 0.6777

5 0.8000 0.9000 0.5000 4.9195 4.9380 4.9683 5.0943 9.41895 0.8000 0.9000 0.5000 0.2814 0.2826 0.3455 0.3578 0.8043

6 0.8000 0.9000 0.9500 4.8010 4.9138 4.9465 5.4273 66.88836 0.8000 0.9000 0.9500 0.9072 0.9334 1.0107 1.0857 59.7624

7* 0.4000 0.5000 0.0000 1.6483 1.6560 1.6226 2.5829 3.15407 0.4000 0.5000 0.0000 0.1360 0.1365 0.5040 0.9034 0.4073

8 0.4000 0.5000 0.5000 1.6425 1.6567 1.6009 2.5267 3.11508 0.4000 0.5000 0.5000 0.1576 0.1589 0.4745 0.7849 0.4553

9 0.4000 0.5000 0.9500 1.6192 1.6466 1.5711 4.0734 3.16709 0.4000 0.5000 0.9500 0.5031 0.5132 0.9757 3.3751 1.0860

10 0.8000 0.5000 0.0000 4.9039 4.9105 4.8895 5.9812 −32.297010 0.8000 0.5000 0.0000 0.4460 0.4459 0.9282 1.1946 13.6989

11 0.8000 0.5000 0.5000 4.8703 4.9035 4.9010 6.0701 −29.002611 0.8000 0.5000 0.5000 0.6060 0.6090 1.0187 1.3284 11.8103

12 0.8000 0.5000 0.9500 4.6716 4.8153 4.7548 10.1375 −14.436912 0.8000 0.5000 0.9500 2.6279 2.7193 2.9596 7.5898 8.0974

13 0.4000 0.1000 0.0000 1.6379 1.6454 1.2985 4.2001 3.235113 0.4000 0.1000 0.0000 0.3594 0.3609 1.3675 4.6443 0.7975

14 0.4000 0.1000 0.5000 1.6272 1.6424 1.2711 4.2402 3.201014 0.4000 0.1000 0.5000 0.4500 0.4540 1.2751 5.1848 0.9702

15 0.4000 0.1000 0.9500 1.6103 1.6463 0.6335 2.8419 3.125415 0.4000 0.1000 0.9500 1.4948 1.5232 1.7459 12.0521 3.0639

16 0.8000 0.1000 0.0000 4.8442 4.8508 4.6500 12.2291 −12.758916 0.8000 0.1000 0.0000 1.1063 1.1079 2.7628 12.9961 3.3146

17 0.8000 0.1000 0.5000 4.7894 4.8211 4.5625 12.0907 −12.511617 0.8000 0.1000 0.5000 1.6429 1.6566 3.0707 15.9700 4.8945

18 0.8000 0.1000 0.9500 4.6336 4.7955 2.6696 −7.4605 −11.574318 0.8000 0.1000 0.9500 7.8888 8.1310 8.0018 39.7988 19.6848

Table 9: Estimated Total Effects from the SDM Model via Instrumental Variablesand Maximum Likelihood

24

4 Sensitivity to W

The above analysis has always used a contiguity-based weight matrix. A commonfinding in spatial econometrics is that the structure of W matters in some cases. Toinvestigate this, we examined the case with a SDM DGP where n = 3, 000, φ = 0.95,ρ = 0.4, and R2 = 0.9. This is similar to the case used in analysing models with largen in section 3.2. However, the sample size is smaller and the R2 was raised from 0.1to 0.9 to allow better performance of the IV estimators for this sample size of 3, 000observations.

We examined the afore mentioned contiguity weight matrix, weight matrices basedon 6, 20, and 30 nearest neighbors (NN-6, NN-20, NN-30), weight matrices where the3, 20, and 30 diagonals closest to the main diagonal are positive (D-6, D-20, and D-30),a random matrix where approximately 6/n proportion of the off-diagonal elements arepositive (Random), and a block diagonal matrix comprised of a 50 by 50 submatrixS with the 6 diagonals nearest the main diagonal were positive (and 0 otherwise) sothat the candidate weight matrix equals I60 ⊗ S. All candidate weight matrices weresymmetricized and scaled to be doubly stochastic.

Table 10 shows that the weight matrices made a difference in the average estimatesof ρ across the 1, 000 trials. For all the weight matrices, REML gave answers close tothe true parameter value of 0.4. The Best IV estimator also came close to this valuewith the most bias appearing in the case with a matrix with 20 positive diagonals(D-20) where the average estimate of ρ equalled 0.3705. Interestingly, maximumlikelihood showed considerable bias in the case of the D-30 weight matrix with anaverage estimate of 0.3318. Also, 2SLS showed its highest bias in this case with anestimate of 0.745. OLS showed its lowest bias in the case of NN-30. Typically, thedeterminant plays a smaller role the denser the weight matrix and the less clusteredneighbors are to the main diagonal.

W ML REML BEST 2SLS OLS

Contiguity 0.3881 0.3980 0.3923 0.5065 0.6672NN-6 0.3870 0.3989 0.3847 0.5570 0.6617D-6 0.3871 0.3984 0.3760 0.6143 0.6563NN-20 0.3565 0.3961 0.3880 0.6080 0.5910D-20 0.3560 0.3959 0.3705 0.7291 0.5963NN-30 0.3357 0.3949 0.3883 0.6374 0.5525D-30 0.3316 0.3927 0.3740 0.7450 0.5546Random 0.3318 0.3928 0.3755 0.7379 0.5539Block Diagonal 0.3881 0.3985 0.3728 0.6315 0.6590

Table 10: Average Estimates of ρ for the SDM Model Across W

Table 11 shows the RMSE for ρ estimates for the different estimators across the

25

different weight matrices normalized by the REML RMSE. Interestingly, maximumlikelihood performed much worse than REML for some of the weight matrices. In thecase of a weight matrix with positive entries in the 15 closest off-diagonals on eitherside of main diagonal (D-30), ML had 67.2 percent more RMSE than REML. Almostall of that, however, was bias. For the Best IV estimator, it had RMSE worse thanOLS for D−20, D−30, and Random weight matrices. For two stage least squares, ithad RMSE worse than OLS for the NN−20, D−20, NN−30, D−30, and Randomweight matrices.

W ML REML BEST 2SLS OLS

Contiguity 1.1184 1.0000 4.6860 5.8506 11.2825NN-6 1.1356 1.0000 5.7661 7.6381 10.7066D-6 1.1506 1.0000 7.2979 10.3208 11.1566NN-20 1.3963 1.0000 3.3209 5.0693 4.4431D-20 1.4788 1.0000 5.4840 8.3331 4.9435NN-30 1.5477 1.0000 2.9276 4.6719 3.1370D-30 1.6719 1.0000 4.1952 6.9101 3.3118Random 1.6705 1.0000 4.0901 6.7866 3.3025Block Diagonal 1.1384 1.0000 8.4173 11.5782 11.6931

Table 11: Relative RMSE of ρ for the SDM Model Across W

5 Conclusion

Much of the research conducted through Monte Carlo experiments in spatial econo-metrics assumes that the regressors are iid. This is not a realistic assumption given thevery high level of measured spatial dependence found in common economic variables.Dependence in the regressors may not be very important in all situations. For exam-ple, estimation of regression parameters in error models is unbiased in the presenceof spatially dependent disturbances and this will not change for dependent regressors(although it might affect estimation precision). However, as this manuscript doc-uments, the performance of instrumental variable techniques when estimating SARand especially SDM models can be sensitive to spatial dependence in the regressorseven when using thousands of observations. In previous research, the bias of OLSwhen estimating SAR models also displayed sensitivity to dependence in the regres-sors (LeSage and Pace, 2009; Pace and LeSage, 2009). Therefore, we recommend thatinvestigations into the performance of various spatial techniques examine the sensitiv-ity of these techniques to spatial dependence in explanatory variables in conjunctionwith other spatial features of interest.

26

References

Anselin, L. (1988). Spatial Econometrics: Methods and Models, Dordrecht: KluwerAcademic.

Barry, R.P. and R.K. Pace (1999). “A Monte Carlo Estimator of the Log Determinantof Large Sparse Matrices,” Linear Algebra and its Applications, 289, pp. 41-54.

Bound, J., D. A. Jaeger and R. Baker (1995). “Problems with Instrumental VariablesEstimation When the Correlation Bctween the Instruments and the EndogenousExplanatory Variable is Weak,” Journal of the American Statistical Association,Vol. 90, pp. 443-450.

Bowden, Roger J. and Darrell A Turkington (1984). Instrumental Variables, Cam-bridge: Cambridge University Press.

Byme, P. F. (2005). “Strategic interaction and the adoption of tax increment financ-ing,” Regional Science and Urban Economics, Vol. 35, Issue 3, pp. 279-303.

Cressie, N. (1993). Statistics for Spatial Data, Revised edition, New York: John Wiley.

Kelejian, H., and I. Prucha. (1998). “A Generalized Spatial Two Stage Least SquaresProcedure for Estimating a Spatial Autoregressive Model with Autoregressive Dis-turbances,” Journal of Real Estate Finance and Economics, Vol. 17, pp. 99-121.

Kelejian, H., I. Prucha, and Y. Yuzefovich (2004).“Instrumental Variable Estima-tion of a Spatial Autoregressive Model with Autoregressive Disturbances: Largeand Small Sample Results,” Advances in Econometrics: Volume 18: Spatial andSpatiotemporal Econometrics, (Oxford: Elsevier Ltd), J.P. LeSage and R.K. Pace(eds.), pp. 163-198.

Klotz, Stefan (2004). Cross Sectional Dependence in Spatial Econometrics: With anApplication to German Start-Up Activity Data, Munster: LIT Verlag Munster.

Land, K. and G. Deane (1992). “On the large-sample estimation of regression modelswith spatial or network-effects terms: a two stage least squares approach,” In P.Marsden (Ed.), Sociological Methodology, San Francisco: Jossey-Bass, pp. 279-303.

Lee, L. F. (2003). “Best Spatial Two-Stage Least Squares Estimators for a SpatialAutoregressive Model with Autoregressive Disturbances.” Econometric Reviews,Vol. 22, pp. 307-335.

Lee, L. F. (2007). “GMM and 2SLS Estimation of Mixed Regressive, Spatial Autore-gressive Models,” Journal of Econometrics, Vol. 137, pp. 489-514.

LeSage, J. P. and K. R. Pace (2009). Introduction to Spatial Econometrics, BocaRaton: CRC Press/Taylor & Francis.

27

Millimet, D. L. and V. Rangaprasad (2007). “Strategic competition amongst publicschools,” Regional Science and Urban Economics, Vol. 37, Issue 2, pp. 199-219.

Pace, R. Kelley and James P. LeSage (2009). “Biases of OLS and Spatial Lag Mod-els in the Presence of an Omitted Variable and Spatially Dependent Variables,”Progress in Spatial Analysis: Methods and Applications, eds, Antonio Pa’ez, JulieGallo, Ron N. Buliung, and Sandy Dall’erba. Berlin: Springer-Verlag.

Patterson, H.D.; Thompson, R. (1971). “Recovery of inter-block information whenblock sizes are unequal.” Biometrika, Vol. 58, pp. 545-554.

Richards, T. J. and L. Padilla (2009). “Promotion and Fast Food Demand,” AmericanJournal of Agricultural Economics, Vol. 91, Issue 1, pp. 168-183.

Ripley, B. (1981). Spatial Statistics, New York: Wiley.

Staiger, D. and J. H. Stock (1997). “Instrumental Variables Regression with WeakInstruments,” Econometrica, Vol. 65, Issue 3, pp. 557-586.

28