spatial correlation robust inference
TRANSCRIPT
Spatial Correlation Robust Inference
Ulrich K. Müller and Mark W. Watson
Princeton University
Presentation at University of Virginia
Motivation
• More and more empirical work using spatial data in development, trade, macro, etc.
• How to appropriately correct standard errors?
— Conley (1999): Spatial analog of HAC standard errors
— Don’t work well under moderate or high spatial dependence
— Many applications characterized by strong spatial correlations (Kelly (2019, 2020))
1
Canonical Problem
Observe
= + , = 1
• (and ) associated with observed location ∈ S ⊂ R
• drawn i.i.d. with density
• With s = (1 ): E[|s] = 0, E[|s] = ( − ), so covariance stationary
• How to test 0 : = 0 and construct confidence interval for ?
• Extensions to regression, GMM etc. follow from standard arguments (see paper)
2
Three One-Dimensional Designs
3
Two Geographic Designs
Light data from Henderson, Squires, Storeygard and Weil (2018)
4
Spatial Inference
• Usual t-statistic
=
√( − 0)
with associated critical value cv, and confidence interval with endpoints ± cv √
• Conley (1999): kernel-type “consistent” estimator 2 = −1X
µ||−||
¶ with →∞
and standard normal cv
• Bester, Conley, Hansen, and Vogelsang (2016): ‘fixed-b’ version 2 = −1X
µ||−||
¶
and nonstandard cv obtained by simulating from ∼ N (0 1)
• Sun and Kim (2012): 2 = −1X
=1
³X−12
´2with Fourier weights and student-t
cv (analogous to Müller (2004, 2007), Phillips (2005), etc.)
5
This Paper
• Measure of strength of spatial dependence: average pairwise correlation
=1
(− 1)X=1
X6=Cor ( |s)
• Objective: method to construct and cv that “work” for
— some moderate and all small values of
— also for non-uniform spatial density
6
Outline of Talk
1. Definition of new “SCPC” method
2. Size control under generic weak correlation
3. Size control under some forms of strong correlation
4. Small sample performance
7
Motivation for SCPC Method
• Consider OLS variance estimator 2OLS = (− 1)−1P=1( − )2. Using familiar notation
2OLS = (− 1)−1y0My = (− 1)−1y0⎛⎝−1X=1
ww0
⎞⎠y = (− 1)−1 −1X=1
(w0u)2
for w such that −1w0w = 1[ = ] and l0w = 0 for l = (1 1)
0So | | cv iff
(l0u)2 cv2(− 1)−1−1X=1
(w0u)2
• Challenge with (positive) spatial correlation: Most weighted averages w0u are less variable than l0u,leading to downward bias of 2
• Solution: Select (few) weighted averages w0u that are as variable as possible under plausible spatialcovariance matrix, and use 2 = −1P
=1(w0u)
2
8
SCPC Method
• Benchmark model ( − ) = exp(−|| − ||) with associated covariance matrix Σ() of u
— worst-case correlation has = 0, weaker correlation for 0
• Use eigenvectors r1 r ofMΣ(0)M corresponding to largest eigenvalues as weights and
2SCPC() = −1X
=1
(r0u)2
“Spatial Correlation Principle Components” (SCPC) of u ∼ N (0MΣ(0)M)
• For given , compute critical value cvSCPC such that sup≥0 P0Σ()(|SCPC()| cvSCPC()|s) =
• SCPC minimizes CI length E[2SCPC() cvSCPC()|s] in i.i.d. model u|s ∼ N (0 2I)9
Eigenvectors in One-Dimensional Designs
10
Eigenvectors in Geographic Designs
11
Properties of SCPC Inference I
• Select 0 as function of implied average correlation = 0
⇒ SCPC inference invariant to scale of locations, as well as any distance preserving transformations,
such as rotations
• For fixed 0 (such as 0 = 002), SCPC = (1)
— SCPC not consistent, cvSCPC does not converge to normal critical value
— (appropriately) reflects difficulty of correctly estimating variance of l0u under ‘strong’ correlation
• SCPC easy to apply, even for (very) large , using computational short-cuts for determination ofeigenvectors if is large (STATA and matlab code available)
• By construction, SCPC controls size in benchmark model u|s ∼ N (0Σ()), ≥ 012
Properties of SCPC Inference II
• Theorem: Under regularity conditions, SCPC controls asymptotic size under generic ‘weak correlation’with
→ 0, also in non-Gaussian models
(whereas suggestions by Sun and Kim (2012) and Bester, Conley, Hansen, and Vogelsang (2016)
only do so for uniform )
• Theorem: In Gaussian model with parametric covariance matrices u|s ∼ N (0Σ()), ∈ Θ,
easy-to-compute conditions such that SCPC controls size in nonparametric class of models
u|s ∼ N (0ZΣ() ()) for all cdfs
⇒ Numerical result: in many spatial designs, SCPC controls size for mixtures of Matérn covariance
functions with implied ≤ 0
• Numerical result: SCPC close to minimizing expected confidence interval length in particular classof spatial correlations among all functions of y
13
Weak Correlation: Set-up of Lahiri (2003)
• Location are sampled i.i.d. on S ⊂ R
• Conditional on locations s = (1 ), for some sequence 0,
= ()
with a mean-zero stationary random field on R with E[()()] = ( − )
• Calculation: = (1), so weak dependence when →∞
⇒ nature of weak dependence characterized by = → ∈ [0∞)
⇒ →∞ corresponds to asymptotically negligible spatial correlation
14
Weak Correlation: Weighted Averages
• Lemma: Let the × (+ 1) matrixW0 have th row w0() where w0() = (1w()0) w() =
(1() ())0 and : S 7→ R are continuous weight functions. Under mixing and moment
assumptions on of Lahiri (2003)
12 −12W00u|s⇒ N (0Ω) with Ω = (0)V1 +
µZ()
¶V2
where
V1 =
ZSw0()w0()0() and V2 =
ZSw0()w0()0()2
⇒ V1 is what we would expect from i.i.d. data, large corresponds to very weak correlation
⇒ V2 proportional to V1 only under constant
• Convergence holds conditional on s: Randomness in locations doesn’t drive variability of weightedaverages
15
Weak Correlation: t-statistic
• Recall: 12 −12W00u|s⇒ N (0Ω) with Ω = (0)V1 + (R())V2
• Theorem: Let be t-statistic with 2 = −1P=1(w
0y)
2. WithX = (0X01:)
0 ∼ N (0Ω),
under assumptions of Lemma,
P (| | cv |s) → P
⎛⎜⎝ |0|q−1X01:X1:
cv
⎞⎟⎠
• If Ω ∝ V1 =RS w0()w0()0() = I+1, then critical value from student-t
⇒ but Ω ∝ V1 only if uniform, so Sun and Kim (2012) approach not generically valid
16
Weak Correlation: Size Control
• Since t-statistic is scale invariant, no loss of generality to normalize
Ω = V1 + (1− )V2 ∈ [0 1)where
V1 =
ZSw0()w0()0() and V2 =
ZSw0()w0()0()2
⇒ scalar parameter ∈ [0 1) fully characterizes all possible limits under weak correlation
• SCPC benchmark model (− ) = exp(−||− ||) replicates all such limits with appropriatechoices of = →∞
Theorem: SCPC critical value construction sup≥0 P0Σ()
(|SCPC()| cvSCPC()|s) = implies
size control also under arbitrary sequences ≥ 0, and, therefore, under generic weak correlation.
17
Weak Correlation: Additional Results
• Convergence of first eigenvectors ofMΣ0M to limiting eigenfunctions : S 7→ R in appropriatesense
• Analogous results to those above for t-statistics with ‘fixed-b’ kernel variance estimators
2 = −1X
µ||−||
¶
with cv obtained from distribution in i.i.d. model, as in Bester, Conley, Hansen, and Vogelsang (2016)
⇒ generic size control under weak correlation again only with uniform
18
Size Control Under Mixtures I
• Consider Gaussian model and parametric covariance matrices u|s ∼ N (0Σ()), ∈ Θ.
Seek easy-to-compute conditions such that t-statistic is valid in nonparametric class of models
u|s ∼ N (0ZΣ() ()) for all cdfs
• Suppose 2 = u0WW0u, cv and Σ0 are such that
P(| | cv) ≤ under y ∼ N (0Σ0)so can one find inequalities relating Σ(), ∈ Θ to Σ0 such that also
P(| | cv) ≤ under y ∼ Nµ0
ZΣ() ()
¶for any cdf on Θ?
19
Size Control Under Mixtures II
Theorem: With W0 = (lW), let Ω0 = W00Σ0W0 and Ω() = W00Σ()W0. Suppose A0 =
D(cv)Ω0 with D(cv) = diag(1− cv2 I) is diagonalizable, and let P be its eigenvectors. Define
A() = P−1D(cv)Ω()P and A() = 12(A() +A()0), and suppose A0 and A() ∈ Θ are scale
normalized such that 1(A0) = 1(A()) = 1, where (·) is the th largest eigenvalue. Let
1() = (−A())− 1(A())(−A0)− (1(A())− 1)() = +1−(−A())− 1(A())+1−(−A0) for = 2
If inf∈ΘP=1 () ≥ 0 for all 1 ≤ ≤ , then for any cdf on Θ, P(| | cv) ≤ under
y ∼ N (0Σ0) implies
P(| | cv) ≤ under y ∼ Nµ0
ZΣ() ()
¶
20
Implications of Mixture Theorem
1. Spatial context
• Apply SCPC in U.S. states “light” and uniform designs
• Use theorem to numerically establish size control in arbitrary mixtures of Matérn-class covariancematrices with ∈ 12 32 52∞ and 0 such that implied ≤ 0
2. Regular time series
• Consider equal-weighted cosine (EWC) t-statistic of Müller (2004, 2007), Lazarus et al. (2018)with critical value computed from benchmark AR(1) model with coefficient 1− 0
• Class of spectral densities where ()0() is (weakly) monotonically increasing in ||⇒ Can be written as mixture (|) ∝
R1[ ]0() ()
⇒ Theorem implies small sample and asymptotic size control (cf. results in Dou (2019))
21
Intuition for Theorem
• With Ω() =W00Σ(θ)W0 and X = (0X01:)
0 ∼ Ω()12Z ∼ N (0Ω())
P³2 cv2
´→ P
⎛⎝ 20
X01:X1: cv2
⎞⎠ = P ³20 − cv2X01:X1: 0
´= P
³X0D(cv)X 0
´= P(Z0Ω()12D(cv)Ω()12Z 0)
= P
⎛⎝ X=0
()2 0
⎞⎠ = P⎛⎝20 − X
=1
()
0()2
⎞⎠where and D(cv) = diag(1− cv2 I) and () are the eigenvalues of Ω()12D(cv)Ω()12, or,equivalently, of D(cv)Ω()
• Can show: P³20
P=1
2
´is Schur convex in =1
• Use majorization theory on eigenvalues of linear combinations of matrices (plus additional calcula-tions) to obtain result
22
Monte Carlo Comparison with Existing Methods
For each of the 48 contiguous U.S. states, draw 5 independent samples of 500 locations from density
∈ uniform, light
• Gaussian benchmark model with ( − ) = exp(−0|| − ||), 0 calibrated from 0
• Alternative methods
1. Conley (1999) with choice of that minimizes size distortion
2. Fixed-b kernel variance estimator with = max || − || (analog to Kiefer, Vogelsang andBunzel (2000))
3. Sun and Kim (2012) with chosen according to their rule using true covariance function
4. Ibragimov and Müller (2010) cluster inference with = 4 and = 9 clusters
23
Monte Carlo Results
24
Monte Carlo Results
25
Monte Carlo Results
26
Conclusions
• New method to conduct spatial correlation robust inference that accounts for
— non-uniform distribution of locations
— (some) strong spatial correlations
• Potentially applicable also in network econometrics or more generally under given plausible form of
‘worst-case’ correlation structure
27