-referenced variable x€¦ · geary„s c is based on paired comparisons of values of a...

3.2.3 Geary‘s c

Geary„s c is based on paired comparisons of values of a geo-referenced variable X

In neighbouring regions to measure spatial autocorrelation.

Geary„s c with unstandardized weights :

(3.12)

Geary„s c with standardized weights wij:

(3.13)

with

*ijw

Range of Geary„s c: [0; 2], Expectation of C under independency: E( C)=1

(spatial randomness)

Positive spatial autocorrelation: 0 ≤ C < 1

Negative spatial autocorrealtion: 1 < C ≤ 2

n

1i

2i0

n

1i

n

1j

2ji

*ij

)xx(S2

)xx(w)1n(

C

n

1i

n

1j

*ij0 wS

n

1i

2i

n

1i

n

1j

2jiij

)xx(n2

)xx(w)1n(

C

1

Table 1: Weighted squared differences wij(xi-xj)2

Example:

We show the calculation of Geary„s c in the five-region example with the standar-

dized weight matrix.

Standardized weight matrix: Observation vector:

01000

3/103/13/10

03/103/13/1

03/13/103/1

002/12/10

W

2

3

6

6

8

x

Region 1 2 3 4 5

1 0 (1/2)(8-6)2=2 (1/2)(8-6)2=2 0 0

2 (1/3)(6-8)2=1 1/3 0 (1/3)(6-6)2=0 (1/3)(6-3)2=3 0

3 (1/3)(6-8)2=1 1/3 (1/3)(6-6)2=0 0 (1/3)(6-3)2=3 0

4 0 (1/3)(3-6)2=3 (1/3)(3-6)2=3 0 (1/3)(3-2)2=1/3

5 0 0 0 1(2-3)2=1 0

Sum of weighted squared differences 20 2

The sum of sqared deviations from the mean has already been calculated with

Moran„s I (section 3.2.1):

5

1i

2i 24)xx(

Geary„s c with standardized weights wij (n=5):

3333,0240

80

2452

20)15(C

0 ≤ C=0,3333 < 1: positive spatial autocorrelation

3

Comparison between Moran„s I and Geary„s c:

Evaluating spatial autocorrelation with Moran„s I and Geary„s c leads to similar

but not identical results.

Griffith (1987) notes that simulation experiments suggest that the inverse relation-

ship between Moran's I and Geary's C is basically linear in nature. Departures

from linearity are ascribed to differences in what each of the two indices measure.

Geary's C deals with paired comparisons and Moran's I with covariations.

The relation between Moran's I and Geary's C can be compared by randomization

experiments

4

Figure: Relation between Moran's I and Geary's C for 20000 statistics

generated using rook contiguity

5

3.2.4 Getis-Ord G statistic

Getis and Ord (1992) have suggested a somewhat different approach to measuring

spatial association using a distance-based contiguity matrix. Neighbourhoods are

defined by a critical distance d. All regions within the critical distance d from a spa-

tial unit i are neighbours of that region.

The Getis-Ord G statistic are conceived for assessing overall spatial concentration.

An application of the G statistic is restricted to geo-referenced variables with posi-

tive values and a natural origin.

G statistic:

(3.14)

The G statistic measures the proportion of the sum of the products of each xi with

an xj value within a distance d from i to the total sum of all products xi·xj, j≠i. G(d)

provides a evidence of global spatial clustering of high values (“hot spots”). A low

value of G(d) will occur in case of low value clustering but may emerge also in case

of negative spatial autocorrelation.

high G(d): overall concentration of high attribute values

low G(d): lack of overall concentration of high attribute values

Range: 0 ≤ G ≤ 1

n

1i ijji

n

1i ijjiij

xx

xx)d(w

)d(G

6

Weights of the binary matrix W(d):

With respect to a unique usage of global and local Getis-Ord statistics

( section 3.3.2) we set the elements of the main diagonal wii equal to 1.

Note that the G statistic is not affected by this definition.

otherwise,0

ddif,1)d(w

ijij

7

Example:

1 2

3

4 5

Distances between regions are measured by distances between their centres. In

our five-region example,

we impute the following distances between centres (in km):

Region 1 2 3 4 5

1 0 6 5 11 14

2 6 0 4 5 8

3 5 4 0 7 10

4 11 5 7 0 3

5 14 8 10 3 0

8

The above table covers the entries of the distance matrix D:

0310814

307511

107045

85406

1411560

D

We set the critical distance d equal to 7.5 kilometres. The spatial weight matrix

W(d) corresponding with d=7.5 reads:

Because of the particular choice of the critical distance, W(d=7.5) is – apart from

the main diagonal elements - identical with the „ordinary“ first-order contiguity

matrix W*.

11000

11110

01111

01111

00111

)5.7d(W

9

Region 1 2 3 4 5

1 - 86=48 86=48 83=24 82=16

2 68=48 - 66=36 63=18 62=12

3 68=48 66=36 - 63=18 62=12

4 38=24 36=18 36=18 - 32=6

5 28=16 26=12 26=12 23=6 -

j≠i Sum of products xixj, j≠i 476

Calculation of the denominator of (3.14):

Calculation of the numerator of (3.14):

Region 1 2 3 4 5

1 - 86=48 86=48 0 0

2 68=48 - 66=36 63=18 0

3 68=48 66=36 - 63=18 0

4 0 36=18 36=18 - 32=6

5 0 0 0 23=6 -

j≠i Sum of weighted products wij(d)xixj, j≠i 348

G statistic:

7311.0476

348G

Observation vector x: '23668x

10

Test for global spatial clustering

Null hypothesis H0: Lack of spatial concentration of attribute values

Alternative hypothesis H1: Spatial concentration of high attribute values

Expected value of G(d):

(3.16))1n(n

W)]d(G[E

with

n

1i ijij )d(wW

Test statistic: (3.15))G(Var

)G(EG)G(Z

a

~ N(0,1)

Variance of G(d):

(3.17)

with

4,3,2,1r,xmn

1i

rir

(rth non-centered moment of X multiplied by n)

22 )]G(E[)G(E)G(Var

)mBmmBmmBmBmB(

)3n)(2n)(1n(n)mm(

1)G(E

4143132

21241

220

22

21

2

11

221

20 W3nSS)3n3n(B

]W3nS2S)nn[(B 221

21

]W6S)3n(nS2[B 2212

2213 W8S)1n(2S)1n(4B

2214 WSSB

n

1i ij

2jiij1 )]d(w)d(w[

2

1S

n

1i

2ii

2n

1i ij ijjiij2 )ww()d(w)d(wS

with

ij

iji )d(ww and

ij

jii )d(ww

12

Example:

In order to test for global spatial clustering on the basis of the G statistic, we

have to compute its expected value and variance.

Expected value of G(d):

Calculation of W:

Regio

n

1 2 3 4 5

1 - 1 1 0 0

2 1 - 1 1 0

3 1 1 - 1 0

4 0 1 1 - 1

5 0 0 0 1 -

ΣΣ j≠i Sum of wij, j≠i 12

6.020

12

)15(5

12

)1n(n

W)]d(G[E

13

242122

1)]d(w)d(w[

2

1S 25

1i ij

2jiij1

2)d(wwij

j11

3)d(wwij

j22

3)d(ww

ijj33

Row sums of W(d):

3)d(wwij

j44

1)d(wwij

j55

2)d(wwij

1j1

Column sums of W(d):

3)d(wwij

2j2

3)d(wwij

3j3

3)d(wwij

4j4

1)d(wwij

5j5

12843636361626664

)11()33()33()33()22()ww(S

22222

222225

1i

2ii2

14

Variance of G(d):

104432640312

123128524)3535(W3nSS)3n3n(B 22221

20

368)4321280480(

]1231285224)55[(]W3nS2S)nn[(B 22221

21

80)8641024240(

)126128)35(2452(]W6S)3n(nS2[B 22212

011521536384

128128)15(224)15(4W8S)1n(2S)1n(4B 22213

401212824WSSB 22214

15

Observation vector of the

attribute variable X:

23668'x

25)23668(xmn

1i

1i1

Moments of X multiplied by n:

149)49363664()23668(xm 22222n

1i

2i2

979)827216216512()23668(xm 33333n

1i

3i3

6785)1681129612964096()23668(xm 44444n

1i

4i4

16

)mBmmBmmBmBmB(

)3n)(2n)(1n(n)mm(

1)G(E

4143132

21241

220

22

21

2

]2540979250)1492580(6785368149104[

)35()25()15(5)14925(

1

422

22

477426.01298078427189120

1

)156250000745000024968802308904(27189120

1

117426.06.0477426.0)]G(E[)G(E)G(Var 222

17

Test statistic:

Critical value (α=0.05, one-sided test):

z1-α = 1.6449

Test decision:

z(G) = 0.3826 < z0.95 = 1,6449 => Accept H0

Interpretation:

No global evidence for substantive spatial clustering of high unemployment

regions

Hint:

As the normal approximation requires a large sample size, the test on the Getis-

Ord G statistic has only been performed here for illustrative purposes.

3826.0342675.0

1311.0

117426.0

6.07311.0

)G(Var

)G(EG)G(z

18

3.3 Local indicators of spatial association (LISA)

While global spatial autocorrelation analysis aims at summarizing the strength

of spatial dependencies by a single statistic, local spatial autocorrelation analy-

sis focuses on heterogeneity of spatial association over space. Instead of a single

global statistic, location-specific statistics are provided.

Local indicators of spatial association (LISA) provide detailed information on

spatial clustering (Anselin, 1995). The LISA for each observation gives an indication

of the extent of substantial spatial clustering of similar values around that observation.

Some LISA have also the property that their sum or average is proportional to the

global counterpart.

LISA aims at identifying local clusters and spatial outliers. Local clusters are charac-

terized by a concentration of high or low values of an attribute variable X. A spatial

clustering of contiguous high-value regions is called a „hot spot“, whereas a concen-

tration of low-value regions defines a „cold spot“. Both cases are associated with

positive local autocorrelation. Spatial outliers are regions with a reversed local

orientation compared to the predominant global one. When positive global spatial

autocorrelation has been established, regions with negative local autocorrelation

coefficients represent spatial outliers.

19

We deal with three well-known local indicators of spatial association,

- the local Moran statistic (Anselin, 1995),

- the Getis-Ord Gi statistic (Getis and Ord, 1992),

- the Getis-Ord Gi* statistic (Getis and Ord, 1992),

which complement one another with regard to identification of spatial clusters and

spatial outliers. The local Moran coefficient is adapted for identifying spatial

outliers and general but not specific clustering formations. For the latter purpose

the Getis-Ord Gi and Gi*statistics have to be applied. They can distinguish be-

tween „hot spots“ and „cold spots“ both of which are characterized by high posi-

tive spatial autocorrelation.

20

3.3.1 Local Moran statistic

n

1j

2j

n

1jjiji

i

n/)xx(

)xx(w)xx(

I

The Local Moran statistic Ii detects local spatial autocorrelation. The Ii„s are indica-

tors of local instability. They decompose Moran's I into contributions for each loca-

tion.

According to this property, Local Moran statistics can be used for two purposes:

- Indicators of local spatial clusters,

- Diagnostics for outliers in global spatial patterns.

Local Moran statistic:

(3.15)

Numerator

Determines the sign of Ii:

+, if both the ith region and the neighbouring have above or below average

values in the geo-referenced variable X

-, if the ith region has an above (below) and the neighbouring regions have a be-

low (above) average values in X

Denominator

Standardization of the cross-product by the variance sx² of the geo-referenced va-

riable X21

Relationsship between global and local Moran statistics:

The average of the Ii's coincides with Moran's I:

n

1iiI

n

1I

Expected value (under independence):

1n

1

1n

w)I(E i

i

(3.16)

1wwn

1jiji

with

22

Note: Random permutation tests on local Moran„s I statistics are available in pro-

grams like GeoDa and R (package spdep). Because of the high computational ex-

pense, the testing approach is introduced in the computer exercise. In the following

example local Moran„s I statistics are interpreted descriptively.

Example:

We calculate Local Moran statistics with the standardized weights wij.

5x

01000

3/103/13/10

03/103/13/1

03/13/103/1

002/12/10

W

2

3

6

6

8

x

The sum of sqared deviations from the mean has already been calculated with

Moran„s I (section 3.2.1):

Standardized weight matrix: Observation vector ( ):

Expected value: 25.015

1

1n

1

1n

w)I(E i

i

8.4245

1)xx(

n

1s

5

1j

2j

2x

23

● Region 1:

Weighted sum of deviations from the mean:

1)56(2

1)56(

2

1)xx(w

5

1jjj1

6250.08.4

3

8.4

1)58(I1


● Region 2:


Local Moran statistic: 1389.04.14

2

8.4

)3/2()56(I2

3

2)53(

3

1)56(

3

1)58(

3

1)xx(w

5

1jjj2

24

● Region 3:


3

2)53(

3

1)56(

3

1)58(

3

1)xx(w

5

1jjj3


2

8.4

)3/2()56(I3

● Region 4:



3

1)52(

3

1)56(

3

1)56(

3

1)xx(w

5

1jjj4

1389.04.14

2

8.4

)3/1()53(I4

25

● Region 5:


2)53(1)xx(w5

1jjj5


6

8.4

)2()52(I5

Moran„s I = Average of Local Moran Statistics:

[Section 3.2.1: I = 0,4583 (with standardized weights)]

4583.05/2917.2

5/)2500.11389.01389.01389.06250.0(I5

1I

5

1ii

Interpretation:

- A local spatial clustering is identified around region 5 and to a somewhat

less extent around region 1, as both I5 and I1 exceed the global Moran I

value noticeably.

- Since all Ii values exceed their expected value, no outlying region

with respect to orientation is identified. 26

3.3.2 Getis-Ord G statistics

The Getis-Ord Gi and Gi* statistics are local measures of spatial concentration.

They indicate the extent of spatial clustering of high values („hot spots“) or low

values („cold spots“) of an attribute variable X around region i

.

As with the global G statistic contiguity is defined by distance bands.

The Gi and Gi* statistics differ in excluding or including observation i from summa-

tion. While observation i is excluded in Gi, it is included in the computation of Gi*

(Getis and Ord, 1992).

Gi statistic:

(3.16)

Gi* statistic:

(3.17)

ijj

ijjij

ix

x)d(w

G

n

1jj

n

1jjij

*i

x

x)d(w

G

27

Expected values of Gi and Gi*:

(3.18) E(Gi) = Wi / (n-1) with (3.19) ij

iji )d(wW

(3.20) E(Gi*) = Wi* / n with (3.21)

n

1jij

*i )d(wW

Local spatial concentration of high values (“hot spots”):

Values of Gi and Gi* above their expected values

Local spatial concentration of low values (“cold spots”):

Values of Gi and Gi* below their expected values

28

Note: Getis and Ord (1995) also provide standardized Gi and Gi* statistics that

are asymptotically normally distributed. The normal test is even valid for sample

sizes as low as eight when the underlying distribution is not too skewed. For small

samples, however, the random permutation test is preferable. The testing approa-

ches are available in GeoDa and R (package spdep). In the following example the

Gi and Gi* statistics are interpreted descriptively.

Example:

We calculate the local Getis-Ord statistics Gi and Gi* for the five regions by using

the spatial weights matrix

which is defined in section 3.2.4 (global G statistic) for a distance band of 7.5

kilometres.

As the denominator of (3.17) does not vary across regions, it has to be calcu-

lated only once using the entries of the observation vector x:

Denominator of (3.17):

'23668x

5

1ij 2523668x

11000

11110

01111

01111

00111

)5.7d(W

29

Region 1:

Gi* statistic:

7059.017

12

2366

6161

x

x)5.7d(w

G

1jj

1jjj1

1

5.04

2

15

0011

1n

)5.7d(w

)G(E1j

j1

1

8.025

20

23668

616181

x

w

Gn

1jj

n

1jx)5.7d(j1

*1

j

6.05

3

5

00111

n

)5.7d(w

)G(E

n

1jj1

*1

Gi statistic:

G1 > E(G1) Tendency of spatial concentration of high values around region1

(hot spot)

G1* > E(G1

*) Tendency of spatial concentration of high values (hot spot):

region1 and surrounding

30

Region 2:

Gi* statistic:

75.04

3

15

0111

1n

)5.7d(w

)G(E2j

j2

2

8947.019

17

2368

316181

x

x)5.7d(w

G

2jj

2jjj2

2

92.025

23

23668

31616181

x

w

Gn

1jj

n

2jx)5.7d(j2

*2

j

8.05

4

5

01111

n

)5.7d(w

)G(E

n

2jj2

*2

Gi statistic:

G2 > E(G2) Tendency of spatial concentration of high values around region 2

(hot spot)

G2* > E(G2


region 2 and surrounding

31

Region 3:

Gi* statistic:

75.04

3

15

0111

1n

)5.7d(w

)G(E3j

j3

3

8947.019

17

2368

316181

x

x)5.7d(w

G

3jj

3jjj3

3

92.025

23

23668

31616181

x

w

Gn

1jj

n

3jx)5.7d(j3

*3

j

8.05

4

5

01111

n

)5.7d(w

)G(E

n

3jj3

*3

Gi statistic:

G3 > E(G3) Tendency of spatial concentration of high values around region 3

(hot spot)

G3* > E(G3


region 3 and surrounding32

Region 4:

Gi* statistic:

6364.022

14

2668

216161

x

x)5.7d(w

G

4jj

4jjj4

4

75.04

3

15

1110

1n

)5.7d(w

)G(E4j

j4

4

68.025

17

23668

21316161

x

w

Gn

1jj

n

4jx)5.7d(j4

*4

j

8.05

4

5

11110

n

)5.7d(w

)G(E

n

4jj4

*4

Gi statistic:

G4 < E(G4) Tendency of spatial concentration of low values around region 4

(cold spot)

G4* < E(G4

*) Tendency of spatial concentration of low values (cold spot):

region 4 and surrounding33

Region 5:

Gi* statistic:

1304.023

3

3668

31

x

x)5.7d(w

G

5jj

5jjj5

5

25.04

1

15

1000

1n

)5.7d(w

)G(E5j

j5

5

2.025

5

23668

2131

x

w

Gn

1jj

n

5jx)5.7d(j5

*5

j

4.05

2

5

11000

n

)5.7d(w

)G(E

n

5jj5

*5

Gi statistic:

G5 < E(G5) Tendency of spatial concentration of low values around region 5

(cold spot)

G5* < E(G5

*) Tendency of spatial concentration of low values (cold spot):

region 5 and surrounding

34

-referenced variable x€¦ · geary„s c is based on paired comparisons of values of a...

Documents