the special nature of spatial data - sage publications inc · the special nature of spatial data 7...

[14:18 14/8/2008 5187-Fotheringham-Ch02.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 4 4–21

2The Special Nature of

Spatial Data

R o b e r t H a i n i n g

This chapter describes some of the specialor distinguishing features of spatial dataopening the way to methodological issuesthat will be treated in more depth in laterchapters. The use of the term ‘special’should not be taken to imply that no othertypes of data possess these features. Spatialdata analysis is a sub-branch of the moregeneral field of quantitative data analysisand has sometimes suffered from not payingsufficient attention to that fact. Many of thedata properties that will be encountered arefound in other types of (non-spatial) data butwhen found in spatial data, may possess aparticular structure or properties may arise inparticular combinations.

The chapter will first define what is meantby spatial data and then identify properties.It will be helpful, in order to put structure onthis discussion, to distinguish ‘fundamental’properties of spatial data from propertiesthat are due to the chosen representation of

geographical space and from properties thatare a consequence of measurement processesby which data are collected for the purposeof storage in the spatial data matrix (SDM).The SDM is what the analyst works with.We conclude by considering the implicationsof these properties for the methodology ofspatial data analysis.

Geographic Information Science (GISc)is the generic label that is frequently used,particularly by geographers, to define thearea of science that involves the analysisof spatially referenced data – that is datawhere each case has some form of locationalco-ordinate attached to it. Data is the lynchpin in the process of “doing science” andit is essential that methodologies for spatialdata analysis are tuned to the properties ofspatial data.

The science undertaken with spatial datais usually ‘observational’ rather than ‘experi-mental’. This is important. Much spatial data


THE SPECIAL NATURE OF SPATIAL DATA 5

are not collected under controlled situations.We often cannot choose the values ofindependent variables in order to generate asatisfactory experimental design. There is noreplication (in order, for example, to assessthe effects of measurement error) and theanalyst must take the world as he or shefinds it. There may be further problems inspecifying what the appropriate locationalco-ordinate is when studying certain typesof processes and outcomes. All this hasimplications for the quality of spatial data andfor the methodologies that can be employed.We worry not only about the quality of ourdata but exactly what it is we are observingin any given situation. A consequence of thisis that much of the data collected may beused to build a model of the situation understudy which can then be used to estimateparameters and test hypotheses. We shallsee that some of the fundamental propertiesof spatial data raise major problems inthis regard.

2.1. SPATIAL DATA AND THEIRPROPERTIES

A spatial datum comprises a triple ofmeasurements. One or more attributes (X)are measured at a set of locations (i) at time t,where t may be a point or interval of time.So, if k attributes are measured at n locationsat time t, we can present the spatial data inthe form:

{xj (i; t) ; j = 1, . . ., k; i = 1, . . ., n}. (2.1)

Equation (2.1) expresses in shorthand muchof the content of the SDM. The record ofwhen the observation was taken (t) may besuppressed if analysis is concerned with onlya single time period but may be retainedif there are to be a series of comparativestudies through time or if different attributeswere recorded at different times and theanalyst needs to be aware of this. Such

data may come from a variety of differentsources including national censuses; publicor private agency records (e.g., nationalhealth service, police force areas, consumersurveys); and satellite imagery; environmen-tal surveys; and primary surveys. The datamay be collected from a census or froma sampling process. For the purposes ofanalysis data from different sources may berequired. Studies in environmental epidemi-ology utilise health, demographic, socio-economic and environmental data. Thesedata may come with differing degrees ofquality and may not all be collected onthe same areal framework (Brindley et al.,2005).

To understand the properties of spatialdata we need to understand the relationshipbetween equation (2.1) and the ‘real world’from which the data are taken. In order toundertake data analysis the complexity of thereal world must be captured in finite formthrough the processes of conceptualizationand representation (Goodchild, 1989; Guptilland Morrison, 1995; Longley et al., 2001).We shall focus here only on the issuesassociated with capturing spatial variation,but the reader should note that there areconceptualization and representation issuesassociated with the way attributes and timeare captured as well.

The first step in this process, whichultimately leads to the construction of theSDM, involves conceptualizing the geogra-phy of the real world. There are two viewsof the geographical world in GISc – thefield and the object views. The field viewconceptualizes space as covered by surfaceswith the attribute varying continuously acrossthe space. This is particularly appropriate formany types of environmental and physicalattributes. The object view conceptualizesspace as populated by well-defined indivisi-ble objects, a view that is particularly appro-priate for many types of social, economic andother types of data that refer to populations.Objects are conceptualized as points, lines orpolygons.

These two views constitute models ofthe real world. In order to reduce a field


6 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

to a finite number of bits of data thenthe surface may be represented using afinite number of sample points at whichthe attribute is recorded or it may berepresented using a raster grid. Pixels are laiddown independently of the underlying fieldand its surface variation. Alternatively, thesurface may be represented by polygons thatpartition the space into areas with uniformcharacteristics (e.g., vegetation zones). Howwell any field is captured by these differentrepresentations will depend on the density ofthe points or the size of the raster in relation tosurface variability. There is a large theoreticaland empirical literature on the efficienciesof different spatial sampling designs – forexample the properties of random, systematicand stratified random sampling given thenature of variation in the surface to besampled (see, e.g., Cressie, 1991; Ripley,1981). The process of discretizing in thisway involves a loss of information on surfacevariability.

This loss of detail on variability also ariseswhen selecting a representation based onthe object view. A city may comprise manyhouseholds (points) but for confidentialityreasons information about households isaggregated into spatially defined groups(polygons) – output areas in the case ofthe 2001 UK census, enumeration districtsprior to 2001 (Martin, 1998). Again aggre-gation into polygons involves a loss ofinformation. There may be a further loss ofinformation in capturing the polygon itselfin the database. It may be captured usinga representative point (such as its centroid)and its spatial relationship to other polygonscaptured using a neighbourhood weightsmatrix.

The conceptualization of a geographicspace as a field or as an object islargely dictated by the attribute. However,representation – the process by whichinformation about the geography of thereal world is made finite using geomet-ric constructs – involves making choices(Martin, 1999). These choices include thesize and configuration of polygons, thelocation and density of sample points.

2.1.1. Fundamental properties

Fundamental properties are inherent to thenature of attributes as they are distributedacross the earth’s surface. There is a fun-damental continuity (structure) to attributesin space that derives from the underlyingprocesses that shape the human and phys-ical geographical world. We shall discussexamples of these processes in section 2.2.2.The geographical world would be a strangeplace if levels of attributes changed suddenlyand randomly as we moved from one pointin space to another close by. Continuity isalso a fundamental property of attributesobserved in time. If we know the levelof an attribute at one position in space(time) we can make an informed estimateof its level at adjacent locations (pointsin time). The information that is carriedin a piece of data about an attribute ata given location provides information onwhat the level of the attribute is at nearbylocations. However as distance increases thenthe similarity of attribute values weakens andin the GISc literature this is often referredto as Tobler’s First Law of Geography(‘…near things are more related than distantthings’). Although Tobler’s First Law isclearly an oversimplification, and in relationto some types of spatial variation justplain wrong, it is nonetheless a usefulaphorism.

Testing for spatial autocorrelation wasone of the high-profile research agendas ingeography during the quantitative revolution.Geographers adapted spatial autocorrelationstatistics based on the join-count statistic,the cross product statistic and the squareddifference statistic that had been developedfor quantifying spatial structure on regularareal frameworks (grids). These statisticswere developed to test for statisticallysignificant spatial autocorrelation on irregularareal frameworks (Cliff and Ord, 1973). Thenull hypothesis (no spatial autocorrelation)was assessed against a non-specific alter-native hypothesis (spatial autocorrelation ispresent). We shall see how this argument wasdeveloped in later years with the introduction



and use by geographers of models for spatialvariation.

In the earth sciences, dealing principallywith point data from surfaces, the quan-tification of structure was based on theuse of the empirical semi-variogram whichuses a squared difference statistic (Isaaksand Srivastava, 1989). The advantage ofthe latter route was that it led naturally tomodel specification and model fitting usingtheoretical semi-variograms. Of course thesequantitative measures and tests of hypothesisdepend on the scale of analysis. That is, theydepend on the size of the polygons in termsof which data are reported, the inter-pointdistance between samples on a continuoussurface. Thus the chosen representation hasan important influence on the quantificationof this fundamental property and henceits presence within any spatial dataset. Ifsamples are taken at sufficient distances apartthe level of spatial autocorrelation is likely tobe much reduced relative to the case wheresamples are taken close together.

Autocorrelation statistics are also usedto capture temporal structure in attributevalues but there are important differenceswith the spatial situation. Time has a naturaluni-directional flow (from past to present)whereas space has no such order. The twodimensional nature of space means thatdependency structures might vary not justwith distance but direction too giving riseto anisotropic dependency structures withstructure along the north–south axis differingfrom the east–west axis. The presence ofspatial autocorrelation, that attribute valuesare not statistically independent, has funda-mental implications for the conduct of spatialanalysis.

Spatial autocorrelation, in statistical terms,is a second order property of an attributedistributed in geographic space. In additionthere may be a mean or first-order componentof variation represented by a linear, quadratic,cubic (etc.) trend. We can think of theseas two different scales of spatial variationalthough the distinction may be hard to makeand quantify in practice. As Cressie (1991)remarks: ‘What is one person’s (spatial)

covariance may be another persons meanstructure’ (p. 25). It has often been remarkedthat spatial variation is heterogeneous. Thistype of decomposition (plus a white noiseelement to capture highly localized hetero-geneity) is one way of formally capturing thatheterogeneity using what are termed ‘global’models. Another approach is to only analyzespatial subsets, that is allow model structureto vary locally.

2.1.2. Properties due to thechosen representation

We have already noted that the extent towhich our data retains fundamental propertiesdepends on the chosen representation. Wenow turn to look at other properties thatstem directly or indirectly from the chosenrepresentation.

Representing spatial variation using poly-gons is employed in many branches ofscience that handle spatially referenceddata. Two of the generic consequences ofworking with data aggregates are: intra-areal unit heterogeneity and inter-areal unitheteroscedasticity.

Whether the data refer to a continu-ously varying phenomenon (field view) oraggregations of individuals like households(object view) the effect of bundling data intospatial aggregates has the effect of smoothingvariation. In the case of environmental dataand the use of pixels then the degree ofsmoothing will clearly depend on the size ofthe pixels. The larger the pixels the greaterthe degree of smoothing. A non-intrinsicpartition, where the polygons are defined interms of attribute variability with the aimof maximizing within unit homogeneity andmaximizing between-unit heterogeneity willnot produce this effect to the same extent.This second process shares common groundwith the process of regionalization – to whichit is sometimes compared.

Intra-unit heterogeneity is a particularproblem for many types of social sciencedata particularly in those cases where area



boundaries are chosen arbitrarily as was thecase with the UK census for example priorto 2001. Attributes reported for an area mayrepresent percentages or means of attributevalues associated with the individuals (peopleor households) that have been aggregated andthe analyst may have no information on thevariability around the mean. If an ecologicalor contextual attribute is calculated for anarea (social capital say, or area deprivation)again the calculation is conditional on thechosen representation and the scale of thepartition.

One of the conclusions that might be drawnfrom this is that it is better to have small arealaggregates rather than large ones. Assumingspatial structure, a reasonable suppositiongiven the discussion in section 2.1.1, thensmaller areas should be more homogeneousthan larger areas and their mean valuesshould be more representative of their area’spopulation. But such spatial precision comesat the cost of statistical precision. Data errorsor small random fluctuations in numbersof events (household burglaries; diseaseoutcomes) will have a big effect on thecalculation of rates when populations aresmall. Take the case of a standardizedmortality ratio. If the expected count issmall, for example 2.0, then the ratio itself(observed count divided by the expectedcount) rises or falls by 0.5 with eachaddition or subtraction of a single case. Thiswill have implications for determining thestatistical significance of counts – whetherthere are significantly more cases than wouldbe expected on the basis of chance alone. Itwill also have implications for determiningthe statistical significance of differences incounts between areas which in turn raisesproblems for the detection of significantcrime hotspots or disease clusters.

In summary, there is a trade-off that islinked to the number of individual elementsin a polygon. A polygon containing fewindividuals will tend to be more homo-geneous but statistical quantities, such asrates and ratios, tend to be unreliable inthe sense that small errors and random

fluctuations can impact severely on thecalculated values. Polygons containing manyindividuals will generate robust rates andratios but often conceal much higher levelsof internal heterogeneity.

In practice an area is sometimes partitionedinto polygons of varying size and this canyield a secondary effect on data properties.A rate calculated for a polygon wherethe denominator attribute is small has alarger variance than a rate computed for apolygon where the denominator attribute islarge. Moreover there is a mean-variancedependence in the rate statistics. Take thecase where the denominator is the number ofhouseholds (n(i)). Rates are observed countsof some attribute (number of burglaries) inpolygon i(O(i)) divided by the number ofhouseholds. It follows from the binomialmodel for O(i) that:

E [O(i)/n(i)] = (1/n (i)) E [O(i)] = p (i) ;

Var [O(i)/n (i)] = (1/n (i))2 Var [O(i)]

= p(i)(1 − p(i))/n(i)(2.2)

where E[…] and Var[…] denote mean andvariance and p(i) is the probability thatany individual in area i (e.g., number ofhouseholds) has the characteristic (e.g., beenburgled) that is being counted. The meanand the variance in equation (2.2) areclearly not independent. It also follows fromequation (2.2) that the standard error of theestimate of the rate p(i) which is:

[ p(i) (1 − p(i)) /n(i)]1/2

is inversely related to the number ofhouseholds. It follows that any real spatialvariation in rates could be confounded byvariation in n(i) (the number of households)or alternatively spatial variation in rates couldbe an artifact of any spatial structure in



n(i) (see Gelman and Price, 1999 who giveexamples from disease mapping in the USA).

Standardized ratios provide an estimate ofthe true but unknown area-specific relativerisk of the selected disease under theassumption of an independent Poisson modelfor the observed counts. It follows from theproperties of the Poisson distribution thatthe standard error of the standardized ratiois O(i)1/2/E(i). Using a normal approxima-tion for the sampling distribution of thestandardized ratio, SR(i), approximate 95%confidence intervals can be computed:

SR (i) ± 1.96[O(i)1/2/E(i)

].

However there are problems here when mak-ing comparisons. The standard error tendsto be large for areas with small populationsand small for areas with large populationsbecause of the effect of population size onE(i). So extreme ratios tend to be associatedwith small populations but ratios that aresignificantly different from 1.0 tend to beassociated with areas with large populations(Mollie, 1996).

These examples are intended to illustratethe way in which data properties canbe induced by the chosen representation.In certain circumstances the geographicalstructure of the representation (for examplethe geography of which areas have largeand which have small denominator values)could induce a geographical structure on thestatistics which when mapped could then giverise to a misleading impression about trendsor patterns in the data.

2.1.3. Properties due tomeasurement processes

The final step in the creation of theSDM involves obtaining measurements onthe attributes of interest given the chosenrepresentation.

Data quality can be assessed in terms offour characteristics: accuracy, completeness,consistency and resolution. As noted above,a spatial datum comprises a triple ofmeasurements: the attributes, location andtime. Thus the quality of each of thesethree measurements needs to be assessedagainst the four characteristics. What is ofinterest here, however, is how measurementproblems might introduce certain proper-ties into the data (Guptill and Morrison,1995).

A common assumption in error analysisis that attribute errors are independent. Thisis likely to hold less often in the caseof spatial data. Location error may leadto overcounts in one area and undercountsin adjacent areas because the source ofthe overcount is the set of nearby areasthat have lost cases as a result of thelocation error. So, count errors in adjacentareas may be negatively correlated (Haining,2003, pp. 67–70). Location error can beintroduced into a spatial data set as a resultof having to put data, collected on differentspatial frameworks, onto a common spatialframework. Areal interpolation methods areused but these are based on assumptionsabout how attributes are distributed withinareal units and these assumptions oftencannot be tested. The consequence is thatfurther levels and patterns of error areintroduced into the database (Cockings et al.,1997).

In the case of remotely sensed data,the values recorded for any pixel are notin one-to-one relationship with an area ofland on the ground because of the effectsof light scattering. The form of this errordepends on the type and age of the hardwareand natural conditions such as sun angle,geographic location and season. The pointspread function quantifies how adjacent pixelvalues record overlapping segments of theground so that the errors in adjacent pixelvalues will be positively correlated (Forster,1980). The form of the error is analogousto a weak spatial filter passed over thesurface so that the structure of surface



variation, in relation to the size of the pixelunit, will influence the spatial structure oferror correlation. Linear error structures alsoarise in remotely sensed data (Short, 1999).Finally, we note that the effects of errorpropagation may further complicate errorproperties when arithmetic or cartographicoperations are carried out on the dataand source errors are compounded andtransformed via these operations (Hainingand Arbia, 1993).

Data incompleteness may induce falsepatterns in spatial data. Data incompletenessrefers to the situation where there are missingdata points or values or where there are underor overcounts arising from the reportingprocess. ‘Spatially uniform’ data incomplete-ness raises problems for analysis but spatialvariation in the level of data incompletenesswith, for example, undercounting 6 moreserious in some parts of the study area thanothers can seriously affect comparative workand the interpretation of spatial variation.Missing or inaccurately located cases in apoint pattern of events may result in failureto detect a local cluster of cases (Kulldorff,1998).

Incompleteness in cancer data leads toforms of under or overcounting which giverise to spatial variation that is an artifactof how the data were collected. In thecase of official crime statistics geographicaldifferences between large counties in Eng-land may be due to differences in policeinvestigative and reporting practices. Onthe intra-urban scale, burglaries in suburbanareas will, on the whole, be well reportedfor insurance purposes, but in some innercity areas there may be under reportingeither because there is no ‘incentive’ orbecause of fear of reprisals. The Censusprovides essential denominator data forcomputing small area rates. However refusalsto cooperate can lead to undercounting andthe 1991 Census in the UK was thoughtto have undercounted the population by asmuch as 2% because of fears that its datawould be used to enforce the new local‘poll tax’. Inner city areas show higher levelsof undercounting than suburban areas where

populations are easier to track. Finally, sincethere are 10 year gaps between successivecensuses, population in- and out-flows inmany areas may be such as to preserve theessential socio-economic and demographiccharacteristics of the areas. On the other handsome areas of a city, especially inner-cityareas, may experience population mobilityand redevelopment which result in markedshifts that have implications for the reliabilityof the data in the years following theCensus.

Finally, in the case of some imagery, someareas of the image may be obscured becauseof cloud cover. A distinction should be drawnbetween data that are ‘missing at random’from data that are missing because of somereason linked to the nature of the populationor the area. Weather stations temporarilyout of action because of equipment failureproduce data missing at random. On the otherhand mountainous areas will tend to sufferfrom cloud cover more than adjacent plainsand there will be systematic differences inland use between such areas. This distinctionhas implications for how successfully miss-ing values can be estimated and whether theresults of data analysis will be biased becausesome component of spatial variation isunobservable.

Figure 2.1 provides a summary of thepoints raised in this section.

2.2. IMPLICATIONS OF DATAPROPERTIES FOR THE ANALYSISOF SPATIAL DATA

In this section we turn to a consideration ofthe implications of the properties of spatialdata for the conduct of spatial analysis. Againwe shall simply introduce ideas which willbe taken up in more detail in later chapters.We divide this section into situations wherespatial properties can be exploited to helpsolve problems and situations where spatialproperties introduce complications for theconduct of data analysis.

texreader

Text Box

AQ : Short 1999 not listed in refernce. Please check.

texreader

Highlight



Figure 2.1 Processes involved in constructing the spatial data matrix and the dataproperties that are present or introduced at each stage.

2.2.1. Taking advantage of spatialdata properties to tackle problems

Consider the following problems:

• Samples of attribute values have been takenacross an area. The analyst would like toconstruct a map to describe surface variationusing the information contained in the sample.Perhaps instead the analyst just wishes toestimate the surface at a point, or set of points,where no sample has been taken and estimatethe prediction error.

• A spatial database has been assembled butthe database contains data that are ‘missing atrandom’ in the sense that there are no underlyingreasons (such as suppression or confidentiality)why the particular values are missing. The analystwants to estimate these missing values.

In both these cases we might expect toexploit some formalized version of the notionthat data points near together in space carryinformation about each other. Both of theseexamples constitute a form of the spatialinterpolation problem and solutions such askriging exploit the spatial structure inherentin the surface as well as the configurationof the sample points to provide an estimateof surface values together with an estimateof the prediction error (Isaak and Srivastava,

1989). It is intuitive that any solution thatdid not use the information contained in thelocation co-ordinates of sample data valueswould be considered an inefficient solution.

Consider another group of problems:

• Aggregated data are obtained on race(black/white) and voting behaviour (did vote/didnot vote). Counts in the 2 × 2 table are knownbut the real interest lies in the voting behaviourat the constituency level.

• Unemployment estimates have been obtainedfrom a survey for each of a number of smallareas in a region. The small area estimatorsare unbiased but, because of small samplesizes have low precision. Conversely the regionwide estimator has high precision, but as anestimate for any of the small area levels ofunemployment is biased. A similar situationarises when estimating relative risk levels acrossthe small areas of a larger region using thestandardized mortality ratio.

In both these cases there is again an oppor-tunity to exploit some formalized version ofthe notion that data points nearby in spacecarry information about each other. Onesolution is to ‘borrow information’or ‘borrowstrength’ so that the low precision of smallarea estimates are raised by using data fromnearby areas (Mollie, 1996; King, 1997).



These nearby areas provide additional data(helping to improve precision) and becausethey are nearby should reflect an underlyingsituation that is close to the small area inquestion so will not introduce a serious levelof bias.

2.2.2. Where spatial dataproperties introduce complicationsfor data analysis

Spatial analysis is often called upon toaddress scientific questions relating to out-comes (numbers of cases of a disease, dis-tribution of house prices, regional economicgrowth rates) that are a consequence ofprocesses that by their nature are spatial.Haining (2003) identifies four generic groupsof spatial processes. A diffusion process isone where some attribute is taken up bya population so that at any point in timesome individuals have the attribute (e.g., aninfectious disease) and some do not. If thediffusion process operates in ways that areconstrained by distance then there is likelyto be spatial structure in the geography ofthose who do and those who do not havethe attribute in question. An exchange andtransfer or mixing process is one whereplaces become similar in attribute values(per capita income; employment) as a resultof flows of goods or services that bindtheir economic fortunes together or wherepatterns of movement and mixing perhaps atdifferent scales introduce a measure of spatialhomogeneity into structures. A third type ofspatial process is an interaction process inwhich outcomes at one location (e.g., theprice of a commodity) are observed and asa result of the competition effect influenceoutcomes (prices) at another location. Finally,there is a dispersal process in whichindividuals spread across space (such as thedispersal of seeds around a parent plant)so that counts reflect the geography of thedispersal mechanism.

These generic spatial processes – processesthat operate in geographic space – generate

data where spatial structure emerges as a fun-damental property of the data. Process shapesor at least influences attribute variation andthe resulting data that are collected possessdependency structures that reflect the way theprocess plays out across geographic space.

Not all processes of interest are ‘spatial’in the sense described above. Many of theprocesses of interest to geographers playout across geographic space in response tothe place-based characteristics of areas (theparticular mix of attributes they possess)and the spatial relationships between thoseareas. Outcomes in places (whether forexample economic, social, epidemiologicalor criminological) are not necessarily merelythe consequence of the properties of thoseplaces – as places – but may also be theconsequence of relational and contextualinfluences. The distance between places;the difference between adjacent places interms of relevant attributes; the overallconfiguration of places across a region, areall facets of relation and context that mayimpact on outcomes and modify the role of‘place’ in influencing outcomes. Two placesmay be identical in terms of their place-based characteristics but differ significantlyin terms of their relational and contextualattributes with neighbouring areas and thesedifferences may explain why (for exam-ple) two similarly affluent neighbourhoodsexperience quite different levels of assaultand robbery; why two similarly deprivedneighbourhoods experience quite differentlevels of health outcomes.

We now examine briefly how these fea-tures of how attribute values are generatedimpact on the choice of methodology forthe purpose of data analysis. We distinguishbetween exploratory spatial data analysis andmodel based forms of analysis that allowhypothesis testing and parameter estimation.

Exploratory spatial data analysisExploratory data analysis (EDA) comprises acollection of visual and numerically resistanttechniques for summarizing data properties,detecting patterns in data, identifying unusual



or interesting features in data including pos-sible data errors and formulating hypotheses.Exploratory spatial data analysis (ESDA)undertakes these activities with respect tospatial data so that cases can be located ona map and the spatial relationships betweencases assumes importance because they carryinformation that is likely to be relevant tothe analysis (Cressie, 1984; Haining et al.,1998; Fotheringham and Charlton, 1994). Itis important to be able to answer questionssuch as: ‘where does that subset of cases onthe scatterplot or that subset of cases on theboxplot, occur on the map?’ ‘What are thespatial patterns and spatial associations in thisgeographically defined subset of the map?’ Inthe case of regression modelling do the largepositive residuals, for example, cluster in onearea of the map?

ESDA and the software that supportsESDA needs to be able to handle the spatialindex and be able to handle the specialqueries that arise because of the spatial refer-encing of the data. Thus the map becomes anessential visualization tool (Dorling, 1992).The linkage between a map window and othergraphics windows, so that cases can be simul-taneously highlighted in more than one win-dow, becomes an essential part of the conductof ESDA (Andrienko and Andrienko, 1999;Monmonier, 1989).

Visualizing spatial data raises particularproblems, in part because of some of theproperties discussed in earlier parts of thischapter. We highlight two here. First, it hasbeen noted that data values, particularly ratesand ratios, may not be strictly comparablebecause standard errors are population sizedependent. So if areas vary substantiallyin terms of population counts (used asthe denominators for a rate) then extremevalues and even patterns detected by visualinspection might be associated with thateffect rather than real differences betweenareas. Second, areas that partition a regionmight be very different in physical size.This may mean that the viewer of a maphas their attention drawn to certain areas ofthe cartographic display (those areas withphysically large spatial units) whilst other

areas are ignored. This may be particularlyimportant if in fact it is the small areasthat have the larger populations so that itis their rates and ratios (rather than therates and ratios associated with the physicallylarger but less densely populated areas) thatare the more robust. One solution to thisproblem is to use cartograms so that areasare transformed in physical extent to reflectsome underlying attribute such as populationsize (Dorling, 1994). This comes at a costbecause the individual areas in the resultingcartogram may be hard for the analyst toplace. There may be a need for a second,conventional, map linked to the cartogram,so the analyst can highlight areas on thecartogram and see where they are on theconventional map.

Conventional visualization technology isoften based on the assumption that alldata values are of equal status so thatthe viewer can extract information fromvisual displays without worrying about thestatistical comparability of the data valuesthat are displayed. This assumption maybreak down when dealing with spatiallyaggregated data (Haining, 2003).

Model fitting and hypothesis testingIf n data values are spatially autocorrelatedthen one of the consequences of this for theapplication of standard statistical inferenceprocedures is that the information contentof the data set is less than would be thecase if the n values were independent. Thismeans that the degrees of freedom availablefor testing hypotheses is not a simple functionof n. We shall take the example of testing forsignificant bivariate correlation between twovariables to illustrate this point.

Suppose n pairs of observations,{(x(i), y(i))}i are drawn from a bivariatenormal distribution. Pearson’s productmoment correlation coefficient (r) is thestatistic used to measure the associationbetween X and Y . If the observations on thetwo variables are independent (there is nospatial autocorrelation in either X or Y ), thenif the null hypothesis is of no association



between X and Y then a test statisticis given by:

(n − 2)1/2 |r|(

1 − r2)−1/2

(2.3)

which is t distributed with (n − 2) degrees offreedom.

These distributional results do not hold ifX and Y are spatially correlated. The problemis that when spatial autocorrelation is presentthe variance of the sampling distribution of r,which is a function of the number of pairsof observations n, is underestimated by theconventional formula which treats the pairsof observations as if they were independent.The effect of spatial autocorrelation on testsof significance have been extensively studied(for reviews see Haining, 1990, 2003) andshown to be very severe when both X and Yhave high levels of spatial autocorrelation.

Clifford and Richardson (1985) obtain anadjusted value for n(n′) which they call the‘effective sample size’. This value, n′, canbe interpreted as measuring the equivalentnumber of independent observations so thatthe solution to the problem lies in choosingthe conventional null distribution based on n′rather than n. An approximate expression forthis quantity is:

n′ = 1 + n2 (trace

(RxRy

))−1 (2.4)

where Rx and Ry are the estimated spatialcorrelation matrices for X and Y respectively.(For a discussion of estimators see Haining,1990, pp.118–120.) The null hypothesis of noassociation between X and Y is rejected if:

(n′ − 2

)1/2 |r|(

1 − r2)−1/2

(2.5)

exceeds the critical value of the t distributionwith (n′ − 2) degrees of freedom.

This illustrates a general problem. Sincethe n observations are positively spatially

autocorrelated, the information content of thesample is over-estimated if n is used – itneeds to be deflated. The sampling varianceof statistics are underestimated leading theanalyst to reject the null hypothesis whenno such conclusion is warranted at thechosen significance level. For the effectsof spatial dependency on the analysisof contingency tables see, for example,Upton and Fingleton (1989) and Cerioli(1997).

To make further progress in understandingthe importance of spatial data properties andthe complications they introduce we needto introduce models for spatial variation –or data generators for spatial variation.Such models are important. By specifying amodel to represent the variation in the data(including the spatial variation), the analystis able to construct tests of hypothesis withgreater statistical power than is possible iftesting is against a non-specific alternative.There are a number of possible formalmodels for spatial variation of which thesimultaneous spatial autoregressive (SAR),the conditional spatial autoregressive (CAR)and the moving average (MA) models areprobably the best known. We will brieflylook at the first two but the interestedreader will need to follow up the liter-ature to gain a fuller understanding ofthese models and their properties (Whittle,1954; Besag, 1974, 1975, 1978; Ripley,1981; Cressie, 1991; Haining, 1978, 1990,2003).

A multivariate normal CAR model whichsatisfies the first order (spatial) Markovproperty and thus might be thought of as thesimplest departure from spatial independencecan be written as follows (Besag, 1974;Cressie, 1991, p. 407):

E[X(i) = x(i)

∣∣ {X( j) = x( j)}

j∈N(i)

]

= µ +∑

j∈N(i)

τ w(i, j) [X( j) − µ] ,

i = 1, . . ., n (2.6)

texreader

Highlight

texreader

Text Box

AQ : Upton and Fingleton 1989 not listed in refernce. Please check.



and:

Var[X(i) = x(i)

∣∣ {X( j) = x( j)}

j∈N(i)

]= σ 2,

i = 1, . . ., n

where E[… | .] and Var[… | .] denote con-ditional expectation and variance respec-tively, µ is a first-order parameter and τ

is the spatial interaction parameter. TheMarkov property means observations areconditionally independent given the valuesat neighbouring sites. {w(i, j)} denotes theneighbourhood structure of the system ofareas and w(i, j) = 1 if i and j are neigh-bours ( j ∈ N(i)) and w(i, i) = 0 for all i.W is the n × n matrix of {w(i, j)} and issometimes called the connectivity matrix. Itis a requirement that τ lies between (1/ωmin)and (1/ωmax) where ωmin and ωmax are thesmallest and largest eigenvalues of W. Fora fuller introduction to the Markov propertyfor spatial data including how to constructhigher-order spatial Markov models see, forexample, Haining (2003, pp. 297–299). Thisapproach allows the construction of a hier-archy of models of increasing complexity.As noted in Haining (2003), however, theMarkov property does not have the naturalappeal it has in the case of time series,because space has no natural ordering. Sothe neighbourhood structure can often seemrather arbitrary especially in the case of thenon-regular areal frameworks used to reportCensus and other social and economic data.

If the analyst of regional data does notattach importance to satisfying a Markovproperty another option is available calledthe SAR model specification. A form of thismodel was first introduced into statistics byWhittle (1954). Let e be independent normalIN(0, σ 2I) where I is the identity matrixand e(i) is the variable associated with sitei(i = 1, . . ., n). Define the expression:

X (i) = µ +∑

j∈N(i)

ρw (i, j) [X( j) − µ]

+ e(i), i = 1, . . ., n. (2.7)

where ρ is a parameter. The bounds on ρ areset by the largest and smallest eigenvaluesof W just as in the case of the CAR model.This is the model most often seen in thespatial analysis and regional science literaturealthough the reason for its hegemony is farfrom clear and seems to be largely basedon a combination of historical accident (inthe sense that time series modelling precededspatial data modelling and methods weretransferred across) and subsequent ‘lock-in’.

These models can be embedded into,for example, regression models either asadditional covariates (as in the case of equa-tion (2.7)) or as models for the error structurewhere the errors (in practice the residuals)are tested and found to show evidence ofspatial autocorrelation (Anselin, 1988; Ord,1975). It is well known that fitting regressionmodels by ordinary least squares when errorsare spatially (positively) autocorrelated givesrise to some damaging consequences. First,although we shall obtain consistent estimatesof the regression parameters (there may besome small sample bias), the sampling vari-ance of these estimates may be inflated com-pared with methods that take account of thespatial autocorrelation in the errors. Second,if the usual least squares formula for the sam-pling variances of these regression estimatesis applied, the variances will be seriouslyunderestimated. The formulae are no longervalid and conventional F and t tests ofhypothesis are also not valid. We shall take avery simple example to illustrate these points,where the parameter to be estimated and testsof hypothesis relate to a constant mean µ.

Suppose n independent observations {x(i)}are drawn from a N (µ, σ 2) distribution. Thesample mean, x̄, is an unbiased estimator forµ, and the variance of the sample mean is:

Var (x̄) = σ 2/n. (2.8)

If σ 2 is unknown then it is estimated by:

s2 = (1/ (n − 1))∑

i=1, ..., n

(x(i) − x̄)2 (2.9)

texreader

Highlight

texreader

Highlight

texreader

Text Box

AQ :Ord 1975 not listed in refernce. Please check.



so that:

Var (x̄) = (1/n (n − 1))∑

i=1, ..., n

(x(i) − x̄)2 .

(2.10)

If the n observations are not independentthen although the sample mean is stillunbiased as an estimator of µ, assuming eachx(i) has the same variance (σ 2), the varianceof the sample mean is (see for exampleHaining, 1988, p. 575):

Var (x̄) = σ 2/n +(

2/n2)

×∑

i

∑j(i<j)

Cov (x(i), x( j))

(2.11)

where Cov(x(i), x( j)) denotes the spatialautocovariance between x(i) and x( j). So, ifthere is positive spatial dependence and σ 2

is known then σ 2/n underestimates the truesampling variance of the sample mean. If σ 2

is unknown and is estimated by equation (2.9)then if there is positive spatial dependencethe expected value of s2 is (see, for example,Haining, 1988, p. 579):

E[s2

]= σ 2 − [(2/n(n − 1))

×∑

i

∑j(i< j)

Cov (x(i), x( j))]

(2.12)

so that equation (2.9) is a downward biasedestimate of σ 2. This further compounds theunderestimation of the sampling variance.

Modified methods to take accountof spatial dependence are often basedon the following argument (see, forexample, Haining, 1988). Assume the dataxT = (x(1), . . ., x(n)), where T denotes thetranspose, are drawn from a multivariatenormal spatial model with mean vector

given by µ1 and n by n variance–covariancematrix � = σ 2V given, say, by one of themodels described above. (In the case of theCAR model (2.6), V = (I − τW)−1.) Thelog likelihood for the data is:

− (n/2) ln 2πσ 2 − (1/2) ln |V| −(

1/2σ 2)

× (x − µ1)T V−1 (x − µ1) (2.13)

where 1 is a column vector of 1’s and |V|denotes the determinant of V. For simplicitywe assume V is known. The maximumlikelihood estimator of µ is:

µ̃ =(

1TV−11)−1 (

1TV−1x). (2.14)

The estimator (2.14) is the best linearunbiased estimator (BLUE) of µ. Note thatin the case of independence V = I (theidentity matrix with 1’s down the diagonaland zeros elsewhere) and equation (2.14)reduces to the sample mean. In the caseV �= I two modifications to the samplemean are occurring. First, the denominatorfor positive spatial dependence will be lessthan n. Second, the presence of V−1 in thenumerator of equation (2.14) downweightsthe contribution of any attribute x(i) which ishighly correlated with other attribute values{x( j)} – that is, where x(i) is part of a clusterof observations.

The variance of µ̃ is:

Var[µ̃] = σ 2(1TV−11)−1 (2.15)

which reduces to σ 2/n if V = I.Since the sample mean is an unbiased

estimator of µ, one modification is to replaceequation (2.8) with equation (2.15). The term(1T V−1 1) is proportional to Fisher’s infor-mation measure (Haining, 1988, p. 586). Itidentifies the information about µ containedin an observation. Now equation (2.9) is not



the maximum likelihood estimator for σ 2.This is given by:

σ̃ 2 = n−1(x − µ1)T V−1(x − µ1). (2.16)

A further refinement is to replace equa-tion (2.9) with equation (2.16) substitutingthe sample mean for µ in equation (2.16)where V−1 plays a role equivalent to thesecond term in the right-hand side ofequation (2.11).

The general results given byequations (2.11) and (2.12) are whyadjustments to conventional methods areneeded. The evidence suggests that it is theeffect of the second term on the right-handside of equation (2.11) that is the moreserious, at least in the usual situationof positive spatial dependence, and thatone way to deal with this is to adjust nin equation (2.8) thereby increasing thesampling variance of the sample mean. Thesize of the adjustment to n will be sensitiveto the estimates of the spatial autocorrelationin the data or, if a spatial model is fitted tothe data, the choice of model. The problem isfurther complicated if, as is usually the case,V is not known and so must be estimatedfrom the data.

Before leaving the normal model it isimportant to note that aggregated spatialdata may violate another of the statisticalassumptions of least squares regression. Itwas remarked in section 2.1 how rates andratios based on areas with very differentpopulation counts will have different stan-dard errors. It follows that the assumption ofhomoscedasticity (or constant error variance)is likely to be violated when developingmodels to explain how rates or ratiosvary over a region. Data transformations orweighted least squares estimators are usedto address these problems (Haining, 1990,pp. 49–50) but such adjustments may needto be implemented whilst also addressingthe problems created by residual spatialautocorrelation (Haining, 1991). In additionto the problems created by failure to satisfy

statistical assumptions, spatial data oftencreate ‘data-related’ problems in regressionmodelling (Haining, 1990, pp. 332–333). Forexample, the fit of a trend surface modelcan be influenced by the configuration ofthe sample data points on the surface where,as a result of the particular distribution,certain values have high leverage (Unwin andWrigley, 1987); the particular shape of thestudy region may also influence the trendsurface model fit (Haining, 1990, p. 372).These and other issues are reviewed inHaining (1990, pp. 40–50).

We conclude this section by remarkingon the implications of intra-area and inter-area spatial dependency and intra-area het-erogeneity when modelling a discrete valuedresponse variable such as the count of thenumber of cases of a disease across a regionusing the Poisson model. Spatial dependencyand heterogeneity are important causes ofoverdispersion. For example consider a localdiffusion process in which individuals aremore likely to be infected if they are closeto someone already infected. The result isthat counts of the number of cases willreveal Poisson overdispersion because therewill be areas with large counts (due to thelocal infection process) and areas with zerocounts where the process has not yet started.These considerations require the analyst bothto carry out tests for overdispersion andwhere necessary take appropriate action.The effects of overdispersion in generalizedlinear modelling are rather similar to thosedescribed for the normal model when spatialautocorrelation is detected. If overdispersionis present, ignoring it tends to have littleimpact on point estimates of the regressionparameters (the maximum likelihood estima-tor is consistent, although some small samplebias might be present). However, standarderror estimates for regression parameters areunderestimated. Type I errors associated withthe model are underestimated which is par-ticularly problematic in relation to predictorsthat are close to the significance threshold.If the objective is to build a parsimoniousmodel, the presence of overdispersion mayresult in an analyst constructing a model



more complicated than necessary, and thatoverestimates the variance explained.

Ways of tackling this problem may dependon the reasons for the overdispersion.A conventional approach is through theuse of a variance inflation factor (Dobson,1999). Where the cause is inter-area spa-tial autocorrelation then a discrete valued‘auto-model’ may be used which is analogousto equation (2.6) (see Besag, 1974). Morerecently attention has focused on the useof spatial random effects models usingCAR models fitted using WinBUGS (Lawet al., 2006). These models allow foroverdispersion through the random effectsterm. This is an area of current researchin spatial modelling since the developmentof good modelling tools for discrete valuedresponse variables has rather taken a backseat whilst attention for many years hasfocused – perhaps disproportionately – on thenormal model (Law and Haining, 2004).

2.3. DRAWING INFERENCES

One of the main purposes of undertaking spa-tial statistical analysis is to make populationinferences on the basis of the data collected.In concluding this chapter we consider someof the inference pitfalls associated with theanalysis of spatial data.

What is the population about whichinferences are made in an observationalscience? If data are point samples from acontinuous surface then the population mightbe the surface itself. Of course the realizedsurface may be thought of as only oneof many possible realizations (the rest nothaving been observed). However, with orwithout the concept of a ‘superpopulation’of surfaces, making inferences from pointsamples to the (realized) surface populationdoes represent a legitimate target. Thisargument is less convincing when the datarepresent a complete census – for example thedata refer to areas and a complete (or nearlycomplete) enumeration has been carriedout. What is the population about which

inferences are being made now? A frequentanswer to this is that the underlying processis stochastic (chance is an inherent part of theprocess) so that inferences are directed at theprocess (its parameters and covariates) ratherthan the map. The problem with this is thatwe have access to only one realization of theprocess and in order to give our inferencessome broader validity other assumptions needto be invoked such as that this realizationis representative of the underlying process.There may be no way to test such anassumption.

The modifiable areal units problem(MAUP) reminds us that results obtainedfrom analyzing aggregate data are dependenton the particular scale of the partition, and,at the given scale, the particular boundariesused. In general statistical relationshipsbetween attributes are stronger the largerthe spatial aggregates because variancesare reduced. Boundary shifts can influencewhether or not disease clusters or crime hotspots are detected at any scale because ifboundaries happen to cut through the middleof a cluster this may dilute the effect overtwo or more areas.

The analysis of aggregated data is par-ticularly problematic and not just becauseof the MAUP. It is important to rememberthat conclusions drawn from aggregate datacan only be transferred to the individuallevel under certain conditions. The ecologicalfallacy is the uncritical transfer of findingsat the group level to the individual level. Asthe famous example cites, the suicide ratein Germany in the 17th century may havebeen larger in areas with higher percentagesof Catholics but that does not mean Catholicswere more prone to commit suicide thanProtestants. Quite the reverse as individuallevel data revealed. Aggregation bias raisesserious problems for epidemiological studiesbased on aggregate data and is one reasonwhy it is considered the weakest of thedifferent methodologies for assessing dose–response relationships – even though thismay be the only realistic way of obtainingreasonably sound measures of exposure to anenvironmental risk factor. The problem is that



it is not difficult to construct examples wherethere are complete sign reversals when goingfrom the ecological to the individual levelstudy (Richardson, 1992).

The converse of the ecological fallacyis the atomistic (or individualistic) fallacywhich assumes relationships identified at theindividual level apply at the group level.There may be group level or contextualeffects that need to be taken into account –as for example in the study of youthoffending, where the risk of becoming anoffender may not depend only on personaland household level risk factors but alsoneighbourhood and peer group effects. Thisthen raises the problem of defining what the‘neighbourhood’ is.

Figure 2.2 provides a summary of thepoints raised in sections 2.2 and 2.3.

2.4. CONCLUSIONS

Spatial data possess a number of distinctiveproperties that derive from the fundamentalnature of geographic space and the way pro-cesses unfold in geographic space, the waythat spatial variation is represented for thepurpose of storage in a finite digital database

and the way spatial data are collected andattributes measured. Many of these propertieswere recognized early in geography’s ‘quan-titative revolution’ most notably the lack ofindependence in data values collected closetogether in space. Geographers then and sincehave made important contributions to thedevelopment of relevant statistical theory andpractice.

Geographers continue to develop newmethods for describing spatial variation andnew methods for modelling processes thatoperate across geographical space. At presentthere are two strong traditions which providefocuses for research. On the one hand thereare methodologies based on ‘whole map’ orglobal statistics that seek to capture dataproperties through models that are fitted toall the data. On the other hand there aremethodologies based on ‘local’ statistics thatprocess geographically defined subsets of thedata and do not seek to impose a singlestatistic or model on the whole data set(Anselin, 1995, 1996; Getis and Ord, 1996;Fotheringham and Brunsdon, 2000). Theyrepresent different ways of responding to theneed to develop methodologies to meet theanalytical challenges posed by the specialnature of spatial data.

Figure 2.2 Spatial data properties and how they impact at different stages of analysis.



REFERENCES

Andrienko, G.L. and Andrienko, N.V. (1999). Interactivemaps for visual data exploration. InternationalJournal of Geographical Information Science, 13:355–374.

Anselin, L. (1988). Spatial Econometrics: Methods andModels. Dordrecht: Kluwer Academic.

Anselin, L. (1995). Local indicators of spatialassociation – LISA. Geographical Analysis, 27:93–115.

Anselin, L. (1996). The Moran scatterplot as an ESDAtool to assess local instability in spatial association.In: Fischer, M, Scholten, H.J. and Unwin, D., (eds),Spatial Analytical Perspectives on GIS, pp. 111–125.London: Taylor & Francis.

Besag, J.E. (1974). Spatial interaction and the statisticalanalysis of lattice systems. Journal, Royal StatisticalSociety, B, 36: 192–225.

Besag, J.E. (1975). Statistical analysis of non-latticedata. The Statistician, 24: 179–195.

Besag, J.E. (1978). Some methods of statisticalanalysis for spatial data. Bulletin of the InternationalStatistical Institute, 47: 77–92.

Brindley, P., Wise, S.M., Maheswaran, R. and Haining,R.P. (2005) The effect of alternative representationsof population location on the areal interpolation ofair pollution exposure. Computers, Environment andUrban Systems, 29: 455–469.

Cerioli, A. (1997). Modified tests of independence in2 × 2 tables with spatial data. Biometrics, 53:619–628.

Cliff, A.D. and Ord, J.K. (1973). Spatial Autocorrelation.London: Pion.

Clifford, P. and Richardson, S. (1985). Testing theassociation between two spatial processes. Statisticsand Decisions, Suppl. No. 2: 155–160.

Cockings, S., Fisher, P.F. and Langford, M. (1997).Parametrization and visualization of the errorsin areal interpolation. Geographical Analysis, 29:314–328.

Cressie, N. (1984). Towards resistant geostatistics.In: Verly, G., David, M., Journel, A.G. andMarechal, A., (eds), Geostatistics for NaturalResources Characterization, pp. 21–44. Dordrecht:Reidel.

Cressie, N. (1991). Statistics for Spatial Data. New York:Wiley.

Dobson, A.J. (1999). An Introduction to GeneralizedLinear Models. Boca Raton: Chapman & Hall.

Dorling, D. (1992). Stretching space and splicingtime: from cartographic animation to interactivevisualization. Cartography and Geographic Informa-tion Systems, 19: 215–227.

Dorling, D. (1994). Cartograms for visualizing humangeography. Hearnshaw, H.M. and Unwin, D.J., (eds),Visualization in Geographic Information Systems,pp. 85–102. New York: J. Wiley & Sons.

Fisher, R. (1935). The Design of Experiments.Edinburgh: Oliver & Boyd.

Forster, B.C. (1980). Urban residential ground coverusing LANDSAT digital data. PhotogrammetricEngineering and Remote Sensing, 46: 547–558.

Fotheringham, A.S., Brunsdon, C. and Charlton, M.(2000). Quantitative Geography: Perspectives onSpatial Data Analysis. London: SAGE.

Fotheringham. A.S. and Charlton, M. (1994). GIS andexploratory spatial data analysis: an overview ofsome research issues. Geographical Systems, 1:315–327.

Gelman, A. and Price, P.N. (1999). All maps ofparameter estimates are misleading. Statistics inMedicine, 18: 3221–3234.

Getis, A. and Ord, J.K. (1996). Local spatial statistics:an overview. In: Longley, P. and Batty, M., (eds),Spatial Analysis: Modelling in a GIS environment,pp. 261–277.

Goodchild, M.F. (1989). Modelling error in objectsand fields. In: Goodchild, M. and Gopal, S.(eds), Accuracy of Spatial Databases, pp. 107–113.London: Taylor & Francis.

Guptill, S.C. and Morrison, J.L. (1995). Elements ofSpatial Data Quality. Oxford: Elsevier Science.

Haining, R.P. (1978). The moving average model forspatial interaction. Transactions of the Institute forBritish Geographers, NS3: 202–225.

Haining, R.P. (1988). Estimating spatial means withan application to remotely sensed data. Commu-nications in Statistics, Theory and Methods, 17:573–597.

Haining, R.P. (1990). Spatial Data Analysis in the Socialand Environmental Sciences. Cambridge: CambridgeUniversity Press.

Haining, R.P. (1991). Estimation with heteroscedasticand correlated errors: a spatial analysis of

texreader

Highlight

texreader

Highlight

texreader

Text Box

AQ : 45 Fisher (1935) not found in text.Please check.

texreader

Highlight

texreader

Text Box

AQ : Getis and Ord (1996) publisher required



intra-urban mortality data. Papers in RegionalScience, 70: 223–241.

Haining, R.P. (2003) Spatial Data Analysis: Theory andPractice. Cambridge: Cambridge University Press.

Haining, R.P. and Arbia, G. (1993). Error propaga-tion through map operations. Technometrics, 35:293–305.

Haining, R.P., Wise, S.M. and Ma, J. (1998). ExploratorySpatial Data Analysis in a geographic informationsystem environment. The Statistician, 47: 457–469.

Isaaks, E.H. and Srivastava, R.M. (1989). An Intro-duction to Applied Geostatistics. Oxford: OxfordUniversity Press.

King, G. (1997). A Solution to the Ecological InferenceProblem. Princeton, New Jersey: Princeton UniversityPress.

Kulldorff, M. (1998) Statistical methods for spatialepidemiology: tests for randomness. GIS and Health,eds Gatrell, A. and Löytönen, M. pp. 49–62. London:Taylor & Francis.

Law, J. and Haining, R.P. (2004) A Bayesian approachto modelling binary data: the case of high intensitycrime areas. Geographical Analysis, 36: 197–216.

Law, J., Haining R., Maheswaran, R. and Pearson, T.(2006) Analysing the relationship between smokingand coronary heart disease at the small area level.Geographical Analysis, 38: 140–159.

Longley, P.A., Goodchild, M.F., Maguire, D.J. andRhind, D.W. (2001). Geographical InformationSystems and Science. Chichester: Wiley.

Martin, D.J. (1998) Optimizing Census Geography: theseparation of collection and output geographies.International Journal of Geographical InformationScience, 12: 673–685.

Martin, D.J. (1999). Spatial representation: thesocial scientists’ perspective. In: Longley, P.A.,Goodchild, M.F., Maguire, D.J. and Rhind, D.W.(eds), Geographical Information Systems: Volume 1.Principles and Technical Issues, 2nd edition.pp. 71–89. New York: Wiley.

Mollie, A. (1996). Bayesian mapping of disease. MarkovChain Monte Carlo in Practice: InterdisciplinaryStatistics, pp. 359–379. London: Chapman & Hall.

Monmonier, M.S. (1989). Geographic brushing:exhancing exploratory analysis of thescatterplot matrix. Geographical Analysis, 21:81–84.

Richardson, S. (1992). Statistical methods forgeographical correlation studies. In: Elliot, P.,Cuzich, J., English, D. and Stern, R., (eds),Geographical and Environmental Epidemiology:Methods for Small Area Studies, pp. 181–204.Oxford: Oxford University Press.

Ripley, B.D. (1981). Spatial Statistics. New York: Wiley.

Unwin, D.J. and Wrigley, N. (1987). Towards a generaltheory of control point distribution effects in trendsurface models. Computers and Geosciences, 13:351–355.

Whittle, P. (1954) On stationary processes in the plane.Biometrika, 41: 434–449.

the special nature of spatial data - sage publications inc · the special nature of spatial data 7...

Documents