research article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/ijgis (2003).pdfkeller 1997, hunter...

18
. . , 2003 . 17, . 2, 139–156 Research Article Stability of the surface generated from distributed points of uncertain location YUKIO SADAHIRO Department of Urban Engineering, University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 113-8656, Japan; e-mail: [email protected] (Received 18 June 2001; accepted 29 April 2002 ) Abstract. Conversion of a point distribution into a surface is one of the spatial operations used in GIS. This supports the visual analysis of point patterns, which is usually followed by more sophisticated statistical and mathematical analysis. If the location of points is uncertain, however, the surface obtained becomes unstable and consequently the results of analysis may be unreliable. Though unavoidable in spatial data, locational uncertainty has been rather neglected in the context of spatial analysis. To fill this gap, this paper proposes a method for representing and analysing the stability of the surface generated from an uncertain point distribution. The surface stability is represented by a scalar function called the slope stability function. Its definition, calculation procedure, visualization method, and summary indices are proposed. The method is evaluated through an empirical study, and some findings are shown which help us understand the eect of the smoothing parameter, locational uncertainty, and spatial pattern of points on the stability of a surface. 1. Introduction In point pattern analysis, the first step of exploration is to visualize the distribu- tion of points (Slocum 1999). The map visually reveals the spatial structure of a point distribution, and may suggest underlying factors to an investigator. Visual- ization of a point distribution greatly helps its further analysis such as statistical analysis and mathematical modelling. If we have only a small number of points, they are represented and visualized as point objects in a GIS; it is easy to understand their spatial structure by looking at their spatial distribution directly. However, if a large number of points are visualized, say, a thousand or a million points, it becomes quite dicult to find spatial patterns in the point distribution. The points are then converted into a surface by the kernel method (Silverman 1986, Scott, 1992), quadrat method (Upton and Fingleton 1985, Cressie 1993), or other methods (Bailey 1994, Bonham-Carter 1994, Bailey and Gatrell 1995). The kernel method, for instance, has been applied to the distributions of crimes (Goldsmith et al. 2000), volcanic craters (Bailey and Gatrell 1995), red- woods (Fotheringham et al. 2000), population (Bracken and Martin 1989, Martin and Bracken 1991, Bracken 1993) and retail stores (Sadahiro 2001). The quadrat International Journal of Geographical Information Science ISSN 1365-8816 print/ISSN 1362-3087 online © 2003 Taylor & Francis Ltd http://www.tandf.co.uk/journals DOI: 10.1080/1365881022000015921

Upload: others

Post on 24-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

. . , 2003. 17, . 2, 139–156

Research Article

Stability of the surface generated from distributed points ofuncertain location

YUKIO SADAHIRO

Department of Urban Engineering, University of Tokyo, 7-3-1, Hongo,Bunkyo-ku, Tokyo 113-8656, Japan; e-mail: [email protected]

(Received 18 June 2001; accepted 29 April 2002)

Abstract. Conversion of a point distribution into a surface is one of the spatialoperations used in GIS. This supports the visual analysis of point patterns, whichis usually followed by more sophisticated statistical and mathematical analysis.If the location of points is uncertain, however, the surface obtained becomesunstable and consequently the results of analysis may be unreliable. Thoughunavoidable in spatial data, locational uncertainty has been rather neglected inthe context of spatial analysis. To fill this gap, this paper proposes a method forrepresenting and analysing the stability of the surface generated from an uncertainpoint distribution. The surface stability is represented by a scalar function calledthe slope stability function. Its definition, calculation procedure, visualizationmethod, and summary indices are proposed. The method is evaluated throughan empirical study, and some findings are shown which help us understand theeffect of the smoothing parameter, locational uncertainty, and spatial pattern ofpoints on the stability of a surface.

1. IntroductionIn point pattern analysis, the first step of exploration is to visualize the distribu-

tion of points (Slocum 1999). The map visually reveals the spatial structure of apoint distribution, and may suggest underlying factors to an investigator. Visual-ization of a point distribution greatly helps its further analysis such as statisticalanalysis and mathematical modelling.If we have only a small number of points, they are represented and visualized as

point objects in a GIS; it is easy to understand their spatial structure by looking attheir spatial distribution directly. However, if a large number of points are visualized,say, a thousand or a million points, it becomes quite difficult to find spatial patternsin the point distribution. The points are then converted into a surface by the kernelmethod (Silverman 1986, Scott, 1992), quadrat method (Upton and Fingleton 1985,Cressie 1993), or other methods (Bailey 1994, Bonham-Carter 1994, Bailey andGatrell 1995). The kernel method, for instance, has been applied to the distributionsof crimes (Goldsmith et al. 2000), volcanic craters (Bailey and Gatrell 1995), red-woods (Fotheringham et al. 2000), population (Bracken and Martin 1989, Martinand Bracken 1991, Bracken 1993) and retail stores (Sadahiro 2001). The quadrat

International Journal of Geographical Information ScienceISSN 1365-8816 print/ISSN 1362-3087 online © 2003 Taylor & Francis Ltd

http://www.tandf.co.uk/journalsDOI: 10.1080/1365881022000015921

Page 2: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro140

method has also been frequently used in ecology (Pielou 1977), criminology(Brantingham and Brantingham 1981) and epidemiology (Alexander and Boyle1996). Both methods are used for not only the visualization of point distributionsbut also in statistical analysis and modelling. In visualization, however, the kernelmethod is preferred because, unlike the quadrat method, it generates a ’smooth’surface from a point distribution.One important view, that has been rather neglected in spatial analysis including

the conversion of a point distribution into a surface, is locational uncertainty inspatial data. If the location of points is uncertain, the surface generated from thepoints is inevitably uncertain and unstable. Spatial data are usually inaccurate tosome extent (Goodchild and Gopal 1989, Burrough and Frank 1996, Heuvelink1998), and thus uncertainty is unavoidable in spatial analysis. Though this fact iswidely recognized in the GIS community, it is often implicitly assumed that spatialdata are accurate and the result of analysis is reliable. This is usually not the case.Recently, effort has been made to analyse the effect of uncertainty in spatial

operations. Leung and Yan (1997), for instance, discussed how the locational uncer-tainty affects the point-in-polygon operation. Error propagation in spatial operationssuch as spatial interpolation, spatial smoothing, overlay and buffer operations hasbeen studied by statistical models and empirical studies (Veregin 1994, Davis andKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposedstatistical tests for uncertain spatio-temporal point distributions. Sadahiro (2002)analyzed cluster detection in uncertain point distributions. However, conversion ofa point distribution into a surface has not yet been discussed in terms of locationaluncertainty in the point data.To fill the gap in the research, this paper proposes a method for representing

and analysing the stability of the surface generated from a distribution of uncertainlylocated points. We focus on the kernel smoothing method because it is suitable forvisual analysis of point distributions, the first step of exploratory spatial analysis. In§2, we outline kernel smoothing and propose a scalar function called the slopefunction that describes the spatial structure of the surface generated from the distribu-tion of certain points. In §3, extending the method proposed in §2, we propose amethod for representing and analysing the stability of the surface generated fromthe distribution of uncertain points. To represent the surface stability, we introducea scalar function called the slope stability function. The section describes its calcula-tion and visualization, and summary indices. To test the validity of the method, weperform an empirical study in § 4. Section 5 provides a concluding discussion.

2. Surface under certainty2.1. Kernel smoothingSuppose there are n points in a region S. The ith point and its locational vector

are denoted by Piand z

i, respectively. The surface value at a location x given by

the kernel smoothing is

f (x)=

∑ik(|x−z

i|)

∑jPyµSk(|y−z

i|)dy

(1)

Page 3: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 141

where k( ) is the kernel function. A typical choice for k( ) is the bivariate normaldistribution

k(u)=a

pexp(−a|u |2 ) (2)

where a is the smoothing parameter (Silverman 1986). A small value of a generatesa smooth surface, while a large a yields a rough surface which emphasizes large localvariations in point density. In the following we use the bivariate normal distributionas the kernel function. Figure 1 shows an example of kernel smoothing using thebivariate normal distribution.

2.2. Representation of surface under certaintyIn point pattern analysis, we are interested in the spatial structure of points

characterized by point clusters, spatial variation of point density and so forth. Whena point distribution is converted into a surface, the structure of the original distribu-tion is transferred to the resultant surface. Therefore, it is useful to examine thesurface generated from a point distribution as well as to analyse the point distributiondirectly, in order to understand its spatial structure. To this end, we propose a binaryfunction representing a surface with emphasis on its spatial structure.Let v(h) be the unit vector of angle h measured counterclockwise from the x-axis

(0∏h<2p). The gradient of the surface function f (x) at x in direction h is given bythe directed differentiation of f (x):

f ∞(x, h)=(f (x) · v(h) (3)

Using f ∞(x, h), we define a binary function s(x, h; f ) by

s(x, h; f )=G1 if f ∞ (x, h)�0

0 otherwise(4)

and call it the slope function. This function indicates the surface gradient as a binaryvariable: if the surface ascends at a location x in direction h, the slope function

Figure 1. Kernel smoothing. (a) A point distribution, (b) a surface generated by the kernelsmoothing (a=0.001).

Page 4: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro142

becomes one; otherwise, it becomes zero. Figure 2 shows the slope function of thesurface in figure 1(b).The slope function has two values of h where it discretely changes between one

and zero, which agrees with the direction of contour lines of the surface. Therefore,the slope function also shows the distribution of the surface aspect, and consequently,gives us the outline of the spatial structure of points. For instance, figure 2 roughlytells us the location of point clusters and spatial variation of point density. Theslope function is a summary of the original surface function, emphasizing its spatialstructure.

3. Surface under uncertaintyIf the location of points is certain, the surface generated by the kernel smoothing

can be either visualized as it is or outlined by the slope function proposed in theprevious section. If their location is uncertain, however, the surface obtained becomesunstable and can be represented by neither the simple surface function nor the slopefunction. To solve this problem, this section extends the slope function so that itrepresents the stability of a surface instead of the surface itself.

3.1. Representation of locational uncertainty of pointsRepresentation of locational uncertainty in spatial data is a controversial research

topic in GIS (Goodchild 1989, Burrough and Frank 1996, Burrough and McDonnell1998); at least two methods exist for representing locational uncertainty in termsof spatial analysis. One uses a continuous function such as the fuzzy membershipfunction (Burrough and McDonnell 1998) and the probability density function (Shi1998). The other defines the region of a deterministic boundary within which aspatial object is possibly located (Perkal 1966, Chrisman 1982a, 1982b, Dunn et al.1990, Sadahiro 2002). This paper follows the latter representation of locationaluncertainty; we are certain that a spatial object is located in a bounded region calledthe error zone but we do not know its exact location. This is a key to calculatingthe surface stability; it gives us high tractability in computation of the slope functionand its extension as shown later. The deterministic boundary approach is reasonable

Figure 2. The slope function of the surface shown in figure 1(b).

Page 5: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 143

when the point location is represented by a higher level of address system such asblocks and cities, and it gives a good approximation to continuous functions—wecan determine a threshold of uncertainty or probability to define the boundary oferror zones.In this paper we assume that error zones are represented as polygons, which also

approximate spatial objects with curved boundaries. Let Qibe the error zone of

point Piin S. The set of error zones {Q1 , Q2 , . . . , Qn} is denoted by V.

3.2. Representation of surface under uncertaintyUncertainty in the location of points yields instability of the surface generated

by the kernel smoothing; the surface fluctuates according to the location of points.Its global structure, however, is usually robust against uncertainty. Remember thepoint distribution and its surface representation shown in figure 1. We intuitivelybelieve that the global structure of the surface would not drastically change even ifa little uncertainty exists in the location of points; there are two peaks in the region,one around the centre and the other in the south of the region. Though it dependson the degree of smoothing determined by the parameter a (equation (2)), thereoften exists a stable structure in an unstable surface and we may detect it byvisual analysis.This intuitive discussion can be described by the behaviour of the slope function

under uncertainty. If there is only a small locational uncertainty the slope functionis likely to be stable almost everywhere. On the other hand, if the location of pointsis quite uncertain, the slope function may become unstable.To treat the surface stability formally, we introduce the slope stability function

defined by

l(x, h; V)=G1 if YPiµQi, s(x, h; f )=1

0 otherwise(5)

If the surface ascends at x in direction h for any possible location of points, the slopestability function l(x, h; V) becomes one. Otherwise, if the surface can descendaccording to the location of points, the function becomes zero. The slope stabilityfunction shows the possible surface gradient as a binary variable, and consequently,the stability of a surface.Figure 3 shows an example of the slope stability function generated on the

assumption that a certain amount of uncertainty exists in the locations of pointsshown in figure 1. The fan-shaped figures indicate the direction in which the surfaceascends for any possible location of points. Around the largest point cluster (see alsofigure 1) the figures face in the direction to the centre of the cluster. This indicatesthat the surface tends to ascend to the cluster centre, even if the location of pointsis uncertain. The size of fan-shaped figures indicates the stability of a surface.Therefore, figure 3 shows that the surface is not stable around the centre of thecluster, while it is stable on the periphery, facing in the direction to the cluster centre.Comparing figures 2 and 3, we notice that the slope function and the slope

stability function are quite similar. It is reasonable because the latter is an extensionof the former. For instance, if the point location is certain, the slope stability functionis equivalent to the slope function; the range of h where the slope stability function isequal to one reduces with an increase of uncertainty. In addition, both functionsoutline the spatial structure of points. Figures 2 and 3 roughly indicate the location

Page 6: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro144

Figure 3. An example of the slope stability function.

of point clusters, the spatial variation of point density, and so forth. We can under-stand the spatial structure of points from the slope stability function even if the pointlocation is uncertain. From this we can say that the slope stability function showsthe structural stability of a surface, and consequently, that of its original pointdistribution. This is one advantage of the surface stability function.The slope stability function indicates the aspects that the surface can take accord-

ing to the location of points. Therefore, we can evaluate the surface stability at x byintegrating the slope stability function by h. For instance, if the point location iscertain, then the aspect of the surface is fixed, which results in

P 2p0l(x, h; V)dh=p (6)

If the point location is so uncertain that the surface can take any aspect at xaccording to the location of points, then we have

P 2p0l(x, h; V)dh=0 (7)

3.3. Computational procedure for calculating the slope stability functionEvaluation of the slope stability function is based on the calculation of the

minimum gradient of the surface at every location. To this end, we consider thekernel functions of individual points separately. The kernel function for the point P

iis

ki(x)=exp(−a|x−z

i|) (8)

where ziµQi. For the sake of simplicity, we omit the coefficient a/p in equation (2)

without the loss of generality. The gradient of the kernel function at x in directionh is given by

k∞i(x, h)=(k

i(x) ·v(h) (9)

(recall equation (3)). Let k∞i,min (x, h) be the minimum of k∞i (x, h):

k∞i,min (x, h)=minP

iµQi

k∞i(x, h) (10)

Page 7: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 145

Since the surface function is the sum of the individual kernel functions (equation(1)), the minimum gradient of the surface is given by the sum of the minimumgradient of kernel functions, that is, S

ik∞i,min (x, h).

The slope stability function becomes one if the surface ascends at a location x indirection h for any possible location of points. Therefore, equation (5) becomes

l(x, h; V)=G1 if Sik∞i,min (x, h)�0

0 otherwise(11)

To calculate equation (11), we consider two points at x and x+dx denoted byA and A∞, respectively. Let ds be the distance between the points, that is, ds=|dx |.Then we have

dx=(ds cosh, ds sinh) (12)

Substituting equation (12) into equation (10), we obtain

k∞i,min (x, h)=minP

iµQi

limds�0

ki(x+dx)−k

i(x)

ds(13)

For the sake of simplicity, we translate and rotate A, A∞, and Qiso that A is at

the origin of the coordinate system and A∞ is on the positive side of the x-axis(figure 4). New coordinates of the points A and A∞ are denoted by (0, 0) and (ds, 0),respectively. The coordinate of P

iis given by (l cosQ, l sinQ), where the angle Q is

measured counterclockwise from the x-axis (0∏Q<2p). Equation (13) then becomes

k∞i,min (x, h)=minl,Q

limds�0

exp(−a{(l cosQ−ds)2+l2 sin2Q})−exp(−al2 )ds

=minl,Qexp(−al2 ) lim

ds�0

exp(−ads2 )exp(2aldscosQ)−1ds

(14)

To solve the right side of equation (14), we first fix the variable l and minimizek∞i(x, h) with respect to Q, which is equivalent to minimizing k∞

i(x, h) with respect to

Figure 4. The coordinate system for calculation of the slope stability function.

Page 8: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro146

cosQ. Let D(Q; l ) be the domain of the variable Q for a given l. We can obtain it byoverlaying a regular polygon on Q

ithat approximates the circle of radius l centred

at the origin and calculate their intersections (figure 5). If pµD(Q; l ), then k∞i(x, h) is

minimized when Q=p. Otherwise, k∞i(x, h) is minimized when Q is located exactly on

one of the boundaries of D(Q; l ).We then minimize k∞

i(x, h) with respect to l. Let D(l ) be the domain of l given by

the minimum and maximum distances from the origin to Qi. Gradually increasing l

from its minimum in D(l ), we obtain k∞i,min (x, h) and finally the slope stability function

l(x, h; V) using equation (11).

3.4. Unstable regionsIn analysis of surfaces, critical points, that is, peaks, bottoms, and cols (saddle

points) play a crucial role (Warntz 1966, Warntz and Waters 1975, Okabe andMasuda 1984, Iri et al. 2000, Sadahiro 2001). They are defined by the locationalvector x satisfying

d f (x)

dx=0 (15)

and describe the spatial structure of a surface in a simple but efficient way.Similar concepts can be defined for uncertain surfaces. As mentioned earlier,

there may be regions in S where the surface can take any aspect according to thelocation of points (equation (7)). We call them unstable regions. Stable regions, onthe other hand, are the regions in which every location has at least one aspect nottaken by the surface.Unstable regions are further classified into three types: peak regions, bottom

regions, and neutral regions. If the surface always ascends in a certain direction into

Figure 5. Calculation of the domain D(Q; l ).

Page 9: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 147

an unstable region everywhere on its boundary, the region contains at least onepeak independently of the location of points. We thus call it a peak region. If thesurface always ascends out of an unstable region, the region contains at least onebottom and we call it a bottom region. Unstable regions classified into neither ofthem are called neutral regions. Stable regions do not contain either peaks or bottoms.Figure 6 shows an example of unstable regions.Unstable regions, as well as the slope stability function, show the structure of an

uncertain surface and its stability against the locational uncertainty of points. Theyare in a sense substitutes for critical points in surface analysis.

3.5. V isualization and evaluation of surface stabilityHaving obtained the slope stability function and unstable regions, we visualize

them to understand the structural stability of a surface, and consequently, that of itsoriginal point distribution. A simple but useful visualizing method is to use the fan-shaped figures shown in figure 6. Using the centroid of error zones, we can alsocalculate an ‘average’ surface function which helps our interpretation of the slopestability function. Figure 7 shows an example of the visualization of the stability ofthe surface generated from an uncertain point distribution.In figure 7(c) we notice that the peak at the centre of the region (figure 7(b)) is

stable against data uncertainty; there exists at least one peak in that region independ-ently of the location of points in error zones. The peak in the south, on the otherhand, is in the neutral region, which implies that the peak may disappear accordingto the location of points.To evaluate the overall stability of a surface quantitatively, we propose two

indices: the stable region ratio (SRR) and the stability ratio (SR). Let d(x; V) be abinary function representing stable regions:

d(x; V)=G1 if P 2p0l(x, h; V)dh>0

0 otherwise

(16)

The SRR is the ratio of the area of stable regions to that of the region S:

SRR=PxµSd(x; V)dx

PxµSdx

(17)

The SR is an extension of the SRR weighted by the slope stability function:

SR=PxµSP 2p0l(x, h; V)dhdx

pPxµSdx

(18)

Either index shows a large value if the surface is stable against locational uncer-

Page 10: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro148

Figure 6. Unstable regions.

Figure 7. Visualization of the stability of the surface generated from an uncertain pointdistribution. (a) Error zones, (b) a surface generated by the kernel smoothing (a=0.001, the points are located at the centroid of error zones), (c) the slope stabilityfunction and unstable regions.

tainty of points. A small value indicates that the surface is unstable and attentionshould be paid to its visual analysis. From the definitions they satisfy

0∏SRR, SR∏1 (19)

The SR and SRR can be used to evaluate not only the stability of the surfacebut also the smoothing parameter a in terms of the surface stability. If the surface

Page 11: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 149

is extremely unstable, the slope stability function gives us little information aboutthe spatial structure of points. Therefore, to perform a meaningful analysis of a pointdistribution, we should apply a values that keep the surface stability above a certainlevel. Whether an a value is acceptable can be judged by calculating the SR or SRRvalue and comparing it with its desirable level determined by analysts.

4. Empirical studyThis section empirically analyses the stability of the surface generated from a

point distribution, in order to test the validity of the method proposed in the previoussection. We also discuss the effect of the smoothing parameter, locational uncertaintyof points, and spatial pattern of points on the surface stability in a practical setting.Point data used in the analysis are based on the list of retail stores and restaurants

in the NTT telephone directory published monthly. Among 1670 categories we chosecoffee shops, clothing shops, and convenience stores in the Shinjuku area in Tokyo,Japan (figure 8). The numbers of shops are 689, 555, and 145, respectively. Weextracted their addresses in September 1999 in ASCII format, and converted theminto point data by geocoding.The locational data obtained by geocoding are highly accurate; their locational

error is usually kept below 10m, especially in the central area of Tokyo. We thusconsidered hypothetical error zones of points represented by the circle of radius r,which was set to 50, 100, and 150m, successively. Figure 9 shows the error zones ofradius 100m assumed for the distribution of coffee shops.We then calculated the slope stability function and unstable regions, following

the procedure simply illustrated in figure 10. We overlaid a lattice of side 10m overthe study region, and calculated the slope stability function at each lattice nodeand the middle point of each edge by the procedure shown in the previous section.The slope stability function calculated on edges was used for detection of unstableregions. The smoothing parameter a (equation (2)), which reflects the scale of spatialanalysis, was set to 0.001, 0.01, and 0.1, successively.The results are shown in table 1 and figure 11. In table 1 we notice that the

surface stability decreases as uncertainty of point location increases; both the SRR

Figure 8. Shinjuku area in Tokyo, Japan.

Page 12: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro150

Figure 9. Error zones of radius 100m assumed for the distribution of coffee shops.

Figure 10. Calculation of the slope stability function and unstable regions. (a) The slopestability function at lattice points, (b) the slope stability function calculated on edges(an arrow indicates that the surface always ascends in its direction), (c) unstable regionsdetected by the result shown in (b).

and SR monotonically decrease with an increase of r. Similarly, the stability decreaseswith an increase of the smoothing parameter a. When a=0.001, the surface is quitesmooth and the SRR shows large values over 98%. As a increases, however, thestability monotonically decreases in all cases. This implies that enlargement of thescale of analysis, say, transition from global to local analysis, reduces the reliabilityof the result. From this we obtain one obvious solution to uncertainty in spatialanalysis; we should analyse spatial phenomena at a scale small enough to keep thereliability of the result at a certain level.The SRR and SR seem to be closely related with each other. In fact, Pearson’s

correlation coefficient of these indices is 0.954, a large value indicating that they arehighly correlated with each other.Figure 11 shows that the unstable region increase mainly at the expense of the

Page 13: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 151

Table 1. The stable region ratio (SRR) and stability ratio (SR) (percentage). The variable ris the radius of the circles that represent the hypothetical error zones, and a is thesmoothing parameter used in equation (2).

r: 50m r: 100m r: 150m

SRR SR SRR SR SRR SR

(a) Coffee shopa: 0.001 99.99 79.01 99.86 75.49 99.67 71.06a: 0.01 96.39 58.42 78.32 34.91 50.53 17.75a: 0.1 68.96 35.02 36.31 15.75 22.69 8.68

(b) Clothing shopa: 0.001 99.81 79.29 99.49 74.88 98.85 71.05a: 0.01 96.93 62.38 84.55 42.30 63.78 26.28a: 0.1 80.20 44.72 50.77 22.28 33.29 13.14

(c) Convenience storea: 0.001 99.90 78.15 99.58 72.09 98.99 61.14a: 0.01 89.03 51.57 66.13 31.06 46.67 18.60a: 0.1 81.18 46.44 51.06 24.09 34.91 14.01

neutral region. When a=0.1, for instance, the neutral region accounts for at leastfour-fifths of the unstable region in all cases. This implies that an increase of a andr reduces the relative amount of information that we obtain from the unstable region.Among three types of unstable regions, the peak and bottom regions assure theexistence of at least one peak and bottom, respectively. On the other hand, theneutral region does not provide any information about the existence of critical points,and thus is less informative than the peak and bottom regions.Let us then discuss the effect of point pattern on the surface stability, comparing

the results of three categories of shops. They have different spatial patterns. Coffeeshops are strongly clustered around Shinjuku station. Clothing shops also show aclustered pattern, but they have two clusters: one around Shinjuku station, and theother near Harajuku station. Unlike these categories, convenience stores are widelydispersed in the study region as shown in figure 11(c). This is because conveniencestores are quite similar to each other so that they receive little additional profit byagglomeration and travel distances are generally low for non-luxury items.Table 1 shows that the results are quite similar among the three categories when

a=0.001. A difference arises with an increase of a. When a=0.01, the clothing shopshave the largest SRR and SR values among the three categories, which implies thatthe surface generated from the distribution of clothing shops is the most robustagainst locational uncertainty. On the other hand, the convenience stores yield theworst result—the smallest SRR and SR values. When a=0.1, however, the conveni-ence stores become the best and the coffee shop the worst. The order of the threecategories in terms of the surface stability is independent of the radius of error zones.The order of these categories is closely related to the spatial pattern of shops,

and it can be explained by the separability of points as follows. Consider twouncertain points in a one-dimensional region represented by error zones (figure 12(a)).In this case there are three unstable regions: two exactly coincide with the errorzones, and one is in the middle of the error zones (figure 12(b)). If the points arecloser as shown in figure 12(c), the middle unstable region expands (figure 12(d )).

Page 14: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro152

Figure 11. Results of analysis. (a) The coffee shop, (b) clothing shop, (c) convenience store.The first column shows the centroid of error zones. The second column is the surfacegenerated from the points located at the centroid of error zones. The third, fourth,and fifth columns show the slope stability function (the fan-shaped figure) and unstableregions (the grey area indicates the neutral region, and the areas enclosed by the solidand broken lines indicate the peak and bottom regions, respectively).

Page 15: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 153

Figure 11(c).

However, if the points get much closer (figure 12(e)), the unstable regions are mergedinto smaller one (figure 12( f )). To sum up, when two uncertain points get closer, thesurface stability first decreases and then increases.The above observation indicates that the surface stability is greatly affected by the

separability of points determined by the distance between error zones and the shapeof the kernel function. The unstable region is small when either error zones are almostseparable with respect to the kernel function (figure 12(b)) or they are too close todistinguish (figure 12( f )). Instability arises when error zones are neither separable norindistinguishable. In our example, it seems that coffee shops become indistinguishablewhen a decreases from 0.1 to 0.01. Convenience shops, on the other hand, experienceit when a decreases from 0.01 to 0.001. This is understandable because the conveniencestores are more dispersed than the coffee shops as mentioned earlier.The above discussion suggests that, given a point distribution, a parameter value

of a exists that minimizes SRR and SR, and consequently, generates the most unstablesurface from the distribution. In our empirical study, however, we did not see sucha case where the stability reaches its minimum in terms of SRR and SR. It seems toappear when a>0.1, but such a steep kernel function is not meaningful in pointpattern analysis because the surface obtained is almost equivalent to the distributionof error zones itself.

5. Concluding discussionIn this paper we have developed a method for representing and analysing the

surface generated from an uncertain point distribution. We proposed the slopestability function to evaluate the surface stability quantitatively. It also enables usto understand the spatial structure of a surface, and consequently, that of its originalpoint distribution, even if the point location is uncertain. Visualization of the slopestability function and unstable regions outlines the structural stability of an uncertain

Page 16: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Y. Sadahiro154

Figure 12. Relationship between the point distribution and the surface stability. (a) (c) (e) Twopoints in a one-dimensional region (solid and broken lines represent the error zonesof the points), (b) (d ) ( f ) kernel functions (solid and broken lines) and unstable regions(black rectangles).

Page 17: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points 155

point distribution. The overall stability of a surface is quantitatively evaluated bytwo indices, the SRR and SR. They also serve as criteria for evaluating the smoothingparameter a in terms of the surface stability. To test the validity of the method, weanalysed the stability of surfaces generated from the point data of retail stores andrestaurants in Tokyo, Japan. The empirical study yielded some interesting findingsthat help us understand the nature of surface stability such as the effect of thesmoothing parameter, locational uncertainty of points, and spatial pattern of points.We finally discuss some limitations of our method for further research. First, we

have assumed discrete representation of locational uncertainty, which may be calledthe ‘epsilon band approach’ (Perkal 1966, Chrisman 1982a, 1982b). One of theadvantages of this approach is high tractability in computation of the slope stabilityfunction. However, this representation is not the only way to treat uncertainty;representation of locational uncertainty in spatial data is a research topic in progressas mentioned in §3. If the locational uncertainty of points is represented by acontinuous function, one practical solution is to determine a threshold of uncertaintyor probability and use it to define the boundary of error zones. This enables us tocalculate the slope stability function and to evaluate the stability of the surfacegenerated from the points. Otherwise, we may have to calculate the joint probabilitydistribution of the surface function directly from the probability distributions ofpoints. Though this is an important and challenging research topic, we leave it forfuture research. Second, we have used the bivariate normal distribution as the kernelfunction throughout this paper. This is because the bivariate normal distribution ismost widely used in GIS and spatial analysis. However, there are many other kernelfunctions such as the Epanechnikov kernel, triangular kernel, and so forth (Silverman1986). When a kernel other than the bivariate normal distribution is used, calculationof the slope stability function requires a different computational procedure becauseequation (14) takes a different form. A new computational procedure with widerapplicability should be developed to treat a variety of kernel functions. Third, wehave discussed the effect of locational uncertainty in a rather limited context: conver-sion of a point distribution into a surface. As discussed in the introductory section,however, uncertainty affects the results of a wide range of spatial operations andanalysis including spatial statistics, spatial optimization, and spatial decision making.It is necessary to discuss the effect of uncertainty in a broader context, and toestablish a common methodology of handling uncertainty in these spatial procedures.

ReferencesA, F. E., and B, P., 1996, Methods for Investigating L ocalized Clustering of

Disease (Lyon, France: International Agency for Research on Cancer).B, T. C., 1994, A review of statistical spatial analysis in geographical information systems.

In Spatial Analysis and GIS, edited by S. Fotheringham and P. Rogerson (London:Taylor & Francis), 13–44.

B, T. C., and G, A. C., 1995, Interactive Spatial Data Analysis (London: Taylor& Francis).

B-C, G. F., 1994, Geographic Information Systems for Geoscientists (Oxford:Pergamon).

B, I., 1993, An extensive surface model database for population-related information:concept and application. Environment and Planning B, 20, 13–27.

B, I., and M, D., 1989, The generation of spatial population distributions fromCensus centroid data. Environment and Planning A, 21, 537–543.

B, P. J., and B, P. L., 1981, Environmental Criminology (ProspectHeights, Illinois: Waveland Press).

Page 18: Research Article - 東京大学ua.t.u-tokyo.ac.jp/okabelab/sada/IJGIS (2003).pdfKeller 1997, Hunter and Goodchild 1997, Heuvelink 1998). Jacquez (1996) proposed statistical tests for

Stability of surface generated from uncertain points156

B, P. A., and F, A. U., 1996, Geographic Objects with Indeterminate Boundaries(London: Taylor & Francis).

B, P. A., and MD, R. A., 1998, Principles of Geographical Information Systems(New York: Oxford University Press).

C, N. R. 1982a, Methods of Spatial Analysis Based on Error in Categorical Maps.PhD thesis, University of Bristol.

C, N. R., 1982b, A theory of cartographic error and its measurement in digital databases. In Proceedings of AUTO-CARTO (Falls Church, VA: American Congress onSurveying and Mapping) 5, pp. 159–168.

C, N., 1993, Statistics for Spatial Data (New York: John Wiley & Sons).D, T. J., and K, C. P., 1997, Modelling and visualizing multiple spatial uncertainties.

Computers & Geosciences, 23, 397–408.D, R., H, A. R., and W, J. C., 1990, Positional accuracy and measurement

error in digital databases of land use: an empirical study. International Journal ofGeographical Information Systems, 4, 385–398.

F, A. S., B, C., and C, M., 2000, Quantitative Geography(London: Sage).

G, V., MG, P. G., M, J. H., and R, T. A., 2000, Analyzing CrimePatterns (Thousand Oaks: Sage).

G, M. F., 1989, Modeling error in objects and fields. In Accuracy of Spatial Databases,edited by M. Goodchild and S. Gopal (London: Taylor & Francis), pp. 107–113.

G, M., and G, S., 1989, T he Accuracy of Spatial Databases (London: Taylor& Francis).

H, G. B. M., 1998, Error Propagation in Environmental Modelling (London: Taylor& Francis).

H, G. J., and G, M. F., 1997, Modelling the uncertainty of slope gradient andaspect estimates in spatial databases. Geographical Analysis, 29, 35–49.

I, M., S, Y., and N, T., 2000, Extraction of invariants from digital elevationdata with application to terrain topography. Proceedings of the Symposium onIntegrated Geographical Information Systems, 5, 33–46 (in Japanese).

J, G. M., 1996, Disease cluster statistics for imprecise space-time locations. Statisticsin Medicine, 15, 873–885.

L, Y., and Y, J., 1997, Point-in-polygon analysis under certainty and uncertainty.GeoInformatica, 1, 93–114.

M, D., and B, I., 1991, Techniques for modelling population-related raster data-bases. Environment and Planning A, 23, 1069–1075.

O, A., and M, S., 1984, Qualitative analysis of two-dimensional urban populationdistribution in Japan. Geographical Analysis, 16, 301–312.

P, J., 1966, On the length of empirical curves. Discussion Paper, 10, Ann Arbor MichiganInter-University Community of Mathematical Geographers.

P, E. C., 1977, Mathematical Ecology (New York: John Wiley & Sons).S, Y., 2001, Analysis of surface changes using primitive events International Journal

of Geographical Information Science, 15, 523–538.S, Y., 2002, Cluster detection in uncertain point distributions: a comparison of four

methods. Computers, Environment and Urban Systems, to appear.S, D. W., 1992, Multivariate Density Estimation (New York: John Wiley & Sons).S, W., 1998, A generic statistical approach for modelling error of geometric features in GIS.

International Journal of Geographical Information Science, 12, 131–143.S, B. W., 1986, Density Estimation (London: Chapman & Hall ).S, T. A., 1999, T hematic Cartography and V isualization (Upper Saddle River, New

Jersey: Prentice Hall ).U, G., and F, B., 1985, Spatial Data Analysis with Example (Chichester: John

Wiley & Sons).V, H., 1994, Integration of simulation modeling and error propagation for the buffer

operation in GIS. Photogrammetric Engineering and Remote Sensing, 60, 427–435.W, W., 1966, The topology of a socioeconomic terrain and spatial flows. Papers of the

Regional Science Association, 17, 47–61.W, W., and W, N., 1975, Network representations of critical elements of pressure

surface. Geographical Review, 65, 476–492.