on measures of dissimilarity between point patterns and ...frederic/papers/ppmeasures105.pdf · on...

Biometrical Journal 0 (0) 0, zzz–zzz / DOI: 10.1002/

On measures of dissimilarity between point patterns and their ap-plications ∗

Jorge Mateu∗∗ ,1, Frederic P Schoenberg 2, David M. Diez 3, Jonatan A. Gonzalez 1,and Weipeng Lu 2

1 Department of Mathematics, University Jaume I, E-12071, Castellon, Spain2 Department of Statistics, UCLA, 8125 Math-Sciences, Los Angeles, CA 90095, USA3 Department of Biostatistics, Harvard SPH, MA 02115, Boston, USA

Various measures of dissimilarity between point patterns are outlined, and their uses and pros and cons arereviewed. Examples include spike-time distance and its variants, as well as cluster-based distances and dissim-ilarity measures based on classical statistical summaries of point patterns, such as the K-function. Applica-tions to the summary and description of collections of repeated realizations of a point pattern via prototypes ormultidimensional scaling are explored. A simulation study to evaluate the performance of multidimensionalscaling with two types of selected distances is considered. Finally, a real dataset on a plant community isanalyzed.

Key words: K-function; Multidimensional scaling; Point patterns; Prototypes; Spike-time dis-tance.

1 Introduction

Point processes, which are random collections of points falling in some measurable space, have found use indescribing an increasing number of naturally arising phenomena in a wide variety of applications, includingepidemiology, ecology, forestry, mining, hydrology, astronomy, ecology, and meteorology (Cox and Isham;1980; Ripley; 1981; Karr; 1991; Daley and Vere-Jones; 2003; Møller and Waagepetersen; 2004; Schoenbergand Tranbarger; 2008; Tranbarger and Schoenberg; 2010; Schoenberg; 2011)

Point processes evolved naturally from renewal theory and the statistical analysis of life tables dating backto the 17th century, and in the earliest applications, each point represented the occurrence time of an event,such as death or incidence of disease (see e.g. chapter 1 of Daley and Vere-Jones (2003) for a review). In themid-20th century, interest spanned to spatial point processes, where each point represented the location ofsome object or event, such as a tree or sighting of a species (Ripley; 1981; Cressie; 1993; Diggle; 2003). Theclassical model for temporal or spatial point processes is the Poisson process, where the numbers of pointsin any disjoint sets are independent random variables; the name comes from the fact that for point processeswith this independence property, the number of points in any measurable set is a Poisson random variable(see Daley and Vere-Jones (2003), chapter 2.4). The Poisson process is a natural null model in the absenceof clustering or inhibition, and the Poisson process also arises commonly as a limiting distribution in manycircumstances (Karr; 1991; Daley and Vere-Jones; 2003; Møller and Waagepetersen; 2004; Tranbarger andSchoenberg; 2010).

∗ Work partially funded by grant MTM2010-14961 from the Spanish Ministry of Science and Education∗∗Corresponding author: e-mail: [email protected], Phone: +34-964-728-391, Fax: +34-964-728-429

c© 0 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

2 Jorge Mateu et al., dd, and dd: Running title

Alternative models for spatial point processes (Cressie (1993), chapter 8, or Møller and Waagepetersen(2004)) grew quite intricate over the course of the 20th century, and among the names associated withthese models are some of the key names in the history of statistics, including Jersey Neyman and DavidCox. Today, much attention is paid to space-time point processes, where each point represents the time andlocation of an event, such as the origination of an earthquake or wildfire, a lightning strike, or an incidenceof a particular disease (Vere-Jones; 2009; Tranbarger and Schoenberg; 2010).

The intimate relationship between point processes and time series is worth noting. Indeed, many data setsthat are traditionally viewed as realizations of point processes could in principle also be regarded as timeseries, and vice versa (Cox and Isham; 1980; Tranbarger and Schoenberg; 2010). For instance, a sequenceof earthquake origin times is typically viewed as a temporal point process, though one could also store sucha sequence as a time series consisting of zeros and ones, with the ones representing earthquakes. The maindifference is that for a point process, the points can occur at any times in a continuum, whereas for timeseries, the time intervals are discretized. In addition, if the points are sufficiently sparse, one can see that itmay be far more practical to store and analyze the data as a point process, rather than deal with a long listcontaining mostly zeros.

By the mid 1990s, models for spatial-temporal point processes had become plentiful and often quiteintricate. Around this time, a wealth of large multiple neuronal spike train datasets began to become availableto neuroscientists, and the analysis of this type of data required some slightly different methods (Brown et al.;2004; Kulkarni and Paninski; 2007). Unlike data from seismology or epidemiology, for instance, where theevents were naturally seen as one long instance of a point process, the neuronal data typically consistedof many repeated observations of a point process. For instance, one might observe the times of firings ofneurons in a particular part of the brain in the instant immediately following a stimulus, for many differentsubjects, some of whom have a disease and some of whom do not.

In order to classify neuronal spike trains into clusters or to differentiate between diseased and healthypatients based on their neuronal firing patterns, methods were sought which would require the definition ofa distance between two point patterns. The seminal work of Victor and Purpura (1997) proposed severaldistance metrics, including spike-time distance, which the authors used for describing neuronal spike trains.The list of distances in Victor and Purpura (1997) is far from exhaustive, however, and certain alternativetypes of distances and non-metric measures of dissimilarity are more useful for clustered or inhomogeneouspoint processes or for point processes in high-dimensional spaces. Literature on spatial point patterns hasmostly focused on modeling the (spatial) distribution of the locations, and only distances among such loca-tions play a role. Little attention has been paid to measuring distances between point patterns to solve, forexample, clustering or classification problems or prototype determination.

The purpose of this article is to outline and review several types of distances (and non-metric measuresof dissimilarity) between two point patterns, X and Y , each observed on the same metric space. We alsodiscuss some examples and applications of the proposed distance metrics to cluster analysis and prototypedetermination. The paper is outlined as follows. In Section 2, we discuss the spike-time distance and itsvariants which involve matching individual points ofX to corresponding points in Y . Dissimilarity measuresbased on functional or scalar descriptors of point processes, including estimators of first and second momentsof the processes, or classical test statistics based on these moments, are described in Section 3. Applicationsto summaries of collections of independent realizations of a point process via prototypes or multidimensionalscaling are given in Section 4. This Section also includes a simulation study and a real data analysis. Thepaper ends with a Section containing some discussion and further remarks. Finally, a brief introduction tostDist function is provided in the Appendix.


Biometrical Journal 0 (0) 0 3

2 Pointwise distances between point patterns through transformations

Following some standard treatments such as Cressie (1993), Daley and Vere-Jones (2003), and Møller andWaagepetersen (2004), we refer to a locally finite collection of points falling in some space as a point patternor point configuration, and a random point pattern, i.e., a mapping from a probability space to the set of pointpatterns, is called a point process.

One family of distances between point patterns is characterized by a raw transformation of one pointpattern X into another point pattern Y by actions on individual points. Individual points in X are moved sothat the resulting layout resembles Y .

The spike-time distance developed by Victor and Purpura (1997) is one popular distance metric thathas been applied to neuronal, earthquake, and wildfire data (Victor and Purpura; 1997; Schoenberg andTranbarger; 2008; Tranbarger and Schoenberg; 2010; Nichols et al.; 2011; Diez et al.; 2012). Techniques forcomputing this distance metric were only recently extended to multiple dimensions (Diez et al.; 2010, 2012).The distance is defined as the minimum cost associated with the transformation of one point pattern X intoa pattern Y by deleting, adding, and moving points. The left panel of Figure 1 represents a transformationof X into Y via actions on individual points of X . Those points in X not moved are deleted from X , andthose points in Y to which no point in X corresponds are added to X . If we describe such a transformationas T , then the cost associated with T is

C(T |X,Y ) = pd|Xdelete|+ pa|Yadd|+∑

x∈Xmove

pmdx (1)

where pd, pa, and pm are penalty parameters, Xdelete, Yadd, and Xmove are subsets of X and Y that aredeleted, added, and moved, respectively, dx is the distance a point x in Xmove is moved, and |S| representsthe number of points in a set S. The spike-time distance, dst(X,Y ), is defined as the infimum of equation (1)over all such possible transformations T . The transformation shown in the left panel of Figure 1 is thetransformation minimizing (1) for a particular set of parameters. Victor and Purpura (1997) showed thatspike-time distance is a well-defined distance metric, and they also considered other closely related distances.Spike-time distance in single or multiple dimensions may be computed using the free R statistical softwarevia the stDist function in the ppMeasures package (Team; 2010; Diez et al.; 2010). The C codeforming the base of the package is available through the package page on the R Project website, and abrief introduction to stDist function is provided in the Appendix. A spike analysis toolkit for single-dimensional analyses was also made available for MATLAB and Octave (Goldberg et al.; 2009).

Another useful distance function that is also defined in terms of such transformations of X into Y is thenearest-point distance, where each point x in X is moved to its nearest-neighbor in Y ; call this nearestpoint yx. Unlike spike-time distance, nearest-point distances are computed without allowing the addition ordeletion of points. The nearest-point distance is defined simply as

dn(X,Y ) =∑x∈X||x− yx||

where || · || may represent Euclidean distance for point patterns in Rk, or some other distance metric forpoint patterns in a more general metric space. The measure dn is useful in facility placement. For instance,if points in X represent consumers and points in Y represent facilities, then dn characterizes the total costfor all consumers to visit their closest facility. Note that for two distinct points x, x′ in X , we may haveyx ≡ yx′ , and that in general, dn(X,Y ) 6= dn(Y,X), so that dn is not formally a distance metric. However,



the sum of these two distances forms a symmetric nearest-point distance metric

dN (X,Y ) = dn(X,Y ) + dn(Y,X).

1

2

3

4

5

6

789

12

3

4

5

6

7

8

9

XY

Y[1:nx]

1

2

3

4

5

6

789

12

3

4

5

6

7

8

9

XY

Figure 1 Spike-time distance (left) and nearest-point distance for moving X to Y (right).

When a point process is highly clustered, metrics such as spike-time distance tend to yield prototypes thatdo not reflect a typical sample point pattern. Clustered processes thus require a separate family of distanceswhere movements of collections of points are permitted. For example, consider the following metric, calledcluster distance.

Let T represent a transformation of X into Y that involves sequentially moving collections of points inX . Thus, T is itself a sequence of transformations, ti, where each ti moves a subset Xi of points by a vectorzi (see the first two panels of Figure 2). Then the cost associated with T is defined as

pd|Xdelete|+ pa|Yadd|+∑i

pm|Xi|q||zi|| (2)

where q is a parameter in [0, 1). The cluster distance, dc(X,Y ), is the infimum cost over all such trans-formations, T . The further each cluster must be moved (the larger zi is), and the more points that must bemoved in order to match X up with Y (the larger |Xi| is), the larger the cluster distance between X and Y .

A similar type of distance for clustered point processes is defined by first aligning clusters in X withclusters in Y and subsequently applying simple spike-time distance. For instance, let {Rj} represent a set ofdisjoint and convex regions of the space that contain all the points of X . One may translate all of the pointsin a given region by some fixed vector and repeat for each region, assigning a cost of pc per unit distance toeach translated region, and then the spike-time distance may be applied after these regions are moved. Thisdeclustered spike-time distance is defined as the minimum cost over all such transformations and choicesof {Rj}. An illustration is given in the right panel of Figure 2.

A drawback of both the cluster distance and the declustered spike-time distance is that methods for theirefficient computation appear to be unavailable presently, and searching over all possible translations of allpossible subsets of points in X is extremely computationally burdensome.



x[,1]

XY

allpoints

in X

Cluster 1

x1[,1]x1

[,2

]

XY

singleclustermoved

Cluster 2

XY

completepoint-by-point

moves

Cluster 3x[,

2]

XY

Declustering

Figure 2 The first two steps in the cluster distance are shown in the left and center panels. The right panelrepresents the process of removing the clustering characteristics prior to applying spike-time distance.

An advantage of the cluster distance is that the clustersXi and their associated translation vectors, zi, mayhave physical meaning and be of direct interest. For instance, a subset of point patternX is nearly equivalentto a subset of point pattern Y in the case of a temporal delay in response to stimuli, in the case where thepoints are observed in time, and in the purely spatial case, the points in X or Y may be translated due tophysical deformation or may require spatial translation due to mis-calibration of the recording instruments.

3 Dissimilarity measures based on numerical and functional summary statis-tics

In this section, we describe further techniques for quantifying the difference between point patterns. Someof these differences satisfy the formal definitions of a metric, while others described below, such as integralsof the squared differences betweenK-functions or empty-space functions, may be useful for comparisons ofimportant pattern characteristics despite violating the triangle inequality or other technical property formallyrequired of distance metrics (Searcoid; 2006; Deza and Deza; 2009).



3.1 Model-based distances

Probability densities for point processes behave much like probability densities in more familiar contexts.Spatial point process models that are constructed by writing down their probability densities are of interest tothe statistical community. To construct spatial point processes which exhibit interpoint interaction (stochasticdependence between points), we need to introduce terms in the density that depend on more than one point.

A spatial point process model is typically specified not in terms of its likelihood but rather in terms of itsconditional intensity, i.e. the rate at which points occur at the location in question, given information on allprevious points. This turns out to be an intuitively appealing way to formulate point process models, as wellas being necessary for many models. For spatial point processes, the lack of a natural ordering in a two-dimensional space implies that there is no natural generalization of the conditional intensity of a temporalprocess given the past or history up to time t. Instead, the appropriate counterpart for a spatial point processis the Papangelou conditional intensity (Daley and Vere-Jones; 2003), say λ(u,X), which conditions on theoutcome of the process at all spatial locations other than u. Informally, λ(u,X)du gives the conditionalprobability of finding a point of the process inside an infinitesimal neighborhood of the location u, givencomplete information on the point pattern at all other locations. The Papangelou conditional intensity of afinite point process uniquely determines its probability density and vice-versa.

Given specific models for the point processes giving rise to the point patterns X and Y , one may definethe distance between X and Y in terms of the differences between characteristics of these models. Forinstance, if the point processes X and Y are characterized by their conditional intensities λX(η) and λY (η),respectively, then one metric summarizing the difference between these point process models is the integralof the squared difference of the two conditional intensities over the observation region W

dmb(X,Y ) =

∫W

(λX(η)− λY (η))2dη. (3)

This is illustrated in the left panel of Figure 3, where the intensity functions have been estimated based onthe data shown at the top of the panel. These methods readily extend to a variety of other summaries, such asthe integrated squared difference between the overall mean, or second moment measure, or higher momentsor cumulants of the processes. Of course, the conditional intensities in (3) may be replaced by their expectedvalues, their overall intensities, or in the spatial point process setting, by the Papangelou intensities (see e.g.Daley and Vere-Jones (2003)).

3.2 Distances based on functional statistical summaries

Differences in point process behavior can also be characterized by comparing classical summary measuresof the first or second moments for each of the point patterns. For instance, given two point patternsX and Y ,one could imagine looking at an estimate of the overall intensity of X , e.g. a kernel estimate, and a similarestimate of the intensity of Y . Alternatively, if each of the point patterns is one-dimensional, then one couldlook at the empirical cumulative distribution function for each point pattern as a statistical summary of therealization.

In his influential paper Ripley (1977) developed an exploratory analysis of interpoint interaction, assum-ing that the data are spatially homogeneous. A useful summary statistic is the nonparametric estimator ofRipley’s K-function, essentially a renormalized empirical distribution of the pairwise distances between ob-served points. More pragmatically, for a stationary point process with intensity (mean number of points perunit area) λ, λK(r) gives the expected number of other points of the process within a distance r of a typicalpoint of the process.



Thus, a further alternative to define distances between point patterns would be to take the estimated K-function or its derivative, the estimated reduced second moment measure, for each pattern, and examine thedifference. These functional summary statistics can be also adapted and used under non-stationary contexts,where the intensity of the point pattern is no longer a constant λ but a function of the spatial locations and/orcovariates. For example, a general estimator for the K-function (see Møller and Waagepetersen (2004)) isgiven by

K(r) =

6=∑ξ,η∈W

1[0 <‖ ξ − η ‖≤ r]λ(ξ)λ(η)

e(ξ, η)

where 1[·] is the indicator function, and e(ξ, η) is an edge correction factor (Ripley (1988)). As an illustra-tion, one could compare the integrated squared difference between estimates of Ripley’s K-function for twopoint patterns

dK(X,Y ) =

∫ r0

0

(KX(r)− KY (r))2dr (4)

where the upper limit r0 is a user-specified value taken to ensure the stabilization of the variance of thek-function (in the unit square this value is usually taken as 0.25 (Illian et al. (2008))). This difference isillustrated in the right panel of Figure 3. While this integrated squared difference does not meet the technicalrequirements of a distance metric between two point patterns, it is nevertheless useful when consideringdifferences in clustering behavior among the point patterns.

position

inte

nsity

est

imat

e

−1 0 1 2 3 4 5

0

5

10YX

λX

λY

0

11

Cum

ulat

ive

Cos

t

R

K fu

nctio

n

0

8

0 1 2 3 4 5

0

5

10

15

20

Cum

ulat

ive

Cos

t

Figure 3 Left: A comparison of two intensity models using Equation (3). In this illustration, the intensitiesλX and λY are estimated by kernel smoothing the points in X and Y , respectively. Right: an exampleusing Equation (4) to characterize the difference in clustering behavior of two patterns using the Ripley’s Kfunction.

First and second order measures represent a valuable description of a spatial point process but do notgenerally uniquely characterize a point process. Additional important summary descriptions are given bydistance methods based on measures of some distances between points. The nearest-neighbor distance Dmay be defined as the distance from a point of the pattern to the closest of the other points in the samepattern. The empty space distance F is the distance from an arbitrary fixed location to the nearest point of



the pattern. For a given point pattern, a set of distances (nearest-neighbor or empty space distance) can besummarized by means of their corresponding distribution functions. Let G(r) denote the nearest-neighbordistribution function, and F (r) the corresponding first contact distribution function or empty space function.These distribution functions may be estimated from the observed point patternsX and Y using conventionalmethods. For example, if di defines the distance from each of m points in a regular lattice arrangement tothe nearest event, then F (r) = m−1

∑1(di ≤ r). Similarly, if hi is the distance from each of n events of

the point pattern to its nearest-neighbor, then G(r) = n−1∑

1(hi ≤ r).In practice, each point pattern is typically observed in a bounded region W , and without precautions,

these boundaries can lead to biased estimates. This problem is known in the context of spatial statistics asedge-effects. Different edge-corrected estimators of the functions F and G have been proposed (see e.g.(Cressie; 1993)). There is a clear analogy with censoring in the context of survival analysis (Baddeley andGill; 1997). The different distances observed within a single point pattern are really censored distances. Letdi denote the observed distance from the ith point of the point pattern to its nearest-neighbor within thesampling window, W , and ci its distance to the complement of the sampling window W . If ci < di thereal nearest-neighbor could be outside the window and, in this case, the real and unknown nearest-neighbordistance di fulfills di > ci, and the observation is censored. A similar comment applies to empty spacedistances. In both cases the censoring distance is the distance from the sampling point to the frame window.

Given a statistical summary such as the estimated F orG-function of the point processes, the dissimilaritybetween point processes X and Y can be defined as the distance between the corresponding estimatedfunctional summaries. For instance, let FX and FY be the corresponding estimated empty-space distributionfunctions for X and Y . The dissimilarity between X and Y can thus be defined as

D(X,Y ) = d(FX , FY ) (5)

where d stands for a metric between the functions. For instance, one may use the L2 metric

D2F (X,Y ) =

∫ r0

0

(FX(r)− FY (r))2dr (6)

or the L∞ metric

D∞F (X,Y ) = supr ‖FX(r)− FY (r)‖ (7)

In these examples the differences between patterns X and Y are not distance metrics, but these dissimilaritymeasures are nevertheless useful for comparing characteristics of point patterns. By replacing the empty-space distribution function with the nearest-neighbor distribution function G or K-function K, one cansimilarly define D2

G(X,Y ), D∞G (X,Y ), D2K(X,Y ) (which is equation (4)), and D∞K (X,Y ), respectively.

Note that the sampling variability in the estimates of the K-function tend to increase with the distance r andthis can have a great influence on the value of the dissimilarity measure. This suggests the use of the L-function proposed by Besag (1977), a variance stabilized version of the K-function in defining D2

K(X,Y )in equation (4).

4 Applying measures to a collection of patterns

In this section, we present and develop two main uses of distances between point patterns. In particular,distance measures can be used in the identification of prototypical patterns or with multidimensional scalingfor classification. Each technique is considered in the context of a collection of patterns that come from eitherone or many point processes, and an introduction to referenced R packages is included in the Appendix.



4.1 Prototypical patterns

One application of distance measures is in the construction of a point pattern prototype, which is the typicalpattern of a collection. Prototypes are useful for identifying the typical pattern in a collection, much like themedian is used to describe the typical observation in a univariate data set. The prototype has generally beenused to construct robust representations of a collection of patterns using spike-time distance as the distancemetric in applications including neuronal spike behavior, earthquake aftershocks, and wildfire activity inCalifornia (Schoenberg and Tranbarger; 2008; Nichols et al.; 2011; Diez et al.; 2012).

The prototype Pd(C) of the collection of patterns C based on the distance measure d is defined as thepattern that minimizes the following loss function

argminP∑Xi∈C

αid(P,Xi)

where αi is the weight corresponding to point pattern Xi. The weights αi are typically chosen to be unityunless certain point patterns are deemed more important than others, or if the point patterns are measuredwith differential error. Recall that the median and mean can be defined in a similar way: each is the mini-mizer of a loss function represented by the sum of distances, the L1 norm for the median and L2 norm forthe mean.

As an illustrative example, kernel density and prototype summaries for ten simulated patterns are shownin Figure 4. The kernel density is a pooled estimate based on all patterns and is heavily influenced bypattern 10. The prototype, using the spike-time distance, is shown to be resistant to the tenth pattern andcharacterizes the typical pattern in the collection.

time

patte

rn n

umbe

r

● ● ● ●

● ● ●

● ● ● ●

●● ● ●

● ● ● ●

● ● ● ●

●● ● ●

● ● ●

● ● ● ●

●● ● ●● ● ● ●●● ●● ●● ●●● ●● ●● ●● ● ●● ●● ●●●● ●●● ●

0 1 2 3 4 5 6 7

123456789

10

0.00.81.6

prototype ●● ●●

Figure 4 Kernel density and prototype summaries for a collection of ten patterns. Each pattern is repre-sented by a sequence of events at times denoted by circles.

A two-dimensional example of a prototype is shown in Figure 5. Here the sample collection contains 10point patterns, nine of which are realizations of an inhomogeneous Poisson process with intensity shownin the plot in the first row and first column. Each pattern is expected to have about 15 points. The point



pattern in row 3 and column 1 is a realization of a homogeneous Poisson process with an overall intensityof three times that of the original process. Just as in the one-dimensional case, the prototype, shown in thebottom-right panel, is resistant to a single atypical pattern.

x$xB

0

26

53

●

●

● ●

●

●

●

●

X[[i]]$y[,1]

X[[i

]]$y[

,2]

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

X[[i]]$y[,1]

X[[i

]]$y[

,2]

●

●

●

●

●

●

●

●

● ●

●

●

●

X[[i]]$y[,1]

●

●

●●

● ●

●

●●

●

●

●

●

●

●

●

●

●

●

X[[i]]$y[,1]

X[[i

]]$y[

,2]

●

●

●

●

●

●

●●

●

●

●

●

●

●●

X[[i]]$y[,1]

X[[i

]]$y[

,2]

●

●

●

●●●

●

●

●

●●

●

● ●

●

●

●

● ●

X[[i]]$y[,1]

X[[i

]]$y[

,2]

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

X[[i

]]$y[

,2]

●

●●

●

●

●

● ●

●

●

●

●

●

● ●

X[[i

]]$y[

,2]

●

●

●

●

●

●●

●

●

●

●

●dim

2

Figure 5 The top-left panel shows an intensity function over the space [0, 1]× [0, 1]. The second panel inthe first row represents a legend of the intensity, and the next 9 panels show simulations of an inhomogeneousPoisson process based on the intensity, except the bottom-left panel, which is a realization of a homogeneousPoisson process with λ = 42. The prototype of these 9 point patterns, using the spike-time distance withparameters chosen by default according to the recommendations in Diez et al. (2012), is shown in the bottom-right panel.

Tranbarger and Schoenberg (2010) developed techniques for estimating the prototype in one dimension.This work was extended to multiple dimensions in Diez (2010); Diez et al. (2012). Prototype methods arefreely available through the ppPrototype function in the ppMeasures package on CRAN (Team; 2010;Diez et al.; 2010). Many additional examples are included in the package help files.

4.2 Multidimensional scaling

Classical multidimensional scaling (MDS) is useful for pattern classification based on a distance metric.Given a collection of point patterns, C = {X1, ..., Xn}, a distance matrix D is computed where Di,j =d(Xi, Xj) for some distance measure d. MDS uses this distance matrix to estimate relative locations of thepatterns in Rk, where k is generally selected by the user. Each pattern is itself represented by a point, andMDS embeds these points in locations of Rk. The goal of this embedding is to place points representingsimilar (dissimilar) patterns close to (far from) each other. To achieve this aim, MDS selects a location foreach pattern such that the resulting Euclidean distances between the patterns, described by D, approximates



the pattern distances D by minimizing the following loss function

L(D, D) =∑i 6=j

[Dij − Dij

]2.

See e.g. Venables and Ripley (2002) for additional details on classical MDS and its variants, such asKruskal’s MDS and Sammon mapping.

MDS can be useful in both identifying groupings of points – in this case, groups of patterns – and inclassification. If classifications are provided on a training set, then a distance metric may be applied tocompute the distance matrix D. Applying MDS places the patterns into a new space Rk, where traditionalclassification techniques may be applied. The MDS methods are implemented in the R package smacofusing the smacofSym function (de Leeuw and Mair; 2009).

Figure 6 shows 15 patterns in a one-dimensional space. Each pattern arose from a Poisson process withan overall mean of 12 points per pattern, as described by the figure’s intensities in the left panel. The rightpanel of Figure 6 shows MDS applied to these fifteen patterns using the spike-time distance. One sees agrouping of the patterns based on their original intensity functions.

0 5 10 15

0.0

0.5

1.0

1.5

2.0

time

process 1process 2

time

patte

rn

● ● ●●● ●●● ●● ● ●●

●● ● ●● ●●● ●●● ● ●

● ●● ●● ●● ●●●

●●● ●●● ● ●●

● ●● ● ●●● ●● ●

● ● ●● ●● ●● ●● ●● ●●

●● ●● ● ●●● ●●

● ●● ●●● ●● ●● ●●● ●● ●

● ●● ●● ●●●● ●

●●● ●● ●

● ●●● ●●● ●● ● ●●

● ●●● ●●●● ●

● ●●●● ●● ●●● ●●● ● ●

●● ●●●● ●● ●●● ●● ●●● ●● ●● ●

●● ●●● ● ● ●●● ● ●

0 5 10 15

proc

ess

1pr

oces

s 2

●

●

●●

●

MDS, dimension 1

MD

S, d

imen

sion

2

−0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

Figure 6 Left: Two Poisson intensities, each with a total intensity of 12. Middle: 15 patterns, five from thefirst Poisson process and ten from a second Poisson process depicted in the left panel. Right: The spike-timedistance was used as the metric to construct the matrix D, and multidimensional scaling applied to all 15patterns. The black circles denote the first five patterns and the red crosses mark those of the second process.

Multidimensional scaling may be applied to any of the distance metrics introduced in the previous sec-tion. Figure 7 shows a second example of 15 two-dimensional patterns and their representation as points in[0, 1] × [0, 1] where the distance D2

K was used to construct the dissimilarity matrix. Each point pattern is arealization of one of two Neyman-Scott processes with different parameters. Again, one sees that MDS isquite successful at classifying the point patterns into two groups.

A measure of goodness-of-fit associated to the MDS technique is called the stress measure. The stressis intended to be a measure of how well the configuration matches the data. It is supposed that the truedissimilarities result from some unknown monotone distortion of the inter-point distances of some trueconfiguration, and that the observed dissimilarities differ from the true dissimilarities only because of somerandom fluctuations. The stress is essentially the root-mean-square residual departure from this hypothesis.

Following Kruskal (1964), we consider a MDS configuration whose stress is above 20% as of no interest.We yet have to be cautious with a MDS configuration with a stress measure above 15%; a stress from 10% to15% indicates that the MDS result can be improved. Finally, a stress value from 5% to 10% indicates that theMDS configuration is satisfactory; and a stress below 5% comes with an almost perfect MDS configuration.



●

●●

●

●● ●

●

●

●●●

●

●

●

●

●

●

●

●●

●

●●

●●

●●● ●

●

●●

●

●

● ●

●

●

●

●

●

● ●

●●

●●

●

●

●●

●●

●

●●

●●

●

●●●

●●●●

●●

●●

●●

●●

●●●●●●

●

●●

●

●

● ●

●

●

●●

● ●

●●

●

●●

●

●●

●

●●

●●

●

●●

●●

●

● ●●

●

●●●

●

●●

●●●

●

●●●

●●●

●

●●

●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●● ●

●

●

●●

●

●

●●

●● ●

●

●●

●

●●

●●

●●● ●

● ●●

●●

●●

● ●

●●

●●

●

●

●

●●

●

●

●

●●

●●●

●

●

●●

●●

●

●

●●●

●

●●

●●●

●

●

●●●

●

●

●●

●●●●

●

●●

● ●

●

●●●

●

●●

●

●

●

●

●●●

●●●

●

●

●●●

●

●

●

●●

●●●

●

●●

●

● ●

●

●

●●

●

●

●

●●● ●●

●

●

●●

● ●● ●

●

●

●

●●

●

●●●

●●

●

●

●

●

●

●

●

●

● ●●

●●●

●●●●

●

●

● ●●●

●

●

●●● ●●●

●

●

●●

●●

●●

●●●

●● ●

●●

●●

●

●

●

●

● ●●●

●

●

● ●●

●

●

●● ●

●

●

●

● ●

●●● ●

●

●

●

●

●

●

●

●

●

●

●●

●●

● ●

●

●●

● ●

●●

●

●

●●

●●

●●●●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●

●

●

●●

●

● ●●

● ●●●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●●

●●

●

●

●●

●●

●●

● ●

●●

● ●

●

●

●●

●

●

●

●●

●●

●

●

●

●

●

●

●

●●

●

●

●

●●

● ●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●●

●

●●

●

●●●●

●●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

● ●

●

●

●●

●

●

●●

●●

●●

●

● ●● ●

●

●

●

●

●

●

●●

●

●

● ●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

● ●

●●

●●●●

●

●

●

●

●●

●

●●

●

● ●

●

●

●●

●

●

●●

● ●

●

●

●

●

●●

●

●●●

●

●●

●●

●

●

●●

●●

●●

● ●

●●

●●

●

●

●

● ●●

●

●

●

●

●

●

●

●

●● ●

●

●●●

●

●●

●●

●

●

●

●

● ●●

●

●●

●●

●

●

●

●

●

●●

●●

●●

●

●

●

●

dim 1

dim

2

−1 0 1

−0.

15−

0.05

0.05

Figure 7 The first two rows are realizations of a Neyman-Scott process with parent intensity of 8 and 6children per parent uniformly distributed within a distance of 0.1. The point patterns in rows 3 and 4 arerealizations of a Neyman-Scott process with parent intensity 24 and 2 children per parent within a radiusof 0.1. The bottom-right panel shows MDS applied to these 15 patterns using the distance function D2

K ,where the black circles denote the eight patterns from the first Neyman-Scott process and the red crossesmark those of the second Neyman-Scott process.

4.3 A simulation study

Prototype point pattern building was deeply analyzed in (Schoenberg and Tranbarger; 2008; Nichols et al.;2011). Additionally, here we want to gain some insight on the performance of MDS using two differentdistance measures in problems of pattern classification. We selected the spike-time distance and the K-function-based distance (we indeed used L-functions). The reader is referred to the Appendix to learn moreabout the practical coding of the corresponding functions. The performance of MDS was evaluated throughthe stress measure.

We thus performed an intensive simulation study considering homogeneous spatial patterns in the unitsquare under the following scenarios: (a) with three different values of the first order intensity, namelyλ = 50, λ = 100 and λ = 150; (b) with three different spatial structures, namely Poisson processes,Strauss processes with γ = 0.1, and Strauss processes with γ = 0.2. For each combination of scenarios, wegenerated 20 point patterns, 10 coming from a particular scenario, and the other 10 from a different one. Wethen evaluated the MDS based on the selected two distances and obtained a stress measure value. Each casewas repeated 100 times obtaining 100 stress values. Note that we were able to evaluate the performance ofthe MDS technique depending on the number of points on the pattern (given by the intensity function), and



also based on the spatial structure of the pattern. We report the results in terms of histograms of the stressvalues in Figures 8 and 9.

Poisson Vs Poisson Poisson Vs Strauss with same Intensity Poisson Vs Strauss with different Intensity

0

10

20

30

0.025 0.050 0.075 0.025 0.050 0.075 0.025 0.050 0.075Stress

coun

t

Combination

l=50 Vs l=100

l=50 Vs l=150

l=50 Vs l=50,g=0.2

l=50 Vs l=50,g=0.1

l=50 Vs l=150,g=0.2

l=50 Vs l=150,g=0.1

Figure 8 MDS with K-function-based distance: estimated stress values under 100 repetitions of the cor-responding combination of intensity and spatial process. λ is represented by ”l”.

Poisson Vs Poisson Poisson Vs Strauss with same Intensity Poisson Vs Strauss with different Intensity

0

20

40

0.00000 0.00025 0.00050 0.00075 0.001000.00000 0.00025 0.00050 0.00075 0.001000.00000 0.00025 0.00050 0.00075 0.00100Stress

coun

t

Combination

l=50 Vs l=100

l=50 Vs l=150

l=50 Vs l=50,g=0.2

l=50 Vs l=50,g=0.1

l=50 Vs l=150,g=0.2

l=50 Vs l=150,g=0.1

Figure 9 MDS with spike-time distance: estimated stress values under 100 repetitions of the correspondingcombination of intensity and spatial process. λ is represented by ”l”.

We note that the stress values when using the K-function-based distance are, under any scenario, lessthan 10% indicating a satisfactory classification. The use of the spike-time distance is even better show-ing an almost perfect classification using the MDS technique. In all cases, the stress values are lower (sobetter classification) when the point patterns exhibit larger differences in the intensity values, and when thespatial structures are also different. When the patterns are more similar, the results suggest using MDS incombination with the spike-time distance.



We also considered a small set of inhomogeneous Poisson cases based on an intensity function of theform λ(x, y) = 100 exp(ax2 + by2) for some appropriate choices of a ∈ R+ and b ∈ R+. Table 1 showsthe considered scenarios, where (C1, C3, C5) corresponded to MDS with K-function-based distance, and(C2, C4, C6) with the spike-time distance. For each case two groups of 10 patterns were simulated, and thewhole procedure was repeated 100 times. Figure 10 shows the corresponding stress measure histograms.

C1 C2 C3 C4 C5 C6a 1.0 2.0 1.0 3.0 1.0 0.2 2.0 3.0 2.0 0.2 3.0 0.2b 1.0 0.5 1.0 5.0 1.0 0.8 0.5 5.0 0.5 0.8 5.0 0.8

Table 1 Considered scenarios and values of a and b coefficients for the intensity function λ(x, y) =

100 exp(ax2 + by2).

0.05

0.06

0.07

0.08

0.09

0.10

C1 C2 C3 C4 C5 C6factor(Combination)

Str

ess

factor(Combination)

C1

C2

C3

C4

C5

C6

Figure 10 Stress measures when using MDS with K-function-based distance (C1, C3, C5), and with thespike-time distance (C2, C4, C6) for inhomogeneous Poisson processes following the coefficients setup inTable 1.

The stress values outlined in Figure 10 show a satisfactory performance, with a very small number ofvalues over 10%, indicating that MDS is also doing a good job under inhomogeneous cases. Again thespike-time distance seems to provide slightly better results.

4.4 Application: classification in a plant community

The data considered here were collected in Cataby, in the Mediterranean type shrub- and heathland of thesouth western area of Western Australia (Beard; 1984), where the locations of 6378 plants from 67 species ofseeders and resprouters on a 22m by 22m plot have been recorded (see (Armstrong; 1991) and (Illian et al.;2009) and references therein). The sampled area is notable by its large number of species, the large numberof rare flora and a high biodiversity located at the southwest of Australia. Lying within a mineral sand



mining area, the study area was mined shortly after data collection in 1990 and will have to be re-naturatedas soon as mining has ceased. Current efforts towards re-naturation in neighboring areas, however, haveresulted in a very small survival rate for some species. The characteristic sandy soil in the area is extremelylow in nutrients and water. Hence, individual plants may interact in various ways by inhibiting each othersgrowth while competing for the scarce resources. Attraction may also occur if conditions (shade, nutrients,water availability) in the micro habitat are made more suitable for a plant by the presence of another plant.The vegetation in the area is Banksia low woodland which is susceptible to regular natural fires occurringapproximately every ten years. Plants have adapted to this through the development of different regenerationstrategies, so the specific regeneration will have an impact on the pattern formed by the individual plants andalso on the structure of the interaction between species (see (Illian et al.; 2009) and references therein).

This dataset constitutes a complex multivariate spatial point pattern for a plant community with high bio-diversity, and Illian et al. (2009) provided a hierarchical multivariate point process model. Here, we are moreinterested in classifying the point patterns with respect to their spatial structure, as an additional exercise toobtain further information on the general behavior of the spatial structures present in the community. Figure11 shows the point patterns of the 12 most abundant and influential species (2 Seeders, and 10 Resprouters).The patterns show different behaviors of inhomogeneity and clustering. Following the line in the previoussection, we performed a MDS classification using the spike-time distance and theK-function-based distance(we indeed used L-functions).

TheDK distance was computed over the support r ∈ [0, 0.25]. Noting a degree of inhomogeneity, we firstestimated the first order intensity function λ using a kernel-based method with a bandwidth of ε = 0.0001(see (Illian et al.; 2008)). Then we estimated the corresponding inhomogeneous K-function using Ripley’sedge-correction through the function Kinhom included in the spatstat package ((Baddeley and Turner;2005)). On the other hand, the penalties for the spike-time distance dst were specified as pa = pd = 1 andpm = (pa + pd)/2.10763 = 0.095, where 2.10763 was the average distance separating any two points overall patterns. These two distances were organized into two dissimilarity matrices, and MDS was then appliedusing the smacof package.

The configuration plots from MDS with DK and dst are shown in Figure 12, on the left and right,respectively. One clear outlier is evident when using the DK measure, which represents a pattern with amost clear clustering behavior. We note that MDS applied to the dissimilarities based on DK seems lessrobust to outliers than when using the spike-time distance. The stress values are, respectively, 0.082 and3.1515 × 10−5, implying that both distances are doing the classification job perfectly well, perhaps theMDS coming from the spike-time distance does slightly better. However, the principal configurations showapproximately identical groups of patterns, the resprouters appear in the same group (x-axis) in both cases,and evidently the seeders get into one group for both cases. Our classification results go in the line of Illianet al. (2009), providing a confirmation that seeders show a clear different spatial behavior to that shown byresprouters. From an ecological perspective, we also confirm existing knowledge on species’ interactions.

5 Discussion and further remarks

Point pattern analysis is often based on pairwise or triplets distance measures amongst the locations orevents. This paper goes beyond this aspect and considers distances or, more generally, dissimilarity mea-sures between any two point patterns and individual entities. This fact is useful in a variety of practicalproblems, and we have focused here on two of them, in the identification of prototypical patterns or in mul-tidimensional scaling for classification. We have outlined various measures of dissimilarity between pointpatterns, but in particular we have mainly treated in detail the spike-time distance and a distance based on a



●

● ●●

●●

●●●●

●

●●●

●●●

●●

●●

●

●

●●

●●●

●●● ●

●● ●●●●

● ● ●●● ●●

● ●●

●

●

●●

●●

● ●●● ● ●●

●●

●

● ●●●●

●●●●●● ●

●● ●●

● ●

●● ● ● ●● ●●

●●●

●●

●●●

● ●●● ●●● ●●● ●●●● ●●●●●● ●●● ●●● ●●● ●● ●●●● ●●● ● ●●● ●●● ●

●● ●●●

●● ●

●●●●●●●●●

●●●

● ● ●●●

●●

●

●●●●

●●

●●

● ●●● ●

●● ●● ● ●

●●●●

●

●●●● ●

●● ● ●

●●●●●●●● ●●

●

●●

●

●●● ●●●●●● ● ●●●

●●

●●● ●● ●●

●●

●●

●●●●●

● ●●●

●●●●●

●● ● ●●●●●●● ●●●

●

●●

●●

●●● ●●●

●●●●● ●

● ●● ●

●●● ●●●

● ●●● ●

●● ●●●●

● ● ●

●● ● ●● ●● ●●● ●● ●●● ●●●●●● ● ● ●● ●●●●● ●● ●●● ●●●●● ●● ●●● ●●● ●● ●● ●● ●●● ●● ●●●●●● ●●●●●● ●● ●● ●●● ●●● ●●●● ●●●● ●● ●●● ●● ●●● ● ●●

●●●

● ●●● ●●

● ●●●

●●●●●●●●●● ● ●●●●●

●●●●●●●●● ●●

●●

●●●●●● ●●

●●●●● ●●●●

●●

●●

●●● ●● ●●

●●●

●●

●●● ●●●●●●●

●●●●● ●●●● ●●● ●

●●● ●● ●●● ●● ●●● ●●●● ●●●● ●●● ●● ●●●● ● ●●●●●● ●●●●●● ●●● ●● ●● ●● ●● ●●● ● ● ●●● ● ●●● ●●●

●●●

●● ●

● ● ●●

● ●

● ●●●

●●●●

●●●

●●

●●●●●

● ●

●

●●

●

●●

●●●

●●

● ● ●● ●

●●

●●●

●●● ● ● ●

● ● ●

●

●●

●●

●●● ●● ●

●● ●● ●

●●●

●●

●● ●●

●

● ● ●●

●●●●●● ●●●● ●

●●●● ●●

● ●

●●

●●●●●

●●

● ●●●● ●●●●● ●● ●● ● ●● ●●●● ●●● ●● ●● ●●●● ●● ● ●●● ●●● ●●●●● ●●● ●●●● ● ●●● ●●● ●● ●● ●● ●● ●●● ●● ●●●● ●●● ●●●● ●● ●

●

●

●● ● ●

● ●●

●

●●● ●

●● ●●● ●

●● ●

●●●●

● ●●●●●● ●

●

●● ● ● ●

● ●

●●

● ● ●● ●●

●●●

●● ●

●●●

●●● ● ●● ●

●●●●

●●

● ●

● ●●

●●●

● ● ●● ●●● ● ● ●●

●●

● ●●●

●●●

●

● ● ●

●● ●● ●● ●●

● ●●● ● ●●●●● ●●●● ●●● ●●● ●●●● ●●●●● ● ●● ●● ●

●● ●

●

●

● ● ● ●●● ● ●●

● ●●

●● ●

●

● ●

●●●

●● ●

●●

●● ● ●●●●

● ●

●●●●●●

●● ●●●●

● ●● ●

●● ●

●

●●

●

●●

●● ●● ●

●●●

● ●

● ● ●

●

● ●●● ● ●●● ●● ●● ●● ● ●●● ●● ●●● ● ●● ●● ●● ●●●●● ●● ●● ●●●● ●● ●● ● ●●● ●●● ●

●

●●

● ● ●●

●●●

● ●●●

●●

● ●●● ●

●●●●

●●● ●●●●

●●● ●● ●●●

●●● ●

●●

●●●●●●●●●●

●●●●●●●●

●●●

●●●● ●

●●

● ●●

●● ●●● ●● ●● ●●● ●●● ● ●● ●● ●● ●● ● ●●● ●●●●● ●●● ●●●● ●●● ●● ● ●●●● ● ●● ●●● ● ●●● ●● ●● ●●● ●● ● ●●● ●●●●

●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●●●●●

●

●●

●● ●●

●● ●●● ●

●

●●●●●●

●●

● ● ● ●● ●●

●●

●●●●●●●●

●●

●

●●

● ●●

●●● ●

●●●

● ●●

●●

● ●● ●●

●●●

●●●●●●

●●●● ●●●●●

●●●●●●

●●●●●●

●

●●●

●●●●

●

●●●●●●● ●●●●● ●●●● ● ●●●●●●● ●● ●● ● ●●● ● ●●● ●●●● ● ●●● ●● ●●●● ●●●●●●● ●● ●●● ●● ●●● ●●●●● ●●●● ●●● ●●● ●● ●●●●●●● ● ●●●● ●● ●● ●● ●●● ●● ● ●● ●● ●● ●●● ●●● ●●●

●●●

●●

●

●●●● ●● ●●●

● ● ●●●●●

●●

●●● ●

● ●● ● ● ●

●●●●●

●●

●●●●

●●●●●

●●

●● ●●

●● ●●●●● ●●

●● ●●

●● ●● ●●●● ●●

●● ●

●●

●●

●●●●●●● ●●●

●● ●

●●● ●●

●●

●●●

● ●●●

●●●● ●

●●●●

● ●●

●●● ● ●●

●● ●●

●●

●● ●●●●● ● ●

●●

●●●●

●●●●

●●●●●●● ● ●●

●●●● ●●● ●●●● ●● ●●● ●●●● ●● ●●● ●●●●●● ●●● ●● ● ●●● ●●●●● ●● ●●●●●● ●● ●● ●●●●●● ●●● ●●● ●●●● ●● ●● ●●●●●● ● ●●● ●● ●●● ●● ●●●●●●● ●●●● ●● ●● ●●●●● ●●●●● ●●●●● ●● ●● ●●●●● ● ●●● ●

● ●●●●●●●● ●● ●

●●●●●●

●●●●●

● ●●●●●●● ●●

●●●

● ●●●

●●●●

●●●●●●●●

●●

●●●●●●●●● ●●●

●

●●●●●● ●●●●●●●●● ●

●●●●●

●●

●●●

● ●●●

●●●●●●

●

●●● ● ●● ●●

● ●●●●●●

●●● ●●

● ●●●● ●

●●● ●●

●●

●●

●●●● ●

●●●

●●●●● ● ●●

●●●●●

●●●●●●

●

●●●●●●●●●● ●●●

●●●

●●●●●

●●●●

●●●

●●●●●●

●●● ●●●● ●

●●●●●●

●●●●

● ●● ●

●

●● ●●● ●● ●●● ●●● ●●●● ●● ●●●●● ●●●● ●●● ●●●● ●● ● ●●● ●● ●●● ● ●●● ●●● ●● ● ● ● ●● ●● ●● ● ●●● ● ●●● ●● ●

●●●

●

● ●●●●

●● ● ●●

●

●

●●

●●●

●●● ●

●●

●●●

●●●

●●

●●

●●●● ●

●●

●●●● ●●●

● ●●●

●●●●●

●●● ●

●●●

●

●●●

●●●●

●

●●●● ●

●●●

● ●●●

● ●

●●

●● ●●

● ●●

● ●●●●●●

●●●

●●● ● ●

●

●●●●●

●●

●● ●●

●● ●●●●● ●● ●●● ●●●● ●● ●●● ●● ●●● ●●●● ●●● ●● ●●●● ● ●●● ● ●●● ●●● ● ●●●● ●●● ●●● ● ● ●●●●●● ●● ●● ●● ●● ●● ●●

●●●●●●●●●●●

●

●●● ●●●●●● ●●

●●●●●●●●● ● ●●●

●●

●●

●●●●

●●●●●●●● ●●

●●●●● ●

●● ●●

●● ●●●●●

●●

●●

●

●

●●●●

●●●●● ●●

●●

●●

● ● ●● ●

●●●●●●

●●● ●

●●● ●

●●

●●

● ● ●●

●●● ●

●●●

●●●

●●●

●●

●

●● ● ●● ●●●●● ● ● ●● ●●● ●●

●● ●● ●● ● ● ●●● ●●●● ●●

Resprouter 1 Seeder 1 Resprouter 2 Resprouter 3

Resprouter 4 Resprouter 5 Resprouter 6 Seeder 2

Resprouter 7 Resprouter 8 Resprouter 9 Resprouter 10

Figure 11 Observed point patterns of the 12 most abundant and influential species (2 Seeders, and 10Resprouters).

squared difference of K-functions. Applications to the summary and description of collections of repeatedrealizations of a point pattern via prototypes or multidimensional scaling are explored. We have included asimulation study to evaluate the performance of multidimensional scaling with these two types of selecteddistances, and the results confirm that MDS is useful to classify spatial patterns when using both distances,with the spike-time distance performing slightly better.

We have discussed several common measures, but there are many alternatives. Some of them are brieflypointed out here.

When estimated from point process data, the empirical product density function provides a description ofthe density of inter-event distances in an observed point pattern. Both theK-function and the product densityfunction provide a global measure of the covariance structure by summing over the contributions from eachevent observed in the process. We might consider individual contributions to the estimated function that areanalogous to the local statistics described by Anselin (1995) and called local indicators of spatial association(LISA). An individual LISA product density function for a given point pattern X , lXi (·), should reveal theextent of the contribution of the i − th event to the global estimate of the product density lX(·), and may



x

y

x

y

-0,03

-0,02

-0,01

0

0,01

0,02

0,03

0,04

0,05

0,06

0,07

0,08

-1,2 -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1

1

6

5

4

11

3

7

12 2 8

9

10

MDS, dimension 1

MD

S, dim

ensio

n 2

-1

-0,8

-0,6

-0,4

-0,2

0

0,2

0,4

0,6

0,8

1

1,2

1,4

-1,2 -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1

MDS, dimension 1

4

11

3

7

12

2

8

9

10

MDS, dimension 1

MD

S, dim

ensio

n 2

1

6

5

Figure 12 MDS configurations for the seeders and resprouters patterns after applying dissimilarity matricesbased onDK (on the left) and dst (on the right) distances respectively. Each color represent a possible group.Red and black numbers represent seeders and resprouters patterns, respectively



provide a further description of structure in the data (e.g., determining events with similar local structurethrough dissimilarity measures of the individual LISA functions). A product density LISA function can beconstructed in the same manner as the global estimate (Cressie and Collins; 2001). Given point patternsX = {x1, x2, . . . , xn} and Y = {y1, y2, . . . , ym}, one may calculate the set of LISA functions for eachpattern corresponding to any individual or spatial location. Denote both sets by {lXi (r), i = 1, . . . , n} and{lYj (r), j = 1, . . . ,m}. Then one may derive several possibilities for distances between X and Y , such as

AXL =1

n(n− 1)

∑i

∑j

d(lXi (r), lXj (r)),

the average of all pairwise distances between LISA functions of points in X . Define the same average forpoints in Y , say, AYL . Then, a similarity or dissimilarity measure is given by dL(X,Y ) = AXL −AYL . Otherforms are also possible.

One could alternatively compare point processes X and Y based on single numerical summaries of eachpoint process. For instance, given the same parametric model for the two processes, one could obtainestimates θX and θY of the parameter vector governing each of the two processes, given the realizationsX and Y , respectively. The dissimilarity between X and Y could then be defined simply as ||θX − θY ||.A related idea is to describe the similarity between X and Y in terms of a numerical summary of thedegree of clustering or inhomogeneity in each. An example would be a generalization to the point processcontext of the log rank test (Mantel and Haenszel; 1959) or Kolmogorov-Smirnov type statistic (Flemingand Harrington; 1981), where the time or duration usual in the context of survival analysis is replaced hereby the distance observed.

Appendix: Spike-time distance, prototypes, and multidimensional scaling inR

We consider the packages ppMeasures and smacof for the R statistical software. These packages and theR statistical software are freely available online from the Comprehensive R Archive Network: r-project.org.The ppMeasures and smacof packages are useful in computing the spike-time distance, prototypes, andimplementing multidimensional scaling. This introduction uses simulated data, and we assume the user isfamiliar with vectors, matrices, loops, subsetting techniques, and basic plotting in R. To install the packagesusing an Internet connection, use the install.packages function, and to load the packages, use thelibrary function:

> install.packages(c(’ppMeasures’, ’smacof’))> library(ppMeasures)> library(smacof)

The spike-time distance is useful in assessing the similarity or dissimilarity in two point patterns. ThestDist function in the ppMeasures package computes the spike-time distance for two patterns. Itsrequired arguments are the two point patterns entered as matrices, where rows represent points, and a mov-ing penalty that can be a single value or a vector where the ith entry corresponds to movement in the ith

dimension.

Additional arguments include addition and deletion penalties, and other arguments that may be modifiedto compute variants of the spike-time distance and other function options. Below two patterns are generated,



●

●

●

●

●●

●

●

●

●

●

● ●

●

●

●

●●

●●

0 1 2 3 4 5 6

0

1

2

3

4

5

dim 1

dim

2 ●

●

●

●

● ●

●

● ●

●●

●●

●

●

●

●

●

●

●

●

●

pattern 1pattern 2

● ●● ●● ●●●● ● ●●● ●●●● ●●●● ●●●● ● ●●● ● ●● ●●●● ●●● ●●●● ●● ● ●●● ●● ●

● ● ●● ● ● ●●● ●● ●●●● ●● ●●● ●● ●● ●●● ●● ●● ● ●● ●● ●●● ●● ● ●●

● ●●● ●● ●● ● ●● ●● ●●● ●●●● ●● ●●●● ● ●●●● ●● ●●●● ●● ●●● ●●● ●● ●

● ● ●●● ●●● ●●● ●● ●●● ● ●● ●●●● ●● ●●● ●●● ● ●● ●● ● ●● ●● ●● ●●

● ●●● ●●●● ●● ●●● ● ●● ●● ●● ●● ●● ●●●● ●● ● ●● ●●●● ●●● ●●● ●● ●●● ●● ●● ● ●●● ● ●● ●● ●●

●●● ●● ● ●● ● ●●●● ●● ● ●●● ● ●● ● ●● ● ●●● ●●● ● ● ●●● ●●●●● ●●● ● ●● ● ●● ● ●● ●● ●●● ● ●

● ● ●● ● ●● ●● ●●● ●● ● ● ● ●●●●● ●●●● ●● ●●●● ● ●● ● ●● ●● ●● ●● ●● ●● ●● ●●● ●

●● ●● ●● ●● ●● ●●●● ● ●● ●● ●●● ●●●● ●●● ●● ●● ●● ●● ● ●●● ●●●● ● ●● ●● ● ●●● ●

● ●●●● ● ●●● ●● ●●● ●● ●●● ●●● ● ●●● ●●● ●● ●●●●● ●● ●● ●●● ●● ● ● ●● ●● ●● ●●● ●●●● ●● ●● ●●●●

●●● ●●● ●● ● ● ●●●●●● ● ●● ●● ●● ● ●●● ●● ●● ● ● ● ●● ● ●● ●●● ●●●● ●●● ●● ●● ●●●●●●● ●● ●● ● ●● ●●●● ●● ● ●

●● ●● ● ●● ● ●● ●●● ●● ●● ● ●● ●● ●● ● ●● ●●●●● ●● ● ●●●●●● ●● ● ●●● ●● ●●

●●●● ● ●● ●● ● ●● ●● ● ● ●●● ●● ●● ●●●● ●● ●●● ●● ●●● ● ●● ●●● ●●●● ●●● ●●● ●●● ●● ●●● ●● ●● ●● ●● ●●● ●● ● ●●● ●●●

●● ●● ●● ●●● ● ●● ●●●● ●●●● ● ●●● ●●● ● ●● ●● ●●● ● ●●●●●●●● ●● ● ●●●●● ●● ●● ● ●● ● ●●● ●●

● ●●●● ●● ●● ●●● ● ●●●●● ●● ●●●● ● ●● ●●●● ●●● ● ● ● ●●●●●●●● ● ●● ●● ●●● ● ● ●● ●●●●● ●●●●●●● ●● ●● ●

● ● ●● ●● ●●● ● ●●● ● ●●● ●●● ●● ●●● ●● ● ●●● ● ●● ●● ●● ● ● ●●● ●●● ●●●● ●●●●● ●● ●●● ●● ●● ● ● ●●● ●●●

●● ●● ● ●●● ● ●● ●● ●●●● ●●● ●●● ● ●● ●●● ● ●●● ●●● ●● ●●● ●● ●● ● ●●●● ●● ● ●●

●● ●●●● ●● ●● ●● ●●●● ●● ●● ● ●●●● ●● ●● ● ●●●● ●● ● ●● ● ● ●●● ●●● ●● ● ●● ●● ● ● ● ● ●●●● ● ●●

● ● ●● ●●● ● ●● ●● ●●● ●● ●●● ● ●●●● ●● ●● ●● ● ● ●●● ●● ●●● ●●●● ●●

●●● ●●● ●● ●●● ●●● ● ●●●● ●● ●●● ●●● ●

● ● ●●●● ●●● ●●●●●●●● ●●●● ●● ●●●● ● ●

●●● ●● ●● ●●●● ● ● ●●● ● ● ●●● ●● ● ●●●● ●●

●● ●● ●● ●● ●●●● ●● ●● ●●●● ●● ●● ●●●● ● ●● ●●●● ●●● ● ●● ●●●●

●●●● ●● ●● ●● ●●● ●●●● ●● ●● ●●●● ●● ●●● ●● ●

●● ● ●●● ●● ●●●● ●●● ●● ● ●●●● ●● ●● ●

● ●● ● ●●● ● ● ●●●●●●● ● ● ●● ● ●● ●● ●●

●● ●● ● ● ●● ●●● ● ●●● ● ●●● ●● ●● ●

●●● ● ● ●● ●●●●● ●● ●●●● ●●● ●● ●●

● ● ●●● ●●●● ● ●●●●●●● ●● ● ●● ●● ●● ●●●● ●

● ●● ● ●● ●● ●●●●●● ● ●● ●● ●● ●

●● ● ●●●● ●● ●● ●● ●● ●

0 2 4 6 8

0

10

20

30

40

50

60

dimension 1

patte

rn

●●● ●●●●● ●●● ●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

● ●

●

●●

●

●

●●

●

●

●

●

●

●

●

●●

●

−0.5 0.0 0.5 1.0−1.0

−0.5

0.0

0.5

1.0

D1

D2

Figure 13 Left: A plot representing the optimal transformation of pattern 1 into pattern 2 for the chosenparameters. Center: A collection of 60 patterns and their prototype, shown as pattern 0. Right: Multidimen-sional scaling applied to the patterns shown in the center panel. These patterns were generated using threerandom processes, and the coloring and plotting characters denote those groups.

p1 and p2, using the rgamma function to generate the location of the points. The distance is computedusing a moving penalty of 10 and the default addition and deletion penalties, which are both 1. A graphicrepresenting how the distance was computed is generated using the plot method applied to the output ofstDist, which is shown in the left panel of Figure 13. Points connected by a line represent correspondingpoints in patterns 1 and 2, and isolated points represent added or deleted points.

> p1 <- cbind(rgamma(20, 2), rgamma(20, 1.8))> p2 <- cbind(rgamma(20, 2), rgamma(20, 1.8))> (hold <- stDist(p1, p2, 1))[1] 19.35158> plot(hold, col=c("#444444", "#BB0000"))> legend("topright", legend=c("pattern 1", "pattern 2"),+ pch=c(1,19), col=c("#444444", "#BB0000"))

The default colors in the plot method for the stDist output use an alpha level, which may require firstopening a Cairo plotting device on a Linux or Windows operating system. The default colors may be changedusing extra arguments in the plot method, as shown above.

We next build a point pattern collection and find the pattern’s prototype. A pattern collection is organizedusing the ppColl function, which has two required arguments. The first argument is a matrix where eachrow represents the spatial location of a point; a vector of values may be provided if there is only one spatialdimension. Below 60 patterns are constructed, where the first 15 come from one point process, the next 20come from another process, and the last 25 are from a third process. We will use these patterns for both theprototype and the multidimensional scaling examples.

> groups <- rep(1:3, c(15, 20, 25))> nPts <- rpois(60, c(25, 35, 14)[groups])> pts <- rep(-1, sum(nPts))> key <- rep(-1, sum(nPts))> start <- 0



> for(i in 1:60){+ shape <- c(1.2,1.5,1)[groups[i]]+ pts[start+1:nPts[i]] <- rgamma(nPts[i], shape=shape)+ key[start+1:nPts[i]] <- i+ start <- start + nPts[i]+ }> patColl <- ppColl(pts, key, nMissing=sum(nPts==0))

The additional nMissing is provided here to ppColl in the event that one or more patterns have no points,which would otherwise be undocumented in the pattern collection.

The point pattern prototype is computed using the ppPrototype function. A pattern collection or-ganized via ppColl is the first required argument; a moving penalty is the second. Below is code thatcomputes the prototype of the point pattern collection, plots the collection, and plots the prototype.

> (proto <- ppPrototype(patColl, 10)) # ’...’ represents omitted outputdim 1

1 0.25292 0.3709...13 1.7100> plot(patColl, addLines=TRUE, col=groups)> points(proto)

The resulting plot is shown in the center panel of Figure 13. The patterns from the collection are shownas rows 1-60, and the prototype is listed as pattern 0. Optional arguments in the ppPrototype functionadjust the distance function used to compute the prototype (e.g. addition and deletion penalties) and includeoptions for changing parameters in the algorithm used to compute the prototype. For a comparison of theavailable algorithms, see Diez (2010).

Multidimensional scaling is applied using the smacofSymm function in the smacof package. Thisfunction requires a single argument: a symmetric matrix with entries representing the distance between theobjects in the collection. In this instance, the objects are patterns, and the distances are computed belowusing spike-time distance in a loop. Notice that only the lower-triangle of the matrix is computed, and theupper-triangle is filled in by adding the matrix to its own transpose.

> dists <- matrix(0, 60, 60)> for(i in 2:60){+ x <- pts[key == i]+ for(j in 1:(i-1)){+ y <- pts[key == j]+ dists[i,j] <- stDist(x, y, 10)$dist+ }+ }> dists <- dists + t(dists)> hold <- smacofSym(dists)> plot(hold$conf, col=groups, pch=c(1,3,19)[groups])

The last line of code plots this matrix, coloring the points and also ensuring each group has its own colorand plotting character. The right plot in Figure 13 shows the resulting plot with the locations suggested by



smacof. The suggested object locations should only be analyzed relative to their neighbors; the coordinatevalues for single object holds no meaning when it is not referenced to the other objects. Here we presentedonly one possible set of parameters in both the distance and multidimensional scaling operations. Chang-ing the default parameters in stDist or smacofSym may result in important changes in the apparentclustering of the objects.

See Diez (2010) and de Leeuw and Mair (2009) for additional details on the functions and algorithms inthe ppPrototype and smacof packages, respectively.

References

Anselin, K. (1995). Local indicators of spatial association-lisa, Geographical Analysis 27: 93–115.Armstrong, P. (1991). Species Patterning in the Heath Vegetation of the Northern Sandplain, Honours thesis,

University of Western Australia.Baddeley, A. and Gill, R. D. (1997). Kaplan-meier estimators of distance distributions for spatial point

processes, Annals of Statistics 25(1): 263–292.Baddeley, A. and Turner, R. (2005). spatstat: An r package for analyzing spatial point patterns, Journal of

Statistical Software 12(6): 1–42.Beard, J. (1984). Biogeography of the kwongan, in J. Pate and J. Beard (eds), Kwongan: plant life of the

sandplains, University of Western Australia Press, pp. 1–26.Besag, J. (1977). Contribution to the discussion of dr ripley’s paper, Journal of the Royal Statistical Society

B 39: 193–195.Brown, E. N., Kass, R. E. and Mitra, P. P. (2004). Multiple neural spike train data analysis: state-of-the-art

and future challenges, Nature Neuroscience 7: 456–461.Cox, D. R. and Isham, V. (1980). Point Processes, Chapman and Hall, London.Cressie, N. A. C. (1993). Statistics for Spatial Data, revised ed, Wiley, New York.Cressie, N. and Collins, L. B. (2001). Patterns in spatial point locations: local indicators of spatial association

in a minefield with clutter, Naval Research Logistics 48: 333–347.Daley, D. and Vere-Jones, D. (2003). An Introduction to the Theory of Point Processes: Volume I: Elementary

Theory and Methods, second edn, New York: Springer - Verlag.de Leeuw, J. and Mair, P. (2009). Multidimensional scaling using majorization: Smacof in r, Journal of

Statistical Software 31(3): 1–30.Deza, M. M. and Deza, E. (2009). Encyclopedia of Distances, Springer-Verlag, Berlin.Diez, D. M. (2010). Extensions of distance and prototype methods for point patterns, PhD thesis, University

of California, Los Angeles.Diez, D. M., Schoenberg, F. P. and Woodyc, C. D. (2012). Algorithms for computing spike time distance

and point process prototypes with application to feline neuronal responses to acoustic stimuli, Journalof Neuroscience Methods 203: 186–192.

Diez, D. M., Tranbarger Freier, K. E. and Schoenberg, F. P. (2010). ppMeasures: Point pattern distancesand prototypes. R package version 0.1.URL: http://CRAN.R-project.org/package=ppMeasures

Diggle, P. (2003). Statistical Analysis of Spatial Point Patterns, London: Edward Arnold.



Fleming, T. R. and Harrington, D. P. (1981). A class of hypothesis tests for one and two sample censoredsurvival data, Communications in Statistics. Theory and Methods 10: 763–794.

Goldberg, D. H., Victor, J. D., Gardner, E. P. and Gardner, D. (2009). Spike train analysis toolkit: Enablingwider application of information-theoretic techniques to neurophysiology, Neuroinformatics 7(3): 165–178.

Illian, J. B., Møller, J. and R.P, W. (2009). Hierarchical spatial point process analysis for a plant communitywith high biodiversity, Environmental and Ecological Statistics 16(3): 389–405.

Illian, J., Penttinen, P., Stoyan, H. and Stoyan, D. (2008). Statistical Analysis and Modelling of Spatial PointPatterns, Statistics in Practice, Wiley.

Karr, A. (1991). Point Processes and Their Statistical Inference, 2nd ed., Marcel Dekker, New York.Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method, Psychometrika

29(2): 115–129.Kulkarni, J. E. and Paninski, L. (2007). Common-input models for multiple neural spike-train data, Network:

Computation in Neural Systems 18: 375–407.Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of

disease, Journal of National Cancer Institute 22: 719–748.Møller, J. and Waagepetersen, R. P. (2004). Statistical Inference and Simulation for Spatial Point Processes,

CHAPMAN And HALL/CRC.Nichols, K., Schoenberg, F. P., Keeley, J. E., Bray, A. and Diez, D. M. (2011). The application of prototype

point processes for the summary and description of California wildfires, Journal of Time Series Analysis32: 420–429.

Ripley, B. D. (1977). Modelling spatial patterns (with discussion), Journal of the Royal Statistical Society B39: 172–212.

Ripley, B. D. (1981). Spatial Statistics, Wiley, New York.Ripley, B. D. (1988). Statistical Inference for Spatial Processes, Cambridge University Press, Cambridge.Schoenberg, F. P. (2011). Introduction to point processes, in J. J. Cochran, J. Louis Anthony Cox, P. Ke-

skinocak, J. P. Kharoufeh and J. C. Smith (eds), Wiley Encyclopedia of Operations Research and Man-agement Science, Wiley, pp. 616–617.

Schoenberg, F. P. and Tranbarger, K. E. (2008). Description of earthquake aftershock sequences usingprototype point processes, Environmetrics 19: 271–286.

Searcoid, M. O. (2006). Metric Spaces, Springer-Verlag, Berlin.Team, R. D. C. (2010). R: A Language and Environment for Statistical Computing, 2.11.1 edn, R Foundation

for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.URL: http://www.R-project.org

Tranbarger, K. E. and Schoenberg, F. P. (2010). On the computation and application of prototype pointpatterns, Open Applied Informatics Journal 4: 1–9.

Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, fourth edn, Springer.Vere-Jones, D. (2009). Some models and procedures for space-time point processes, Environmental and

Ecological Statistics 16: 173–195.Victor, J. D. and Purpura, K. P. (1997). Metric-space analysis of spike trains: theory, algorithms and appli-

cation, Journal of Neuroscience 8: 127–164.


on measures of dissimilarity between point patterns and ...frederic/papers/ppmeasures105.pdf · on...

Documents