directional analysis of stationary point processes · 2019-09-10 · directional analysis of...

143
Directional Analysis of Stationary Point Processes Martina Sormani Vom Fachbereich Mathematik der Technischen Universität Kaiserslautern zur Verleihung des akademischen Grades Doktor der Naturwissenschaften (Doctor rerum naturalium, Dr. rer. nat.) genehmigte Dissertation D386 Erstgutachterin: Prof. Dr. Claudia Redenbach Technische Universität Kaiserslautern Zweitgutachterin: Prof. Dr. Jesper Møller Aalborg University Datum der Disputation: 11.06.2019

Upload: others

Post on 15-Mar-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

  • Directional Analysis of Stationary Point Processes

    Martina Sormani

    VomFachbereich Mathematik der Technischen Universität Kaiserslautern

    zur Verleihung des akademischen GradesDoktor der Naturwissenschaften (Doctor rerum naturalium, Dr. rer. nat.)

    genehmigte Dissertation

    D386

    Erstgutachterin: Prof. Dr. Claudia RedenbachTechnische Universität Kaiserslautern

    Zweitgutachterin: Prof. Dr. Jesper MøllerAalborg University

    Datum der Disputation: 11.06.2019

  • Acknowledgements

    First of all and most of all I would like to thank my supervisor Claudia Redenbach, who gaveme the opportunity to do my Phd, and was of great support and guidance during this time,not only regarding issues in mathematics....I am grateful to Tuomas Rajala, who shared with me lots of code and ideas and to Prof. AilaSärkkä, especially for her detailed and accurate text corrections.Thanks to Johannes Freitag to share with us the ice-data, for his suggestions, and for givingme the opportunity to enter in the DFG Project. Thanks also to his Phd student Tetsuro.I also would like to thank the image processing group of the ITWM for letting me use theirsoftware and to have shared their knowledge. In particular thanks to Sonja for all her helpand Prakash.I am grateful to professors Lothar Heinrich, Jürgen Franke, Gabriele Steidl for sharing theirknowledge and for taking time to discuss with us.Thanks to Disha that was always with me in good and bad times.Finally I want to thank my family in Italy which has been always near to me, and... to Luisand Diego .....

  • Preface

    This work has mainly been supported by the DFG priority programm “Antarktisforschungmit vergleichenden Untersuchungen in arktischen Eisgebieten”: FR 2527/2-1, RE 3002/3-1.Partial funding by the DFG-Graduiertenkolleg 1932 and from the Center for Mathematicaland Computational Modelling (CM)² in Kaiserslautern, is gratefully acknowledged.

  • List of Symbols

    S subset of Rd

    B Borel σ-algebra induced by the Euclidean metric

    B0 bounded sets of B

    Nlf locally finite subsets of Rd

    Nlf σ-algebra on Nlf

    x point configuration on Rd

    Nx(B) number of points of x in a subset B ⊂ Rd

    X spatial point process on Rd

    XS X ∩ S

    NX(B) number of points of X in a subset B ⊂ Rd

    λ intensity of X

    W window of observation of X

    ∂W border of W

    (Ω,F ,P) probability space

    Λ(2)(·) second order intensity measure

    K(·) reduced second order moment measure

    λ(2) second order product density

    P0(·) Palm measure

    P!0(·) reduced Palm measure

    Z Fry points of X

    W ∗ window of observation of the Fry points

    λZ intensity of the Fry points

    µ,ν measures

    θ = (β, γ, r0) parameters of a Strauss process

  • G nearest neighbor distance distribution function

    F empty space function

    g pair correlation function

    K Ripley´s K function

    d(·, ·) distance function

    Br(c) ball of radius r centered in c

    Sd−1 unit sphere in Rd

    kd volume of the d-dimensional unit ball

    I(·) indicator function

    T = RC linear mapping, R rotation matrix, C compression matrix

    R0 element of SOn

    SOn special orthogonal group in Rd

    det(·) determinant of a matrix

    tr(·) trace of a matrix

    AT transpose of the matrix A

    VI

  • Contents

    List of Symbols V

    Introduction 2

    1 Spatial Point Processes 31.1 General notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Definitions and preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Properties of spatial point patterns . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Poisson point process (CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.5.1 Intensity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.5.2 Palm distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.5.3 Second order summary statistics . . . . . . . . . . . . . . . . . . . . . . 111.5.4 First order summary statistics . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.6 Strauss process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.7 The Metropolis Hastings algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 17

    1.7.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.7.2 Convergence of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . 181.7.3 Simulation of locally stable point processes . . . . . . . . . . . . . . . . 19

    2 Directional Analysis 212.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.1.1 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.1.2 Explicative examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.2 Fry points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.3 Integral method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.3.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.4 Projection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    2.4.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.5 Ellipsoid method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    2.5.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.6 Estimation of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    2.6.1 Integral method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.6.2 Projection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    3 Simulation Study 453.1 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

  • Contents

    3.2 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.2.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    3.2.1.1 Integral Method . . . . . . . . . . . . . . . . . . . . . . . . . . 463.2.1.2 Projection Method . . . . . . . . . . . . . . . . . . . . . . . . 583.2.1.3 Ellipsoid Method . . . . . . . . . . . . . . . . . . . . . . . . . 59

    3.2.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.2.2.1 Projection Method . . . . . . . . . . . . . . . . . . . . . . . . . 613.2.2.2 Ellipsoid Method . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    3.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.3 Estimation of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    3.3.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.3.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

    4 Directional Analysis-Additional Aspects 794.1 Influence of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    4.1.1 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.2 Classification algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    4.2.1 Model specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.2.2 MCMC method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.2.3 Variational Bayes algorithm . . . . . . . . . . . . . . . . . . . . . . . . 844.2.4 Comparison of the methods . . . . . . . . . . . . . . . . . . . . . . . . . 85

    4.3 Testing against anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864.3.1 Power of the “Projection” test . . . . . . . . . . . . . . . . . . . . . . . 88

    4.4 Visualization of the Fry points . . . . . . . . . . . . . . . . . . . . . . . . . . . 894.4.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.4.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    4.5 Limit behaviour of the geometric anisotropy transform . . . . . . . . . . . . . . 95

    5 Application to Ice Data 995.1 Description of the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    5.1.1 Division in subsamples . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055.3 Directional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

    5.3.1 Estimation of the interaction radius . . . . . . . . . . . . . . . . . . . . 1065.3.2 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

    5.3.2.1 Talos Dome core . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.3.2.2 EDML core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095.3.2.3 Renland core . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    5.3.3 Estimation of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.3.3.1 Talos Dome core . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.3.3.2 EDML core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.3.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

    5.3.4 Representation of the Fry points . . . . . . . . . . . . . . . . . . . . . . 119

    Conclusions 122

    VIII

  • Contents

    Appendices 123A.1 Proof of unbiasedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123A.2 Expectation of wavelet coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 123B.1 Academic Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125B.2 Akademischer Werdegang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

    Bibliography 125

    IX

  • Introduction

    In this thesis we consider as a main topic the directional analysis of a stationary point process.The interest in such an analysis rises in the modern point process literature, since, thanks tothe advances in technology, large and complicated point pattern data, in particular in 3D,are more common. For those patterns, the assumptions of isotropy and stationarity can notbe simply made but need further investigation. Testing stationarity has been considered byseveral authors and currently lots of non-stationary models are available [1, 2, 27, 39, 49].Isotropy is, on the other hand, often still assumed without further checking, although severaltools to study anisotropy have been suggested in the literature. To render those tools moreeasily accessible it was recently published a paper [51] where the existing non-parametricmethods have been collected.In this thesis we focus on and compare three non-parametric methods which we have defined asIntegral method, Ellipsoid method and Projection method. All of them are based on second-order analysis of a point process. The Ellipsoid method has been introduced in [52]. TheIntegral method has been applied in the literature in several versions, for example in [53] orin [35] and it is here described in a general context. The Projection method, from the bestof our knowledge, is introduced in this thesis. A similar idea in 2D can be found in [26, page254]. In a simulation study we apply the methods in order to find preferred directions and wecompare their performances. Testing isotropy and visualization of anisotropy, both in 2D and3D, are also considered.Directional methods are especially useful to detect directions in regular point patterns sinceit can be difficult to visually detect anisotropy in such patterns. In constrast, in clusteredpatterns, the shape and directions of the clusters can already reveal some information. Anexample of a regular pattern where it is difficult to visually detect anisotropy is the amacrinecells data (Figure 0.0.1, left), which consists of on cells and off cells. These data have beenanalyzed several times by assuming stationarity and isotropy, but it was recently detected byWong and Chiu that both the marginal on and off patterns and the superposition show somesign of anisotropy.Anisotropy can be formed by several mechanisms. In this thesis we focus on the so calledgeometric anistropy mechanism which has been considered in the literature both for clusteredpoint patterns, such as the Welsh chapel data (Figure 0.0.1, right) [38, 36] and for regularpoint processes, such as the amacrine cells [66] and air bubbles in polar ice [53, 52]. Motivatedby our application to real data, we here pay special attention to the regular case. As in [53, 52]we consider the 3D locations of air-bubbles in glacial ice-cores. For this data the aim of adirectional analysis is to get information about the deformation of the ice-sheet at differentdepths. This information is necessary for the glaciologists in order to build dating models forthe ice. A first directional analysis of the ice-data can be found in [53] and [52].

  • Contents

    Figure 0.0.1: Locations of 152 amacrine cells labelled ’on’ and 142 cells labelled ’off’ (left) andthe Welsh chapels data (right).

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ● ●

    ●●

    ● ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    on

    off

    Finally we consider the influence of isotropic and stationary noise on the results of the direc-tional analysis of a stationary point process. This study is motivated by the ice application.In fact it has been recently discovered that ice core samples may contain noise bubbles whichform due to the relaxation of the ice after the core is taken out of the drilling hole. In thiscontext, the classification algorithms introduced in [54] and [50] are taken into consideration.The limit behavior of the geometric anisotropy mechanism is also described.Introduction to point process theory and to the main notation is given in Chapter 1. Thethree main methods, the Integral, the Ellipsoid and the Projection method, are described inChapter 2, as well as their application in the setting of geometric anisotropy. In Chapter 3the methods are compared via a simulation study both in 2D and in 3D. In Chapter 4 weconsider the influence of noise, the anisotropy tests and the limiting behavior of the geometricanisotropy mechanism. Finally in Chapter 5 we apply the methods to the ice data.Parts of this work have been published in

    • C. Redenbach, A. Särkkä, M. Sormani (2015). Classification of Points in Superpositionsof Strauss and Poisson Processes. Spatial Statistics. 12, 81-95.

    • T. Rajala, C. Redenbach, A. Särkkä, M. Sormani (2016). Variational Bayes Approachfor Classification of Points in Superpositions of Point Processes. Spatial Statistics. 15,85-99.

    • T. A. Rajala, A. Särkkä, C. Redenbach, M. Sormani (2016). Estimating geometricanisotropy in spatial point patterns. Spatial Statistics. 15, 139-155.

    • T. A. Rajala, C. Redenbach, A. Särkkä, , M. Sormani (2018). A review on anisotropyanalysis of spatial point patterns. Spatial Statistics.

    2

  • 1 Spatial Point Processes

    In this chapter we describe the fundamentals of the theory of spatial point processes whichare necessary to introduce our work. We start by giving the formal definition of a spatialpoint process in Section 1.2 and by describing some important properties that a point processmay have in Section 1.3. In Section 1.4 we introduce the Poisson point process, which is afundamental model in spatial point process theory. In Section 1.5 we describe some of thepossible summary statistics used to describe point patterns. Finally in Section 1.6 we introducethe Strauss process, that will be considered throughout the thesis. The main references weused in this chapter are [26], [40],[59] and [64].

    1.1 General notation

    In this section we define some general notations that will be considered in the thesis. Morespecific notations will be introduced later.We denote with I[·], the indicator function and with IB[·] the indicator function over a setB ⊂ Rd which, given x ∈ Rd, is defined as

    IB[x] :=

    {1 if x ∈ B0 if x /∈ B

    .

    Given a set B ⊂ Rd, we denote its Lebesgue measure by |B|. In particular, the Lebesguemeasure of the d-dimensional unit ball Br(0) with r = 1 will be denoted by kd. We define thed− 1-dimensional unit sphere Sd−1and the positive half unit sphere as

    (Sd−1)+ := {x ∈ Sd−1s.t xd > 0},

    where xd denotes the last component of x. We denote the Minkowski sum of two sets A andB in Rd as

    A⊕B = {a+ b : a ∈ A, b ∈ B}.

    The set Bx = B⊕{x} therefore corresponds to the translation of the set B by a point x ∈ Rd.We define the Euclidean norm as || · || and denote by d(x, y) := ||x− y|| the distance betweentwo points x, y∈ Rd. The distance between a point x ∈ Rd and a set B ⊂ Rd is given by

    d(x,B) := infy∈B

    d(x, y).

    Given the space Lp(Rd,R) of Lp-Lebesgue integrable functions from Rd to R, we define thecorresponding Lp-norm as || · ||Lp . We denote by det(A) the determinant of a matrix A, bytr(A) its trace and by AT its transpose. Finally, we denote the Dirac Delta function with δ(·).We now introduce the notation regarding three particular types of sets in Rd. We denoteS(u, �, r) the double conical sector centered in the origin with main direction u ∈ Sd−1,

  • 1 Spatial Point Processes

    opening angle � and with radius of the sector given by r. In 2D the set will be denotedby S(θ, �, r) where θ is the angle that u forms with the x-axis (plot 1 of Figure 1.1.1). Wedenote by L(u, r, hc) the cylinder (3D) or rectangle (2D) with major-axial direction unit vectoru ∈ Sd−1, with height 2r and cross-section half-length hc with 0 < hc < r. In 2D we use, asfor the cone, the notation L(θ, r, hc) (plot 2 of Figure 1.1.1). Finally we denote by E(u, r, k),where k < 1, the 2D ellipse centered in the origin with major-axial direction u ∈ Sd−1 andsemi-axes of length r/k and rk. In 2D we use the notation E(θ, r, k) (plot 3 of Figure 1.1.1).

    Figure 1.1.1: Illustration in 2D of the sets S(θ, �, r), L(r, θ, hc) and E(θ, r, k) for particularchoices of the parameters.

    S( π 2 , π 4 ,1)

    u

    L( π 2 ,1.5,1)

    2 hc

    2r

    u

    E( π 2 ,1,0.5)

    rk

    u

    r/k

    1.2 Definitions and preliminaries

    Spatial point processes are random countable subsets of a space S. The space S is required tobe a topological locally compact space with a countable base, on which a Borel σ-algebra isdefined. In this thesis we usually consider S = Rd endowed with the σ-algebra B induced bythe Euclidean metric. In some cases we also consider S ⊂ Rd again endowed with the σ-algebrainduced by the Euclidean metric which is also denoted by B. We now give a formal definitionof a point process on S, restricting our attention to point processes whose realizations arelocally finite subsets of S. Let B0 be the set of bounded elements of B, x be a countablesubset of S, Nx(S) its cardinality and Nx(B) the cardinality of the point configuration xrestricted to a subset B of S. We define the set Nlf of locally finite subsets of S as

    Nlf = {x ⊂ S : Nx(B)

  • 1.3 Properties of spatial point patterns

    The distribution of X is given by the probability measure PX on the measure space (Nlf ,Nlf )defined as

    PX(F ) = P(ω : X(ω) ∈ F ) ∀F ∈ Nlf .

    In applications spatial point processes are used as statistical models for the analysis of observedpatterns of points, called spatial point patterns or spatial point configurations, where thepoints represent the locations of some objects of interest. A great variety of objects canbe considered and in many different contexts. Typical examples are locations of trees in aforest, locations of stars in the galaxies or locations of cells in a tissue. In all these situationsthe data, at a basic level, simply consist of point coordinates. Since spatial point patternspresent a huge variety, one of the primary aims of point process theory is to provide structuralmethods describing how to find a statistical model which offers a satisfactory explanation ofthe considered pattern. To this aim, different types of models, which could depend on differentparameters are considered and studied.In practice, the data of a realization of a spatial point process are collected in a boundedobservation windowW , which affects the analysis of the data and should therefore be carefullytaken into consideration.

    1.3 Properties of spatial point patterns

    In this section we describe some important properties that spatial point patterns (processes)may have. When having a point pattern it is in fact useful to check whether it satisfies cer-tain properties, in order to find a correct model for the data, and, if possible, to simplify itsanalysis. We start by describing two properties, namely stationarity and isotropy, that willplay a central role throughout the thesis. Let X be a point process on Rd.

    1) Stationarity: We say that X is stationary, if its distribution is invariant under transla-tions. This means that the point process Y := X + x, where x is an arbitrary fixed point ofRd, has the same probability distribution as X for all x ∈ Rd.

    2) Isotropy: We say that X is isotropic, if its distribution is invariant under rotationsabout the origin. This means that the point process Y := R0X, where R0 ∈ SOn has thesame probability distribution as X for all R0 ∈ SOn.

    Both the assumption of stationarity and of isotropy considerably simplify the analysis ofa point pattern. To check stationarity various methods have been proposed in the literature,some of them are pretty standard to be used (see for example the quadrat counting method, [4,page 165], where one should assume independence of the points). The hypothesis of isotropyis instead often confirmed only by a visual check.In applications we usually distinguish between

    1) Regular point patterns: The points show repulsion between each other and are lo-cated such to preserve a certain distance. The repulsion may be caused by some physicallimits, for example the points could represent the centers of spheres which have a certainradius r0. (Figure 1.3.1, second plot)

    5

  • 1 Spatial Point Processes

    2) Clustered point patterns: The points show attraction between each other and formclusters where the points lie close to each other. An example of a clustered pattern can bethe pattern of the seeds spread by a group of plants, where each plant spreads seeds in itsproximity. (Figure 1.3.1, third plot)

    3) Complete Spatial Randomness (CSR): The points do not show any type of inter-action and are independently randomly scattered in space. (Figure 1.3.1, first plot)The CSR model plays a major role in spatial statistics and will be taken into consideration inthe next Section 1.4.In the literature, several models both for regular and clustered patterns have been proposed.In this thesis we particularly focus on regular point patterns.An additional property of spatial point processes is

    Simplicity: Realizations of X consist a.s. of strictly different events, so that it can a.s.not happen that two events coincide. In most applications, and also in ours, this does notrepresent a constraint since for physical reasons it is impossible that two points are located inexactly the same place.

    Figure 1.3.1: Realizations of a CSR process (first plot), a regular point process (second plot)and a clustered point process (third plot).

    CSR Regular Clustered

    ●●

    ●●●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●● ●●

    ●●

    ●●● ●

    ●●

    ●●● ●

    ●●

    ●●

    1.4 Poisson point process (CSR)

    Definition 1.4.1. Let µ be a locally finite, diffuse measure on Rd. A point process X on Rd

    such that

    (i) NX(A) ∼ Poisson(µ(A)) ∀A ∈ B0,

    (ii) if A1, . . . , Ak ∈ B0 are disjoint sets, NX(A1) . . . NX(Ak) are independent random vari-ables,

    is called Poisson point process with intensity measure µ.

    The Poisson process with intensity measure µ can be defined on a subset S ⊂ Rd in ananalogous way. If the measure µ has a density λ with respect to Lebesgue measure, λ is calledintensity function. If λ is constant, we say that X is a homogeneous Poisson process. It is easy

    6

  • 1.5 Summary statistics

    to verify that the homogeneous Poisson process is stationary and isotropic. Note that, in theliterature, when talking about CSR, it is usually referred to the homogeneous Poisson pointprocess. The Poisson process is also used as a base for the construction of more complicatedmodels and it is the most analytically tractable model. Note that in Definition 1.4.1, if X issimple, property (i) implies property (ii).

    Definition 1.4.2. Let S ∈ B, µ a diffuse locally finite measure on Rd with µ(S) < ∞, andlet n ∈ N. A point process X is a µ-binomial point process on S with n points if X := ∪ni=1ξi,where ξi are independent and they are distributed µ-uniformly in S, so that

    P(ξi ∈ A) =µ(A)

    µ(S)A ∈ B, A ⊂ S.

    We now consider the restriction XS of a Poisson process X with intensity measure µ on a setS such that µ(S) 0 conditional on NX(S) = n is a binomial pointprocess with n points on S.

    From Proposition 1.4.1 we can deduce a method to simulate XS . We can generate a randomnumber N ∼ Poisson(µ(S)), and then generate N points, µ-uniformly scattered in S. Propo-sition 1.4.1 also allows us to characterize the distribution Π of X defined on the measure space(Nlf ,Nlf ) . In fact

    Π(F ) = P(X ∈ F )

    =∞∑n=0

    P(NX(S) = n)P(X ∈ F |NX(S) = n)

    =∞∑n=0

    e−µ(S)µ(S)n

    n!

    ∫S· · ·∫SI[{s1, . . . , sn} ∈ F ]

    dµ(s1)

    µ(S). . .

    dµ(sn)

    µ(S)

    =

    ∞∑n=0

    e−µ(S)

    n!

    ∫S· · ·∫SI[{s1, . . . , sn} ∈ F ]dµ(s1) . . . dµ(sn), F ∈ Nlf .

    (1.4.1)

    When n = 0 the integrals should be replaced by I[∅ ∈ F ].

    1.5 Summary statistics

    In this section we introduce different summary statistics used to describe spatial point patterns.Summary statistics can give different kinds of information about the considered spatial patternand they can be used to help identify a suitable model for it. In Section 1.5.1 we introducethe so called intensity measures. In Section 1.5.2 we introduce Palm distributions, which

    7

  • 1 Spatial Point Processes

    characterize conditional properties of spatial patterns and which are necessary to introducesecond order statistics in Section 1.5.3: Ripley´s K-function and the pair correlation functiong. Finally in Section 1.5.4 we introduce three possible first order summary statistics: theempty space function F , the nearest neighbor distance distribution G and the J-function, acombination of F and G.

    1.5.1 Intensity measures

    The first order moment measure Λ, also called the intensity measure of a point processX is defined on the space (S,B) as

    Λ(A) = E(NX(A)) ∀A ∈ B,

    so Λ(A) represents the expected number of points of X in A. The first order moment measurecan have a density λ : S → R+ with respect to the Lebesgue measure. In this case we call λthe intensity function and we can write

    Λ(A) =

    ∫Aλ(ξ)dξ.

    For a stationary point process∫Aλ(ξ)dξ = Λ(A) = Λ(Aν) =

    ∫Aλ(ξ + ν)dξ ∀ν ∈ Rd.

    This implies that the intensity function is constant so that λ(x) = λ and that Λ(A) = λ|A|.In this case λ can be interpreted as the average number of points in a unit volume and, givena realization x of the process in the observation window W , can be estimated by

    λ̂ =Nx(W )

    |W |. (1.5.1)

    In the Poisson process the intensity measure Λ coincides with µ.

    Theorem 1.5.1. (Campbell thm.) Let X be a point process on Rd and f : Rd → R+ anon-negative measurable function. Then

    E(∑x∈X

    f(x))

    =

    ∫Rdf(x)Λ(dx).

    In the stationary case the equation can be written as

    E(∑x∈X

    f(x))

    = λ

    ∫Rdf(x)dx.

    The proof of this theorem can be found e.g. in [57, page 54].

    The definition of first order moment measure can be extended to an arbitrary order n asa measure on the product space (Sn,⊗nB)

    8

  • 1.5 Summary statistics

    M (n)(A1 × · · · ×An) = E(NX(A1) . . . NX(An)) =

    E

    ∑ξ1,...,ξn∈X

    I[ξ1 ∈ A1, . . . , ξn ∈ An]

    A1, . . . , An ∈ B0.We can also define the nth order factorial moment measure Λ(n) as

    Λ(n)(A1 × · · · ×An) = E

    6=∑ξ1,...,ξn∈X

    I[ξ1 ∈ A1, . . . , ξn ∈ An]

    A1, . . . , An ∈ B0, (1.5.2)where 6= indicates that the sum is done only for ξ1, . . . , ξn mutually distinct. The name ofthis measure is due to the fact that

    Λ(n)(A× · · · ×A) = E(NX(A)(NX(A)− 1) . . . (NX(A)− n+ 1)).

    The measure M (n)(A1 × · · · × An) represents the expected number of n-tuples we can formusing the points of the process taking the i-th point in Ai and permitting repetitions if theintersections between some Aj are not empty, while Λ(n)(A1 × · · · × An) represents the samequantity without permitting those repetitions. The measure Λ(n) can have a density withrespect to Lebesgue measure on (Sn,⊗nB) that we denote with λ(n). In the case n = 2, thedensity λ(2) is called second order product density. It follows directly from the definition ofΛ(2) that

    λ(2)(x, y) = λ(2)(y, x) ∀x, y ∈ Rd. (1.5.3)The Campbell Theorem 1.5.1 can be generalized to the second order factorial moment measureΛ(2) as

    Theorem 1.5.2. Let X be a point process on Rd and f : Rd × Rd → R+ a non-negativemeasurable function. Then

    E( 6=∑x,y∈X

    f(x, y))

    =

    ∫Rdf(x, y)Λ(2)(d(x, y)) =

    ∫Rdf(x, y)λ(2)(x, y)d(x, y).

    We now define the so called Campbell measures, which will be useful in the next section. Thefirst order Campbell measure on the product space (S ×Nlf ,B ⊗N lf ) is defined as

    C(A× F ) = E(NX(A)I(X ∈ F ))

    = E∑ξ∈X

    I[ξ ∈ A,X ∈ F ] ∀A ∈ B0, F ∈ Nlf .

    Notice that we haveC(A×Nlf ) = Λ(A).

    The first order reduced Campbell measure is defined as

    C !(A× F ) = E∑ξ∈X

    I[ξ ∈ A,X\ξ ∈ F ] ∀A ∈ B0, F ∈ Nlf .

    These measures can be extended to higher orders in the obvious way.

    9

  • 1 Spatial Point Processes

    1.5.2 Palm distributions

    The Palm distributions of a spatial pattern are probability measures Pξ on (Nlf ,Nlf ), whereξ ∈ S. We will see that Pξ(F ) can be heuristically interpreted as P(X ∈ F |NX(B�(ξ) > 0))where � > 0 is arbitrarily small, so Pξ gives the conditional distribution of X given that thereis a point of the process at ξ.Formally we define the Palm distributions in the following way. Consider F ∈ Nlf , thefirst moment measure Λ(·) and the measure given by C(·, F ) where C(·, ·) is the first orderCampbell measure. We have directly from the definition that

    C(·, F )� Λ(·)

    where � means "absolutely continuous with respect to", since Λ(A) = 0 =⇒ C(A×F ) = 0.From the Radon Nikodym theorem we have that there exists a density dC(·×F )dΛ : S → R suchthat

    C(A× F ) =∫A

    dC(ξ × F )dΛ

    dΛ(ξ) ∀A ∈ B0.

    It is possible to choose this density such that fixing F we obtain a Borel measurable functionand fixing ξ we obtain a probability measure on (S,B). We call this probability measure Palmdistribution so

    Pξ(·) =dC(ξ × ·)

    dΛ.

    We now show heuristically that the Palm distribution can be interpreted as the conditionaldistribution of X given that there is an event in ξ. In fact, given � small enough, if we defineA := B�(ξ), we can assume that in A there is at most one point of X. With this assumptionwe have that

    C(A× F ) ≈ E(I(X ∈ F,NX(A) > 0)) = P(X ∈ F,NX(A) > 0)and

    C(A× F ) ≈ Pξ(F )Λ(A) ≈ Pξ(F )P(NX(A) > 0)so

    Pξ(F ) ≈P(X ∈ F,NX(A) > 0)

    P(NX(A) > 0)= P(X ∈ F |NX(A) > 0).

    In an analogous way, using the reduced Campbell measure, we can define the reduced Palmdistribution P!ξ(·) . In this case heuristically we can interpret P!ξ(·) as the probability dis-tribution of X\ξ given that for X there is an event at ξ. From the definition of the reducedPalm distribution, using standard techniques in measure theory, the following formula, knownas Campbell-Mecke Theorem can be proved

    E(∑ξ∈X

    h(ξ,X\{ξ}))

    =

    ∫ ∫h(ξ, x)dP!ξ(x)dΛ(ξ) (1.5.4)

    for non-negative measurable functions h.Consider now the case that X is stationary. Since the characteristics of the process are thesame throughout space, it should not be important which point ξ is fixed when looking at thePalm measure. In fact it can be proved that (see [40]) if we define

    P!0(F ) :=E(∑

    ξ∈XA I(X\{ξ} ∈ Fξ))λ|A|

    F ∈ Nlf , A ∈ B0, (1.5.5)

    10

  • 1.5 Summary statistics

    P!0(F ) does not depend on A and

    P!ξ(F ) = P!0(F(−ξ)) F ∈ Nlf .

    In the stationary case we can therefore restrict our attention to P!0, which can also be in-terpreted as the distribution of the further points of X given a "typical point" of X. TheCampbell-Mecke theorem, in the stationary case, can be rewritten as

    E(∑ξ∈X

    h(ξ,X\{ξ})) = λ∫ ∫

    h(ξ, x+ ξ)dP!0(x)dξ (1.5.6)

    Consider now a Poisson point process. We can expect that the distribution of the process doesnot change if we suppose to know the position of one point of the process, since the scatteringof the points is completely random and does not depend on the other positions.

    Theorem 1.5.3. (Slivnyak thm.) Let X be a Poisson process on Rd with intensity measureµ. Then PX = P!ξ for almost all ξ ∈ Rd.

    For a proof of this theorem see [57, Thm 3.3.5, Notes 3.3.3].

    1.5.3 Second order summary statistics

    Second order summary statistics, although they do not fully characterize a point process, arebelieved to represent important statistical properties and therefore constitute a widely usedtool for the analysis of point patterns. Second order statistics are based on the second orderfactorial moment measure Λ(2) which was defined in Equation (1.5.2). In this section weassume that X is stationary and that the product density λ(2) exists. In this case it can beproved that

    λ(2)(x, y) = λ(2)(0, y − x) =: λ(2)(z), z = y − x

    and that therefore

    Λ(2)(A×B) =∫A

    ∫Bλ(2)(0, y − x)dydx =

    ∫A

    ∫B(−x)

    λ(2)(z)dzdx. (1.5.7)

    We now define the reduced second-order moment measure K by

    λ2K(B) :=∫Bλ(2)(z)dz B ∈ B. (1.5.8)

    From the definition of K and Λ(2) it follows that

    Λ(2)(A×B) = λ2∫AK(B(−x))dx (1.5.9)

    and from the Campbell-Mecke Formula (1.5.6) we have

    Λ(2)(A×B) = λ∫AE!0(NX(B(−x)))dx. (1.5.10)

    These two equations lead toλK(B) = E!0(NX(B)). (1.5.11)

    11

  • 1 Spatial Point Processes

    The quantity λK(B) can be therefore interpreted as the expected number of points in Bexcluding the origin, conditioned on 0 belonging to X. When observing X on all Rd, anunbiased estimator for K(B) is given by

    λ̂2K(B) :=6=∑

    x∈A,y∈XI[y − x ∈ B]/|A|, A ∈ B0. (1.5.12)

    Unbiasedness follows from Theorem 1.5.2. When observing X in a finite window W we needto deal with edge effects, since smaller distances between points are more likely to be observedthan larger ones. In this case an unbiased estimator is given by

    λ̂2K(B) =6=∑

    x,y∈XW

    I[y − x ∈ B] 1|Wx ∩Wy|

    , (1.5.13)

    where the weights 1|Wx∩Wy | , are called translation edge correction weights and were introducedby Ohser and Stoyan in [44].When choosing B as the sphere centered in the origin with radius r, the K-measure coincides,as a function of r, with Ripley´s K-function which is widely used in practice, soK(r) = K(Br).Note that Ripley´s K-function, due to the shape of B, assumes both stationarity and isotropy.In Chapter 2, Section 2.3 we will discuss directional versions of the K-function that takeanisotropy into account. For a homogeneous Poisson process Ripley´s K-function assumesvalues

    K(r) = kdrd.

    For clustered processes we expect that K(r) ≥ kdrd for small r, and for regular point processeswe expect that K(r) ≤ kdrd for small r.The cumulative nature of the K-measure can make it hard to interpret it and can sometimesobscure some details. This is why sometimes its derivative is considered. Rewriting theK-measure as

    K(B) = λ−2∫Bλ(2)(z)dz =:

    ∫Bg(z)dz,

    the integrand

    g(z) =λ(2)(z)

    λ2

    is called the pair-correlation function. The pair correlation function is more practical thanthe product density λ(2) since it is independent of the intensity. By the definition of densitywe can interpret λ(2)(z)dz as the probability to have two points in two infinitesimal volumesdx and dy with difference vector z, while λ can be interpreted as the probability to have onepoint in an infinitesimal volume dx. If the two events of having one point in dx and one pointin dy are independent, as in the homogeneous Poisson process, we have g ≡ 1. Values of g > 1for ||z|| small, are typical in case of clustering, while values of g < 1 show repulsion betweenthe points and are typical for regular patterns.

    1.5.4 First order summary statistics

    In this section we briefly introduce some first order summary statistics for stationary pointprocesses. The nearest neighbor distance distribution function G is defined as

    G(r) = P0(d(0, X\0) ≤ r), r > 0.

    12

  • 1.6 Strauss process

    G(r) is the probability that there is at least one point which has distance less than r from 0,which is a point belonging to the process. The empty space function F is analogous to the Gfunction, with the only difference that the Palm distribution is substituted by PX

    F (r) = PX(d(0, X) ≤ r) r > 0,

    where in this case 0 does a.s not belong to the process. F (r) is then the probability to find,given a generic point in S, at least one event of the process that has distance less than r fromthis point. Therefore F is the distribution function of the distance between an arbitrary pointof S and the nearest point of the process, while G is the distribution function of the distancebetween the typical point of the process and its nearest neighbor. The J-function is definedas

    J(r) =1−G(r)1− F (r)

    =P!0(NX(B(0, r)) = 0)PX(NX(B(0, r)) = 0)

    ∀r > 0, F (r) < 1,

    where 0 is the typical point of the process. Intuitively we have that, if J(r) takes valuessmaller than 1, the probability to have an empty space larger than r between points of theprocess is less than the probability to have the same empty space between a generic point anda point of the process, which is typical in clustered patterns. Instead if J(r) is higher than 1we can suppose to have a more regular pattern. These heuristic observations are confirmedby the fact that for a Poisson process J(r) = 1 as a consequence of Theorem 1.5.3.

    1.6 Strauss process

    In this section we first introduce the class of processes that have a density with respect to aPoisson process with intensity measure µ, defined on a set S ⊂ Rd with µ(S)

  • 1 Spatial Point Processes

    Definition 1.6.2. A non negative measurable function h on Nlf is locally stable if

    ∃K > 0 such that ∀x ∈ Nlf , ∀ξ ∈ S\x : h(x ∪ ξ) ≤ Kh(x)

    and Ruelle stable if

    ∃K > 0, c > 0 such that ∀x ∈ Nlf : h(x) ≤ cKNx(S).

    Local stability implies Ruelle stability which implies integrability of h [59].

    Definition 1.6.3. We call processes that have a locally stable density with respect to Πlocally stable point processes.

    Definition 1.6.4. Given a point process X that has density p(·) with respect to Π we definethe Papangelou conditional intensity of X as

    λ(x, ξ) =p(x ∪ {ξ})

    p(x)x ∈ Nlf , ξ ∈ S\x

    taking λ(x, ξ) = 0 if p(x)=0.

    Notice that

    • The local stability condition implies the existence of an upper limit of the Papangelouconditional intensity.

    • The Papangelou conditional intensity does not depend on the normalizing constant ofthe density p(·), which is unknown in most of the cases.

    • Heuristically, the Papangelou conditional intensity λ(x, ξ) of a process X can be inter-preted as

    λ(x, ξ)dξ = P(NX(dξ) = 1|X ∩ (dξ)C = x ∩ (dξ)C)

    so as the probability of finding a point in an infinitesimal region dξ around ξ given thatthe point process agrees with the configuration x outside dξ.

    Definition 1.6.5. Suppose we have a point process X with Papangelou conditional intensitygiven by λ(x, ξ). We say that X is attractive if

    λ(x, ξ) ≤ λ(y, ξ) ∀x ⊆ y ∈ Nlf

    and repulsive ifλ(x, ξ) ≥ λ(y, ξ) ∀x ⊆ y ∈ Nlf .

    Intuitively, attractivity means that the chance that ξ ∈ X, given that X\ξ = x, is an increas-ing function of x, while repulsivity means the opposite.

    We now give the definition of the Strauss process.

    14

  • 1.6 Strauss process

    Definition 1.6.6. We say that a point process X is a Strauss process with parameters θ =(β, γ, r0), where β > 0, 0 ≤ γ ≤ 1, r0 > 0, if X has density

    pθ(x) =1

    Z̃θβNx(S)γsr0 (x) (1.6.1)

    with respect to the measure Π induced by a homogeneous Poisson process on S with intensity1, where Z̃θ is the unknown normalizing constant and

    sr0(x) =∑

    {ξ1,ξ2}⊆x:ξ1 6=ξ2

    I[d(ξ1, ξ2) ≤ r0]

    is the number of pairs of distinct points belonging to the point configuration x that havedistance less than r0 from each other.

    Proposition 1.6.1. The Strauss process is a locally stable, repulsive point process.

    Proof. We have that the Papangelou conditional intensity of a Strauss process is equal to

    λ(x, ξ) = β[N(x∪ξ)(S)−Nx(S)]γ[sr0 (x∪ξ)−sr0 (x)] = βγtr0 (x,ξ), ξ /∈ x

    where we have denotedtr0(x, ξ) = sr0(x ∪ ξ)− sr0(x)

    which is the number of points in configuration x that have distance less than r0 from ξ.The local stability follows from the fact that

    γtr0 (x,ξ) ≤ 1 since 0 ≤ γ ≤ 1, tr0(x, ξ) ≥ 0

    and the repulsivity from the fact that

    tr0(x, ξ) ≤ tr0(y, ξ) if x ⊆ y.

    The normalization constant Z̃θ is not explicitly known and its estimation, if needed, is notstraightforward. We mention here a possible estimation used by Cressie and Lawson in [10],based on a Poisson aproximation (see [55])

    1

    Z̃θ≈ e|W |(β−1) exp

    (β2|W |22|W |

    kdrd0(γ − 1)

    ). (1.6.2)

    In Definition 1.6.6, β is called the intensity parameter, γ the interaction parameter and r0 theinteraction radius. Realizations of the Strauss process have different characteristics dependingon the values of these parameters (Figure 1.6.1). Typically, if γ is close to 0, the realizationslook more regular than in the case in which γ is close to 1. Therefore the parameter γ willalso be called regularity parameter. Consider the extreme cases. If γ = 0, since the densityassumes values different from 0 only if sr0(x) = 0, we obtain the so called hardcore process,where points with distance less than r0 are prohibited. Instead, in the case γ = 1, we obtain

    15

  • 1 Spatial Point Processes

    a Poisson process which allows arbitrarily close points. Decreasing γ is not the only way toobtain a more regular pattern. Another way is to increase r0 while fixing the other parameters.Notice that this fact highlights that the parameters r0 and γ are strictly related to each other,and from a pattern is not easy to guess if it is the parameter r0 that assumes for example ahigh value or γ that assumes a small value. This type of correlation can cause problems ifestimations of the parameters of a Strauss process are needed.The parameter β is related to the intensity λ of the process. Note that λ can not be computedexplicitly, even if the values of the parameters β, γ and r0 are known. A possible approximationof λ, given the parameters of the process, was introduced by Baddeley and Nair in [3] and itis given by

    λ̂ =W0(βΓ)

    Γ,

    where W0 is the principal branch of Lambert’s W0 function (see [9]), and Γ = −kdrd0 log γ.

    In Figure 1.6.1 we show some realizations of the Strauss process in the observation window[0, 1] × [0, 1] with different values of the parameters r0 and γ, while the parameter β is fixedto 200. The two rows correspond to two different values of r0 and in every row we considerthree different increasing values of γ.

    The Strauss process is a pairwise interaction point process and in particular a Gibbs or Markovpoint process ([40]). Using the properties of Markov point processes the definition of Straussprocess on a finite set S can be extended to Rd by e.g. using the local specification character-ization as in [40, page 95]. Such an extension is a stationary point process on Rd (the Poissonprocess with intensity 1 is stationary and tr0 is invariant under translations and rotations).

    16

  • 1.7 The Metropolis Hastings algorithm

    Figure 1.6.1: Simulations of Strauss processes with different values of the parameters γ and r. In the firstrow r = 0.02, in the second r = 0.06. In the first column γ = 0, in the second γ = 0.3 and in

    the last column γ = 0.6. γ =0, r=0.02 γ =0.3, r=0.02 γ =0.6, r=0.02

    γ =0, r=0.06

    γ =0.3, r=0.06

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    γ =0.6, r=0.06

    The Strauss process on a finite set S can be simulated for example by using the MetropolisHastings algorithm as described in Section 1.7.3.

    1.7 The Metropolis Hastings algorithm

    In this section we shortly introduce the Metropolis Hastings algorithm and show how to applyit to simulate locally stable point processes on a set S ⊂ Rd with |S|

  • 1 Spatial Point Processes

    Definition 1.7.2. We say that the chain Yn converges in equilibrium to a measure π on (Y,Y)as n→∞ if

    limn→∞

    ||Pn(x, ·)− π(·)||v = 0 for π − a.a x ∈ Y

    where Pn : Y × Y → [0, 1] is the n-step transition probability which satisfies

    P(Yn ∈ A|Y0 = x) = Pn(x, A) A ∈ Y, x ∈ Y.

    The Metropolis Hastings algorithm is an MCMC (Markov Chain Monte Carlo) method, andhas the aim to get a sample from a distribution with density π with respect to a measure µdefined on a measure space (Y,Y). Usually this algorithm is needed when π is known only upto a normalizing constant and therefore direct sampling is not available. The basic idea of themethod is to simulate, for a sufficiently long time, a discrete in time Markov chain with statespace Y , that has equilibrium density given by π.

    1.7.1 The algorithm

    The algorithm consists in building the following discrete in time Markov chain.Suppose that at the n-th iteration the chain is in the state x. The n+ 1-th step is built by

    • proposing a new state y using a density q(y,x) (with respect to µ),

    • accepting or refusing y as the new state of the (n + 1)-th iteration using the followingacceptance probability

    α(y, x) =

    {min(H(y,x), 1) if π(x)q(y,x) > 01 if π(x)q(y,x) = 0

    where H(y,x) is called Hastings ratio and is given by

    H(y, x) =π(y)q(x,y)

    π(x)q(y,x).

    Note that H(y,x) depends on π only through ratios, so for applying this algorithm it is notnecessary to know the normalizing constant of π.

    1.7.2 Convergence of the algorithm

    It can be proved that the Metropolis Hastings algorithm, if choosing an appropriate proposaldensity q(·, ·), which has to render the constructed Markov chain aperiodic and irreducible[59], converges in equilibrium to the density π. A good proposal density q

    • has to be easy to implement in practice,

    • has high acceptance rate,

    • provides a good mixing of the chain so that all the range of the states is visited “often”and not only a part of it,

    • guarantees no cyclic behavior of the chain.

    18

  • 1.7 The Metropolis Hastings algorithm

    Notice that not only the convergence of the algorithm, but also the rate of convergence dependson the choice of q. For example a high rejection rate can make the convergence slower.The Metropolis Hastings algorithm gives us a way for sampling from a density π by runninga chain for a suitable number of iterations such that the chain has reached equilibrium. Wehave however to consider that

    (i) If we want a multi-dimensional sample, the sample we obtain by running a single chainis not independent.

    (ii) The density of the sample is only asymptotically equal to π.

    (iii) We do not know the rate of convergence, so for how many iterations the chain shouldbe run before reaching approximately the equilibrium.

    Regarding the first point, one could run multiple independent chains, although this leads toa high computational cost. Another possibility is to thin the chain and take its values everyk-th iteration, obtaining an approximately independent sample. Regarding the third point,in practice, since theoretical results are in general difficult to apply, methods such the onesintroduced by Raftery and Lewis in [48] are applied. These methods let first run the algorithmin order to obtain one or more pilot samples. The number of iterations are then determined byapplying convergence diagnostics to the pilot samples. To get rid of the problem of the thirdpoint one can also use an alternative method to the Metropolis Hastings algorithm which iscalled dominated coupling from the past (DCFTP) [40]. Once it has converged the DCFTPgives an exact simulation of π. It can however happen that the algorithm takes long time toconverge.

    1.7.3 Simulation of locally stable point processes

    The Metropolis Hastings algorithm can be used to simulate locally stable processes which havea density p with respect to Π where p is usually known only up to a normalizing constant.In this case the state space is (Y,Y) = (Nlf ,Nlf ). It is also possible to use the Metropolisalgorithm to simulate from the conditional (on having n points) versions of those densities.Let us first consider the unconditional case. The proposal distribution q can be chosen as

    • propose a birth with probability q(x), where the new point u ∈ S is sampled from adensity b(x, u) with respect to µ.

    • propose a death of a preexisting point with probability 1− q(x), where the point ξ ∈ xto delete is sampled from a density d(x, ξ) on the point configuration x.

    With this choice of q, the acceptance probability α is

    α(x ∪ u,x) = q(x)p(x ∪ u)d(x ∪ u, u)(1− q(x))p(x)b(x, u)

    x ∈ Nlf , u ∈ S

    and

    α(x/u,x) =(1− q(x))p(x/u)b(x, u)

    q(x)p(x)d(x, u)x ∈ Nlf , u ∈ x

    19

  • 1 Spatial Point Processes

    Usually all densities are taken as uniform so that

    q(x) =1

    2b(x, u) =

    1

    µ(S)d(x, ·) = 1

    Nx(S).

    It can be proved that under some conditions on b, d and q, which are fulfilled by the previouschoices, the algorithm converges to a distribution with the specified density p.For the conditional case, when we fix the total number of points to n, the algorithm startswith a point pattern having n points and at each iteration it will be proposed to replace anold point with a new proposed point. For details see [40] page 108.Two other possible ways to simulate locally stable processes are spatial birth and death pro-cesses [26] and/or dominated coupling from the past [40] (exact simulations).An exact simulation of the Strauss process in 2D can be obtained by using the functionrStrauss of the R-package spatstat. Both 2D and 3D simulations of the Strauss process usingthe Metropolis Hastings algorithm can be obtained by using the function rstrauss which canbe found in the R-package rstrauss in https://github.com/antiphon/rstrauss.

    20

  • 2 Directional Analysis

    In this chapter we describe different methods for a directional analysis of a stationary pointprocess, which, in order to be exemplified, are applied to two simulated data sets, one regularand one clustered. Although the directional methods are introduced in the general case ofa stationary point process X defined on Rd, special attention is given to their applicationto regular patterns subjected to a particular type of anisotropy mechanism, called geometricanisotropy, which is described in Section 2.1. In Section 2.3, Section 2.4 and Section 2.5 wedescribe the directional methods. In Section 2.2 we introduce the so called Fry points, whichwill be important throughout the chapter.

    2.1 Settings

    Let X be a simple stationary point process on Rd, with intensity λ and second order productdensity λ(2). Since we assume that X has no duplicate points, λ(2)(x, x), x ∈ Rd, is not welldefined, and set equal to 0. We moreover assume that X is observed in a compact windowW ⊂ Rd.

    We now describe in details and introduce notations for a particular type of anisotropy mecha-nism, which has been called geometric anisotropy in [36]. Let X0 be a stationary and isotropicpoint process and define the point process

    X = TX0 = {Tx : x ∈ X0} (2.1.1)

    where T : Rd → Rd is an invertible linear mapping, which corresponds to a d× d matrix alsodenoted by T . We assume here that det(T ) > 0. If det(T ) = 1, the transformation T is calledvolume preserving. T can be decomposed by using the singular value decomposition

    T = R1CR2

    where R1 and R2 correspond to rotations and C is a diagonal matrix with strictly positiveentries. Since X0 is isotropic we have that

    TX0 = R1CR2X0 ∼ R1CX0.

    Therefore it is sufficient to consider a linear mapping T of the form

    T = RC. (2.1.2)

    The matrix C “rescales” X0 along the coordinate axes, whereas the matrix R rotates thedeformed process CX0. The axes obtained by rotating the coordinate axes by R are calleddeformation axes of T .

  • 2 Directional Analysis

    The point process X that we get after the transformation, is a stationary point process withintensity λX = det(T−1)λX0 . If the matrix C is not a multiple of the identity matrix, Xcan be anisotropic. Note that, in the case X0 is a stationary Poisson process, X remains astationary Poisson process, only with different intensity.Geometric anisotropy has already been considered in the literature with X0 clustered or regu-lar, both in the 2D and in the 3D case, for real and simulated data. Regarding the simulateddata, the cluster case has been considered in 2D in [36] with log-Gaussian Cox processes andshot noise Cox processes, in [22] and in [66] with anisotropic Thomas processes. The regularcase has been considered in [53] with Matern hard core processes (in 3D), in [66] with Gibbshardcore processes (in 2D) and in [52] with Strauss processes (both in 2D and in 3D). In thisthesis we focus on the regular case in both 2D and 3D. As in [52], in our simulation study inChapter 3, we consider realizations of Strauss processes.Let now X be a point process on R2 or on R3, generated by the geometric anisotropy mecha-nism. Motivated by our application (Chapter 5) we assume T volume preserving. In 2D thescaling matrix C assumes the form (since det(T ) = 1)

    C =

    (c 00 1c

    ). (2.1.3)

    We assume that the strength of compression 0 < c ≤ 1. In 3D the scaling matrix C assumesthe form

    C =

    c2√c1

    0 0

    0 c1 00 0 1√c1c2

    (2.1.4)where we assume that 0 < c1 ≤ c2√c1 , so that c2 > c1

    √c1. We call c1 the strength of main

    compression and c2 the strength of additional compression. If c2 = 1 we have only one axisof compression. The other two axes of deformations are elongated with equal strengths. Ifc1 =

    c2√c1

    we have one axis of elongation and two axes of compression which are deformed withequal strengths. In both cases T is a spheroidal transform.Let us now consider in 2D 0 < c < 1 and in 3D 0 < c1 < c2√c1 . Given our (non restrictive)assumptions on the order of the elements of the diagonal of C, in 2D the process is compressedalong the image (by applying the rotation R) of the x-axis, and dilated along the image of they-axis. In 3D the process is compressed along the image of the y and x axes and dilated alongthe image of z. Since the compression along the image of y is stronger than the compressionalong x, we say that the image of y is the axis of main compression and the image of x is theaxis of additional compression. In 2D the deformation axes can be simply represented by theangle θ̄ ∈ [0, π] that the axis of compression forms with the x-axis (counterclockwise). Fromnow on we will call θ̄ the direction of compression. The matrix R can be expressed by

    R =

    (cos(θ̄) − sin(θ̄)sin(θ̄) cos(θ̄)

    ). (2.1.5)

    In 3D we denote the axes of deformation (in order the axis of elongation, the axis of additionalcompression and the axis of compression) ū1, ū2, ū3. The same notation will be used to de-note the directions of the deformation axes with nonnegative z values, that belong to (S2)+.We call these directions directions of deformation. In the d-dimensional case we extend the

    22

  • 2.1 Settings

    notation in the obvious way.

    Besides geometric anisotropy, other anisotropy mechanisms could have been taken into con-sideration. An example of anisotropic stationary point processes not generated by geometricanisotropy, are Poisson processes (or in general stationary processes) with increased intensityalong directed lines (see for example the Poisson line cluster point process (PLCPP) model in[35] and the models in [56]). These processes can be considered stationary, if the distributionof the lines is stationary.

    2.1.1 Aims

    Given the assumption of geometric anisotropy, our specific aims are

    • Estimate the rotation R, so the axes of deformation.

    • Estimate the matrix C, so estimate the strength c in 2D and the strengths c1 and c2 in3D.

    In Sections 2.3.1, 2.4.1 and 2.5.1 we consider the estimation of R, while in Section 2.6 weconsider the estimation of C.

    2.1.2 Explicative examples

    In this section we show two realizations of 2D point processes, one regular and the otherclustered, that we will use to show the basic ideas and the typical results of the considereddirectional methods. Both examples are constructed by using the geometric anisotropy mech-anism. For the regular case we chose X0 as a Strauss process with fixed number of pointsn = 300 and parameters γ = 0, r0 = 0.04. For the clustered case we chose X0 as a MaternCluster Process with radius of the clusters equal to 0.03, intensity of the Poisson process thatdetermines the cluster centers equal to 10 and with an average of 40 points per cluster. Forthe simulation we used the function rMatClust of the R-package spatstat. In both the clus-tered and the regular case, we fixed R as the identity matrix, applying no rotation to X0 andwe fixed the strength of compression c = 0.5. For details on how the realizations of theseprocesses can be obtained see Section 3.1. The realization of the regular case is shown in theplot on the left of Figure 2.1.1 and the realization of the clustered case in the plot on the right.

    23

  • 2 Directional Analysis

    Figure 2.1.1: Two explicative examples patterns. On the left the regular case and on the rightthe clustered case.

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Regular

    x

    y

    ● ●●●● ●

    ●●●●● ● ●● ●●●● ●● ● ●●● ●●●●●●●

    ●●●●●

    ●● ●●●

    ●●● ●●●● ●● ●● ● ●●

    ●●● ●●●● ●●● ●●●●

    ●●

    ● ●●●● ●● ●● ●● ●●●●

    ● ●●●

    ●● ●●●●● ●

    ●●●

    ● ●● ●●●●● ●● ●●●●●●●

    ●● ●● ● ●● ●●●

    ●●● ●●● ●●●●

    ●● ●●●●

    ● ●● ●●

    ●●●●● ●●

    ●● ● ●●● ●●●● ●●

    ●●●● ●●●●●●● ●●●●

    ●●● ●●●● ●● ●●● ●●

    ●●● ●●

    ●●● ●●●●

    ●●● ●●●●●● ●●●

    ●● ● ●● ●● ●●

    ●● ●●●

    ●●●

    ●●●

    ●●●●●●●●

    ●●●● ●●●●

    ●●

    ●●

    ● ●● ●●●●

    ●● ●●●

    ●●●●●●●● ●●●●● ●●

    ● ●●● ●● ●●●

    ●●●●●

    ●● ●●●●●● ●●● ●●●●●● ●●●●●● ●●

    ●● ●●●●●● ● ●●●●

    ● ●

    ●●●

    ●●●● ●●●● ●●●●●● ●

    ●●● ●● ●●●● ●

    ●● ●● ●● ●

    ●●

    ●●

    ●●

    ● ●● ●●● ●

    ●● ●● ●●●● ● ●●●● ●●●●●●●

    ●●

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Clustered

    x

    yIn the clustered pattern, the axes of dilation and compression are visually detectable bylooking at the shape of the clusters which are elongated along the axis of elongation x and arecompressed along the axis of compression y. In the case of the regular pattern, the compressionand the dilation axes are not so clearly visible.

    2.2 Fry points

    In this section we introduce the so called Fry points, which will be considered in all thefollowing sections. The Fry points have been first introduced by Fry in [20]. We define theFry points of a stationary point process X as

    ZA := {y − x, x 6= y, x ∈ A, y ∈ X} A ∈ B0. (2.2.1)

    In Equation (2.2.1) we need to consider x ∈ A ∈ B0, since, if considering all points of X on Rd,ZA would not be locally finite. In practice, when observing X in a finite observation windowW , we can only observe the pairwise difference vectors

    ZW := Z := {y − x, x 6= y, x, y ∈ XW } (2.2.2)

    which we also call Fry points. The set Z is symmetric with respect to the origin since y − xand −(y − x) both belong to Z and is affected by edge effects. We denote the observationwindow of Z, which depends on W , by W ∗. From now on we concentrate only on the set Z.

    In the next sections we will see that the Fry points can be exploited in order to analyzeanisotropy in stationary point processes. Moreover, due to their structure, the Fry points areuseful to visualize anisotropy both in 2D and in 3D (Section 4.4).Let us first look at the properties of the Fry points Z under isotropy. If X is isotropic andif W = Br(c), r ∈ R+, c ∈ Rd, we will prove that the distribution of the Fry points Z isrotationally symmetric with respect to rotations about the origin. The condition on W isnecessary since it implies that the window W ∗ is also a ball and therefore invariant underrotations. If W is not a ball we can always restrict ourselves on the biggest ball which iscontained in W . If R0 ∈ SOn we can write

    R0Z = {R0y −R0x, x 6= y, x, y ∈ XW } = {y − x, x 6= y, x, y ∈ R0(XW )}. (2.2.3)

    24

  • 2.2 Fry points

    We have then thatR0(XW ) = (R0X)(R0W )

    (1)∼ (R0X)W(2)∼ XW (2.2.4)

    where in (1) we exploited the fact that W is a ball and the stationarity of X and in (2) theisotropy of X. From Equation (2.2.3) and Equation (2.2.4) we can easily derive that

    R0Z ∼ Z ∀R0 ∈ SOn. (2.2.5)

    Since Definition (2.2.2) considers X restricted to the observation window W , estimationsinvolving X and Z are both affected by edge effects. For instance in W smaller distancesbetween points are more likely to be observed than larger distances. Edge effects can betreated in different ways. In estimations involving Z particularly useful are the translationaledge correction weights already introduced in Equation 1.5.13, since they can provide unbiasedestimators considering only XW . Another possible edge treatment is given by the so calledminus-sampling. In this case only the differences

    {y − x, ||y − x|| < dist(x, ∂W )} (2.2.6)

    are taken into consideration. This means that, fixing x ∈ X, y − x is considered only ify ⊂ Br(x) where Br(x) is the biggest ball contained in W centered in x. In this way, for eachpoint x, in case of isotropy, the additional point y has the same probability to be observed inall directions. The minus-sampling edge correction favors smaller distances and produces apoint process which is not longer symmetric with respect to the origin.We now show, in Figure 2.2.1, how the Fry points of the explicative examples look like.

    Figure 2.2.1: Fry points of the two explicative examples patterns restricted to the window[−0.11, 0.11] × [−0.11, 0.11]. On the left the regular case and on the right theclustered case.

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●