1 spatial processes and statistical modelling peter green university of bristol, uk bccs gm&css...

38
1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

Upload: melanie-hurst

Post on 28-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

1

Spatial processes and statistical modelling

Peter GreenUniversity of Bristol, UK

BCCS GM&CSS 2008/09 Lecture 8

Page 2: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

2

• Continuous space

• Discrete space– lattice– irregular - general graphs– areally aggregated

• Point processes– other object processes

Spatial indexing

Page 3: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

3

Space vs. time

• apparently slight difference• profound implications for

mathematical formulation and computational tractability

Page 4: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

4

Requirements of particular application domains• agriculture (design)

• ecology (sparse point pattern, poor data?)

• environmetrics (space/time)

• climatology (huge physical models)

• epidemiology (multiple indexing)

• image analysis (huge size)

Page 5: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

5

Key themes

• conditional independence– graphical/hierarchical modelling

• aggregation– analysing dependence between differently

indexed data– opportunities and obstacles

• literal credibility of models• Bayes/non-Bayes distinction blurred

Page 6: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

6

Why build spatial dependence into a model?• No more reason to suppose

independence in spatially-indexed data than in a time-series

• However, substantive basis for form of spatial dependent sometimes slight - very often space is a surrogate for missing covariates that are correlated with location

Page 7: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

7

Discretely indexed data

Page 8: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

8

Modelling spatial dependence in discretely-indexed fields• Direct• Indirect

– Hidden Markov models– Hierarchical models

Page 9: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

9

Hierarchical models, using DAGs

Variables at several levels - allows modelling of complex systems, borrowing strength, etc.

Page 10: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

10

Modelling with undirected graphsDirected acyclic graphs are a natural

representation of the way we usually specify a statistical model - directionally:

• disease symptom• past future• parameters data ……whether or not causality is understood.But sometimes (e.g. spatial models) there is

no natural direction

Page 11: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

11

Conditional independence

In model specification, spatial context often rules out directional dependence (that would have been acceptable in time series context)

X0 X1 X2 X3 X4

Page 12: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

12

Conditional independence

In model specification, spatial context often rules out directional dependence

X20 X21 X22 X23 X24

X00 X01 X02 X03 X04

X10 X11 X12 X13 X14

Page 13: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

13

Conditional independence

In model specification, spatial context often rules out directional dependence

X20 X21 X22 X23 X24

X00 X01 X02 X03 X04

X10 X11 X12 X13 X14

Page 14: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

14

Directed acyclic graph

)|()( )(pa vVv

v xxpxp

in general:

for example:

a b

c

dp(a,b,c,d)=p(a)p(b)p(c|a,b)p(d|c)In the RHS, any distributions are legal, and uniquely define joint distribution

Page 15: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

15

Undirected (CI) graph

X20 X21 X22

X00 X01 X02

X10 X11 X12Absence of edge denotes conditional independence given all other variables

But now there are non-trivial constraints on conditional distributions

Regular lattice, irregular graph, areal data...

Page 16: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

16

Undirected (CI) graph

X20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp ))(exp)(

iC

CCii XVXXp ))(exp)|(

)|()|( iiii XXpXXp

then

and so

The Hammersley-Clifford theorem says essentially that the converse is also true - the only sure way to get a valid joint distribution is to use ()

()

clique

Suppose we assume

Page 17: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

17

Hammersley-Clifford

X20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp )(exp)(

)|()|( iiii XXpXXp

A positive distribution p(X) is a Markov random field

if and only if it is a Gibbs distribution

- Sum over cliques C (complete subgraphs)

Page 18: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

18

Partition function

X20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp )(exp)(

Almost always, the constant of proportionality in

is not available in tractable form: an obstacle to likelihood or Bayesian inference about parameters in the potential functions

Physicists call

the partition function

))( CC XV

X C

CC XVZ )(exp

Page 19: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

19

Markov properties for undirected graphs• The situation is a bit more complicated

than it is for DAGs. There are 4 kinds of Markovness:

• P – pairwise– Non-adjacent pairs of variables are

conditionally independent given the rest

• L – local– Conditional only on adjacent variables

(neighbours), each variable is independent of all others

Page 20: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

20

• G – global– Any two subsets of variables separated by a

third are conditionally independent given the values of the third subset.

• F – factorisation– the joint distribution factorises as a product

of functions of cliques• In general these are different, but FGLP

always. For a positive distribution, they are all the same.

Page 21: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

21

Gaussian Markov random fields: spatial autoregression

C

CC XVXp )(exp)(

)|()|( iiii XXpXXp is a multivariate Gaussian distribution, and

If VC(XC) is -ij(xi-xj)2/2 for C={i,j} and 0 otherwise, then

is the univariate Gaussian distribution

),(~| 22i

ijjijiii XNXX

ij

iji /12where

Page 22: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

22

Inverse of (co)variance matrix:

dependent case3210

2410

1121

0012

A B C D

non-zero

non-zero),|,cov( CADB

A B C D

A

B

C

D

Gaussian random fields

Page 23: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

23

Non-Gaussian Markov random fields

ji

ji xx

~

)cosh(log)1(exp

C

CC XVXp )(exp)(

Pairwise interaction random fields with less smooth realisations obtained by replacing squared differences by a term with smaller tails, e.g.

Page 24: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

24

Discrete-valued Markov random fields

C

CC XVXp )(exp)(

Besag (1974) introduced various cases of

for discrete variables, e.g. auto-logistic (binary variables), auto-Poisson (local conditionals are Poisson), auto-binomial, etc.

Page 25: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

25

Auto-logistic model

C

CC XVXp )(exp)(

ji

jiiji

ii xxx~

exp

)|()|( iiii XXpXXp is Bernoulli(pi) with

ij

jijiii xpp )())1/(log(

- a very useful model for dependent binary variables (NB various parameterisations)

(Xi = 0 or 1)

Page 26: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

26

Statistical mechanics models

C

CC XVXp )(exp)(

The classic Ising model (for ferromagnetism) is the symmetric autologistic model on a square lattice in 2-D or 3-D. The Potts model is the generalisation to more than 2 ‘colours’

ji

jii

x xxIXpi

~

][exp)(

and of course you can usefully un-symmetrise this.

Page 27: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

27

Auto-Poisson model

C

CC XVXp )(exp)(

ji

jiiji

iii xxxx~

)!log(exp

)|()|( iiii XXpXXp is Poisson ))(exp(

ij

jiji x

For integrability, ij must be 0, so this onlymodels negative dependence: very limited use.

Page 28: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

28

Hierarchical models and hidden Markov processes

Page 29: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

29

Chain graphs

• If both directed and undirected edges, but no directed loops:

• can rearrange to form global DAG with undirected edges within blocks

Page 30: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

30

Chain graphs

• If both directed and undirected edges, but no directed loops:

• can rearrange to form global DAG with undirected edges within blocks

• Hammersley-Clifford within blocks

Page 31: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

31

Hidden Markov random fields• We have a lot of freedom modelling

spatially-dependent continuously-distributed random fields on regular or irregular graphs

• But very little freedom with discretely distributed variables

use hidden random fields, continuous or discrete

• compatible with introducing covariates, etc.

Page 32: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

32

Hidden Markov models

z0 z1 z2 z3 z4

y1 y2 y3 y4

e.g. Hidden Markov chain

observed

hidden

Page 33: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

33

Hidden Markov random fields

Unobserved dependent field

Observed conditionally-independent discrete field

(a chain graph)

Page 34: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

34

Spatial epidemiology applications

independently, for each region i. Options:• CAR, CAR+white noise (BYM, 1989)• Direct modelling of ,e.g. SAR• Mixture/allocation/partition models:

• Covariates, e.g.:

)(~ iii ePoissonY casesexpected

cases

relative risk

)log( i))cov(log( i

izi eYEi

)(

)exp()( Tiizi xeYE

i

Page 35: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

35

Spatial epidemiology applications

Spatial contiguity is usually somewhat idealised

Page 36: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

36

relativerisk

parameters

Richardson & Green (JASA, 2002) used a hidden Markov random field model for disease mapping

)(Poisson~ izi eyi

observedincidence

expectedincidencehidden

MRF

Spatial epidemiology applications

Page 37: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

37)(Poisson~ ii eY

Chain graph for disease mapping

e

Y

zk

)(Poisson~ izi eYi

based on Potts model

Page 38: 1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

38

Larynx cancer in females in France

SMRs

)|1( ypiz

ii Ey /