combining data in species distribution models

21
Combining Data in Species Distribution Models Combining Data in Species Distribution Models Bob O’Hara 1 Petr Keil 2 Walter Jetz 2 1 BiK-F, Biodiversity and Climate Change Research Centre Frankfurt am Main Germany bobohara 2 Department of Ecology and Evolutionary Biology Yale University New Haven, CT, USA

Upload: bob-ohara

Post on 14-Dec-2014

498 views

Category:

Science


2 download

DESCRIPTION

Using point process models to combine different data types for species distribution models. Slides for talk at ISEC 2014, presented on the 3rd July

TRANSCRIPT

Page 1: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Bob O’Hara1 Petr Keil 2 Walter Jetz2

1BiK-F, Biodiversity and Climate Change Research CentreFrankfurt am MainGermany bobohara

2Department of Ecology and Evolutionary BiologyYale University

New Haven, CT, USA

Page 2: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Motivation

Map Of Life

www.mol.org/

Page 3: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

The Problem

Different data sources

I GBIF

I expert range maps

I eBird and similar citizen science efforts

I organised surveys (BBS, BMSs)

Page 4: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Pointed Process Models

Point process representation of actual distribution

I Continuous space models

Build different sampling models on top

Page 5: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Point Processes: Model

Intensity ρ(ξ) at point s. Assume covariates (features?) X (ξ), anda random field ν(ξ)

log(ρ(ξ)) = η(ξ) =∑

βX (ξ) + ν(ξ)

then, for an area A,

P(N(A) = r) =λ(A)re−λ(A)

r !

where

λ(A) =

∫Aeη(s)ds

Page 6: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

In practice...

Constrained refined Delaunay triangulation

λ(A) ≈N∑

s=1

|A(s)|eη(s)

Approximate λ(ξ) numerically:select some integration points,and sum over those

Page 7: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Some Data Types

I AbundanceI e.g. Point counts

I Presence/absenceI surveys, areal lists

I Point observationsI museum archives, citizen science observations

I Expert range maps

Page 8: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Abundance

Assume a small area A, so that η(ξ) is constant, and observationfor a time t, then n(A, t) ∼ Po(eµ(A,t)) with

µA(A, t) = η(A) + log(|A|) + log(t) + log(p)

where p is the proability of observing each indidivual.Don’t know all of |A|, t and p, so estimate an interceptCan also add a sampling model to log(p)

Page 9: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Presence/Absence for ’points’

As n(A, t) ∼ Po(µ(A, t)),

cloglogPr(n(A, t)) = µI (A, t)

with µI (A, t) as beforeAgain, can make log(|A|) + log(t) + log(p) an intercept

Page 10: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Presence only: point process

log Gaussian Cox ProcessLikelihood is a Poisson GLM (but with non-integer response)

Page 11: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Areal Presence/absence

If an area is large enough, we can’t assume constant covariates, so

Pr(n(A) > 0) = 1− e∫A eρ(ξ)dξ

in pracice this is calculated as

1− e∑

s |A(s)|eρ(s)

which causes problems with the fitting

Page 12: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Expert Range Maps

Not the same as areal presence.Instead, use distance to range asa covariate

I within range, this is 0.

I Have to estimate the slopefor outside the range

Use informative priors to forcethe slope to be negative 0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

Space (1d)

Inte

nsity

Species'Range

Page 13: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Put these together with INLA

Quicker than MCMC

SolTim.res <- inla(SolTim.formula,

family=c('poisson','binomial'),

data=inla.stack.data(stk.all),

control.family = list(list(link = "log"),

list(link = "cloglog")),

control.predictor=list(A=inla.stack.A(stk.all)),

Ntrials=1, E=inla.stack.data(stk.all)$e, verbose=FALSE)

Page 14: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

The Solitary Tinamou

Photo credit: Francesco Veronesi on Flickr(https://www.flickr.com/photos/francesco veronesi/12797666343)

Page 15: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Data

Whole RegionExpert rangePark, absentPark, presenteBirdGBIF

I expert range

I 2 pointprocesses (49points)

I 28 parks

Page 16: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

A Fitted Model

mean sd mode

Intercept -0.30 0.09 -0.30b.PP 1.37 0.40 1.37

b.GBIF 1.43 0.26 1.43Forest -0.03 0.04 -0.03

NPP 0.15 0.05 0.15Altitude -0.02 0.04 -0.02

DistToRange -0.01 0.02 -0.01

Page 17: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Predicted Distribution

−0.10

−0.05

0.00

0.05

0.10

0.15

0.20

0.25

Whole RegionExpert rangePark, absentPark, presenteBirdGBIF

Page 18: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Individual Data Types

Expert Range

−10

−8

−6

−4

−2

0

GBIF−0.060

−0.058

−0.056

−0.054

−0.052

−0.050

−0.048

eBird−0.060

−0.058

−0.056

−0.054

−0.052

−0.050

−0.048

Parks

−10

−8

−6

−4

−2

0

all data

−0.10

−0.05

0.00

0.05

0.10

0.15

0.20

0.25

Page 19: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Summary

Parks and expert range seem to drive distributionNPP is main covariate, not forest or altitude

Page 20: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

What Next

Multiple species

I already being done elsewhere

I estimate sampling biases

More Data

I Point counts (have it working)

Can we estimate absolute probability of presence?

I Distance sampling?

I Mark-recapture?

I scaling issues (in time and space)

Page 21: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Not the final answer...

http://www.gocomics.com/nonsequitur/2014/06/24