multi-scale analysis: options for modeling presence/absence of bird species kathryn m. georgitis 1,...
TRANSCRIPT
Multi-scale Analysis: Options for Modeling
Presence/Absence of Bird Species
Kathryn M. Georgitis1, Alix I. Gitelman1, and Nick Danz2
1 Statistics Department, Oregon State University2 Natural Resources Research Institute University of Minnesota-Duluth
The research described in this presentation has been funded by the U.S. Environmental Protection Agency through the STAR Cooperative Agreement CR82-9096-01 Program on Designs and Models for Aquatic Resource Surveys at Oregon State University. It has not been subjected to the Agency's review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred
R82-9096-01
Talk Overview
• Ecological Question of Interest• Western Great Lakes Breeding Bird Study• Interesting Features of our Example• Options for Modeling Species
Presence/Absence(1) Separate Models for Each Spatial Extent(2) One Model for all Spatial Extents(3) Model using Functionals of Explanatory
Variables(4) Graphical Model
Ecological Question of Interest
• How does the relationship between landscape characteristics and presence of a bird species change with scale?
• What scale is the most useful in terms of understanding bird presence/absence?
Concentric Circle Sampling Design
1000m
500m
100 m
Western Great Lakes Breeding Bird Study
• Response Variable:– Presence/Absence of Pine Warbler
• Explanatory Variables:– % land cover within 4 different spatial
extents– Ten land cover types
Interesting Features of the Data
Correlation between Explanatory Variables
Spatial Extent
pine and oak-pine/spruce-fir
lowland non-forest/n. hardwoods
n. hardwoods /aspen-birch
100m -0.31 (0.08) -0.08 (0.08) -0.07 (0.08)
500m 0.03 (0.08) -0.17 (0.08) -0.14 (0.08)
1000m 0.11 (0.08) -0.24 (0.08) -0.26 (0.08)
5000m 0.21 (0.08) -0.58 (0.06) -0.63 (0.06)
Correlation Between Pine and Oak-Pine Measured at Different Scales
Spatial Extent
100m 500m 1000m 5000m
100m 1 0.81(0.05)
0.70(0.06)
0.45(0.07)
500m 1 0.95(0.03)
0.70(0.06)
1000m 1 0.79(0.05)
Relationship between Land Cover
Variables and Spatial Extent
010002000300040005000
Spatial Extent (m)
01
02
03
04
05
06
0
Pe
rce
nta
ge
o
f P
in
e a
nd
O
ak-P
in
e
Chequamegon ForestChippew a ForestSt. Croix ForestSuperior Forest
0 1000 2000 3000 4000 5000
Spatial Extent (m)
010
2030
4050
60
Per
cent
age
of P
ine
and
Oak
-Pin
e
Chequamegon ForestChippewa ForestSt. Croix ForestSuperior Forest
Options for Modeling Presence/Absence of Pine
Warbler(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 1: Separate Models Approach
(100m) M1 : log(
(500m) M5 : log(
(1000m)M10 : log(
(5000m)M50 : log(
where Y denotes n-length vector of binary response with Pr(Yi=1) = i,
denotes matrix of explanatory variables at the 100m scale
Model Significant explanatory variables selected using BIC criteria
M1 lowland conifer, pine and oak-pine
M5 lowland conifer, pine and oak-pine, spruce-fir, spruce-fir:pine and oak-pine
M10 pine and oak-pine, spruce-fir, spruce-fir:pine and oak-pine
M50 pine and oak-pine, foresta, foresta:spruce-fir, spruce-fir
a: The forest variable is an indicator for stands located in the Chequamegon national forest in Wisconsin.
Option 1: Separate Models Approach
Option 1: Separate Models Approach
• Disadvantages:– does not account for possible
relationships between spatial extents
– multi-collinearity of explanatory variable
– 210 possible models for each spatial extent
Options for Modeling Presence/Absence of Pine
Warbler(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 2: One Model for all Spatial Extents
Mall : log ( (1-)-1) = all
all
where
Y denotes n-length vector of binary response with Pr(Yi=1) = i,
all = [
Spatial extent
Explanatory variables selected using BIC for Mall
100m aspen-birch, northern hardwoods, pine and oak-pine, spruce-fir
500m
none
1000m
spruce-fir
100m:1000m
pine and oak-pine:spruce-fir
Option 2: One Model for all Spatial Extents
Advantages:– allows for interactions between
scales
Disadvantages:– serious multi-collinearity problems
– 230 possible models
Option 2: One Model for all Spatial Extents
Options for Modeling Presence/Absence of Pine
Warbler(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 3: Model using Functionals of Explanatory Variables
• Difference Model Mdiff : log ( (1-)-1) = diff
diffwhere diff = (element-
wise)
• Proportional Model Mprop : log ( (1-)-1) = prop prop
where prop = (element-wise)
Option 3: Model using Functionals of Explanatory
Variables
Model
Explanatory variables selected using BIC
Mdiff
pine and oak-pinediff
Mprop
aspen-birchprop , pine and oak-pineprop
Option 3: Model using Functionals of Explanatory
Variables• Advantages:
– incorporates two spatial extents
• Disadvantages:– biologically meaningful?– multi-collinearity– model selection
Options for Modeling Presence/Absence of Pine
Warbler(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 4: Graphical Model - think of explanatory variables and response
holistically (i.e., as a single multivariate observation)
Logistic Regression Model
X1
Y
X2 X3 X4 X1
Y
X2 X3 X4
Bayesian Network (Graphical) Model
Option 4: Graphical Model
For comparison with MALL, we use the same “explanatory” variables
aspen-birch 100m
pine & oak-pine 100m
spruce-fir 1000m
PineWarbler
spruce-fir 100m
n. hardwoods 100m
Option 4: Graphical Model
spruce-fir 100m
pine & oak-pine 100m
spruce-fir 1000m
Pine Warbler
aspen-birch 100m
N. hardwoods 100m
Diagram of MALL
spruce-fir 100m
pine & oak-pine 100m
spruce-fir 1000m
Pine Warbler
aspen-birch 100m
N. hardwoods 100m
Diagram of Bayesian MALL
log ( (1-)-1) = all ; fixed ~ Multinomial(P,100)
log(spruce-fir1000)~ N
log ( (1-)-1) = + log(spruce-fir1000)
Where= variables in MALL
Option 4: Graphical Model
Comparison of MALL and Bayesian MALL
Land cover type variable MALL Bayesian MALL
intercept -3.87 (1.27) -4.20 (1.18)
aspen-birch100 0.02 (0.01)
0.03 (0.01)
northern hardwoods100 0.03 (0.01)
0.03 (0.01)
pine and oak-pine100 0.06 (0.01)
0.10 (0.02)
spruce-fir100 0.02 (0.01)
0.02 (0.01)
log(spruce-fir1000) 0.3 (0.44) 0.34 (0.41)
pine and oak-pine100: log(spruce-fir1000)
-0.02 (0.008) -0.02 (0.008)
Option 4: Graphical Model
spruce-fir 100m
pine & oak-pine 100m
spruce-fir 1000m
Pine Warbler
aspen-birch 100m
N. hardwoods 100m
spruce-fir 100m
pine & oak-pine 100m
spruce-fir 1000m
Pine Warbler
aspen-birch 100m
N. hardwoods 100m
Where Z= variables in MALL
~ Multinomial(P,100)
log(spruce-fir1000)~ N
log ( (1-)-1) = + log(spruce-fir1000)
i ~ Multinomial(Pi,100)
Pi=(Pi,1, Pi,2, Pi,3, Pi,4, Pi,5)
log(Pi,1/(1- Pi,1))=log(spruce-fir1000)
log(spruce-fir1000)~ N
log( (1-)-1) = + pine & oak-pine100
Bayesian MALL Bayesian Network Model
Option 4: Graphical Model
Comparison of two Bayesian Network Models
Component -2log likelihood for
Bayesian MALL
-2 log likelihood for Bayes Network
Model PIWA 160.9 179.4
100m Scale 25699.5 24478
1000m Scale 379.4 379.4
Total 26239.8 25036.8
BIC total 26354 (13) 25062 (11)
Option 4: Graphical Model
• Advantages:– considers ecological system holistically– can eliminate multi-collinearity– biologically meaningful
• Disadvantages:– model selection– implementation issues
Acknowledgements
Don Stevens, OSU
Jerry Niemi, N.R.R.I Univ. of Minn., Duluth
JoAnn Hanowski, N.R.R.I Univ. of Minn., Duluth