a bootstrap variance estimator for the observed species richness in quadrat sampling
DESCRIPTION
A bootstrap variance estimator for the observed species richness in quadrat sampling. Steen Magnussen, Canadian Forest Service, Victoria BC Lorenzo Fattorini, University of Sienna, IT Ron McRoberts, USDA Forest Service, St. Paul Minnesota. TIES09-Bologna, July 5-9, 2009. - PowerPoint PPT PresentationTRANSCRIPT
A bootstrap variance estimator for the observed species richness in quadrat
sampling
Steen Magnussen, Canadian Forest Service, Victoria BC
Lorenzo Fattorini, University of Sienna, IT
Ron McRoberts, USDA Forest Service, St. Paul Minnesota
TIES09-Bologna, July 5-9, 2009
The number of species (S) in a population is an important indicator of biodiversity. For many populations a census is infeasible. A sample survey yields an observed number of
species S(n) S. Interest in estimating richness. Model-based estimation of S and its precision. Quadrat sampling or k-distance sampling is
popular/efficient in vegetation surveys. Sample locations on a grid or syst. non-aligned.
Non-random species associations
Species co-occurrence in a sample unit is predominantly non-random(positive or negative correlation).
Non-randomness gives rise to over-dispersion in the sampling variance of S(n).
Non-random spatial distribution of specieshas no effect on E(S(n)) but lowers efficiency.
Do we need a variance estimator for S(n)?
The sampling variance of S(n) propagates to estimates of richness.
A variance estimator for should be consistent with the estimator of var[S(n)].
If estimator of S is a function of S(n) and other sample statistics: use delta technique to estimate variance.
We have only one design-based variance estimator for S(n), one that can be adapted to S(n)1, and one based on balanced repeated sample replications2.1. Haas PJ, Liu YS, Stokes L. 2006. An estimator of number of species from quadrat sampling. Biometrics 62: 135-14
2. Magnussen S. 2009. A balanced repeated replication estimator of sampling variance for apparent and predicted species richness. For. Sci.
S
A design-based estimator of variance Ugland KI, Gray JS, Ellingsen KE. 2003. The
species-accumulation curve and estimation of species richness. J. Anim. Ecol. 72: 888-897.
Finite population of N primary sampling units (PSU).
Sampling without replacement. Impractical but inspirational for the proposed
bootstrap estimator of variance. Designed for sub-sampling applications.
The expectation of S(n)
1
1
1
1
1 1 1 11 ( 1)
where number of species found in of sampled PSUs
and relative occurrence of species in P
Sj
j
N ni
i
i
j
N NE S n S
nn
fn n nS
N N N i S
f i n
j N
SUs
22
22
1
var
2
1 if th species samplewhere
0 otherwise
1 if th and th species sampleand
0 otherwise
S
i i ji i j
i
ij
S n E S n E S n
E I E I I E S n
iI
i jI
The bootstrap estimator
data: an ( ) occurrence matrix n S nn S n δ
1 if species occurs in th PSU
0 otherwiseij
j i
• Generate by N-n hot-deck imputations for non-sampled PSUs.
• Bootstrap samples (wor) of size n would miss Δ*S(n) species.
• Add Δ*S(n) columns to .*
( )N S nδ
*( )N S nδ
Expected number of missed of species ΔS(n)
1*( )
*
1
S nj
j
NN NS n
nn
*
* * *
* * *
ˆ
Add: columns to
Bernoulli~
j j
N S n
Dist
N N
S n
S n S n
δ
Adding species (columns)
*
Select the columns to be added
from the columns of
with probability proportional to the
chance of being missed in a sample of size .
N S n
n
δ
Bootstrap sample
Take a size n (wor) random sample from the augmented matrix
Repeat the sequence of hot-deck imputations, augmentation, and bootstrap sampling B times.
* *
*
N S n δ
The bootstrap variance estimator
Generate for each bootstrap sample the species sample occurrence indicators
*, *,,
*, *,
and , 1,...,
, 1,...,
b bi i j
b b
b
I I b B
i j Max S n S n
Compute var(S(n)) as per Ugland et al.
Assessment of estimator
Simulated wor sampling from large USDA Forest Service FIA collections of plot data.
Sample sizes n = 20, 40,...,120 (fp < 0.05). FIA plot records treated as finite populations. State-wide inventories from Georgia (GA), Minnesota
(MN), and Utah (UT). Regional inventory from Wisconsin (ASP212). Monte-Carlo variance = benchmark (10,000 samples). Coverage of estimated (95%) confidence intervals.
Data set Year N(plots)
Species(S)
Min, median,and max noof speciesper PSU
Median noof treesper plot
ASP212 ca. 2002 5771 76 1,2,15 8
Georgia (GA) 1989 6524 82 1,4,16 16
Georgia (GA) 2006 4429 147 1,5,19 25
Minnesota (MN) 1977 8815 54 1,4,11 19
Minnesota (MN) 2006 5769 70 1,4,12 25
Utah (UT) 1993 2733 20 1,2,7 14
Utah (UT) 2006 2198 20 1,2,8 17
Conclusions
The bootstrap variance estimator performs reasonably well in low-intensity sampling in species-rich and species-poor populations.
Tendency to underestimate actual variance. Coverage of CI95’s typically 0.90-0.93. About par with a Bal. Rep. Repl. estimator Much better than estimator by Haas et al.