estimating site occupancy rates when detection probabilities are less than one

8
2248 Ecology, 83(8), 2002, pp. 2248–2255 q 2002 by the Ecological Society of America ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE DARRYL I. MACKENZIE, 1,5 JAMES D. NICHOLS, 2 GIDEON B. LACHMAN, 2,6 SAM DROEGE, 2 J. ANDREW ROYLE, 3 AND CATHERINE A. LANGTIMM 4 1 Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695-8203 USA 2 U.S. Geological Survey, Patuxent Wildlife Research Center, 11510 American Holly Drive, Laurel, Maryland 20708-4017 USA 3 U.S. Fish and Wildlife Service, Patuxent Wildlife Research Center, 11510 American Holly Drive, Laurel, Maryland 20708-4017 USA 4 U.S. Geological Survey, Florida Caribbean Science Center, Southeastern Amphibian Research and Monitoring Initiative, 7920 NW 71st Street, Gainesville, Florida 32653 USA Abstract. Nondetection of a species at a site does not imply that the species is absent unless the probability of detection is 1. We propose a model and likelihood-based method for estimating site occupancy rates when detection probabilities are ,1. The model provides a flexible framework enabling covariate information to be included and allowing for missing observations. Via computer simulation, we found that the model provides good estimates of the occupancy rates, generally unbiased for moderate detection probabilities (.0.3). We estimated site occupancy rates for two anuran species at 32 wetland sites in Maryland, USA, from data collected during 2000 as part of an amphibian monitoring program, Frog- watch USA. Site occupancy rates were estimated as 0.49 for American toads (Bufo amer- icanus), a 44% increase over the proportion of sites at which they were actually observed, and as 0.85 for spring peepers (Pseudacris crucifer), slightly above the observed proportion of 0.83. Key words: anurans; bootstrap; Bufo americanus; detection probability; maximum likelihood; metapopulation; monitoring; patch occupancy; Pseudacris crucifer; site occupancy. INTRODUCTION We describe an approach to estimating the proportion of sites occupied by a species of interest. We envision a sampling method that involves multiple visits to sites during an appropriate season during which a species may be detectable. However, a species may go unde- tected at these sites even when present. Sites may rep- resent discrete habitat patches in a metapopulation dy- namics context or sampling units (e.g., quadrats) reg- ularly visited as part of a large-scale monitoring pro- gram. The patterns of detection and nondetection over the multiple visits for each site permit estimation of detection probabilities and the parameter of interest, proportion of sites occupied. Our motivation for considering this problem in- volves potential applications in (1) large-scale moni- toring programs and (2) investigations of metapopu- lation dynamics. Monitoring programs for animal pop- ulations and communities have been established throughout the world in order to meet a variety of ob- jectives. Most programs face two important sources of Manuscript received 29 May 2001; revised 11 October 2001; accepted 22 October 2001. 5 Present address: Proteus Research and Consulting Ltd., P.O. Box 5193, Dunedin, New Zealand. E-mail: [email protected] 6 Present address: International Association of Fish and Wildlife Agencies, 444 N. Capitol Street NW, Suite 544, Washington, D.C., 20001 USA. variation that must be incorporated into the design (e.g., see Thompson 1992, Lancia et al. 1994, Thomp- son et al. 1998, Yoccoz et al. 2001, Pollock et al. 2002). The first source of variation is space. Many programs seek to provide inferences about areas that are too large to be completely surveyed. Thus, small areas must be selected for surveying, with the selection being carried out in a manner that permits inference to the entire area of interest (Thompson 1992, Yoccoz et al. 2001, Pol- lock et al. 2002). The second source of variation important to moni- toring program design is detectability. Few animals are so conspicuous that they are always detected at each survey. Instead, some sort of count statistic is obtained (e.g., number of animals seen, heard, trapped, or oth- erwise detected), and a method is devised to estimate the detection probability associated with the count sta- tistic. Virtually all of the abundance estimators de- scribed in volumes such as Seber (1982) and Williams et al. (in press) can be viewed as count statistics divided by estimated detection probabilities. Not allowing for detectability and solely using the count statistic as an index to abundance is unwise. Changes in the count may be a product of random variations or changes in detectability, so it is impossible to make useful infer- ence about the system under investigation. The methods used to estimate detection probabilities of individual animals (and hence abundance) at each site are frequently expensive of time and effort. For

Upload: catherine-a

Post on 15-Oct-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

2248

Ecology, 83(8), 2002, pp. 2248–2255q 2002 by the Ecological Society of America

ESTIMATING SITE OCCUPANCY RATES WHEN DETECTIONPROBABILITIES ARE LESS THAN ONE

DARRYL I. MACKENZIE,1,5 JAMES D. NICHOLS,2 GIDEON B. LACHMAN,2,6 SAM DROEGE,2 J. ANDREW ROYLE,3

AND CATHERINE A. LANGTIMM4

1Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695-8203 USA2U.S. Geological Survey, Patuxent Wildlife Research Center, 11510 American Holly Drive,

Laurel, Maryland 20708-4017 USA3U.S. Fish and Wildlife Service, Patuxent Wildlife Research Center, 11510 American Holly Drive,

Laurel, Maryland 20708-4017 USA4U.S. Geological Survey, Florida Caribbean Science Center, Southeastern Amphibian Research and Monitoring Initiative,

7920 NW 71st Street, Gainesville, Florida 32653 USA

Abstract. Nondetection of a species at a site does not imply that the species is absentunless the probability of detection is 1. We propose a model and likelihood-based methodfor estimating site occupancy rates when detection probabilities are ,1. The model providesa flexible framework enabling covariate information to be included and allowing for missingobservations. Via computer simulation, we found that the model provides good estimatesof the occupancy rates, generally unbiased for moderate detection probabilities (.0.3). Weestimated site occupancy rates for two anuran species at 32 wetland sites in Maryland,USA, from data collected during 2000 as part of an amphibian monitoring program, Frog-watch USA. Site occupancy rates were estimated as 0.49 for American toads (Bufo amer-icanus), a 44% increase over the proportion of sites at which they were actually observed,and as 0.85 for spring peepers (Pseudacris crucifer), slightly above the observed proportionof 0.83.

Key words: anurans; bootstrap; Bufo americanus; detection probability; maximum likelihood;metapopulation; monitoring; patch occupancy; Pseudacris crucifer; site occupancy.

INTRODUCTION

We describe an approach to estimating the proportionof sites occupied by a species of interest. We envisiona sampling method that involves multiple visits to sitesduring an appropriate season during which a speciesmay be detectable. However, a species may go unde-tected at these sites even when present. Sites may rep-resent discrete habitat patches in a metapopulation dy-namics context or sampling units (e.g., quadrats) reg-ularly visited as part of a large-scale monitoring pro-gram. The patterns of detection and nondetection overthe multiple visits for each site permit estimation ofdetection probabilities and the parameter of interest,proportion of sites occupied.

Our motivation for considering this problem in-volves potential applications in (1) large-scale moni-toring programs and (2) investigations of metapopu-lation dynamics. Monitoring programs for animal pop-ulations and communities have been establishedthroughout the world in order to meet a variety of ob-jectives. Most programs face two important sources of

Manuscript received 29 May 2001; revised 11 October 2001;accepted 22 October 2001.

5 Present address: Proteus Research and Consulting Ltd.,P.O. Box 5193, Dunedin, New Zealand.E-mail: [email protected]

6 Present address: International Association of Fish andWildlife Agencies, 444 N. Capitol Street NW, Suite 544,Washington, D.C., 20001 USA.

variation that must be incorporated into the design(e.g., see Thompson 1992, Lancia et al. 1994, Thomp-son et al. 1998, Yoccoz et al. 2001, Pollock et al. 2002).

The first source of variation is space. Many programsseek to provide inferences about areas that are too largeto be completely surveyed. Thus, small areas must beselected for surveying, with the selection being carriedout in a manner that permits inference to the entire areaof interest (Thompson 1992, Yoccoz et al. 2001, Pol-lock et al. 2002).

The second source of variation important to moni-toring program design is detectability. Few animals areso conspicuous that they are always detected at eachsurvey. Instead, some sort of count statistic is obtained(e.g., number of animals seen, heard, trapped, or oth-erwise detected), and a method is devised to estimatethe detection probability associated with the count sta-tistic. Virtually all of the abundance estimators de-scribed in volumes such as Seber (1982) and Williamset al. (in press) can be viewed as count statistics dividedby estimated detection probabilities. Not allowing fordetectability and solely using the count statistic as anindex to abundance is unwise. Changes in the countmay be a product of random variations or changes indetectability, so it is impossible to make useful infer-ence about the system under investigation.

The methods used to estimate detection probabilitiesof individual animals (and hence abundance) at eachsite are frequently expensive of time and effort. For

Page 2: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

August 2002 2249ESTIMATING SITE OCCUPANCY RATES

this reason, these estimation methods are often used indetailed experiments or small-scale investigations, butare not as widely used in large-scale monitoring pro-grams. The methods proposed here to estimate the pro-portion of sites (or more generally, the proportion ofsampled area) occupied by a species can be imple-mented more easily and less expensively than the meth-ods used for abundance estimation. For this reason, ourproposed method should be attractive as a basis forlarge-scale monitoring programs, assuming that theproportion of sites or area occupied is an adequate statevariable with respect to program objectives.

The second motivation for considering this estima-tion problem involves the importance of patch occu-pancy data to the study of metapopulation dynamics.The proportion of patches occupied is viewed as a statevariable in various metapopulation models (e.g., Levins1969, 1970, Lande 1987, 1988, Hanski 1992, 1994,1997). So-called ‘‘incidence functions’’ (e.g., see Di-amond 1975, Hanski 1992) depict the probability ofoccurrence of a species in a patch, expressed as a func-tion of patch characteristics such as area. Under theassumption of a stationary Markov process, incidencefunction data are sometimes used to estimate patch ex-tinction and colonization probabilities (e.g., Hanski1992, 1994, 1997, Moilanen 1999). Given the relevanceof patch occupancy data to metapopulation investiga-tions and models, it seems important to estimate patchoccupancy probabilities properly. For most animalsampling situations, detection of a species is indeedindicative of ‘‘presence,’’ but nondetection of the spe-cies is not equivalent to absence. Thus, we expect mostincidence function estimates of the proportion of patch-es occupied to be negatively biased to some unknowndegree because species can go undetected when pre-sent.

In this paper, we first present general sampling meth-ods that permit estimation of the probability of siteoccupancy when detection probabilities are ,1 andmay vary as functions of site characteristics, time, orenvironmental variables. We then present a statisticalmodel for site occupancy data and describe maximumlikelihood estimation under this model. We illustrateuse of the estimation approach with empirical data onsite occupancy by two anuran species at 32 wetlandsites in Maryland collected during 2000. Finally wediscuss extending this statistical framework to addressother issues such as colony extinction/colonization,species co-occurrence, and allowing for heterogeneousdetection and occupancy probabilities

METHODS

Notation

We use the following notation throughout this article:ci, probability that a species is present at site i; pit,probability that a species will be detected at site i attime t, given presence; N, total number of surveyed

sites; T, number of distinct sampling occasions; nt num-ber of sites where the species was detected at time t;n., total number of sites at which the species was de-tected at least once.

Our use of p, to signify detection probabilities, dif-fers from its customary use in the metapopulation lit-erature, where it is used to denote the probability ofspecies presence (our c). However, our notation is con-sistent with the mark–recapture literature which pro-vides the foundation of our approach.

Basic sampling situation

Here we consider situations in which surveys of spe-cies at N specific sites are performed at T distinct oc-casions in time. Sites are occupied by the species ofinterest for the duration of the survey period, with nonew sites becoming occupied after surveying has be-gun, and no sites abandoned before the cessation ofsurveying (i.e., the sites are ‘‘closed’’ to changes inoccupancy). At each sampling occasion, investigatorsuse sampling methods designed to detect the speciesof interest. Species are never falsely detected at a sitewhen absent, and a species may or may not be detectedat a site when present. Detection of the species at asite is also assumed to be independent of detecting thespecies at all other sites. The resulting data for eachsite can be recorded as a vector of 1’s and 0’s denotingdetection and nondetection, respectively, for the oc-casions on which the site was sampled. The set of suchdetection histories is used to estimate the quantity ofinterest, the proportion of sites occupied by the species.

General likelihood

We propose a method that parallels a closed-popu-lation, mark–recapture model, with an additional pa-rameter (c) that represents the probability of speciespresence. In closed-population models, the focus is toestimate the number of individuals never encounteredby using information garnered from those individualsencountered at least once (e.g., see Otis et al. 1978,Williams et al., in press). In our application, sites areanalogous to individuals except that we observe thenumber of sites with the history comprising T 0’s (sitesat which the species is never detected over the T sam-pling occasions); hence, the total population size ofsites is known, but the focus is to estimate the fractionof those sites that the species actually occupies. Onecould recast this problem into a more conventionalclosed mark–recapture framework by only consideringthose sites where the species was detected at least once.Use of such data with closed-population, capture–re-capture models (e.g., Otis et al. 1978) would yield es-timates of population size that correspond to the num-ber of sites where the species is present. However, thefollowing method enables additional modeling of c tobe investigated (such as including covariate informa-tion).

A likelihood can be constructed using a series of

Page 3: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

2250 DARRYL I. MACKENZIE ET AL. Ecology, Vol. 83, No. 8

probabilistic arguments similar to those used in mark–recapture modeling (Lebreton et al. 1992). For siteswhere the species was detected on at least one samplingoccasion, the species must be present and was eitherdetected or not detected at each sampling occasion. Forexample, the likelihood for site i with history 01010would be

c (1 2 p )p (1 2 p )p (1 2 p ).i i1 i2 i3 i4 i5

However, nondetection of the species does not implyabsence. Either the species was present and was notdetected after T samples, or the species was not present.For site k with history 00000, the likelihood is

5

c (1 2 p ) + (1 2 c ).Pk kt kt51

Assuming independence of the sites, the product of allterms (one for each site) constructed in this mannercreates the model likelihood for the observed set ofdata, which can be maximized to obtain maximum like-lihood estimates of the parameters.

Note that, at this stage, presence and detection prob-abilities have been defined as site specific. In practice,such a model could not be fit to the data because thelikelihood contains too many parameters: the modellikelihood is over-parameterized. However, the modelis presented in these general terms because, in somecases, the probabilities may be modeled as a functionof site-specific covariates, to which we shall return.

When presence and detection probabilities are con-stant across monitoring sites, the combined model like-lihood can be written as

Tn. n n.2nt tL(c, p) 5 c p (1 2 p )P t t[ ]t51 (1)

N2n.T

3 c (1 2 p ) + (1 2 c) .P t[ ]t51

Using the likelihood in this form, our model could beimplemented with relative ease via spreadsheet soft-ware with built-in function maximization routines, be-cause only the summary statistics (n1, . . . , nT, n.) andN are required. Detection probabilities could be timespecific, or reduced forms of the model could be in-vestigated by constraining p to be constant across timeor a function of environmental covariates.

We suggest that the standard error of c be estimatedusing a nonparametric bootstrap method (Buckland andGarthwaite 1991), rather than the asymptotic (large-sample) estimate involving the second partial deriva-tives of the model likelihood (Lebreton et al. 1992).The asymptotic estimate represents a lower bound onthe value of the standard error, and may be too smallwhen sample sizes are small. A random bootstrap sam-ple of N sites is taken (with replacement) from the Nmonitored sites. The histories of the sites in the boot-strap sample are used to obtain a bootstrap estimate of

c. The bootstrap procedure is repeated a large numberof times, and the estimated standard error is the samplestandard deviation of the bootstrap estimates (Manly1997).

Extensions to the model

Covariates.—It would be reasonable to expect thatc may be some function of site characteristics such ashabitat type or patch size. Similarly, p may also varywith certain measurable variables such as weather con-ditions. This covariate information (X) can be easilyintroduced to the model using a logistic model (Eq. 2)for c and/or p (denote the parameter of interest as uand the vector of model parameters as B:

exp(XB)u 5 . (2)

1 + exp(XB)

Because c does not change over time during the sam-pling (the population is closed), appropriate covariateswould be time constant and site specific, whereas cov-ariates for detection probabilities could be time varyingand site specific (such as air or water temperature).

This is in contrast to mark–recapture models inwhich time-varying individual covariates cannot beused. In mark–recapture, a time-varying individualcovariate can only be measured on those occasionswhen the individual is captured; the covariate value isunknown otherwise. Here, time-varying, site-specificcovariates can be collected and used regardless ofwhether the species is detected. It would not be pos-sible, however, to use covariates that change over timeand cannot be measured independent of the detectionprocess.

If c is modeled as a function of covariates, the av-erage species presence probability is

N

cO ii51c 5 . (3)

NMissing observations.—In some circumstances, it

may not be possible to survey all sites at all samplingoccasions. Sites may not be surveyed for a number ofreasons, from logistic difficulties in getting field per-sonnel to all sites, to the technician’s vehicle breakingdown en route. These sampling inconsistencies can beeasily accommodated using the proposed model like-lihood.

If sampling does not take place at site i at time t,then that occasion contributes no information to themodel likelihood for that site. For example, considerthe history 10 11, where no sampling occurred at time3. The likelihood for this site would be:

cp (1 2 p )p p .1 2 4 5

Missing observations can only be accounted for in thismanner when the model likelihood is evaluated sepa-rately for each site, rather than using the combined formof Eq. 1.

Page 4: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

August 2002 2251ESTIMATING SITE OCCUPANCY RATES

FIG. 1. Results of the 500 simulated sets of data for N 5 40, with no missing values. Indicated are the average value of, ; the replication-based estimate of the true standard error of , SE( ); and the average estimate of the standard errorc c c c

obtained from 200 nonparametric bootstrap samples, , for various levels of T, p, and c.SE(c)

SIMULATION STUDY

Simulation methods

A simulation study was undertaken to evaluate theproposed method for estimating c. Data were generatedfor situations in which all sites had the same probabilityof species presence, and the detection probability wasconstant across time and sites, c(·)p(·). The effects offive factors were investigated: (1) N 5 20, 40, or 60;(2) c 5 0.5, 0.7, or 0.9; (3) p 5 0.1, 0.3, or 0.5; (4)T 5 2, 5, or 10; (5) probability of a missing observation5 0.0, 0.1, or 0.2.

For each of the 243 scenarios, 500 sets of data weresimulated. For each site, a uniformly distributed, pseu-do-random number between 0 and 1 was generated (y),and if y # c then the site was occupied. Further pseudo-random numbers were generated and similarly com-pared to p to determine whether the species was de-tected at each time period, with additional randomnumbers being used to establish missing observations.The c(·)p(·) model was applied to each set of simulateddata. The resulting estimate of c was recorded and the

nonparametric bootstrap estimate of the standard errorwas also obtained using 200 bootstrap samples.

Simulation results

Fig. 1 presents the simulation results for scenarioswhere N 5 40 with no missing values only, but theseare representative of the results in general. The fullsimulation results are included in the Appendix.

Generally, this method provides reasonable estimatesof the proportion of sites occupied. When detectionprobability is 0.3 or greater, the estimates of c arereasonably unbiased in all scenarios considered for T$ 5. When T 5 2, only when detection probability isat least 0.5 do the estimates of c appear to be reason-able. For low detection probabilities, however, c tendsto be overestimated when the true value is 0.5 or 0.7,but underestimated when c equals 0.9. A closer ex-amination of the results reveals that, in some situationsin which detection probability is low, tends to 1.c

In most cases, the nonparametric bootstrap providesa good estimate of the standard error for , the excep-ction being for situations with low detection probabil-

Page 5: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

2252 DARRYL I. MACKENZIE ET AL. Ecology, Vol. 83, No. 8

TABLE 1. Relative difference in AIC (DAIC), AIC modelweights (wi), overall estimate of the fraction of sites oc-cupied by each species ( ), and associated standard errorc(SE( )).c

Model, by species DAIC wi c SE( )c

American toadc(Habitat) p(Temperature)c(·) p(Temperature)c(Habitat) p(·)c(·) p(·)

0.000.420.490.70

0.360.240.220.18

0.500.490.490.49

0.130.140.120.13

Spring peeperc(Habitat) p(Temperature)c(·) p(Temperature)c(Habitat) p(·)c(·) p(·)

0.001.72

40.4942.18

0.850.150.000.00

0.840.850.840.85

0.070.070.070.07

ities. Again, this is caused by c estimates close to 1;in such situations, the bootstrap estimate of the stan-dard error is very small, which overstates the precisionof .c

In general, increasing the number of sampling oc-casions improves both the accuracy and precision of

, although in some instances there is little gain incusing 10 occasions rather than five. If only two occa-sions are used, however, accuracy tends to be poorunless detection probabilities are high, and even thenthe standard error of is approximately double that ofcusing five sampling occasions.

Similarly, increasing the number of sites sampled, N,also improves both the accuracy and precision of .c

Not presented here are the simulation results for sce-narios with missing observations. The proposed meth-od appears to be robust to missing data, with the onlynoticeable effect being (unsurprisingly) a loss of pre-cision. In this study, on average, the standard error of

increased by 5% with 10% missing observations, andcby 11% with 20% missing observations. The bootstrapstandard error estimates also increased by a similaramount, accounting well for the loss of information.

FIELD STUDY OF ANURANS AT

MARYLAND WETLANDS

Field methods and data collection

We illustrate our method by considering monitoringdata collected on American toads (Bufo americanus)and spring peepers (Pseudacris crucifer) at 32 wetlandsites located in the Piedmont and Upper Coastal Plainphysiographic provinces surrounding Washington,D.C., and Baltimore, Maryland, USA. Volunteers en-rolled in the National Wildlife Federation/U.S. Geo-logical Survey’s amphibian monitoring program,FrogwatchUSA, visited monitoring sites between 19February 2000 and 12 October 2000. Sites were chosennonrandomly by volunteers and were monitored at theirconvenience. Observers collected information on thespecies of frogs and toads heard calling during a 3-mincounting period taken sometime after sundown. Eachspecies of calling frog and toad was assigned a three-level calling index, which, for this study, was truncatedto reflect either detection (1) or nondetection (0).

The data set was reduced by considering only theportion of data for each species between the dates offirst and last detection exclusive. Truncating the datain this manner ensures that species were available tobe detected throughout that portion of the monitoringperiod, thus satisfying our closure assumption. Includ-ing the dates of first and last detection in the analysiswould bias parameter estimates because the data setwas defined using these points; hence, they were ex-cluded.

Three sites were removed after the truncation be-cause they were never monitored during the redefinedperiod. Fewer than eight of the 29 sites were monitored

on any given day and the number of visits per sitevaried tremendously, with a very large number of miss-ing observations (;90%). Note that in the context ofthis sampling, the entire sampling period included theinterval between the date at which the first wetland wassampled and the date at which all sampling ended. Amissing observation was thus any date during this in-terval on which a wetland was not sampled. Each timea site was visited, air temperature was recorded. Siteswere defined as being either a distinct body of water(pond, lake) or other habitat (swamp, marsh, wet mead-ow). These variables were considered as potential cov-ariates for detection and presence probabilities, re-spectively. The data used in this analysis have beenincluded in the Supplement.

Results of field study

American toad.—Daily records for the 29 sites, mon-itored between 9 March 2000 and 30 May 2000, wereincluded for analysis. Sites were visited 8.9 times onaverage (minimum 5 2, maximum 5 58 times), withAmerican toads being detected at least once at 10 lo-cations (0.34). Three models with covariates and onewithout were fit to the data (Table 1) and ranked ac-cording to AIC (Burnham and Anderson 1998). Thefour models considered have virtually identical weight,suggesting that all models provide a similar descriptionof the data, despite the different structural forms.Therefore we cannot make any conclusive statementregarding the importance of the covariates, but thereis some suggestion that detection probabilities may in-crease with increasing temperature and occupancy ratesmay be lower for habitats consisting of a distinct bodyof water. However, all models provide very similar es-timates of the overall occupancy rate (;0.49), whichis 44% larger than the proportion of sites where toadswere detected at least once. The standard error for theestimate is reasonably large and corresponds to a co-efficient of variation of 27%.

Spring peeper.—Daily records for the 29 sites, mon-itored between 27 February 2000 and 30 May 2000,were included for analysis. Sites were visited, on av-

Page 6: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

August 2002 2253ESTIMATING SITE OCCUPANCY RATES

erage, 9.6 times (minimum 5 2, maximum 5 66 visits),with spring peepers being detected at least once at 24locations (0.83). The same models as those for theAmerican toad were fit to the spring peeper data andthe results are also displayed in Table 1. Here the twop(·) models have virtually zero weight, indicating thatthe p(Temperature) models provide a much better de-scription of the data. We suspect that this effect is due,partially, to a tapering off of the calling season as springprogresses into summer. The c(Habitat) p(Temperature)model clearly has greatest weight and suggests thatestimated occupancy rates are lower for distinct bodiesof water (0.77) than for other habitat types (1.00). Thisis not unexpected, given spring peepers were actuallydetected at all sites of the latter type. Regardless ofhow the models ranked, however, all models providea similar estimate of the overall occupancy rate that isonly marginally greater than the number of sites wherespring peepers were detected at least once. This sug-gests that detection probabilities were large enough thatspring peepers probably would be detected during themonitoring if present.

DISCUSSION

The method proposed here to estimate site occupancyrate uses a simple probabilistic argument to allow forspecies detection probabilities of ,1. As shown, it pro-vides a flexible modeling framework for incorporatingboth covariate information and missing observations.It also lays the groundwork for some potentially ex-citing extensions that would enable important ecolog-ical questions to be addressed.

From the full simulation results for scenarios withlow detection probabilities, it is very easy to identifycircumstances in which one should doubt the estimatesof c. We advise caution if an estimate of c very closeto 1 is obtained when detection probabilities are low(,0.15), particularly when the number of sampling oc-casions is also small (,7). In such circumstances, thelevel of information collected on species presence/ab-sence is small, so it is difficult for the model to dis-tinguish between a site where the species is genuinelyabsent and a site where the species has merely not beendetected.

Our simulation results may also provide some guid-ance on the number of visits to each site required inorder to obtain reasonable estimates of occupancy rate.If one wishes to visit a site only twice, then it appearsthat the true occupancy rate needs to be .0.7 and de-tection probability (at each visit) should be .0.3. Eventhen, however, precision of the estimate may be low.Increasing the number of visits per site improves theprecision of the estimated occupancy rate, and the re-sulting increase in information improves the accuracyof the estimate when detection probabilities are low.We stress that whenever a survey (of any type) is beingdesigned, some thought should be given to the likelyresults and method of analysis, because these consid-

erations can provide valuable insight on the level ofsampling effort required to achieve ‘‘good’’ results.

Logistical considerations of multiple visits will prob-ably result in some hesitancy to use this approach, butwe suggest that the expenditure of extra effort to obtainunbiased estimates of parameters of interest generallywill be preferable to the expenditure of less effort toobtain biased estimates. If travel time to sites is sub-stantial, then multiple searches or samples may be con-ducted by multiple observers, or even by a single ob-server, at a single trip to a site, e.g., conduct two ormore 3-min amphibian calling surveys in a single nightat the same pond. If large numbers of patches must besurveyed, then it may be reasonable to conduct multiplevisits at a subset of sites for the purpose of estimatingdetection probability, and perhaps associated covariaterelationships. Then this information on detection prob-ability, perhaps modeled as a function of site-specificcovariates, could be applied to sites visited only once.Issues about optimal design require additional work,but it is clear that a great deal of flexibility is possiblein approaches to sampling.

Site occupancy may well change over years or be-tween seasons as populations change; new coloniescould be formed or colonies could become locally ex-tinct. When sites are surveyed on more than one oc-casion between these periods of change, for multipleperiods, the approach described here could be com-bined with the robust design mark–recapture approach(Pollock et al. 1990). For example, suppose that theanuran sampling described in our examples is contin-ued in the future, such that the same wetland sites aresurveyed multiple times each summer, for multipleyears. During the periods when sites are closed tochanges in occupancy, our approach could be used toestimate the occupancy rate as in our example. Thechange in occupancy rates over years could then bemodeled as functions of site colonization and extinctionrates, analogous with the birth and death rates in anopen-population mark–recapture study. Such Markovmodels of patch occupancy dynamics will permit time-specific estimation and modeling of patch extinctionand colonization rates that do not require the assump-tions of p 5 1 or process stationarity invoked in pre-vious modeling efforts (e.g., Erwin et al. [1998] re-quired p 5 1; Hanski [1992, 1994] and Clark and Ro-senzweig [1994] required both assumptions).

Often monitoring programs collect information onthe presence/absence of multiple species at the samesites. An important biological question is whether spe-cies co-occur independently. Does the presence/ab-sence of species A depend upon the occupancy stateof species B? Our method of modeling species presencecould be extended in this direction, enabling such im-portant ecological questions to be addressed. The mod-el could be parameterized in terms of cAB (in additionto cA and cB): the probability that both species A andspecies B are present at a site. However, the number

Page 7: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

2254 DARRYL I. MACKENZIE ET AL. Ecology, Vol. 83, No. 8

of parameters in the model would increase exponen-tially with the number of species, so reasonably gooddata sets might be required. For example, four addi-tional parameters would be required to model co-oc-currences between species A, B, and C (cAB, cAC, cBC,cABC), but if six species were being modeled, 57 extraparameters would need to be estimated.

Not addressed are situations in which presence anddetection probabilities are heterogeneous, varyingacross sites. Some forms of heterogeneity may be ac-counted for with covariate information such as sitecharacteristics or environmental conditions at the timeof sampling. On other occasions, however, the sourceof heterogeneity may be unknown. We foresee thatcombining our method with the mixture model ap-proach to closed-population, mark–recapture models ofPledger (2000) would be one solution, which enablesthe problem to be contained within a likelihood frame-work. It may also be possible to combine our methodwith other closed-population, mark–recapture methodssuch as the jackknife (Burnham and Overton 1978) orcoverage estimators (Chao et al. 1992). For differentsampling frameworks, where monitoring is performedon a continuous or incidental basis rather than at dis-crete sampling occasions, combining our methods withthe Poisson family of models (Boyce et al. 2001,MacKenzie and Boyce 2001) may also be feasible, par-ticularly for multiple years of data.

The three extensions to the proposed methods arecurrently the focus of ongoing research on this generaltopic of estimating site occupancy rates.

Software to perform the above modeling has beenincluded in the Supplement.

ACKNOWLEDGMENTS

We would like to thank Christophe Barbraud, Mike Conroy,Ullas Karanth, Bill Kendall, Ken Pollock, John Sauer, andRob Swihart for useful discussions of this general estimationproblem and members of the USGS Southeastern AmphibianResearch and Monitoring Initiative for discussions on am-phibian monitoring. Atte Moilanen provided a constructivereview of the manuscript, as did a second anonymous re-viewer. Our thanks also go to the Frogwatch USA volunteersdirected by Sue Muller at Howard County Department ofParks and Recreation for collection of the data.

LITERATURE CITED

Boyce, M. S., D. I. MacKenzie, B. F. J. Manly, M. A. Har-oldson, and D. Moody. 2001. Negative binomial modelsfor abundance estimation of multiple closed populations.Journal of Wildlife Management 65:498–509.

Buckland, S. T., and P. H. Garthwaite. 1991. Quantifyingprecision of mark–recapture estimates using the bootstrapand related methods. Biometrics 47:255–268.

Burnham, K. P., and D. R. Anderson. 1998. Model selectionand inference—a practical information-theoretic approach.Springer-Verlag, New York, New York, USA.

Burnham, K. P., and W. S. Overton. 1978. Estimation of thesize of a closed population when capture probabilities varyamong animals. Biometrika 65:625–633.

Chao, A., S.-M. Lee, and S.-L. Jeng. 1992. Estimating pop-ulation size for capture–recapture data when capture prob-

abilities vary by time and individual animal. Biometrics48:201–216.

Clark, C. W., and M. L. Rosenzweig. 1994. Extinction andcolonization processes: parameter estimates from sporadicsurveys. American Naturalist 143:583–596.

Diamond, J. M. 1975. Assembly of species communities.Pages 342–444 in M. L. Cody and J. M. Diamond, editors.Ecology and evolution of communities. Harvard UniversityPress, Cambridge, Massachusetts, USA.

Erwin, R. M., J. D. Nichols, T. B. Eyler, D. B. Stotts, and B.R. Truitt. 1998. Modeling colony site dynamics: a casestudy of Gull-billed Terns (Sterna nilotica) in coastal Vir-ginia. Auk 115:970–978.

Hanski, I. 1992. Inferences from ecological incidence func-tions. American Naturalist 139:657–662.

Hanski, I. 1994. A practical model of metapopulation dy-namics. Journal of Animal Ecology 63:151–162.

Hanski, I. 1997. Metapopulation dynamics: from conceptsand observations to predictive models. Pages 69–91 in I.A. Hanski and M. E. Gilpin, editors. Metapopulation bi-ology: ecology, genetics, and evolution. Academic Press,New York, New York, USA.

Lancia, R. A., J. D. Nichols, and K. H. Pollock. 1994. Es-timating the number of animals in wildlife populations.Pages 215–253 in T. Bookhout, editor. Research and man-agement techniques for wildlife and habitats. The WildlifeSociety, Bethesda, Maryland, USA.

Lande, R. 1987. Extinction thresholds in demographic mod-els of territorial populations. American Naturalist 130:624–635.

Lande, R. 1988. Demographic models of the northern spottedowl (Strix occidentalis caurina). Oecologia 75:601–607.

Lebreton, J. D., K. P. Burnham, J. Clobert, and D. R. An-derson. 1992. Modeling survival and testing biological hy-potheses using marked animals. A unified approach withcase studies. Ecological Monographs 62:67–118.

Levins, R. 1969. Some demographic and genetic consequenc-es of environmental heterogeneity for biological control.Bulletin of the Entomological Society of America 15:237–240.

Levins, R. 1970. Extinction. Pages 77–107 in M. Gersten-haber, editor. Some mathematical questions in biology. Vol-ume II. American Mathematical Society, Providence,Rhode Island, USA.

MacKenzie, D. I., and M. S. Boyce. 2001. Estimating closedpopulation size using negative binomial models. WesternBlack Bear Workshop 7:21–23.

Manly, B. F. J. 1997. Randomization, bootstrap and MonteCarlo methods in biology. Second edition. Chapman andHall, London, UK.

Moilanen, A. 1999. Patch occupancy models of metapopu-lation dynamics: efficient parameter estimation using im-plicit statistical inference. Ecology 80:1031–1043.

Otis, D. L., K. P. Burnham, G. C. White, and D. R. Anderson.1978. Statistical inference from capture data on closed an-imal populations. Wildlife Monographs 62.

Pledger, S. 2000. Unified maximum likelihood estimates forclosed capture–recapture models using mixtures. Biomet-rics 56:434–442.

Pollock, K. H., J. D. Nichols, C. Brownie, and J. E. Hines.1990. Statistical inference for capture–recapture experi-ments. Wildlife Monographs 107.

Pollock, K. H., J. D. Nichols, T. R. Simons, G. L. Farnsworth,L. L. Bailey, and J. R. Sauer. 2002. Large scale wildlifemonitoring studies: statistical methods for design and anal-ysis. Environmetrics 13:1–15.

Seber, G. A. F. 1982. The estimation of animal abundanceand related parameters. MacMillan Press, New York, NewYork, USA.

Page 8: ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE

August 2002 2255ESTIMATING SITE OCCUPANCY RATES

Thompson, S. K. 1992. Sampling. John Wiley, New York,New York, USA.

Thompson, W. L., G. C. White, and C. Gowan. 1998. Mon-itoring vertebrate populations. Academic Press, San Diego,California, USA.

Williams, B. K., J. D. Nichols, and M. J. Conroy. In press.

Analysis and management of animal populations. Academ-ic Press, San Diego, California, USA.

Yoccoz, N. G., J. D. Nichols, and T. Boulinier. 2001. Mon-itoring of biological diversity in space and time; concepts,methods and designs. Trends in Ecology and Evolution 16:446–453.

APPENDIX

Full results of the simulation study are available in ESA’s Electronic Data Archive: Ecological Archives E083-041-A1.

SUPPLEMENT

Software, source code, and the sample data sets are available in ESA’s Electronic Data Archive: Ecological Archives E083-041-S1.