techniques for tracking, evaluating, and reporting the ... · 1987). in cluster sampling, the total...

27
2-1 CHAPTER 2. SAMPLING DESIGN 2.1. INTRODUCTION This chapter discusses recommended methods for designing sampling programs to track and evaluate the implementation of nonpoint source control measures. This chapter does not address whether the management measures (MMs) or best management practices (BMPs) are effective since no water quality sampling is done. Because of the variation in forestry practices and related nonpoint source control measures implemented throughout the United States, the approaches taken by various states to track and evaluate nonpoint source control measure implementation will differ. Nevertheless, all approaches should be based on sound statistical methods for selecting sampling strategies, computing sample sizes, and evaluating data. EPA recommends that states consult with a trained statistician to be certain that the approach, design, and assumptions are appropriate to the task at hand. As described in Chapter 1, implementation monitoring is the focus of this guidance. Effectiveness monitoring is the focus of another guidance prepared by EPA, the Nonpoint Source Monitoring and Evaluation Guide (USEPA, 1996). Dissmeyer (1994) also provides substantial information regarding QA/QC, statistical considerations, BMP effectiveness monitoring, and monitoring methods. The recommendations and examples in this chapter address two primary monitoring goals: Determine the extent to which MMs and BMPs are implemented in accordance with design standards and specifications. Determine whether there has been a change in the extent to which MMs and BMPs are being implemented. For example, State forestry agencies might be interested in whether streamside management areas (SMAs) at harvest sites associated with all types of forest ownerships (industrial, private nonindustrial, federal, and state) are in compliance with design standards. State forestry agencies might also be interested in whether the percentage of owners of nonindustrial private forest land that are correctly implementing the BMPs specified in a voluntary implementation program. 2.1.1. Study Objectives To develop a study design, clear, quantitative monitoring objectives must be developed. For example, the objective might be to estimate to within ±5 percent the percent of harvest sites that have adequate SMAs. Or perhaps a state is getting ready to implement new administrative procedures to ensure that purchasers of timber have been advised of needed work. In this case, detecting a 10 percent change in the number of operators that implement the work specified in the timber sale administration file might be of interest. In the first example, summary statistics are developed to describe the current status, whereas in the second example, some sort of statistical analysis (hypothesis testing) is performed to determine whether a significant change has really occurred. This choice has an impact on how the data are collected. As an example, balanced designs (e.g., two sets of data with the same number of observations in

Upload: others

Post on 21-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

2-1

CHAPTER 2. SAMPLING DESIGN

2.1. INTRODUCTION

This chapter discusses recommended methodsfor designing sampling programs to track andevaluate the implementation of nonpointsource control measures. This chapter doesnot address whether the managementmeasures (MMs) or best managementpractices (BMPs) are effective since no waterquality sampling is done. Because of thevariation in forestry practices and relatednonpoint source control measuresimplemented throughout the United States, theapproaches taken by various states to trackand evaluate nonpoint source control measureimplementation will differ. Nevertheless, allapproaches should be based on soundstatistical methods for selecting samplingstrategies, computing sample sizes, andevaluating data. EPA recommends that statesconsult with a trained statistician to be certainthat the approach, design, and assumptions areappropriate to the task at hand.

As described in Chapter 1, implementationmonitoring is the focus of this guidance. Effectiveness monitoring is the focus ofanother guidance prepared by EPA, theNonpoint Source Monitoring and EvaluationGuide (USEPA, 1996). Dissmeyer (1994)also provides substantial information regardingQA/QC, statistical considerations, BMPeffectiveness monitoring, and monitoringmethods. The recommendations and examplesin this chapter address two primary monitoringgoals:

• Determine the extent to which MMs andBMPs are implemented in accordance withdesign standards and specifications.

• Determine whether there has been a changein the extent to which MMs and BMPs arebeing implemented.

For example, State forestry agencies might beinterested in whether streamside managementareas (SMAs) at harvest sites associated withall types of forest ownerships (industrial,private nonindustrial, federal, and state) are incompliance with design standards. Stateforestry agencies might also be interested inwhether the percentage of owners ofnonindustrial private forest land that arecorrectly implementing the BMPs specified ina voluntary implementation program.

2.1.1. Study Objectives

To develop a study design, clear, quantitativemonitoring objectives must be developed. Forexample, the objective might be to estimate towithin ±5 percent the percent of harvest sitesthat have adequate SMAs. Or perhaps a stateis getting ready to implement newadministrative procedures to ensure thatpurchasers of timber have been advised ofneeded work. In this case, detecting a 10percent change in the number of operators thatimplement the work specified in the timbersale administration file might be of interest. Inthe first example, summary statistics aredeveloped to describe the current status,whereas in the second example, some sort ofstatistical analysis (hypothesis testing) isperformed to determine whether a significantchange has really occurred. This choice has animpact on how the data are collected. As anexample, balanced designs (e.g., two sets ofdata with the same number of observations in

Page 2: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-2

each set) are more typical for hypothesistesting, whereas summary statistics mightrequire unbalanced sample allocations toaccount for variability such as site size, type,and ownership.

2.1.2. Probabilistic Sampling

Most study designs that are appropriate fortracking and evaluating implementation arebased on a probabilistic approach sincetracking every operator is not cost-effective. In a probabilistic approach, individuals arerandomly selected from the entire group. Theselected individuals are evaluated, and theresults provide an unbiased assessment of theentire group. Applying the results fromrandomly selected individuals to the entiregroup is statistical inference. Statisticalinference enables one to determine, forexample, the probable percentage of timbersales with adequate SMAs without visitingevery tract of land. One could also determinewhether the change in timber sales withappropriate streamside management is withinthe range of what could occur by chance orthe change is large enough to indicate a realmodification of operator practices.

The group about which inferences are made isthe population or target population, whichconsists of population units. The samplepopulation is the set of population units thatare directly available for measurement. Forexample, if the objective is to determine thedegree to which adequate SMAs have beenestablished, silvicultural operations for whichSMAs are an appropriate BMP (e.g., timbersales with nearby streams) would be thesample population. Statistical inferences canbe made only about the target populationavailable for sampling. For example, if

implementation of erosion control is beingassessed and only public lands can be sampled,inferences cannot be made about themanagement of private lands.

The most common types of probabilisticsampling that can be used for implementationmonitoring are summarized in Table 2-1. Ingeneral, probabilistic approaches arepreferred. However, there might becircumstances under which targeted samplingshould be used. Targeted sampling refers tousing best professional judgement for selectingsample locations. For example, state forestersdeciding to evaluate all timber sales in a givenwatershed would be targeted sampling. Thechoice of a sampling plan depends on studyobjectives, patterns of variability in the targetpopulation, cost-effectiveness of alternativeplans, types of measurements to be made, andconvenience (Gilbert, 1987).

Simple random sampling is the most basictype of sampling. Each unit of the targetpopulation has an equal chance of beingselected. This type of sampling is appropriatewhen there are no major trends, cycles, orpatterns in the target population (Cochran,1977). Random sampling can be applied in avariety of ways including operator or timbersale selection. Random samples can also betaken at different times at a single site. Figure2-1 provides an example of simple randomsampling from a listing of harvest sites andfrom a map.

Page 3: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-3

Table 2-1. Applications of four sampling designs for implementation monitoring.

Sampling Design Comment

Simple RandomSampling

Each population unit has an equal probability of being selected.

Stratified RandomSampling

Useful when a sample population can be broken down into groups, or strata,that are internally more homogeneous than the entire sample population. Random samples are taken from each stratum although the probability ofbeing selected might vary from stratum to stratum depending on cost andvariability.

Cluster Sampling Useful when there are a number of methods for defining population unitsand when individual units are clumped together. In this case, clusters arerandomly selected and every unit in the cluster is measured.

Systematic Sampling This sampling has a random starting point with each subsequentobservation a fixed interval (space or time) from the previous observation.

If the pattern of MM and BMPimplementation is expected to be uniformacross the state, simple random sampling isappropriate to estimate the extent ofimplementation. If, however, implementationis homogeneous only within certain categories(e.g., federal, state, or private lands), stratifiedrandom sampling should be used.

In stratified random sampling, the targetpopulation is divided into groups called stratafor the purpose of obtaining a better estimateof the mean or total for the entire population. Simple random sampling is then used withineach stratum. Stratification involves the useof categorical variables to group observationsinto more units, thereby reducing thevariability of observations within each unit. For example, in a state with federal, state, andprivate forests, there might be differentpatterns of BMP implementation. Lands inthe state could be divided into federal, state,and private as separate strata from whichsamples would be taken. In general, a largernumber of samples should be taken in a

stratum if the stratum is more variable, larger,or less costly to sample than other strata. Forexample, if BMP implementation is morevariable on private lands, a greater number ofsampling sites might be needed in that stratumto increase the precision of the overallestimate. Cochran (1977) found that stratifiedrandom sampling provides a better estimate ofthe mean for a population with a trend,followed in order by systematic sampling(discussed later) and simple random sampling. He also noted that stratification typicallyresults in a smaller variance for the estimatedmean or total than that which results fromcomparable simple random sampling.

Page 4: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,
Page 5: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-5

If the state believes that there will be adifference between two or more subsets of thesites, such as between types of ownership orregion, the sites can first be stratified intothese subsets and a random sample takenwithin each subset (McNew, 1990). Stateswith silviculture implementation monitoringprograms commonly divide the sites byownership and county and/or region beforeselecting survey sites. The goal ofstratification is to increase the accuracy of theestimated mean values over what could havebeen obtained using simple random samplingof the entire population. The method makesuse of prior information to divide the targetpopulation into subgroups that are internallyhomogeneous. There are a number of ways to“select” sites, or sets of sites, to be certain thatimportant information will not be lost, or thatMM or BMP use will not be misrepresented asa result of treating all potential survey sites asequal. Figure 2-2 provides an example ofstratified random sampling from a listing ofharvest sites and from a map.

Where data are available, it might be useful tocompare the relative percentages of harvestedtimberland that is classified as having high,medium, and low erosion potentials. In caseswhere sediment is impacting water quality,highly erodible land might be responsible for alarger share of sediment delivery and wouldtherefore be an important target for trackingthe implementation of erosion controls. Astratified random sampling procedure could beused to estimate the percentage of totalharvested timberland with different erosionpotentials that have erosion controls in place. For other water quality problems (e.g.,spawning habitat in decline), otherstratification parameters (e.g., streamclassification) might be more appropriate.

Cluster sampling is applied in cases where it ismore practical to measure randomly selectedgroups of individual units than to measurerandomly selected individual units (Gilbert,1987). In cluster sampling, the totalpopulation is divided into a number ofrelatively small subdivisions, or clusters, andthen some of the subdivisions are randomlyselected for sampling. In one-stage clustersampling, the selected clusters are sampledtotally. In two-stage cluster sampling, randomsampling is performed within each cluster(Gaugush, 1987). For example, this approachmight be useful if a state wants to estimate theproportion of harvest sites that are followingstate-approved MMs or BMPs. All harvestsites in a particular county can be regarded asa single cluster. Once all clusters have beenidentified, specific clusters can be randomlychosen for sampling. Freund (1973) notesthat estimates based on cluster sampling aregenerally not as good as those based on simplerandom samples, but they are more cost-effective. As a result, Gaugush (1987)believes that the difficulty associated withanalyzing cluster samples is compensated forby the reduced sampling requirements andcost. Figure 2-3 provides an example ofcluster sampling from a listing of harvest sitesand from a map.

Systematic sampling is used extensively inwater quality monitoring programs because itis relatively easy to do from a managementperspective. In systematic sampling the firstsample has a random starting point and each

Page 6: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,
Page 7: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,
Page 8: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-8

subsequent sample has a constant distancefrom the previous sample. For example, if asample size of 70 is desired from a mailing listof 700 operators, the first sample would berandomly selected from among the first 10people, say the seventh person. Subsequentsamples would then be based on the 17th, 27th,..., 697th person. In comparison, a stratifiedrandom sampling approach might be to sortthe mailing list by county and then torandomly select operators from each county. Figure 2-4 provides an example of systematic samplingfrom a listing of harvest sites and from a map.

In general, systematic sampling is superior tostratified random sampling when only one ortwo samples per stratum are taken forestimating the mean (Cochran, 1977) or whenthere is a known pattern of managementmeasure implementation. Gilbert (1987)reports that systematic sampling is equivalentto simple random sampling in estimating themean if the target population has no trends,strata, or correlations among the populationunits. Cochran (1977) notes that on theaverage, simple random sampling andsystematic sampling have equal variances. However, Cochran (1977) also states that forany single population for which the number ofsampling units is small, the variance fromsystematic sampling is erratic and might besmaller or larger than the variance from simplerandom sampling.

Gilbert (1987) cautions that any periodicvariation in the target population should beknown before establishing a systematicsampling program. Sampling intervals equalto or multiples of the target population's cycleof variation might result in biased estimates ofthe population mean. Systematic sampling canbe designed to capitalize on a periodic

structure if that structure can be characterizedsufficiently (Cochran, 1977). A simple orstratified random sample is recommended,however, in cases where the periodic structureis not well known or whether the randomlyselected starting point is likely to have animpact on the results (Cochran, 1977).

Gilbert (1987) notes that assumptions aboutthe population are required in estimatingpopulation variance from a single systematicsample of a given size. There are, however,systematic sampling approaches that dosupport unbiased estimation of populationvariance. They include multiple systematicsampling, systematic stratified sampling, andtwo-stage sampling (Gilbert, 1987). Inmultiple systematic sampling, more than onesystematic sample is taken from the targetpopulation. Systematic stratified samplinginvolves the collection of two or moresystematic samples within each stratum.

2.1.3. Measurement and SamplingErrors

In addition to making sure that samples arerepresentative of the sample population, it isalso necessary to consider the types of bias orerror that might be introduced into the study. Measurement error is the deviation of ameasurement from the true value (e.g., thepercent compliance with SMA specificationswas estimated as 23 percent and the true valuewas 26 percent). A consistent under- oroverestimation of the true value is referred toas measurement bias. Random

Page 9: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,
Page 10: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-10

sampling error arises from the variability fromone population unit to the next (Gilbert,1987), explaining why the proportion ofoperators using a certain BMP differs fromone survey to another.

The goal of sampling is to obtain an accurateestimate by reducing the sampling andmeasurement errors to acceptable levels, whileexplaining as much of the variability aspossible to improve the precision of theestimates (Gaugush, 1987). Precision is ameasure of how close an agreement there isbetween individual measurements of the samepopulation. The accuracy of a measurementrefers to how close the measurement is to thetrue value. If a study has low bias and highprecision, the results will have high accuracy. Figure 2-5 illustrates the relationship betweenbias, precision, and accuracy.

As suggested earlier, numerous sources ofvariability should be accounted for indeveloping a sampling design. Samplingerrors are introduced by virtue of the naturalvariability within any given population ofinterest. Since sampling errors relate to MMor BMP implementation, the most effectivemethod for reducing such errors is to carefullydetermine the target population and to stratifythe target population to minimize thenonuniformity in each stratum.

Measurement errors can be minimized byensuring that site inspections are welldesigned. If data are collected by sending staffout to inspect randomly selected harvest sites,the approach for inspecting the harvest sitesshould be consistent. For example, how dofield personnel determine the percent ofadequate SMAs, or what is the basis for

determining whether a BMP has been properlyimplemented?

Reducing sampling errors below a certainpoint (relative to measurement errors) doesnot necessarily benefit the resulting analysisbecause total error is a function of the twotypes of errors. For example, if measurementerrors such as response or interviewing errorsare large, there is no point in taking a hugesample to reduce the sampling error of theestimate since the total error will be primarilydetermined by the measurement error. Measurement error is of particular concernwhen landowner surveys are used forimplementation monitoring. Likewise,reducing measurement errors would not beworthwhile if only a small sample size wereavailable for analysis because there would be alarge sampling error (and therefore a largetotal error) regardless of the size of themeasurement error. A proper balancebetween sampling and measurement errorsshould be maintained because researchaccuracy limits effective sample size and viceversa (Blalock, 1979).

2.1.4. Estimation and HypothesisTesting

Rather than presenting every observationcollected, the data analyst usually summarizesmajor characteristics with a few descriptivestatistics. Descriptive statistics include anycharacteristic designed to summarize animportant feature of a data set. A pointestimate is a single number that represents thedescriptive statistic. Statistics common toimplementation monitoring is useful toestimate the confidence interval. Theconfidence interval indicates the range inwhich the true value lies. For example, if it is

Page 11: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-11

(b)

(c) (d)

(a)

Figure 2-5. Graphical presentation of the relationship between bias, precision, and accuracy(after Gilbert, 1987). (a): high bias + low precision = low accuracy; (b): low bias + low precision= low accuracy; (c): high bias + high precision = low accuracy; and (d): low bias + highprecision = high accuracy.

include proportions, means, medians, totals,and others. When estimating parameters of apopulation, such as the proportion or mean, itestimated that 65 percent of waterbars on skidtrails were installed in accordance with designstandards and specifications and the 90percent confidence limit is ±5 percent, there isa 90 percent chance that between 60 and 70percent of the waterbars were installedcorrectly.

Hypothesis testing should be used todetermine whether the level of MM and BMPimplementation has changed over time. Thenull hypothesis (Ho) is the root of hypothesistesting. Traditionally, Ho is a statement of nochange, no effect, or no difference; forexample, “the proportion of properly installedwaterbars after operator training

Page 12: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-12

Table 2-2. Errors in hypothesis testing.

Decision

State of Affairs in the Population

Ho is True Ho is False

Accept Ho 1-"(Confidence level)

$(Type II error)

Reject Ho "(Significance level)

(Type I error)

1-$(Power)

is equal to the proportion of properly installedwaterbars before operator training.” Thealternative hypothesis (Ha) is counter to Ho,traditionally being a statement of change,effect, or difference, for example. If Ho isrejected, Ha is accepted. Regardless of thestatistical test selected for analyzing the data,the analyst must select the significance level(") of the test. That is, the analyst mustdetermine what error level is acceptable basedon the needs of decision makers. There aretwo types of errors in hypothesis testing:

Type I: Ho is rejected when Ho is reallytrue.

Type II: Ho is accepted when Ho is reallyfalse.

Table 2-2 depicts these errors, with themagnitude of Type I errors represented by "and the magnitude of Type II errorsrepresented by $. The probability of making aType I error is equal to the " of the test and isselected by the data analyst. In most cases,managers or analysts will define 1-" to be inthe range of 0.90 to 0.99 (e.g., a confidencelevel of 90 to 99 percent), although there havebeen applications where 1-" has been set to as

low as 0.80. Selecting a 95 percentconfidence level implies that the analyst willreject the Ho when Ho is true (i.e., a falsepositive) 5 percent of the time. The samenotion applies to the confidence interval forpoint estimates described above: " is set to0.10, and there is a 10 percent chance that thetrue percentage of properly installed waterbarsis outside the 60 to 70 percent range. Thisimplies that if the decisions to be made basedon the analysis are major (i.e., affect manypeople in adverse or costly ways) theconfidence level needs to be greater. For lesssignificant decisions (i.e., low costramifications) the confidence level can belower.

Type II error depends on the significancelevel, sample size, and variability, and whichalternative hypothesis is true. Power (1-$) isdefined as the probability of correctly rejectingHo when Ho is false. In general, for a fixedsample size, " and $ vary inversely. For afixed ", $ can be reduced by increasing thesample size (Remington and Schork, 1970).

Page 13: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-13

2.2. SAMPLING CONSIDERATIONS

In a document of this brevity, it is not possibleto address all the issues that face technicalstaff who are responsible for developing andimplementing studies to track and evaluate theimplementation of nonpoint source controlmeasures. For example, when is the best timeto implement a survey or do on-site visits? Inreality, it is difficult to pinpoint a single timeof the year. Some BMPs can be checked anytime of the year, whereas others have a smallwindow of opportunity. If the goal of thestudy is to determine the effectiveness of anoperator education program, sampling shouldbe timed to ensure that there was sufficienttime for outreach activities and for theoperators to implement the desired practices. Furthermore, field personnel must haveapproval to perform a site visit on each tractof land to be sampled. Where access isdenied, a randomly selected replacement site isneeded.

2.2.1. Site Selection

From a study design perspective, all of theseissues must be considered together whendetermining the sampling strategy. Siteselection criteria will differ from state to statedepending on the type of forestry practiced inthe state, physical landscape, and intendedpurposes for the information obtained fromthe implementation monitoring. The followinglist indicates the typical site selection criteriaculled from existing state implementationmonitoring programs. (The correspondingstate postal code is presented in parentheses.)

• Site size: minimum of 5 or 10 acres,depending on the region of the state (MN);

minimum 10 acres (SC); minimum 5 acres(MT); minimum 20 acres (ID).

• Proximity to a stream (perennial orintermittent): within 300 feet of a stream, ora lake of at least 10 acres surface area (FL);within 200 feet of a stream (MT); sites didnot have to be associated with streams orwetlands (SC); within 150 feet of a class IIstream (ID).

• Time of harvest: within the past 1 year(SC); 1-3 years prior to the audit (MT);within 2 years of harvest (FL).

• Site preparation: only sites that had notbeen site prepared (SC); either slash piledand burned or waiting burning, or slashbroadcast and scheduled to be burned (MT).

• Volume harvested: at least 7 MBF/ac (MT).

• Compatibility with previous surveys: saleshad to meet the selection criteria of aprevious study for comparability purposes(MT).

Other criteria that might be considered includeerosion risk (e.g., more sampling sites could beplaced in high-erosion-risk areas than in low-risk erosion areas) and beneficial use (biassampling toward high use and/or sensitiveareas).

Page 14: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-14

2.2.2. Data to Support Site Selection

A list of harvest sites from which to choosethose to be surveyed can be created frominformation obtained from timber harvesters. Depending on the state, the information isoften in the form of harvest plans and timbersale contracts. These sources of informationnormally include:

• U.S. Forest Service offices.

• The state forestry agency, department, ordivision (for state lands and nonindustrialprivate).

• Private timber companies (Ehinger andPotts, 1991).

In addition, the Bureau of Land Management(BLM) manages a significant acreage offederal property and may have valuableinformation (IDDHW, 1993). Aerialphotographs of the areas to be surveyed canbe used to identify recent harvest sites as well,and this method of identifying the sites tendsto remove site selection biases due to thedistance of sites from roads or other forms ofinconvenience that otherwise might makethem less apt to be chosen for a survey.

The data necessary to select sites for BMPtracking will naturally depend on the siteselection criteria. For instance, if sites mustmeet a minimum of board feet harvested, itwill be necessary to know harvest volumes inorder to select appropriate sites. The amountof data needed will increase as the number ofsite selection criteria increase, and this shouldbe taken into account when deciding on thecriteria, especially given the possibility that

some of the data or types of data collectedmight be unavailable or unreliable.

2.2.3. Example State and FederalPrograms

Several states and federal agencies haveimplemented programs for developing theirimplementation monitoring programs. Thissection describes those implemented byFlorida, Montana, Idaho, and the USDAForest Service.

2.2.3.1. Florida

In Florida, the following site selection criteriaare used (Vowell and Gilpin, 1994):

• All ownership classes are included.

• Only the northernmost 37 counties areincluded because most forestry activitiesoccur in these counties.

• Timber harvesting, site preparation, treeplanting, or some combination must haveoccurred within the past 2 years and within300 feet of an intermittent or perennialstream or a lake 10 acres or larger.

• Each county has a predetermined number ofsurvey sites based on the level of timberremoval reported by the Forest Service.

• Sites are selected from fixed-wing aircraftusing a random, predetermined flight patternin each county.

Page 15: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-15

County foresters randomly select qualifyingsites along the flight pattern until they havelocated the number of survey sites assignedto their county.

This approach is a type of stratified randomsampling. The entire population (entire state)is first divided into strata containing thenorthernmost 37 counties based on priorinformation that indicated most forestryoperations occur in those counties. Thesestrata are still too large to conduct randomsampling; therefore, the criteria describedabove are used to reduce the strata to amanageable number given available resources.

2.2.3.2. Montana and Idaho

Montana is interested in certain types ofinformation related to BMP implementation,so they stratify their sample before selectingsites. They follow these steps:

• Information on the sites (e.g., ownership,erosion hazard) is compiled by watershed orbasin.

• A list of all sales and information on them iscompiled for each basin for the time periodof interest (usually within 1-2 years ofharvest date).

• Sites that do not meet the selection criteriaare eliminated.

• Sites that do meet the selection criteria areground-truthed.

This is a stratified random approach: Withindrainage basins, sites are stratified first byownership and then by erosion hazard(Ehinger and Potts, 1991; Schultz, 1992).

Idaho also uses this approach, stratifying sitesby geographic region and administrativecategory. This ensures that differences in MMand BMP implementation among different soil,geologic, and administrative groupings are notlost as would be the case if simple randomsampling were used (IDDHW, 1993).

2.2.3.3. U.S. Forest Service

The U.S. Forest Service (USDA, 1992) hasdeveloped a monitoring system for Region 5 ofthe Forest Service, Best Management PracticeEvaluation Program (BMPEP), with thefollowing objectives:

• Assess the degree of implementation ofBMPs.

• Determine which BMPs are effective.

• Determine which BMPs need improvementor development.

• Fulfill Forest Land and ResourceManagement Plan BMP monitoringcommitments.

• Provide a record of performance formanagement of nonpoint source pollution inRegion 5 of the Forest Service.

These objectives are met through threeevaluation phases: Administrative, on-site, andin-channel. In general, the first two

Page 16: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-16

phases deal with issues related toimplementation monitoring, withadministrative evaluation primarily addressingprogrammatic evaluation and on-siteevaluation dealing primarily with individualpractices. In-channel evaluation consistsprimarily of effectiveness monitoring.

In the BMPEP, forests are assigned thenumber and types of evaluations to becompleted each year. To support statisticalinference, the evaluations assigned to eachforest must be performed at randomlyidentified sites. Sites to be evaluated areidentified in two ways: Randomly and byselection (“selected” sites).

Randomly identified sites are essential formaking statistical inferences regarding theimplementation and effectiveness of BMPs. Random sites are picked from a pool of sitesthat meet specified criteria.

Selected sites are identified in various ways:

• Identified as part of a monitoring planprescribed in an environmental assessment,environmental impact study, or landmanagement plan.

• Identified as part of a Settlement ofNegotiated Agreement.

• Part of a routine site visit.

• Follow-up evaluations upstream or In-channel Evaluation Sites, to discoversources of problems.

• Sites that are of particular interest to siteadministrators, specialists, and/or

management due to their sensitivity,uniqueness, and other factors.

• Selected for a particular reason specific tolocal needs.

It is important to note that for statisticalinference, the sample pool can only contain therandomly identified sites, not the “selected”sites. Selected sites must be clearly identifiedand kept separate from the random sites duringdata storage and analysis. Because on-siteevaluation addresses a range of practices,corresponding methods are provided fordeveloping sample pools for randomly selectedsites. For example, the sample pool for SMAsshould be developed using a Sale Area Mapfrom the Pool of Timber Sales and countingthe number of units that have designatedSMAs. This constitutes the SMA sample pool.

The data obtained from the sources discussedabove may not be precisely what are requiredby the state conducting the implementationmonitoring survey. Ehinger and Potts (1991)report the following difficulties encountered inusing data from the National Forest Servicedatabase:

• Flawed assumptions concerning the age of aharvest (some were found to be too old tomeet the survey criteria).

• Uncertain age of roads.

• Units within 200 feet of stream on paperwere in fact farther than 200 feet from astream.

Page 17: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-17

The following difficulties were associated withprivate nonindustrial forest units:

• Inadequate database to identify salesmeeting the survey criteria.

• Permission to access landowner property notgranted.

Landowners who did grant permission wereinterested primarily in demonstrating theirBMP efforts to the state forest department, sothis class of ownership was statistically biased.

On private industrial forest sites, a backlog ofslash burnings on sale units was found,preventing their use (because of surveycriteria), and state forests sites were mostlyfound to be farther than 200 feet from astream, making them ineligible for the survey. States must be aware that these kinds oflimitations will be encountered.

2.3. SAMPLE SIZE CALCULATIONS

This section describes methods for estimatingsample sizes to compute point estimates suchas proportions and means, as well as detectingchanges with a given significance level. Usually, several assumptions regarding datadistribution, variability, and cost must be madeto determine the sample size. Someassumptions might result in sample sizeestimates that are too high or too low. Depending on the sampling cost and cost fornot sampling enough data, it must be decidedwhether to make conservative or “best-value”assumptions. Because the cost of visiting anyindividual site or group of sites is relativelyconstant, it is probably cheaper to collect afew extra samples the first time than to realizelater that additional data are needed. In most

cases, the analyst should probably considerevaluating a range of assumptions regardingthe impact of sample size and overall programcost. To maintain document brevity, someterms and definitions used in the remainder ofthis chapter are summarized in Table 2-3. These terms are consistent with those in mostintroductory-level statistics texts, and moreinformation can be found there. Those withsome statistical training will note that some ofthese definitions include an additional termreferred to as the finite population correctionterm (1-N), where N is equal to n/N. In manyapplications, the number of population units inthe sample population (N) is large incomparison to the number of population unitssampled (n), and (1-N) can be ignored. However, depending on the number of units(harvest sites for example) in a particularpopulation, N can become quite small. N isdetermined by the definition of the samplepopulation and the corresponding populationunits. If N is greater than 0.1, the finitepopulation correction factor should not beignored (Cochran, 1977).

Applying any of the equations described in thissection is difficult when no historical data setexists to quantify initial estimates ofproportions, standard deviations, means, orcoefficients of variation. To estimate theseparameters, Cochran (1977) recommends foursources:

• Existing information on the same populationor a similar population.

Page 18: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-18

p ' a /n q ' 1& p

x '1

njn

i'1

xi s 2 '1

n&1jn

i'1

(xi& x)2

s ' s 2 Cv ' s /x

d ' *x&µ* dr '*x&µ*

µ

s 2(x) 's 2

n1& N s(x) '

s

n(1&N)0.5

s(Nx)'Ns

n1&N0.5 s(p)'

pq

n1&N0.5

N = total number of population unitsin sample population

n = number of samplesn0 = preliminary estimate of sample

size

a = number of successes

p = proportion of successes

q = proportion of failures (1-p)

xi = ith observation of a sample

x_

= sample mean

s2 = sample variance

s = sample standard deviation

Nx_

= total amount

µ = population mean

F2 = population variance

F = population standard deviation

Cv = coefficient of variation

s2(x_) = variance of sample mean

N = n/N (unless otherwise stated in text)

s(x_) = standard error (of sample mean)

1-N = finite population correction factor

d = allowable error

dr = relative error

Z" = value corresponding to cumulative area of 1-" using thenormal distribution (see Table A1).

t",df = value corresponding to cumulative area of 1-" using thestudent t distribution with df degrees of freedom (see TableA2).

Table 2-3. Definitions used in sample size calculation equations.

• A two-step sample. Use the first-stepsampling results to estimate the neededfactors, for best design, of the second step. Use data from both steps to estimate thefinal precision of the characteristic(s)sampled.

• A “pilot study” on a “convenient” or“meaningful” subsample. Use the results toestimate the needed factors. Here the resultsof the pilot study generally cannot be used in

the calculation of the final precision becauseoften the pilot sample is not representative ofthe entire population to be sampled.

• Informed judgment, or an educated guess.

It is important to note that this document onlyaddresses estimating sample sizes withtraditional parametric procedures. Themethods described in this document should be appropriate in most cases, considering thetype of data expected. If the data to be

Page 19: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-19

What sample size is necessary to estimate towithin ±5 percent the proportion of harvest sitesthat have adequate SMAs?

What sample size is necessary to estimate theproportion of harvest sites that have adequateSMAs so that the relative error is less than 5percent?

no '(Z1&"/2)

2 pq

d 2(2-1)

no '(Z1&"/2)

2 q

d 2r p

(2-2)

n '

n0

1%Nfor N > 0.1

no otherwise

(2-3)

sampled are skewed, as with much waterquality data, the analyst should plan totransform the data to something symmetric, ifnot normal, before computing sample sizes(Helsel and Hirsch, 1995). Kupper andHafner (1989) also note that some of theseequations tend to underestimate the necessarysample because power is not taken intoconsideration. Again, EPA recommends thatif the analyst lacks a background in statistics,he/she should consult with a trainedstatistician to be certain that the approach,design, and assumptions are appropriate to thetask at hand.

2.3.1. Simple Random Sampling

In simple random sampling, it is presumed thatthe sample population is relativelyhomogeneous and a difference in samplingcosts or variability is not expected. If the costor variability of any group within the samplepopulation were different, it might be moreappropriate to consider a stratified randomsampling approach.

To estimate the proportion of harvest sitesimplementing a certain BMP or MM, such thatthe allowable error, d, meets the studyprecision requirements (i.e., the trueproportion lies between p-d and p+d with a 1-" confidence level), a preliminary estimate of

sample size can be computed as (Snedecor andCochran, 1980)

If the proportion is expected to be a lownumber, using a constant allowable error mightnot be appropriate. Ten percent plus/minus 5percent has a 50 percent relative error. Alternatively, the relative error, dr, can bespecified (i.e., the true proportion lies betweenp-dr p and p+dr p with a 1-" confidence level)and a preliminary estimate of sample size canbe computed as (Snedecor and Cochran, 1980)

In both equations, the analyst must make aninitial estimate of p before starting the study. In the first equation, a conservative samplesize can be computed by assuming p equal to0.5. In the second equation the sample sizegets larger as p approaches 0 for constant dr,and thus an informed initial estimate of p isneeded. Values of " typically range from 0.01to 0.10. The final sample size is thenestimated as (Snedecor and Cochran, 1980)

where N is equal to no/N. Table 2-4demonstrates the impact on n of selecting p, ",d, dr, and N. For example, 278 random

Page 20: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-20

Table 2-4. Comparison of sample size as a function of p, ", d, dr, and N for estimatingproportions using equations 2-1 through 2-3.

Probabilityof Success,

p

Signifi-cancelevel, ""

Allowableerror, d

Relativeerror, dr

Preliminarysamplesize, no

Sample Size, n

Number of Population Units in SamplePopulation, N

500 750 1,000 2,000 Large N

0.1 0.05 0.050 0.500 138 108 117 121 138 138

0.1 0.05 0.075 0.750 61 55 61 61 61 61

0.5 0.05 0.050 0.100 384 217 254 278 322 384

0.5 0.05 0.075 0.150 171 127 139 146 171 171

0.1 0.10 0.050 0.500 97 82 86 97 97 97

0.1 0.10 0.075 0.750 43 43 43 43 43 43

0.5 0.10 0.050 0.100 271 176 199 213 238 271

0.5 0.10 0.075 0.150 120 97 104 107 120 120

What sample size is necessary to estimate theaverage number of acres per harvest site usingerosion controls to within ±25 acres?

What sample size is necessary to estimate the average number of acres per harvest site usingerosion controls to within ±10 percent?

n '(t1&"/2,n&1s/d)2

1 % (t1&"/2,n&1s/d)2/N(2-4)

n ' (t1&"/2,n&1s/d)2 (2-5)

samples are needed to estimate the proportionof 1,000 harvest sites with adequate SMAs towithin ±5 percent (d=0.05) with a 95 percentconfidence level, assuming roughly one-half ofharvest sites have adequate SMAs.Suppose the goal is to estimate the averageacreage per harvest site where erosioncontrols are used. The number of randomsamples required to achieve a desired marginof error when estimating the mean (i.e., thetrue mean lies between x

_-d and x

_+d with a 1-"

confidence level) is (Gilbert, 1987)

If N is large, the above equation can be

simplified to

Since the Student's t value is a function of n,Equations 2-4 and 2-5 are applied iteratively. That is, guess at what n will be, look up t1-"/2,n-1 from Table A2, and compute a revisedn. If the initial guess of n and the revised n aredifferent, use the revised n as the new guess,and repeat the process until the computed

Page 21: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-21

n ' (Z1&"/2F/d)2 (2-6)

n '(t1&"/2,n&1Cv /dr)

2

1% (t1&"/2,n&1Cv /dr)2/N

(2-7)

n ' (t1&"/2,n&1Cv /dr)2 (2-8)

value of n converges with the guessed value. If the population standard deviation is known(not too likely), rather than estimated, theabove equation can be further simplified to

To keep the relative error of the meanestimate below a certain level (i.e., the truemean lies between x

_-dr x

_ and x

_+dr x

_ with a 1-"

confidence level), the sample size can becomputed with (Gilbert, 1987)

Cv is usually less variable from study to studythan are estimates of the standard deviation,which are used in Equations 2-4 through 2-6. Professional judgment and experience,typically based on previous studies, arerequired to estimate Cv. Had Cv been known,Z1-"/2 would have been used in place of t1-"/2,n-1

in Equation 2-7. If N is large, Equation 2-7simplifies to:

For Company X, harvest sites range in sizefrom 20 to 400 acres although most are lessthan 80 acres in size. The goal of thesampling program is to estimate the averagenumber of harvested acres using erosioncontrols. However, the investigator isconcerned about skewing the mean estimatewith the few large sites. As a result, thesample population for this analysis is the 430harvested sites with less than 80 total acres. The investigator also wants to keep the

relative error under 15 percent (i.e., dr = 0.15)with a 90 percent confidence level.

Unfortunately, this is the first study thatCompany X has done and there is noinformation about Cv or s. The investigator,however, is familiar with a recent study doneby another company. Based on that study, theinvestigator estimates the Cv as 0.6 and s equalto 30. As a first-cut approximation, Equation2-6 was applied with Z1-"/2 equal to 1.645 andassuming N is large:

n ' (1.645(0.6/0.15)2 ' 43.3 . 44 samples

Since n/N is greater than 0.1 and Cv isestimated (i.e., not known), it is best toreestimate n with Equation 2-7 using 44samples as the initial guess of n. In this case,t1-"/2,n-1 is obtained from Table A2 as 1.6811.

n '(1.6811×0.6/0.15)2

1% (1.6811×0.6 /0.15)2/430' 40.9 . 41 samples

Notice that the revised sample is somewhatsmaller than the initial guess of n. In this caseit is recommended to reapply the Equation 2-7using 41 samples as the revised guess of n. Inthis case, t1-"/2,n-1 is obtained from Table A2 as1.6839.

n '(1.6839×0.6/0.15)2

1% (1.6839×0.6 /0.15)2/430' 41.0 . 41 samples

Since the revised sample size matches theestimated sample size on which t1-"/2,n-1 wasbased, no further iterations are necessary. Theproposed study should include 41 harvestedsites randomly selected from the 430 sites withless than 80 total acres.

Page 22: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-22

What sample size is necessary to determinewhether there is a 20 percent difference in BMPimplementation before and after an operatortraining program?

What sample size is necessary to detect a30-acre increase in average harvested acreageper site using erosion controls when comparingprivate and public timber sales?

no ' (Z"% Z2$)2

(p1q1% p2q2)

(p2& p1)2

(2-9)

no ' 10.51[(0.4)(0.6) % (0.6)(0.4)]

(0.6& 0.4)2' 126.1

no ' (Z"% Z2$)2

(s 21 % s 2

2 )

*2(2-10)

When interest is focused on whether the levelof BMP implementation has changed, it isnecessary to estimate the extent ofimplementation at two different time periods.

Alternatively, the proportion from twodifferent populations can be compared. Ineither case, two independent random samplesare taken and a hypothesis test is used todetermine whether there has been a significantchange in implementation. (See Snedecor andCochran (1980) for sample size calculationsfor matched data.) Consider an example inwhich the proportion of waterbars thateffectively divert water from the skid trail willbe estimated at two time periods. Whatsample size is needed?

To compute sample sizes for comparing twoproportions, p1 and p2, it is necessary toprovide a best estimate for p1 and p2, as wellas specifying the significance level and power(1-$). Recall that power is equal to theprobability of rejecting Ho when Ho is false. Given this information, the analyst substitutesthese values into (Snedecor and Cochran,1980)

where Z" and Z2$ correspond to the normaldeviate. Although this equation assumes thatN is large, it is acceptable for practical use(Snedecor and Cochran, 1980). Commonvalues of (Z" + Z2$)

2 are summarized in Table2-5. To account for p1 and p2 being estimated,Z should be replaced with t. In lieu of aniterative calculation, Snedecor and Cochran(1980) propose the following approach: (1) compute no using Equation 2-9; (2) roundno up to the next highest integer, f; and (3)multiply no by (f+3)/(f+1) to derive the finalestimate of n.

To detect a difference in proportions of 0.20with a two-sided test, " equal to 0.05, 1-$equal to 0.90, and an estimate of p1 and p2

equal to 0.4 and 0.6, no is computed as

Rounding 126.1 to the next highest integer, f isequal to 127, and n is computed as 126.1 x130/128 or 128.1. Therefore, 129 samples ineach random sample, or 258 total samples, areneeded to detect a difference in proportions of0.2. Beware of other sources of informationthat give significantly lower estimates ofsample size. In some cases the other sourcesdo not specify 1-$; in all cases, it is importantthat an “apples-to-apples” comparison is beingmade.

To compare the average from two randomsamples to detect a change of * (i.e., x̄2-x̄1), thefollowing equation is used:

Page 23: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-23

Power,1-$$

"" for One-sided Test "" for Two-sided Test

0.01 0.05 0.10 0.01 0.05 0.10

0.80 10.04 6.18 4.51 11.68 7.85 6.18

0.85 11.31 7.19 5.37 13.05 8.98 7.19

0.90 13.02 8.56 6.57 14.88 10.51 8.56

0.95 15.77 10.82 8.56 17.81 12.99 10.82

0.99 21.65 15.77 13.02 24.03 18.37 15.77

Table 2-5. Common values of (Z" + Z2$)2 for estimating sample size for use with equations 2-9

and 2-10.

no ' 10.51(302% 302)

202' 47.3 (2-11)

What sample size is necessary to estimate theaverage SMA width per harvest site when thereis a wide variety of stream types and siteconditions?

Common values of (Z" + Z2$)2 are summarized

in Table 2-5. To account for s1 and s2 beingestimated, Z should be replaced with t. In lieuof an iterative calculation, Snedecor andCochran (1980) propose the followingapproach: (1) compute no using Equation 2-10; (2) round no up to the next highest integer,f; and (3) multiply no by (f+3)/(f+1) to derivethe final estimate of n.

Continuing the Company X example above,where s was estimated as 30 acres, theinvestigator will also want to compare theaverage number of harvested acres that usederosion controls to the average number ofharvested acres that used erosion controls in afew years. To demonstrate success, theinvestigator believes that it will be necessaryto detect a 20-acre increase. Although thestandard deviation might change after theoperator training program, there is noparticular reason to propose a different s atthis point. To detect a difference of 20 acreswith a two-sided test, " equal to 0.05, 1-$equal to 0.90, and an estimate of s1 and s2

equal to 30, no is computed as

Rounding 47.3 to the next highest integer, f isequal to 48, and n is computed as (47.3)•(51/49) or 49.2. Therefore 50 samples in eachrandom sample, or 100 total samples, areneeded to detect a difference of 20 acres.

2.3.2. Stratified Random Sampling

The key reason for selecting a stratifiedrandom sampling strategy over simple randomsampling is to divide a heterogeneouspopulation into more homogeneous groups. Ifpopulations are grouped based on size (e.g.,site size) when there is a large number of smallunits and a few larger units, a large gain in

Page 24: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-24

xst ' jL

h'1

Wh xh (2-12)

s 2(xst) '1

N 2 jL

h'1

N 2h 1&

nh

Nh

s 2h

nh

s 2h '

1

nh&1jnh

i'1

(xh,i& xh)2 (2-14)

n '

jL

h'1

Wh sh ch jL

h'1

Wh sh / ch

V%1

NjL

h'1

Wh s 2h

nh ' nWh sh / ch

jL

h'1

Wh sh / ch

(2-16)

precision can be expected (Snedecor andCochran, 1980). Stratifying also allows theinvestigator to efficiently allocate samplingresources based on cost. The stratum mean,x_

h, is computed using the standard approachfor estimating the mean. The overall mean, x

_st,

is computed as

where L is the number of strata and Wh is therelative size of the hth stratum. Wh can becomputed as Nh/N where Nh and N are thenumber of population units in the hth stratumand the total number of population unitsacross all strata, respectively. Assuming thatsimple random sampling is used within eachstratum, the variance of x

_st is estimated as

(Gilbert, 1987)

(2-13)

where nh is the number of samples in the hth

stratum and sh2 is computed as (Gilbert, 1987)

There are several procedures for computingsample sizes. The method described belowallocates samples based on stratum size,variability, and unit sampling cost. If s2(x

_st) is

specified as V for a design goal, n can beobtained from (Gilbert, 1987)

(2-15)

where ch is the per unit sampling cost in the hth

stratum and nh is estimated as (Gilbert, 1987)

In the discussion above, the goal is to estimatean overall mean. To apply a stratified randomsampling approach to estimating proportions,ph, pst, phqh, and s2(pst) should be substitutedfor x

_h, x

_st, sh

2, and s2(x_

st) in the aboveequations, respectively.

To demonstrate the above approach, considerthe Company X example again. In addition tothe 430 sites that are less than 80 acres, thereare 100 sites that range in size from 81 to 200acres, 50 sites that range in size from 201 to300 acres, and 20 sites that range in size from301 to 400 acres. Table 2-6 presents threebasic scenarios for estimating sample size. Inthe first scenario, sh and ch are assumed equalamong all strata. Using a design goal of Vequal to 100 and applying Equation 2-15 yieldsa total sample size of 41.9 or 42. Since sh andch are uniform, these samples are allocatedproportionally to Wh, which is referred to asproportional allocation. This allocation canbe verified by comparing the percent sample

Page 25: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-25

Farm Size(acres)

Number ofFarms

(Nh)

RelativeSize(Wh)

StandardDeviation

(sh)

UnitSample

Cost(ch)

Sample Allocation

Number %

A) Proportional allocation (sh and ch are constant)

20-80 430 0.7167 30 1 31 70.5

81-200 100 0.1667 30 1 7 15.9

201-300 50 0.0833 30 1 4 9.1

301-400 20 0.0333 30 1 2 4.5

Using Equation 2-15, n is equal to 41.9. Applying Equation 2-16 to each stratum yields a total of44 samples after rounding up to the next integer.

B) Neyman allocation (ch is constant)

20-80 430 0.7167 30 1 35 56.5

81-200 100 0.1667 45 1 13 21.0

201-300 50 0.0833 60 1 9 14.5

301-400 20 0.0333 75 1 5 8.1

Using Equation 2-15, n is equal to 59.3. Applying Equation 2-16 to each stratum yields a total of62 samples after rounding up to the next integer.

C) Allocation where sh and ch are not constant

20-80 430 0.7167 30 1.00 38 61.3

81-200 100 0.1667 45 1.25 12 19.4

201-300 50 0.0833 60 1.50 8 12.9

301-400 20 0.0333 75 2.00 4 6.5

Using Equation 2-15, n is equal to 60.0. Applying Equation 2-16 to each stratum yields a total of62 samples after rounding up to the next integer.

Table 2-6. Allocation of samples.

allocation to Wh. Due to rounding up, a totalof 44 samples are allocated.

Under the second scenario, referred to as theNeyman allocation, the variability betweenstrata changes, but unit sample cost isconstant. In this example, sh increases by 15between strata. Because of the increased

Page 26: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling Design Chapter 2

2-26

n'pq

s(p)2'

(0.56)(0.44)

0.0352' 201 (2-17)

variability in the last three strata, a total of59.3 or 60 samples are needed to meet thesame design goal. So while more samples aretaken in every stratum, proportionally fewersamples are needed in the smaller site sizegroup. For example, using proportionalallocation, more than 70 percent of thesamples are taken in the 20- to 80-acre sitesize stratum, whereas approximately 57percent of the samples are taken in the samestratum using the Neyman allocation.

Finally, introducing sample cost variation willalso affect sample allocation. In the lastscenario it was assumed that it is twice asexpensive to evaluate a harvest site from thelargest size stratum than to evaluate a harvestsite from the smallest size stratum. In thisexample, roughly the same total number ofsamples are needed to meet the design goal,yet more samples are taken in the smaller sizestratum.

2.3.3. Cluster Sampling

Cluster sampling is commonly used whenthere is a choice between the size of thesampling unit (e.g., skid trail versus harvestsite). In general, it is cheaper to sample largerunits than smaller units, but the results tend tobe less accurate (Snedecor and Cochran,1980). Thus, if there is not a unit samplingcost advantage to cluster sampling, it isprobably better to use simple randomsampling. To decide whether to performcluster sampling, it will probably be necessaryto perform a special investigation to quantifysampling errors and costs using the twoapproaches.

Perhaps the best approach to explaining thedifference between simple random sampling

and cluster sampling is to consider an exampleset of results. In this example, the investigatordid an evaluation to determine whether harvestsites had adequate SMAs. Since the state hadtimber harvesting activities across the state, theinvestigator elected to inspect 10 harvest sitesalong each randomly selected river. Table 2-7presents the number of harvest sites along eachriver that had the recommended BMPs. Theoverall mean is 5.6; a little more than one-halfof the sites have implemented therecommended BMPs. However, note thatsince the population unit corresponds to the 10sites collectively, there are only 30 samplesand the standard error for the proportion ofsites using recommended BMPs is 0.035. Hadthe investigator incorrectly calculated thestandard error using the random samplingequations, he or she would have computed0.0287, nearly a 20 percent error.

Since the standard error from the clustersampling example is 0.035, it is possible toestimate the corresponding simple randomsample size to obtain the same precision using

Is collecting 300 samples using a clustersampling approach cheaper than collectingabout 200 simple random samples? If so,cluster sampling should be used; otherwise,simple random sampling should be used.

2.3.4. Systematic Sampling

It might be necessary to obtain an estimate ofthe proportion of harvest sites where cableyarding was implemented using siteinspections. Assuming a record of harvest

Page 27: Techniques for Tracking, Evaluating, and Reporting the ... · 1987). In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters,

Sampling DesignChapter 2

2-27

Table 2-7. Number of harvest sites (out of 10) implementing recommended BMPs.

3 9 5 7 6 4 6 3 5 5

5 7 7 4 7 5 3 8 4 6

8 4 7 4 5 3 3 9 9 7

Grand Total = 168 x

_ = 5.6

s = 1.923p = 5.6/10=0.560s = 1.923/10=0.1923

Standard error using cluster sampling: s(p)=0.1923/(30)0.5=0.035Standard error if simple random sampling assumption had been incorrectly used: s(p)=((0.56)(1-0.56)/300)0.5 =0.0287

sites (where cable yarding was specified in thetimber sale contract or administration file) isavailable in a sequence unrelated to themanner in which this BMP would beimplemented (e.g., in alphabetical order by theoperator's name), a systematic sample can beobtained by selecting a random number rbetween 1 and n, where n is the numberrequired in the sample (Casley and Lury,1982). The sampling units are then r, r +(N/n), r + (2N/n), ..., r + (n-1)(N/n), where Nis total number of available records.

If the population units are in random order(e.g., no trends, no natural strata,uncorrelated), systematic sampling is, onaverage, equivalent to simple randomsampling.

Once the sampling units (in this case, specificharvest sites) have been selected, siteinspections can be made to assess the extent ofcompliance with cable yarding standards andspecifications.