probability sampling

8
SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling 27 Probability Sampling; Simple, Systematic, Cluster, Stratified, Multistage Mehdi Osooli Knowledge Hub on HIV surveillance, Kerman University of Medical Sciences, Iran The main objectives 1. Giving an introduction to basics of probability sampling 2. Making participants familiar with practical aspects of each probability sampling method 3. Comparing pros and cons of different probability sampling methods Concept and basics of probability sampling methods One of the most important issues in researches is selecting an appropriate sample. Among sampling methods, probability sample are of much importance since most statistical tests fit on to this type of sampling method. Representativeness and generalize-ability will be achieved well with probable samples from a population, although the matter of low feasibility of a probable sampling method or high cost, don’t allow us to use it and shift us to the other non-probable sampling methods. In probability sampling we give known chance to be selected to every unit of the population. We usually want to estimate some parameters of a population by a sample. These parameters estimates when we don’t observe whole population usually have some errors. Fortunately in probability sampling it is possible that we know how much our estimates are trustable or close to the parameter value from population by computing standard errors of estimates. This is not easily possible in non-probability sampling methods. Types of probability sampling methods Simple Random Sampling What is it? Simple random sampling is selecting randomly some units from a known and well defined population. In this method the sampling frame should be known and all units should have same chance for being selected. How is it down? (Example) In simple random sampling, from population of N, n units are selected randomly and the chance of being selected for all units is equal. Different methods and tools can be used for creating

Upload: daeng-mudrikan-nacong

Post on 24-Oct-2015

14 views

Category:

Documents


0 download

DESCRIPTION

Hi

TRANSCRIPT

Page 1: Probability Sampling

SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling

27

Probability Sampling; Simple, Systematic, Cluster, Stratified, Multistage  Mehdi Osooli Knowledge Hub on HIV surveillance, Kerman University of Medical Sciences, Iran

The main objectives 

1. Giving an introduction to basics of probability sampling

2. Making participants familiar with practical aspects of each probability sampling method

3. Comparing pros and cons of different probability sampling methods

Concept and basics of probability sampling methods One of the most important issues in researches is selecting an appropriate sample. Among sampling methods, probability sample are of much importance since most statistical tests fit on to this type of sampling method. Representativeness and generalize-ability will be achieved well with probable samples from a population, although the matter of low feasibility of a probable sampling method or high cost, don’t allow us to use it and shift us to the other non-probable sampling methods. In probability sampling we give known chance to be selected to every unit of the population. We usually want to estimate some parameters of a population by a sample. These parameters estimates when we don’t observe whole population usually have some errors. Fortunately in probability sampling it is possible that we know how much our estimates are trustable or close to the parameter value from population by computing standard errors of estimates. This is not easily possible in non-probability sampling methods.

Types of probability sampling methods 

Simple Random Sampling What is it? Simple random sampling is selecting randomly some units from a known and well defined population. In this method the sampling frame should be known and all units should have same chance for being selected. How is it down? (Example) In simple random sampling, from population of N, n units are selected randomly and the chance of being selected for all units is equal. Different methods and tools can be used for creating

Page 2: Probability Sampling

Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV

28

random numbers for sample selection. Standard random number tables and soft-wares with ability of generating random numbers like Open-Epi or Stata are available. Example: You have been asked to perform a KAP survey in a prison. The list of all 2000 prisoners has been given to you. You think that a sample of 300 would be satisfactory for your work. If you want choose 300 of them for interview randomly, you can use a random number generator to generate 300 numbers between 1 and 2000. Most of the time you would have some repeated numbers that should be replaced by new numbers. Uses Simple random sampling is a good method for comparing the precision of different methods of sampling and also useful for teaching general probabilistic sampling rules. Criticisms Although when the population is not very big it is possible to do simple random sampling, other methods of random sampling are preferable to it because they brought more precise estimates from population. In big population and wide geographical sampling areas it is not easy to take a list form all units and randomly selecting them.

Systematic Random Sampling What is it? In systematic random sampling we use the order of the population list or the place of units in the population for choosing the sample.

How is it down? (Example) First we should have the list of the population and according to the total number of sample needed we define a value of “k” to jump over population units and selecting units. If we want select 5 units over a population of 50, we can define k=6 and draw a random number between 1 and 6. Suppose the random number is 3. Since, the k is 6 the second, third and fourth units will be 9, 12 and 18 respectively. Example: We want to estimate the prevalence of HIV infection among volunteer blood donors in Tehran, 2009. The list of blood donors was available on computer software and the order of patients was according to the date of their referral. We decided to select a sample of 2,000 from 76,000. The k was defined as 38 and a number between 1 and 38 was chosen. Choosing 12 then 38 was added to that and second person was the record number 50 and the next units were chosen adding each time the 38 to the previous selected record. According to the participants names repeated units were excluded and replaced by new units with the same method.

Uses Systematic random sampling is very easy and less time consuming. The precision of systematic random sampling is higher than simple random sampling.

Criticisms The chance of selecting a non-representative sample is very high in this method of sampling especially when there is a correlation between the place of the unit in the population list and the

Page 3: Probability Sampling

SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling

29

characteristics of the unit that should be observed. Here it is possible that using the “k” you jump over some specific units and select in case different units of the population.

A schematic example for systematic sampling with k=10

Stratified Random Sampling What is it? In some situations the population can be divided in to sub population which share some characteristics internally. In this case it is more reasonable to take random samples from these subdivisions. The sub divisions here are called “strata”. The strata’s should be non-overlapping homogenous internally but heterogeneous externally. The population of the strata’s should be known.

How is it down? (Example) Stratified sampling is done in two major steps. First we should define population strata’s and second select a sample form each stratum. Example: You want to estimate the prevalence of STI among female sex workers in a capital city. You have found from the formative assessment that there are three types of FSWs in the city, street based, hotel based and brothel based. They somehow differ from each other regarding percentage of high risk behavior and medical consultant and available health care services. The level of shaded color in the below figure indicated the level of frequency of risky behaviors.

Page 4: Probability Sampling

Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV

30

As all the FSWs in the city have registered in the health department, you have access to the list of all the FSWs. In this case, you have three strata of FSWs and to have a precise estimate of STI prevalence of the target population, you should select sample from each group/strata.

Uses Four uses can be proposed for stratified random sampling: 1. When you want achieve certain precision or information for specific subdivision of the

population stratified sampling is very useful. Using this method you will need smaller sample size with higher precision.

2. Using stratified sampling is more helpful in studies in multiple administrative areas. In this case each area is a stratum and you will allocate your sample from each stratum separately.

3. In some cases the sampling problems are different in different study fields again here dividing population into subdivisions will enable you to define specific methods and criteria for work in each division.

4. By using stratified sampling the overall precision of the estimates will be more exact. Since the variability within strata’s is very small the smaller numbers will give you satisfactory precision and combining these estimates as well bring precise estimates too.

Criticisms The assumption of little variation and similarity within strata’s is very important to benefit from this method, although in real world that is not easily achievable.

Cluster Random Sampling What is it? In sampling from big population, the most common method is cluster sampling. In this method we divide population in to sub divisions called clusters. Clusters are representative sub-samples of population; this means that the distribution of population units in each cluster is heterogeneous.

How is it down? (Example)

FSWs Street-based

FSWs Hotel-based

FSWs Brothel-based

Page 5: Probability Sampling

SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling

31

Clusters are parts of the population with almost same elements or units of the population but in smaller scale. Cluster sampling can be done in just one step or multiple steps. In one step cluster sampling after selecting few clusters, we include whole selected cluster’s unit in to our sample while in multi stage cluster sampling we will choose randomly just some units within clusters. Example: In assessing the satisfaction of HIV positive patients from hospital based health care services in city of Kerman you can assume each hospital in the city and allocate a random sample size from each hospital to reach your desired sample size. Tehran has around 120 hospitals; in one step method all patients form each hospital should be included while in multi-stage cluster sampling we first select some hospitals then a number of patients within each hospital (cluster) will be selected. Uses In sampling from wide geographical area’s it is possible to define neighboring regions as a cluster. In household surveys also each family can be considered as a cluster.

Criticisms The precision of cluster sample is lower than stratified sampling and it needs bigger sample sizes to bring same precision.

Multi stage sampling What is it? In this method you do several sampling steps and use different sampling methods to achieve your desired sample. It is possible to use both probabilistic and non probabilistic methods together but should keep in mind that non-probabilistic samples are not representative of the population.

How is it down? (Example) At first we assess the study population and its criteria to fit for different sampling methods. Then according to the population’s structure we define our sampling framework and take our sample.

Uses The benefits of each sampling type are achieved using multi stage method. Big national surveys usual are done using multi stage sampling methods.

Criticisms Multi stage method needs careful design. The inference about sample and final analysis is a bit complicated and should be based and adapted on the procedure of the sampling and different methods were used.  

Page 6: Probability Sampling

Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV

32

Summary Table 1 provides a brief summary of various conventional sampling techniques including their advantages and disadvantages.

Table 1 - Summary of conventional sampling techniques.

Sampling

Steps

Advantages

Disadvantages

Sim

ple

rand

om 1. Construct sample frame

for survey population 2. Select people randomly

from sample frame using random number table or lottery draw

1. Concept is easy to understand and analyse

1. Requires sample frame of entire target population

2. Logistically difficult if sample geographically dispersed

3. Using random number/lottery time-consuming

Syst

emat

ic

1. Create a list of the target population

2. Calculate sampling interval (SI)

3. Select random start between 1 and SI & select that person

4. Add SI to random start and select person, etc.

1. Random numbers or lottery not required

2. Easy to analyse

1. Requires sample frame of entire target population

2. Logistically difficult if sample geographically dispersed

Stra

tifie

d

1. Define the strata and construct sample frame for each strata

2. Take a simple/systematic sample from each strata

3. Calculate indicator estimates for each strata and for population

1. Produces unbiased estimates of indicators for the strata

2. Can increase precision of indicator estimates

1. Requires sample frame of entire survey population

2. Logistically difficult if sample geographically dispersed

3. Requires sample large enough to make precise estimates for each strata

4. Population estimates require weighting

Page 7: Probability Sampling

SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling

33

Table 1 - Summary of conventional sampling techniques, continued.

Sampling

Steps

Advantages

Disadvantages

Clu

ster

: Pr

obab

ility

pro

porti

onal

to si

ze (P

PS)

or e

qual

pro

babi

lity

sam

plin

g

1. Construct sample frame of clusters

2. Calculate SI, select random start between 1 & SI

3. Select cluster whose cumulative size contains the random start

4. Add SI to random start & select cluster

5. Sample equal numbers of people from selected clusters

1. Only need sample frame of clusters and individuals in selected clusters

2. Sample concentrated in geographical areas

1. Decreases precision of estimates; thus, requires larger sample size

2. Size of clusters required prior to sampling

Clu

ster

: Eq

ual p

roba

bilit

y, fi

xed

clus

ter s

ize

1. Construct sample frame of clusters

2. Select clusters using simple/systematic sampling

3. Sample equal numbers of people from selected clusters

1. Only need sample frame of clusters and individuals in selected clusters

2. Sample concentrated in geographical areas

3. Don’t need cluster sizes prior to sampling

1. Decreases precision of estimates; thus, requires larger sample size

2. Weighted analysis required for unbiased estimates

3. Size of clusters required for weighted analysis

Clu

ster

: Eq

ual p

roba

bilit

y, p

ropo

rtion

al

clus

ter s

ize

1. Construct sample frame of clusters

2. Select cluster using simple/systematic sampling

3. Sample equal proportions of people per cluster

1. Only need sample frame of clusters and individuals in selected clusters

2. Sample concentrated in geographical areas

1. Decreases precision of estimates; thus, requires larger sample size

2. Size of clusters required for proportional sampling

3. Sample size; thus, precision of estimates unpredictable

Page 8: Probability Sampling

Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV

34

Question(s) to be discussed

Looking at Table 1, answer the following questions:

a. What are the advantages to using a systematic sampling method? b. What are the steps to take when using a Cluster: equal probability, fixed cluster size

sampling method? c. What is the disadvantage of using stratified sampling method when it comes to making

population estimates?

References • Cochrane W.G. Sampling Techniques. 3rd ed. N.Y. John Willey & Sons. 1983

• Levy. P.S., Lemeshow. S. Sampling of Population methods and applications. 3rd ed. John

Willey & Sons Inc. 1999

• Thompson S. Sampling. 2nd ed. N.Y. John Willey & Sons. 1992

• Thompson M.E. Theory of Sample survey. 2nd ed. London Chapman & Hall. 1997

• Chereii A. et al. Sampling and estimating Sample Size in Medical Research [in Persian].

2nd ed. 2007

• Malek Afzali H et al. Applied Research Methodology in Medical Sciences. Tehran

University of medical Sciences. [in Persian].

• Tryfos P. Sampling method for applied research. N.Y. John Willey & Sons. 1996