9. selecting a sample

34
KNOWLEDGE FOR THE BENEFIT OF HUMANITY KNOWLEDGE FOR THE BENEFIT OF HUMANITY RESEARCH METHODOLOGY (HFS4343) SELECTING A SAMPLE Dr. Dr. Mohd Mohd Razif Razif Shahril Shahril School of Nutrition & Dietetics School of Nutrition & Dietetics Faculty of Health Sciences Faculty of Health Sciences Universiti Universiti Sultan Sultan Zainal Zainal Abidin Abidin 1

Upload: razif-shahril

Post on 13-Jan-2017

1.062 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Page 1: 9. Selecting a sample

KNOWLEDGE FOR THE BENEFIT OF HUMANITYKNOWLEDGE FOR THE BENEFIT OF HUMANITY

RESEARCH METHODOLOGY (HFS4343)

SELECTING A SAMPLE

Dr. Dr. MohdMohd RazifRazif ShahrilShahril

School of Nutrition & Dietetics School of Nutrition & Dietetics

Faculty of Health SciencesFaculty of Health Sciences

UniversitiUniversiti Sultan Sultan ZainalZainal AbidinAbidin

1

Page 2: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Topic Learning Outcomes At the end of this lecture, students should be able to;

• define sampling terminology

• differentiate types of sampling in quantitative and

qualitative research

• calculate sample size

2

Page 3: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Sample

The concept of sampling

3

Study population: sampling units

Select a few sampling units from the study

population

Collect information from these people to find

answers to your research questions

Page 4: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

The concept of sampling

• Sampling is the process of selecting a few (a sample)

from a bigger group (the sampling population)

– to become the basis for estimating or predicting the prevalence

of an unknown piece of information, situation or outcome

regarding the bigger group.

• A sample is a subgroup of the population you are

interested in

• Advantages of selection sample

– Saves time as well as financial and human resource

• Disadvantages of selecting sample

– Possibility of error in estimation exists

4

Page 5: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Sampling terminology

• Study population

• Sample

• Sample size (n)

• Sampling design/ strategy

• Sampling unit/ element

• Sampling frame

• Sample statistics

• Population mean

5

Briefly explain these

terminologies in e-Kelip Forum

Page 6: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Principles of sampling

6

11

• In a majority of cases of sampling there will be a difference between the sample statistics and the true population mean, which is attributable to the selection of the units in the sample

22 • The greater the sample size, the more accurate the estimate of

the true population mean

3

• The greater the difference in the variable under study in a population for a given sample size, the greater the difference between the sample statistics and the true population mean

Page 7: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Factors affecting the inferences drawn

from a sample

• The size of the sample

– Findings based upon larger samples have more certainty than

those based on smaller ones.

– The larger the sample size, the more accurate the findings

• The extent of variation in the sampling unit

– The greater the standard deviation, the higher the standard error

for a given sample size in your estimates.

– The higher the variation with respect to the characteristics under

study in the study population, the greater the uncertainty for a

given sample.

7

Page 8: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Aims in selecting a sample

• Maximum precision in estimates within the given sample

size

• Avoid bias in the selection of your sample. Bias in

selection of sample can occur if;

– Sampling is done by a non-random method

– The sampling frame does not cover the sampling population

accurately and completely

– A section of a sampling population is imposible to find or refuses

to co-operate

8

Page 9: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Types of sampling

Random/ probability

sampling design

Non-random/ non-probability sampling design

Systematic sampling: mixed

design 9

Page 10: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Random/ probability sampling design

• Simple random sampling

• Stratified random sampling

– Proportionate stratified sampling

– Disproportionate stratified sampling

• Cluster sampling

10

Activity: Find videos or animations related to sampling design and share in Blendspace. Briefly

present to class member

Page 11: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Non-random/ non-probability sampling

design

• Quota sampling

• Accidental sampling

• Judgemental/ purposive sampling

• Expert sampling

• Snowball sampling

11

Systematic sampling: mixed design

Activity: Find videos or animations related to sampling design and share in Blendspace. Briefly

present to class member

Page 12: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

CALCULATION OF SAMPLE SIZECALCULATION OF SAMPLE SIZE

12

The following slides till the end was prepared by Prof Dr Lua Pei Lin, UniSZA

Page 13: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

Page 14: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

WHY SAMPLE SIZE CALCULATION?WHY SAMPLE SIZE CALCULATION? • Common Q: How many subjects needed to obtain a significant result?

• If sample size (n) is too small: Waste of time No conclusive results Clinically significant differences may not be ‘statistically significant’

• If sample size (n) is too large: Extra subjects given therapy which may not be efficacous Waste of resources (unnecessary exposure)

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 15: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

IF NOT CALCULATED?IF NOT CALCULATED? • Is the study ethical?

• Would the study waste resources?

• Should the study still be conducted?

• Could it be a study limitation?

• Statistically insignificant results may be generated i.e. no real difference OR

lack of power?

• How does it affect the budget?

• Problems on sampling method?

• Study outcomes cannot be compared/generalised?

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 16: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

STUDY AIMS vs. SAMPLE SIZESTUDY AIMS vs. SAMPLE SIZE • Study aims/objectives are almost always: • To estimate prevalence or proportion of a disease, condition etc. • To compare prevalence or proportion of a disease, condition etc.

• For categorical variables – point prevalence/proportion and 95% confidence

interval (CI). • For numerical variables - exact mean and and 95% confidence interval (CI).

• Each aim/objective should have its own sample size. • Choose the biggest sample size.

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 17: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

RECALL!RECALL! TYPES OF ERRORTYPES OF ERROR • Type I error; Alpha: • Reject Ho when it is true. • No difference in reality, but the p value is < 0.05.

• Type II error; Beta: • Ho not rejected when it is false. • Difference exists in reality, but the p value is > 0.05.

• Power of study = 1 – Type II error (or 1 – Beta) • Higher power, bigger sample size needed. • Typically, power set at 80%.

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 18: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

RECALL!RECALL! ONEONE-- VS. TWOVS. TWO--TAILEDTAILED • Also one-sided or two-sided test.

• One-tailed test: • Only ONE assumption/direction – negative OR positive. • Researcher is very confident of outcome. • Unusual in health or medical study. • E.g.: Does treatment A enhance patient satisfaction?

• Two-tailed test: • BOTH negative OR positive assumption/direction allowed. • Outcome can be either good or bad. • More conservative (‘safe’). • Requires bigger sample size. • E.g.: Does treatment A alter patient satisfaction?

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 19: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

TYPES OF CALCULATIONTYPES OF CALCULATION • Sample size for ESTIMATION. • Sample size for HYPOTHESIS TESTING.

• For estimation:

– To produce estimation with acceptable CI. – Too large – very narrow CI (over-precise). – Too small – very wide CI (imprecise).

• For hypothesis testing:

– To enable study to have acceptable power to detect significant differences between two groups.

– Too large – statistically significant differences although clinical difference is very small (not practically important).

– Too small – no statistically significant difference although clinical difference is large (cannot show difference although important).

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia,

Kota Bharu.

Page 20: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

1) n FOR ESTIMATION1) n FOR ESTIMATION • Applies to studies which plan to estimate: a) Means of variables • Example of study aim = To estimate the MEAN stress

score among patients in Hospital ABC. b) Proportions of variables • Example of study aim = To estimate the PROPORTION of

obesity among sumo wrestlers.

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 21: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

1a) ESTIMATION OF MEANS1a) ESTIMATION OF MEANS • Use the SINGLE MEAN FORMULA:

n = Z 1-α/2 * σ

-------------------- Δ

n = sample size required Z 1-α/2 = desired confidence level (if 95% CI, z = 1.96) σ = standard deviation (from previous/pilot study) Δ = precision (researcher decide)

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia,

Kota Bharu.

Page 22: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

1a) EXAMPLE: 1a) EXAMPLE: Mean EstimateMean Estimate • To estimate the mean exercise duration for UniSZA

athletes in a week, how many samples are needed?

1. At 95% CI, Z = 1.96 2. Standard deviation, σ from previous study = 43 mins 3. Precision, Δ decided at = 10 4. Calculate: n = 1.96 * 43 10 = 71 athletes

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 23: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

1b) ESTIMATION OF 1b) ESTIMATION OF PROPORTIONSPROPORTIONS

• Use the SINGLE PROPORTION FORMULA: n = Z 1-α/2

-------------- * P (1-P) Δ

n = sample size required Z 1-α/2 = desired confidence level (if 95% CI, z = 1.96) P = proportion with the disease (from previous/pilot study) Δ = precision (researcher decide)

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 24: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

1b) EXAMPLE: 1b) EXAMPLE: Proportion EstimateProportion Estimate

• To estimate the proportion of adults with metabolic syndrome, how many samples are needed?

1. At 95% CI, Z = 1.96 2. Proportion, P from previous study = 40% 3. Precision, Δ decided at = 2.5% 4. Calculate:

n = 1.96 2 * 0.4 (1-0.4) 0.025 = 1,476 adults.

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 25: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

ISSUES IN ESTIMATIONSISSUES IN ESTIMATIONS • If proportion, report as ‘Prevalence’.

Interpretation: • Prevalence of obesity = 37% (95% CI: 27%, 47%) • The proportion of obese individuals in our study group is 37% and we are

95% confident that the proportion of obese individuals in the population ranges between 27% to 47%.

• Deciding the precision, Δ is hardest. • If past research did indicate 95% CI somewhere in their results for the same

variable studied, then the precision is HALF of the 95% CI WIDTH.

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 26: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

2) n FOR HYPOTHESIS TESTING2) n FOR HYPOTHESIS TESTING

Applies to studies which plan to compare: a) Means between two groups Example of study aim = To compare mean systolic BP between patients with and without stroke.

b) Proportions between two groups Example of study aim = To compare the prevalence (proportion) of obese male and female athletes.

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 27: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

2a) COMPARING MEANS2a) COMPARING MEANS n = 2 (SD2)

--------------- x (Z 1-α/2 + Z 1-β)2

Δ

n = sample size required per group SD or σ = standard deviation (from previous/pilot study) Z 1-α/2 = 1.96 if Type I error (alpha) at 5% (2-sided) Z 1-β = 0.84 if Type II error (beta) at 20% (power=80%) Δ = detectable difference (researcher decide)

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 28: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

22a) EXAMPLE: a) EXAMPLE: Compare MeansCompare Means

• To compare means systolic BP between patients with stroke (A) vs. patients without stroke (B):

1. SD (from previous study) = 10mmHg 2. Detectable Difference, Δ decided at = 5mmHg 3. Type I error at 5%; Z 1-α/2 = 1.96 4. Type II error at 20%; Power = 80%; Z 1-β = 0.84 5. Calculate:

n = 2 (102) x (1.96 + 0.84)2

5 • = 314 per group

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 29: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

2a) COMPARING PROPORTIONS2a) COMPARING PROPORTIONS

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

n = p1 (1 - p1) + p0 (1 - p0)

--------------------------------- * [ (Z 1-α/2 + Z 1-β)2 ]

(p1 – p0) 2

n = sample size required per group p1= proportion in Group 1 (from previous/pilot study) p0 = proportion in Group 2 (from previous/pilot study/decide) Z 1-α/2 = 1.96 if Type I error (alpha) at 5% (2-sided) Z 1-β = 0.84 if Type II error (beta) at 20% (power=80%) Δ = detectable difference (researcher decide)

Page 30: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

2b) EXAMPLE: 2b) EXAMPLE: Compare Proportion (Compare Proportion (ii))

• To compare the prevalence of obesity between male and female athletes:

1. p1= 27% (from previous/pilot study)

2. p0 = 37% (expected 10% more than p1)

3. Z 1-α/2 = 1.96 for Type I error at 5% (2-sided)

4. Z 1-β = 0.84 for Type II error at 20% (power=80%)

5. Δ = 10% (detectable difference, clinically important)

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 31: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

2b) EXAMPLE: 2b) EXAMPLE: Compare Proportion (ii)Compare Proportion (ii)

• n = p1 (1 - p1) + p0 (1 - p0)

--------------------------------- * [ (Z 1-α/2 + Z 1-β)2 ]

(p1 – p0) 2

= 0.37 (1-0.37) + 0.27 (1-0.27) * [(1.96 + 0.84)2] (0.37-0.27)2

= 337 athletes per group i.e. 337 males & 337 females

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

Page 32: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

REMEMBER!REMEMBER!

Kamarul Imran Musa (2012). Sample Size Estimation. In Research Methodology in Health Sciences. KKMED Publications, Universiti Sains Malaysia, Kota Bharu.

• Type I error (alpha) – False Positive • Type II error (beta) – False Negative • Power of Study (1-beta) = True Positive • Detectable Difference (DD) = Detectable Alternative

Page 33: 9. Selecting a sample

S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N

INFLUENTIAL FACTORSINFLUENTIAL FACTORS

http://stattrek.com/sample-size/simple-random-sample.aspx

• Cost considerations (e.g., maximum budget, desire to minimize cost).

• Administrative concerns (e.g., complexity of the design, research deadlines).

• Minimum acceptable level of precision. • Confidence level. • Variability within the population or subpopulation

(e.g., stratum, cluster) of interest. • Sampling method.

Page 34: 9. Selecting a sample

Thank YouThank You

34