hypothesis testing and sample size calculation
DESCRIPTION
Hypothesis Testing and Sample Size Calculation. Po Chyou, Ph. D. Director, BBC. Population mean(s) Population median(s) Population proportion(s) Population variance(s) Population correlation(s) Association based on contingency table(s). Coefficients based on regression model Odds ratio - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/1.jpg)
Hypothesis Testingand
Sample Size Calculation
Po Chyou, Ph. D.
Director, BBC
![Page 2: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/2.jpg)
Hypothesis Testingon
• Population mean(s)• Population median(s)Population proportion(s)
• Population variance(s)• Population correlation(s)Association based on
contingency table(s)
• Coefficients based on regression model
• Odds ratio• Relative risk• Trend analysis• Survival distribution(s) /
curve(s)• Goodness of fit
![Page 3: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/3.jpg)
Hypothesis Testing1. Definition of a Hypothesis
An assumption made for the sake of argument
2. Establishing Hypothesis
Null hypothesis - H0 Alternative hypothesis - Ha
3. Testing Hypotheses
Is H0 true or not?
![Page 4: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/4.jpg)
Hypothesis Testing
4.Type I and Type II Errors
Type I error: we reject H0 but H0 is true
α = Pr(reject H0 / H0 is true) = Pr(Type I error) = Level of significance in hypothesis testing
Type II error: we accept H0 but H0 is false = Pr(accept H0 / H0 is false) = Pr(Type II error)
![Page 5: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/5.jpg)
Hypothesis Testing5. Steps of Hypothesis Testing
- Step 1 Formulate the null hypothesis H0 in statistical terms
- Step 2 Formulate the alternative hypothesis Ha in statistical terms
- Step 3 Set the level of significance α and the sample size n
- Step 4 Select the appropriate statistic and the rejection region R
- Step 5 Collect the data and calculate the statistic
![Page 6: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/6.jpg)
Hypothesis Testing
5. Steps of Hypothesis Testing (continued)
- Step 6 If the calculated statistic falls in the rejection region R, reject H0 in favor of
Ha; if the calculated statistic falls outside R, do not reject H0
α/2 α/2
Z0
Z α/2
De
ns
ity
–Z α/2
Reject H0 Do not reject H0 Reject H0
![Page 7: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/7.jpg)
Hypothesis Testing
A random sample of 400 persons included 240 smokers and 160 non-smokers. Of the smokers, 192 had CHD, while only 32 non-smokers had CHD.
Could a health insurance company claim the proportion of smokers having CHD differs from the proportion of non-smokers having CHD?
6. An Example
CHD No CHDSmokers x1 n1 - x1 n1
Non-Smokers x2 n2 - x2 n2
n = n1 + n2
CHD No CHDSmokers 192 48 240Non-Smokers 32 128 160
400
![Page 8: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/8.jpg)
Hypothesis TestingExample (continued)
Let P1 = the true proportion of smokers having CHD
P2= the true proportion of non-smokers having CHD
- Step 1 H0 : P1 = P2
- Step 2 Ha : P1 P2
- Step 3 α = .05, n = 400
![Page 9: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/9.jpg)
Hypothesis TestingExample (continued)
- Step 4 statistic = = P1 - P2
where P1 = x1 , P2 = x2 and P = x1 + x2
n1 n2 n1 + n2
P(1-P) (1/n1 + 1/n2)
α/2 = .025 α/2 = .025
Z0
= -1.96 = 1.96
Z.025
De
ns
ity
–Z .025
Reject H0 Do not reject H0 Reject H0
![Page 10: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/10.jpg)
CHD No CHDSmokers x1 n1 - x1 n1
Non-Smokers x2 n2 - x2 n2
n = n1 + n2
CHD No CHDSmokers 192 48 240Non-Smokers 32 128 160
400
Hypothesis TestingExample (continued)
P1 = x1
n1
= 192 = .80 240
P2 = x2
n2
= 32 = .20 160
P = x1 + x2
n1 + n2
= 192 + 32 = 224 = 0.56 240 + 160 400
= P1 - P2
P(1-P) (1/n1 + 1/n2)
= .80 - .20 = .60 = 11.84 > 1.96
(.56) (1-.56) (1/240 + 1/160) .05066
- Step 5
![Page 11: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/11.jpg)
Hypothesis TestingExample (continued)
- Step 6
Reject H0 and conclude that smokers had
significantly higher proportion of CHD than
that of non-smokers.
[P-value < .0000001]
![Page 12: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/12.jpg)
Hypothesis Testing7. Contingency Table Analysis
The Chi-square distribution (2)
α=.05
0 =3.841
De
ns
ity
Do not reject H0
Reject H0
.05, 1
2
2
![Page 13: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/13.jpg)
Hypothesis TestingEquation for chi-square for a contingency table
For i = 1, 2 and j =1, 2
2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2
E11 E12
E21 E22
2 = (Oij - Eij )2
i, j Eij
![Page 14: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/14.jpg)
Hypothesis TestingEquation for chi-square for a contingency table (cont.)
E11 = n1m1 E12
= n1 - n1m1 = n1m2
n n nE21
= n2m1 E22 = n2 - n2m1 = n2m2
n n n
O11 O12 n1
O21 O22 n2
m1 m2 n = n1 + n2 = m1 + m2
E11 E12 n1
E21 E22 n2
m1 m2
![Page 15: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/15.jpg)
Hypothesis TestingExample : Same as before
- Step 1 H0 : there is no association between smoker status and CHD
- Step 2 Ha : there is an association between smoker status and CHD
- Step 3 = .05, n = 400
- Step 4 statistic = 2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2
E11 E12
E21 E22
![Page 16: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/16.jpg)
Hypothesis TestingExample (continued) : Same as before
α=.05
0 =3.841
De
ns
ity
Do not reject H0
Reject H0
.05, 1
2
2
![Page 17: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/17.jpg)
Hypothesis TestingExample (continued) : Same as before
- Step 5 CHD No CHDSmokers O11 O12 n1
Non-Smokers O21 O22 n2
m1 m2 n
CHD No CHDSmokers 192 48 240Non-Smokers 32 128 160
224 176 400
CHD No CHDSmokers E11 E12
Non-Smokers E21 E22
![Page 18: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/18.jpg)
Hypothesis TestingExample (continued) : Same as before
E11 = n1m1 = 240 * 224 = 134.4
n 400
E12 = n1 - n1m1 = 240 - 134.4 = 105.6
n
E21 = n2m1 = 160 * 224 = 89.6
n 400
E22 = n2 - n2m1 = 160 - 89.6 = 70.4
n
- Step 5 (continued)
CHD No CHDSmokers 134.4 105.6Non-Smokers 89.6 70.4
Expectation Counts
![Page 19: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/19.jpg)
Hypothesis TestingExample (continued) : Same as before
- Step 5 (continued)
2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2
E11 E12
E21 E22
= (192 - 134.4)2 + (48 - 105.6)2 + (32 - 89.6)2 + (128 - 70.4)2
134.4 105.6 89.6 70.4
= 24.68 + 31.42 + 37.03 + 47.13
= 140.26 > 3.841
![Page 20: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/20.jpg)
Hypothesis TestingExample (continued) : Same as before
- Step 6
Reject H0 and conclude that there is an
association between smoker status and CHD.
[P-value < .0000001]
![Page 21: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/21.jpg)
Sample Size Estimation andStatistical Power Calculation
Definition of Power
Recall :
= Pr (accept H0 / H0 is false) = Pr (Type II error)
Power = 1 - = Pr(reject H0 / H0 is false)
![Page 22: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/22.jpg)
Sample Size Estimationfor Intervention on Tick Bites Among
CampersAssumptions
1. Given that the proportion (PCON) of tick bites among campers in the control group is constant.
2. Given that the proportion (PINT) of tick bites among campers in the intervention group is reduced by 50% compared to that of the control group after intervention has been implemented.
3. Given that a one- or two- tailed test is of interest with 80% power and a type-I error of 5%.
![Page 23: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/23.jpg)
Sample Size Estimationfor Intervention on Tick Bites Among
Campers
Summary Table 1
Required N for each groupPCON PINT Two-tailed One-tailed
.01 .005 (50% reduction) 4500 3600
.05 .025 (50% reduction) 1170 922
.10 .050 (50% reduction) 475 374
.15 .075 (50% reduction) 305 240
![Page 24: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/24.jpg)
Statistical Power Calculationfor Intervention on Obesity of Women in
MESAAssumptions
1. Given that the proportion (PCON) of women who are obese at baseline (i.e., the control group) is constant. There are a total of 840 women in the control group. Based on our preliminary data analysis results, approximately 50% of these 840 women at baseline are obese (BMI >= 27.3).
2. Given that the proportion (PINT) of women who are obese in the intervention group is reduced by 5% or more compared to that of the control group after intervention has been implemented. There are a total of 680 women who had been newly recruited. Based on our preliminary data analysis results, 50% of these 680 newly recruited women are obese. Assume that 60% of these women will agree to participate, we will have 200 women to be targeted for intervention.
![Page 25: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/25.jpg)
Statistical Power Calculationfor Intervention on Obesity of Women in
MESA (continued)
3. Given that a one-tailed test is of interest with a type-I error of 5%, then the estimated statistical powers are shown in Table 1 for detecting a difference of 5% or more in the proportion of obesity between the control group and the intervention group.
Assumptions
Table 1PCON (n=840) PINT (n=680) Difference Power
.50 .45 .05 61%
.50 .44 .06 75%
.50 .43 .07 85%
.50 .42 .08 92%
.50 .41 .09 96%
.50 .40 .10 98%
![Page 26: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/26.jpg)
Reference
“Statistical Power Analysis for the Behavioral Sciences”
Jacob Cohen
Academic Press, 1977
![Page 27: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/27.jpg)
Take Home Message:
• You’ve got questions : Data ? STATISTICS?...
• Contact Biostatistics and consult with an experienced biostatistician
– Po Chyou, Director, Senior Biostatistician (ext. 9-4776)– Dixie Schroeder, Secretary (ext. 1-7266)
OR
• Do it at your own risk
![Page 28: Hypothesis Testing and Sample Size Calculation](https://reader035.vdocuments.site/reader035/viewer/2022081516/56813ef0550346895da96bc4/html5/thumbnails/28.jpg)
Free Handout