12.2 (13.2) comparing two proportions
DESCRIPTION
12.2 (13.2) Comparing Two Proportions. The Sampling Distribution of. The Sampling distribution of. Example. - PowerPoint PPT PresentationTRANSCRIPT
12.2 (13.2) Comparing Two Proportions
The Sampling Distribution of )ˆˆ( 21
The Sampling distribution of)ˆˆ( 21
1 2
1 2
1 2
ˆ ˆWhen the samples are large, the distribution of is approximately Normal
ˆ ˆIf the sampling distribution of both and are Normal then the sampling
ˆ ˆdistribution of is approximately no
1 1 1 1 2 2 2 2
rmal.
ˆ ˆ ˆ ˆ 10 (1 ) 10 10 (1 ) 10n n n n
5.least at are valuesall
whenˆˆabout testscesignifican a performing safe are fact weIn 21
Example
• The movie A Civil Action tells the story of a major legal battle that took place in the small town of Woburn, Massachusetts. A town well that supplied water to East Woburn residents was contaminated by industrial chemicals. During the period that the residents drank the water from this well, a sample of 414 births showed 16 birth defects. On the West side of Woburn a sample of 228 babies born during the same period revealed 3 birth defects. The plaintiffs suing the companies responsible for the contamination claimed that these data show that the rate of birth defects was significantly higher in East Woburn, where the contaminated well water was in use. How strong is the evidence supporting this claim? What should the judge for this case conclude?
Example continued• Construct a 95% confidence interval for ρ1 – ρ2.
• Step 1 We want to compare the difference between rates of birth defects in East and West Woburn.– Populations of interest: Population of babies born in East Woburn in the
time period in questions. The second population is the babies born in West Woburn in the same time period.
– Parameter of interest: ρ1 = The proportion of all East Woburn babies born with birth defects.
ρ2 = The proportion of all West Woburn babies born with birth defects. H0: ρ1 = ρ2 Ha: ρ1 > ρ2
0132.0228
3ˆ 0386.0
414
16ˆ 21
Example Step 2
• SRS – We do not have enough information on how the babies in the sample where selected. We will assume the two samples are SRSs.
• Normality
99.224)1( 02.398)1(
01.3)0132.0(228 98.15)0386.0(414
2211
2211
nn
nn
0132.0228
3ˆ 0386.0
414
16ˆ 21
We may have a problem with n2ρ2 since we would like to have at least 5 successes. Ideally we would increase the sample size to 379 to achieve 5 failures. We will note the problem and proceed with caution.
• Independence – The two samples are independent. In the time period the number of births could be ten times larger than the sample size.
Example Step 3
From table C, Z* = 1.96
0491)(0.0017,0.
0.02370.0254
228
)9868.0)(0132.0(
414
)9614.0)(0386.0(96.1)0132.00386.0(
)ˆ1(ˆ)ˆ1(ˆ*)ˆˆ(
2
22
1
1121
nn
z
0132.0228
3ˆ 0386.0
414
16ˆ 21
Example Step 4
• Step 4 Interpretation
We are 95% confident that the difference between the birth defects rates for East and West Woburn is between 0.17% and 4.9%. The interval does not include the difference of 0, so we reject H0. There may be some evidence for a positive difference between the birth defect rates supporting the plaintiffs case. However, the normality of the sampling distribution may impact on the strength of the evidence.
Example
• Using the previous problem A Civil Action, we will perform a significance test for comparing the two proportions.
• Step 1 We want to compare the difference between rates of birth defects in East and West Woburn.– Populations of interest: Population of babies born in East
Woburn in the time period in questions. The second population is the babies born in West Woburn in the same time period.
– Parameter of interest: ρ1 = The proportion of all East Woburn babies born with birth defects.
ρ2 = The proportion of all West Woburn babies born with birth defects.
H0: ρ1 = ρ2 Ha: ρ1 > ρ2
Example Step 2• SRS – We do not have enough information on
how the babies in the sample where selected. We will assume the two samples are SRSs.
• Normality
0296.0228414
316ˆ
0132.0228
3ˆ 0386.0
414
16ˆ 21
c
25.221)9704.0)(228()ˆ1( 75.401)9704.0)(414()ˆ1(
75.6)0296.0)(228(ˆ 25.12)0296.0)(414(ˆ
.proportion combined on the basednormality analyze willWe
21
21
cc
cc
nn
nn
Since all values are larger than 5 we are safe to use a Normal approximation.
• Independence – The two samples are independent. In the time period the number of births could be ten times larger than the sample size.
Example Step 3Calculations
Test Statistic:
82.1
2281
4141
9704.00296.0
0132.00386.0
11ˆ1ˆ
ˆˆ
21
21
nn
z
cc
0296.0228414
316ˆ
0132.0228
3ˆ 0386.0
414
16ˆ 21
c
P-valueRight side tail test. Use Table A
P-value = P(1 – P(Z < 1.82) = 0.0344
Example Step 4
• Interpretation – The P-value is smaller than α = 0.05 which means that it is unlikely that we would obtain a difference in sample proportions as large as we did if the null hypothesis is true. The judge would probably conclude that the companies who contaminated the well were responsible for the higher proportion of birth defects in East Woburn.