1 confidence intervals for two proportions section 6.1

15
1 Confidence Intervals for Two Proportions Section 6.1

Upload: caroline-cunningham

Post on 13-Dec-2015

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Confidence Intervals for Two Proportions Section 6.1

1

Confidence Intervals for

Two Proportions

Confidence Intervals for

Two Proportions

Section 6.1

Page 2: 1 Confidence Intervals for Two Proportions Section 6.1

2

Section 6.1CI for Two Proportions

Section 6.1CI for Two Proportions

• We are interested in confidence intervals for the difference p1 – p2 between the unknown values of two population proportions

Page 3: 1 Confidence Intervals for Two Proportions Section 6.1

3

6.1 Confidence Intervals for the difference p1 – p2 between two

population proportions

6.1 Confidence Intervals for the difference p1 – p2 between two

population proportions• In this section we deal with two populations whose data

are qualitative.• For nominal data we compare the population

proportions of the occurrence of a certain event.• Examples

– Comparing the effectiveness of new drug versus older one– Comparing market share before and after advertising

campaign– Comparing defective rates between two machines

Page 4: 1 Confidence Intervals for Two Proportions Section 6.1

4

Parameter and StatisticParameter and Statistic

• Parameter– When the data are qualitative, we can only count the

occurrences of a certain event in the two populations, and calculate proportions.

– The parameter we want to estimate is p1 – p2.

• Statistic– An unbiased estimator of p1 – p2 is (the

difference between the sample proportions). 21 p̂p̂

Page 5: 1 Confidence Intervals for Two Proportions Section 6.1

5

Sample 1 Sample size n1

Number of successes x1

Sample proportion

Sample 1 Sample size n1

Number of successes x1

Sample proportion

• Two random samples are drawn from two populations.• The number of successes in each sample is recorded.• The sample proportions are computed.

Sample 2 Sample size n2

Number of successes x2

Sample proportion

Sample 2 Sample size n2

Number of successes x2

Sample proportionx

n1

1

ˆ =p1

2

22 n

xp̂

Point Estimator:Point Estimator: 21 p̂p̂

Page 6: 1 Confidence Intervals for Two Proportions Section 6.1

Large-sample CI for two proportionsFor two independent samples of sizes n1 and n2 with sample proportion of successes 1 and 2 respectively, an approximate level C confidence interval for p1 – p2 is

Use this method when

C is the area under the standard normal curve between −z* and z*.

p̂ p̂

1 1 2 2ˆ ˆ ˆ ˆ10, (1 ) 10, 10, (1 ) 10np n p np n p

* 1 1 2 21 2

1 2

ˆ ˆ ˆ ˆ(1 ) (1 )ˆ ˆ( )

where z* is the appropriate value from the z-table that depends on theconfidence level C

p p p pp p z

n n

Page 7: 1 Confidence Intervals for Two Proportions Section 6.1

7

Example: confidence interval for p1 – p2

p. 2

Example: confidence interval for p1 – p2

p. 2

• Estimating the cost of life saved– Two drugs are used to treat heart attack victims:

• Streptokinase (available since 1959, costs $460)• t-PA (genetically engineered, costs $2900).

– The maker of t-PA claims that its drug outperforms Streptokinase.

– An experiment was conducted in 15 countries. • 20,500 patients were given t-PA• 20,500 patients were given Streptokinase• The number of deaths by heart attacks was recorded.

Page 8: 1 Confidence Intervals for Two Proportions Section 6.1

8

• Experiment results– A total of 1497 patients treated with Streptokinase

died.– A total of 1292 patients treated with t-PA died.

• Estimate the difference in the death rates when using Streptokinase and when using t-PA.

Example: confidence interval for p1 – p2

(cont.)

Example: confidence interval for p1 – p2

(cont.)

Page 9: 1 Confidence Intervals for Two Proportions Section 6.1

9

• Solution– The problem objective: Compare the outcomes of

two treatments.– The data are nominal (a patient lived or died)– The parameter to be estimated is p1 – p2.

• p1 = death rate with Streptokinase• p2 = death rate with t-PA

Example: confidence interval for p1 – p2

(cont.)

Example: confidence interval for p1 – p2

(cont.)

Page 10: 1 Confidence Intervals for Two Proportions Section 6.1

10

• Compute: Manually– Sample proportions:

– The 95% confidence interval estimate is

0630.205001292

p̂,0730.205001497

p̂ 21

2

22

1

1121

)ˆ1(ˆ)ˆ1(ˆ96.1)ˆˆ(

n

pp

n

pppp

)0149.,0051(.

0049.0100.20500

)0630.1(0630.

20500

)0730.1(0730.96.10630.0730.

Example: confidence interval for p1 – p2

(cont.)

Example: confidence interval for p1 – p2

(cont.)

Page 11: 1 Confidence Intervals for Two Proportions Section 6.1

11

• Interpretation– The interval (.0051, .0149) for p1 – p2 does not

contain 0; it is entirely positive, which indicates that p1, the death rate for streptokinase, is greater than p2, the death rate for t-PA.

– We estimate that the death rate for streptokinase is between .51% and 1.49% higher than the death rate for t-PA.

Example: confidence interval for p1 – p2

(cont.)

Example: confidence interval for p1 – p2

(cont.)

Page 12: 1 Confidence Intervals for Two Proportions Section 6.1

12

Example: 95% confidence interval for p1 – p2Example: 95% confidence interval for p1 – p2

2

22

1

1121

)ˆ1(ˆ)ˆ1(ˆ96.1)ˆˆ(

n

pp

n

pppp

The age at which a woman gives birth to her first child may be an important factor in the risk of later developing breast cancer. An international study conducted by WHO selected women with at least one birth and recorded if they had breast cancer or not and whether they had their first child before their 30th birthday or after.

Cancer SampleSize

Age at First Birth > 30

683 3220 21.2%

Age at First Birth <= 30

1498 10,245 14.6%

The parameter to be estimated is p1 – p2.p1 = cancer rate when age at 1st birth >30p2 = cancer rate when age at 1st birth <=30

.146) 1.96.212(.788) .146(.854)

(.2123220 10, 245

.066 1.96(.008) or .066 .016

(.05, .082)

1p̂

2p̂

We estimate that the cancer rate when age at first birth > 30 is between .05 and .082 higher than when age <= 30.

Page 13: 1 Confidence Intervals for Two Proportions Section 6.1

Beware!! Common Mistake !!!Beware!! Common Mistake !!!

A common mistake is to calculate a one-sample confidence interval for

p1, a one-sample confidence interval for p2, and to then conclude that

p1 and p2 are equal if the confidence intervals overlap.

This is WRONG because the variability in the sampling distribution for

from two independent samples is more complex and must take into

account variability coming from both samples. Hence the more complex

formula for the standard error.

2

22

1

11 )ˆ1(ˆ)ˆ1(ˆ

n

pp

n

ppSE

21 ˆˆ pp

Page 14: 1 Confidence Intervals for Two Proportions Section 6.1

INCORRECT Two single-sample 95% confidence intervals: The confidence interval for the rightie BA and the confidence interval for the leftie BA overlap, suggesting no significant

difference between Ryan Howard’s ABILITY to hit right-handed pitchers and his ABILITY to hit left-handed pitchers.

Rightie interval: (0.274, 0.366)Hits AB phat(BA)

Rightie 126 394 .320

Leftie 50 222 .225

Leftie interval: (0.170, 0.280)

ABILITIES

The 2-sample 95% confidence interval of the form

(1 ) (1 )( ) 1.96 for the difference between the

is . Interval is entirely positive,

R R L LR L R L

R L

p p p pp p p p

n n

CORRECT

(.023, .167) suggestin

ABILITIESbetween Howard's to hit righties and lefties

(evidence that is larger than ).R Lp p

g significant difference

0 .095.023 .167

Page 15: 1 Confidence Intervals for Two Proportions Section 6.1

Reason for Contradictory Result

15

1 1 2 2 1 1 2 2

1 2 1 2

1 2 1 2

It's always true that

. Specifically,

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ(1 ) (1 ) (1 ) (1 )

ˆ ˆ ˆ ˆ( ) ( ) ( )

a b a b

p p p p p p p p

n n n n

SE p p SE p SE p