copyright ©2006 brooks/cole, a division of thomson learning, inc. estimating proportions with...

34
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Upload: gian-sirmon

Post on 30-Mar-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Estimating Proportions

with Confidence

Chapter 10

Page 2: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 2

Principal Idea:Survey 150 randomly selected students and 41% think marijuana should be legalized.

If we report between 33% and 49% of all students at the college think that marijuana should be legalized, how confident can we be that we are correct?

Confidence interval: an interval of estimates that is likely to capture the population value.

Objective: how to calculate and interpret a confidence interval estimate of a population proportion.

Page 3: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 3

10.1 The Language and Notation of Estimation

• Unit: an individual person or object to be measured.

• Population (or universe): the entire collection of units about which we would like information or the entire collection of measurements we would have if we could measure the whole population.

• Sample: the collection of units we will actually measure or the collection of measurements we will actually obtain.

• Sample size: the number of units or measurements in the sample, denoted by n.

Page 4: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 4

More Language and Notation of Estimation• Population proportion: the fraction of the population

that has a certain trait/characteristic or the probability of success in a binomial experiment – denoted by p. The value of the parameter p is not known.

• Sample proportion: the fraction of the sample that has a certain trait/characteristic – denoted by . The statistic is an estimate of p.

The Fundamental Rule for Using Data for Inference is that available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest.

p̂p̂

Page 5: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 5

10.2 Margin of Error

• The difference between the sample proportion and the population proportion is less than the margin of error about 95% of the time, or for about 19 of every 20 sample estimates.

• The difference between the sample proportion and the population proportion is more than the margin of error about 5% of the time, or for about 1 of every 20 sample estimates

Media Descriptions of Margin of Error:

Page 6: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 6

Example 10.1 Teens and Interracial Dating

1997 USA Today/Gallup Poll of teenagers across country: 57% of the 497 teens who go out on dates say they’ve been

out with someone of another race or ethnic group.

Reported margin of error for this estimate was about 4.5%.

• In surveys of this size, the difference between the sample estimate of 57% and the true percent is likely* to be less than 4.5% one way or the other.

• There is, however, a small chance that the sample estimate might be off by more than 4.5%.

* The value of how ‘likely’ is often 95%.

Page 7: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 7

Example 10.2 If I Won the Lottery …

If you won 10 million dollars in the lottery, would you continue to work or stop working?

1997 Gallup Poll: 59% of the 616 employed respondents said they would continue to work.

Reported information about this poll:

• Results based on telephone interviews with a randomly selected sample of 1014 adults, conducted Aug 22–25, ‘97.

• Among this group, 616 are employed full-time/part-time.

• For results based on this sample of “workers,” one can say with 95% confidence that the error attributable to sampling could be plus or minus 4 percentage points.

Page 8: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 8

10.3 Confidence Intervals

Interpreting the Confidence Level• The confidence level is the probability that the procedure

used to determine the interval will provide an interval that includes the population parameter.

• If we consider all possible randomly selected samples of same size from a population, the confidence level is the fraction or percent of those samples for which the confidence interval includes the population parameter.

Note: Often express the confidence level as a percent. Common levels are 90%, 95%, 98%, and 99%.

Confidence interval: an interval of values computed from sample data that is likely to include the true population value.

Page 9: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 9

Constructing a 95% Confidence Intervalfor a Population Proportion

Sample estimate Margin of error

In the long run, about 95% of all confidence intervals computed in this way will capture the population value of the proportion, and about 5% of them will miss it.

Be careful: The confidence level only expresses how often the procedure works in the long run. Any one specific interval either does or does not include the true unknown population value.

Page 10: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 10

Example 10.1 Teens and Interracial Dating (cont)

Poll: 57% of dating teens sampled had gone out with somebody of another race/ethnic group.

Margin of error was 4.5%.

95% Confidence Interval:57% 4.5%, or 52.5% to 61.5%

We have 95% confidence that somewhere between 52.5% and 61.5% of all American teens who date have gone out with somebody of another race or ethnic group.

Page 11: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 11

Example 10.2 Winning the Lottery and Work (cont)

Poll: 40% of employed workers sampled would quit working if they won the lottery.

Margin of error was 4%.

95% Confidence Interval Estimate: Sample estimate Margin of error

40% 4%36% to 44%

With 95% confidence, somewhere between 36% and 44% of working Americans would say they would quit working if they won $10 million in the lottery.Interval does not cover 50% => Appears that fewer than half of all working Americans think they would quit if won lottery.

Page 12: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 12

10.4 Calculating A Margin of Error for 95% Confidence

Note: The “95% margin of error” is simply two standard errors, or 2 s.e.( ).

For a 95% confidence level, the approximate margin of error for a sample proportion is

n

pp ˆ1ˆ2error ofMargin

Page 13: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 13

Factors that Determine Margin of Error

1. The sample size, n. When sample size increases, margin of error decreases.

2. The sample proportion, . If the proportion is close to either 1 or 0 most individuals have the same trait or opinion, so there is little natural variability and the margin of error is smaller than if the proportion is near 0.5.

3. The “multiplier” 2. Connected to the “95%” aspect of the margin of error. Later you’ll learn: the exact value for 95% is 1.96 and how to change the multiplier to change the level.

Page 14: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 14

Example 10.3 Pollen Count Must Be High

Poll: Random sample of 883 American adults.“Are you allergic to anything?”

We can be 95% confident that somewhere between 33% and 39% of all adult Americans have allergies.

Results: 36% of the sample said “yes”, = .36

032.

883

36.136.2

ˆ1ˆ2error ofmargin 95%

n

pp

95% Confidence Interval: .36 .032, or about .33 to .39

Page 15: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 15

The Conservative Estimate of Margin of Error

• It usually overestimates the actual size of the margin of error.

• It works (conservatively) for all survey questions based on the same sample size, even if the sample proportions differ from one question to the next.

• Obtained when = .5 in the margin of error formula. p̂

Conservative estimate of the margin of error = n

1

Page 16: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 16

Example 10.3 Really Bad Allergies (cont)

Poll: Random sample of 883 American adults 3% of the sample experience “severe” symptoms

When is far from .5, the conservative margin of error is too conservative. The 95% margin of error using = .03 is just .011 or 1.1%, for an interval from 1.9% to 4.1%.

%4.3or 034.883

1error ofmargin veconservati

95% (conservative) Confidence Interval: 3% 3.4%, or -0.4% to 6.4%

Page 17: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 17

From the sampling distribution of we have: For 95% of all samples,

-2 standard deviations < – p < 2 standard deviations Don’t know true standard deviation, so use standard error.For approximately 95% of all samples,

-2 standard errors < – p < 2 standard errorswhich implies for approximately 95% of all samples,

– 2 standard errors < p < + 2 standard errors

10.5 General Theory of CIs for a Proportion

Developing the 95% Confidence Level

p̂p̂

Page 18: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 18

Approximate 95% CI for the population proportion:

2 standard errors

The standard error is

Interpretation: For about 95% of all randomly selected samples from the population, the confidence interval computed in this manner captures the population proportion.

Necessary Conditions: and are both greater than 10, and the sample is randomly selected.

General Description of the Approximate 95% CI for a Proportion

n

pppes

ˆ1ˆ)ˆ.(.

pnˆ pn ˆ1

Page 19: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 19

For any confidence level, a confidence interval for either a population proportion or a population mean can be expressed as

Sample estimate Multiplier Standard error

The multiplier is affected by the choice of confidence level.

General Format for Confidence Intervals

Page 20: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 20

Note: Increase confidence level => larger multiplier.

More about the Multiplier

Multiplier, denoted as z*, is the standardized score such that the area between -z* and z* under the standard normal curve corresponds to the desired confidence level.

Page 21: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 21

• is the sample proportion.• z* denotes the multiplier.where

• is the standard error of .

Formula for a Confidence Interval for a Population Proportion p

n

pp ˆ1ˆ

n

ppzp

ˆ1ˆˆ

Page 22: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 22

Example 10.6 Intelligent Life Elsewhere?

Poll: Random sample of 935 AmericansDo you think there is intelligent life on other planets?

Note: entire interval is above 50% => high confidence that a majority believe there is intelligent life.

Results: 60% of the sample said “yes”, = .60

016.

935

6.16.ˆ..

pes

90% Confidence Interval: .60 1.65(.016), or .60 .02698% Confidence Interval: .60 2.33(.016), or .60 .037

Page 23: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 23

Example 10.6 Intelligent Life Elsewhere?

Poll: Random sample of 935 Americans“Do you think there is intelligent life on other planets?

Note: Lower confidence level results in a narrower interval.

Results: 60% of the sample said “yes”, = .60

50% Confidence Interval: .60 .67(.016), or .60 .011

We want a 50% confidence interval. If the area between -z* and z* is .50, then the area to the left of z* is .75. From Table A.1 we have z* .67.

Page 24: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 24

Conditions for Using the Formula

1. Sample is randomly selected from the population.Note: Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest.

2. Normal curve approximation to the distribution of possible sample proportions assumes a “large” sample size. Both and should be at least 10 (although some say these need only to be at least 5).

pnˆ pn ˆ1

Page 25: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 25

Table provides 95% conservative margin of error for various sample sizes n

10.6 Choosing a Sample Size

1. When sample size is increased, margin of error decreases.

2. When a large sample size is made even larger, the improvement in accuracy is relatively small.

Important features:

Page 26: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 26

For most surveys, the number of people in the population has almost no influence* on the accuracy of sample estimates.

Margin of error for a sample size of 1000 is about 3% whether the number of people in the population is 30,000 or 200 million.

The Effect of Population Size

* As long as the population is at least ten times as large as the sample.

Page 27: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 27

Principle 1. A value not in a confidence interval can be rejected as a possible value of the population proportion. A value in a confidence interval is an “acceptable” possibility for the value of a population proportion.

Principle 2. When the confidence intervals for proportions in two different populations do not overlap, it is reasonable to conclude that the two population proportions are different.

10.7 Using Confidence Intervals

to Guide Decisions

Page 28: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 28

Example 10.7 Which Drink Tastes Better?

Taste Test: A sample of 60 people taste both drinks and 55% like taste of Drink A better than Drink B.

Note: Since .50 is in the interval, there is not enough evidence to claim that Drink A is preferred by a majority of population represented by the sample.

Makers of Drink A want to advertise these results.Makers of Drink B make a 95% confidence interval for the population proportion who prefer Drink A.

95% Confidence Interval:

13.55.60

55.155.255.

Page 29: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 29

Case Study 10.1 ESP Works with Movies

ESP Study by Bem and Honorton (1994)

• Subjects (receivers) described what another person (sender) was seeing on a screen.

• Receivers shown 4 pictures, asked to pick which they thought sender had actually seen.

• Actual image shown randomly picked from 4 choices.

• Image was either a single, “static” image or a “dynamic” short video clip, played repeatedly (additional three choices shown were always of the same type as actual.

Page 30: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 30

Case Study 10.1 ESP Works (cont)Bem and Honorton (1994) ESP Study Results

Is there enough evidence to say that the % of correct guesses for dynamic pictures is significantly above 25%?

95% CI:

477. to333.072.405.190

405.1405.2405.

Can claim the true % of correct guesses is significantly better

than what would occur from random guessing.

Page 31: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 31

Case Study 10.2 Nicotine Patches vs Zyban

Study: New England Journal of Medicine 3/4/99)

• 893 participants randomly allocated to four treatment groups: placebo, nicotine patch only, Zyban only, and Zyban plus nicotine patch.

• Participants blinded: all used a patch (nicotine or placebo) and all took a pill (Zyban or placebo).

• Treatments used for nine weeks.

Page 32: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 32

Case Study 10.2 Nicotine (cont)

Conclusions:

Zyban is effective (no overlap of Zyban and no Zyban CIs)

Nicotine patch is not particularly effective(overlap of patch and no patch CIs)

Page 33: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 33

Case Study 10.3 What a Great PersonalityWould you date someone with a great personality

even though you did not find them attractive?

Women: 61.1% of 131 answered “yes.” 95% confidence interval is 52.7% to 69.4%.

Men: 42.6% of 61 answered “yes.” 95% confidence interval is 30.2% to 55%.

Conclusions:• Higher proportion of

women would say yes. CIs slightly overlap

• Women CI narrower than men CI due to larger sample size

Page 34: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 34

General CI for p:

In Summary: Confidence Interval for a Population Proportion p

n

ppzp

ˆ1ˆˆ

Approximate 95% CI for p:

n

ppp

ˆ1ˆ2ˆ

Conservative 95% CI for p: n

p1

ˆ