section 7.2

46
Agresti/Franklin Statistics, 1 of 87 Section 7.2 How Can We Construct a Confidence Interval to Estimate a Population Proportion?

Upload: jenn

Post on 05-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Section 7.2. How Can We Construct a Confidence Interval to Estimate a Population Proportion?. Finding the 95% Confidence Interval for a Population Proportion. We symbolize a population proportion by p The point estimate of the population proportion is the sample proportion - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Section 7.2

Agresti/Franklin Statistics, 1 of 87

Section 7.2

How Can We Construct a Confidence Interval to Estimate a

Population Proportion?

Page 2: Section 7.2

Agresti/Franklin Statistics, 2 of 87

Finding the 95% Confidence Interval for a Population Proportion

We symbolize a population proportion by p The point estimate of the population

proportion is the sample proportion We symbolize the sample proportion by p̂

Page 3: Section 7.2

Agresti/Franklin Statistics, 3 of 87

Finding the 95% Confidence Interval for a Population Proportion

A 95% confidence interval uses a margin of error = 1.96(standard errors)

[point estimate ± margin of error] =

errors) ard1.96(stand p̂

Page 4: Section 7.2

Agresti/Franklin Statistics, 4 of 87

Finding the 95% Confidence Interval for a Population Proportion

The exact standard error of a sample proportion equals:

This formula depends on the unknown population proportion, p

In practice, we don’t know p, and we need to estimate the standard error

n

pp )1(

Page 5: Section 7.2

Agresti/Franklin Statistics, 5 of 87

Finding the 95% Confidence Interval for a Population Proportion

In practice, we use an estimated standard error:

n

ppse

)ˆ1(ˆ

Page 6: Section 7.2

Agresti/Franklin Statistics, 6 of 87

Finding the 95% Confidence Interval for a Population Proportion

A 95% confidence interval for a population proportion p is:

n

)p̂-(1p̂ se with 1.96(se), p̂

Page 7: Section 7.2

Agresti/Franklin Statistics, 7 of 87

Example: Would You Pay Higher Prices to Protect the Environment?

In 2000, the GSS asked: “Are you willing to pay much higher prices in order to protect the environment?”

• Of n = 1154 respondents, 518 were willing to do so

Page 8: Section 7.2

Agresti/Franklin Statistics, 8 of 87

Example: Would You Pay Higher Prices to Protect the Environment?

Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey

Page 9: Section 7.2

Agresti/Franklin Statistics, 9 of 87

Example: Would You Pay Higher Prices to Protect the Environment?

0.48) (0.42, 0.03 0.45

)015.0(96.1 1.96(se)p̂

015.01154

)55.0)(45.0(

45.01154

518p̂

se

Page 10: Section 7.2

Agresti/Franklin Statistics, 10 of 87

Sample Size Needed for Large-Sample Confidence Interval for a Proportion

For the 95% confidence interval for a proportion p to be valid, you should have at least 15 successes and 15 failures:

15 )p̂-n(1 and 15 ˆ pn

Page 11: Section 7.2

Agresti/Franklin Statistics, 11 of 87

“95% Confidence”

With probability 0.95, a sample proportion value occurs such that the confidence interval contains the population proportion, p

With probability 0.05, the method produces a confidence interval that misses p

Page 12: Section 7.2

Agresti/Franklin Statistics, 12 of 87

How Can We Use Confidence Levels Other than 95%?

In practice, the confidence level 0.95 is the most common choice

But, some applications require greater confidence

To increase the chance of a correct inference, we use a larger confidence level, such as 0.99

Page 13: Section 7.2

Agresti/Franklin Statistics, 13 of 87

A 99% Confidence Interval for p

2.58(se) ˆ p

Page 14: Section 7.2

Agresti/Franklin Statistics, 14 of 87

Different Confidence Levels

Page 15: Section 7.2

Agresti/Franklin Statistics, 15 of 87

Different Confidence Levels

In using confidence intervals, we must compromise between the desired margin of error and the desired confidence of a correct inference

• As the desired confidence level increases, the margin of error gets larger

Page 16: Section 7.2

Agresti/Franklin Statistics, 16 of 87

What is the Error Probability for the Confidence Interval Method?

The general formula for the confidence interval for a population proportion is:

Sample proportion ± (z-score)(std. error)

which in symbols is

z(se) ˆ p

Page 17: Section 7.2

Agresti/Franklin Statistics, 17 of 87

What is the Error Probability for the Confidence Interval Method?

Page 18: Section 7.2

Agresti/Franklin Statistics, 18 of 87

Summary: Confidence Interval for a Population Proportion, p

A confidence interval for a population proportion p is:

n

)p̂-(1p̂z p̂

Page 19: Section 7.2

Agresti/Franklin Statistics, 19 of 87

Summary: Effects of Confidence Level and Sample Size on Margin of Error

The margin of error for a confidence interval:

• Increases as the confidence level increases

• Decreases as the sample size increases

Page 20: Section 7.2

Agresti/Franklin Statistics, 20 of 87

What Does It Mean to Say that We Have “95% Confidence”?

If we used the 95% confidence interval method to estimate many population proportions, then in the long run about 95% of those intervals would give correct results, containing the population proportion

Page 21: Section 7.2

Agresti/Franklin Statistics, 21 of 87

Section 7.3

How Can We Construct a Confidence Interval To Estimate a

Population Mean?

Page 22: Section 7.2

Agresti/Franklin Statistics, 22 of 87

How to Construct a Confidence Interval for a Population Mean

Point estimate ± margin of error The sample mean is the point

estimate of the population mean The exact standard error of the

sample mean is σ/ In practice, we estimate σ by the

sample standard deviation, s

n

Page 23: Section 7.2

Agresti/Franklin Statistics, 23 of 87

How to Construct a Confidence Interval for a Population Mean

For large n…• and also

For small n from an underlying population that is normal…

The confidence interval for the population mean is:

)n

z(

x

Page 24: Section 7.2

Agresti/Franklin Statistics, 24 of 87

How to Construct a Confidence Interval for a Population Mean

In practice, we don’t know the population standard deviation

Substituting the sample standard deviation s for σ to get se = s/ introduces extra error

To account for this increased error, we replace the z-score by a slightly larger score, the t-score

n

Page 25: Section 7.2

Agresti/Franklin Statistics, 25 of 87

How to Construct a Confidence Interval for a Population Mean

In practice, we estimate the standard error of the sample mean by se = s/

Then, we multiply se by a t-score from the t-distribution to get the margin of error for a confidence interval for the population mean

n

Page 26: Section 7.2

Agresti/Franklin Statistics, 26 of 87

Properties of the t-distribution

The t-distribution is bell shaped and symmetric about 0

The probabilities depend on the degrees of freedom, df

The t-distribution has thicker tails and is more spread out than the standard normal distribution

Page 27: Section 7.2

Agresti/Franklin Statistics, 27 of 87

t-Distribution

Page 28: Section 7.2

Agresti/Franklin Statistics, 28 of 87

Summary: 95% Confidence Interval for a Population Mean

A 95% confidence interval for the population mean µ is:

To use this method, you need:• Data obtained by randomization

• An approximately normal population distribution

1-n df );( t.025

n

sx

Page 29: Section 7.2

Agresti/Franklin Statistics, 29 of 87

Example: eBay Auctions of Palm Handheld Computers

Do you tend to get a higher, or a lower, price if you give bidders the “buy-it-now” option?

Page 30: Section 7.2

Agresti/Franklin Statistics, 30 of 87

Example: eBay Auctions of Palm Handheld Computers

Consider some data from sales of the Palm M515 PDA (personal digital assistant)

During the first week of May 2003, 25 of these handheld computers were auctioned off, 7 of which had the “buy-it-now” option

Page 31: Section 7.2

Agresti/Franklin Statistics, 31 of 87

Example: eBay Auctions of Palm Handheld Computers

“Buy-it-now” option:

235 225 225 240 250 250 210

Bidding only:

250 249 255 200 199 240 228 255 232 246 210 178 246 240 245 225 246 225

Page 32: Section 7.2

Agresti/Franklin Statistics, 32 of 87

Example: eBay Auctions of Palm Handheld Computers

Summary of selling prices for the two types of auctions:

buy_now N Mean StDev Minimum Q1 Median Q3

no 18 231.61 21.94 178.00 221.25 240.00 246.75 yes 7 233.57 14.64 210.00 225.00 235.00 250.00

buy_now Maximum

no 255.00

yes 250.00

Page 33: Section 7.2

Agresti/Franklin Statistics, 33 of 87

Example: eBay Auctions of Palm Handheld Computers

Page 34: Section 7.2

Agresti/Franklin Statistics, 34 of 87

Example: eBay Auctions of Palm Handheld Computers

To construct a confidence interval using the t-distribution, we must assume a random sample from an approximately normal population of selling prices

Page 35: Section 7.2

Agresti/Franklin Statistics, 35 of 87

Example: eBay Auctions of Palm Handheld Computers

Let µ denote the population mean for the “buy-it-now” option

The estimate of µ is the sample mean:

x = $233.57 The sample standard deviation is:

s = $14.64

Page 36: Section 7.2

Agresti/Franklin Statistics, 36 of 87

Example: eBay Auctions of Palm Handheld Computers

The 95% confidence interval for the “buy-it-now” option is:

which is 233.57 ± 13.54 or (220.03, 247.11)

)7

64.14(44.257.233)(025.

n

stx

Page 37: Section 7.2

Agresti/Franklin Statistics, 37 of 87

Example: eBay Auctions of Palm Handheld Computers

The 95% confidence interval for the mean sales price for the bidding only option is:

(220.70, 242.52)

Page 38: Section 7.2

Agresti/Franklin Statistics, 38 of 87

Example: eBay Auctions of Palm Handheld Computers

Notice that the two intervals overlap a great deal:

• “Buy-it-now”: (220.03, 247.11)

• Bidding only: (220.70, 242.52)

There is not enough information for us to conclude that one probability distribution clearly has a higher mean than the other

Page 39: Section 7.2

Agresti/Franklin Statistics, 39 of 87

How Do We Find a t- Confidence Interval for Other Confidence Levels?

The 95% confidence interval uses t.025

since 95% of the probability falls between - t.025 and t.025

For 99% confidence, the error probability is 0.01 with 0.005 in each tail and the appropriate t-score is t.005

Page 40: Section 7.2

Agresti/Franklin Statistics, 40 of 87

If the Population is Not Normal, is the Method “Robust”?

A basic assumption of the confidence interval using the t-distribution is that the population distribution is normal

Many variables have distributions that are far from normal

Page 41: Section 7.2

Agresti/Franklin Statistics, 41 of 87

If the Population is Not Normal, is the Method “Robust”?

How problematic is it if we use the t- confidence interval even if the population distribution is not normal?

Page 42: Section 7.2

Agresti/Franklin Statistics, 42 of 87

If the Population is Not Normal, is the Method “Robust”?

For large random samples, it’s not problematic

The Central Limit Theorem applies: for large n, the sampling distribution is bell-shaped even when the population is not

Page 43: Section 7.2

Agresti/Franklin Statistics, 43 of 87

If the Population is Not Normal, is the Method “Robust”?

What about a confidence interval using the t-distribution when n is small?

Even if the population distribution is not normal, confidence intervals using t-scores usually work quite well

We say the t-distribution is a robust method in terms of the normality assumption

Page 44: Section 7.2

Agresti/Franklin Statistics, 44 of 87

Cases Where the t- Confidence Interval Does Not Work

With binary data

With data that contain extreme outliers

Page 45: Section 7.2

Agresti/Franklin Statistics, 45 of 87

The Standard Normal Distribution is the t-Distribution with df = ∞

Page 46: Section 7.2

Agresti/Franklin Statistics, 46 of 87

The 2002 GSS asked: “What do you think is the ideal number of children in a family?”

The 497 females who responded had a median of 2, mean of 3.02, and standard deviation of 1.81. What is the point estimate of the population mean?

a. 497

b. 2

c. 3.02

d. 1.81