central limit theorem · 2016. 11. 28. · central limit theorem two assumptions 1. the sampled...

26
Central limit Theorem Sample Distribution Models for Means and Proportions

Upload: others

Post on 27-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Central limit TheoremSample Distribution Models for Means and Proportions

Page 2: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Central Limit TheoremTwo assumptions

1. The sampled values must be independent

2. The sample size, n, must be large enough

• The mean of a random sample has a sampling distribution whose shape can be approximated by a Normal model.

• The larger the sample, the better the approximation will be.

• This is regardless of the shape of the distribution of the population being sampled from or the shape of the distribution of the sample.

Page 3: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Distribution of sample proportions• Population has a fixed proportion

• To find population proportion a sample is taken and a sample proportion is calculated

If samples are repeatedly taken with the same sample size

• The mean of the sample distribution would be the population proportion,

• The standard deviation would be

( )p

( )p

pq

n

p̂ p

Page 4: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Conditions to check for the assumptions

1. Success/Failure: The expected number of successes and failure is both greater than 10

2. 10% Condition: Each sample is less than 10% of the population

3. Randomization: The sample was obtained through random sample techniques or we can at least assume that the sample is representative.

All conditions have been met to use the Normal model for the distribution of sample proportions.

10 10np andnq

Page 5: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

• If samples were repeatedly taken with the same sample size then from the CLT, the distribution would be approximately Normal

ˆ ~ ,pq

p N pn

Page 6: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Example: Skittles• According to the manufacturer of the candy

Skittles, 20% of the candy produced is the color red. What is the probability that given a large bag of skittles with 58 candies that we get at least 17 red?

Page 7: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Conditions:

1. 10% condition: 58 skittles is less than 10% of all skittles produced.

2. Success/Failure:

There are at least 10 successes and failures

3. Randomization: Though not from a random sample we can assume the bag is representative of the population.

All conditions have been met to use the Normal model for the distribution of sample proportions.

58 0.20 11.6 10np 58 0.80 46.4 10nq

Page 8: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

• Mean:

• Standard Deviation:

• So the model for becomes N(0.20,0.0525)

• Sample proportion:

0.20p

0.20 0.800.0525

58

pq

n

17ˆ 0.293

58p

Page 9: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

• Then to find the probability that we get a sample proportion of 0.293 or higher:

ˆ( 0.293) 0.0383P p

Page 10: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Confidence Intervals1 Proportion z-intervals

Page 11: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Distribution of Sample Proportions

• From previous work-

• Distribution of sample proportions follow a Normal Model

• But most of the time we don’t know what the population

proportion is.

,pq

N pn

Page 12: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

• We take samples to try to find the population

proportion.

• is the estimate of p

• Since we don’t know p we can’t find the standard

deviation.

• We’ll estimate it with the Standard Error:

ˆ ˆ

ˆpq

SE pn

Page 13: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Confidence Interval

• An interval based on the sample proportion in which we

have a measure of confidence that the true population

proportion lies in.

• Size of the interval is based on sample size and level of

confidence.

• The larger the sample size, the smaller the interval is

• The larger the confidence, the larger the interval is

Page 14: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

• Every confidence interval has the same basic setup

• ME is the measure of error

• For a one-proportion sample

where z* is the critical value, the z value associated

with the level of confidence

estimate ME

*

*

ˆ ˆ( )

ˆ ˆˆ

p z SE p

pqp z

n

* ˆ ˆpqME z

n

Page 15: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Critical Values – some basicsLevel of Confidence z*

90% 1.645

95% 1.960

99% 2.576

To find the critical value given a level of confidence

1. Subtract level of confidence from 1

2. Divide difference by 2

3. Use invNorm( ) function on the calculator but make it a positive value

Ex. 90% confidence

1-.9 = 0.10

0.10/2 = 0.05

invNorm(0.05) = -1.645

z* = 1.645

Page 16: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Conditions• Randomization

• 10% Condition(Independence)

• Success/Failure: this uses the sample

proportion since we don’t know the population

proportion

All conditions have been met to use the Normal

model for a 1-proportion z-interval

ˆ ˆ10; 10np nq

Page 17: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

An experiment finds that 27% of 53 subjects report

improvement after using a new medicine. Create a 95%

confidence interval for the actual cure rate.

Conditions:

1) Random: assume representative sample

2) 10% Condition: It is safe to assume that 53 subjects is less

than 10% of all subjects

3) Success/Failure:

All conditions have been met to use the Normal model for a 1-

proportion z-interval.

ˆ53 0.27n p

ˆ 53 .27 14.31 10

ˆ 53 .73 38.69 10

np

nq

Page 18: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

Mechanics:

Conclusion:

We are 95% confident that the true proportion of subjects that

show improvement lies between 15.05% and 38.95%.

ˆ53 0.27 95 1.96n p CL z

0.27 0.730.27 1.96

53

0.1505,0.3895

ˆ ˆˆ:

pqCI p z

n

0.27 0.1195

Page 19: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

p. 456 #11. In January 2007 Consumer Reports published

their study of bacterial contamination of chicken sold in the

United States. They purchased 525 broiler chickens from

various kinds of food stores in 23 states and tested them

for types of bacteria that cause food-borne illnesses.

Laboratory results indicated that 83% of these chickens

were infected with Campylobacter. Construct a 95%

confidence interval.

Page 20: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

ˆ 525(0.17) 89.25 10nq

Conditions:

•Random: assume sample is representative

•10% Condition: 525 chickens is less than 10% of all chickens for sale

•Success/Failure:

p. 456 #11. Contaminated Chicken

n = 525 ˆ 0.83p

ˆ 525(0.83) 435.75 10np

ˆ ˆ 0.83 0.17ˆ 0.83 1.96

525

p qp z

n

All conditions have been met to use the Normal model for a 1 proportion z-interval.

CI:

(0.7979, 0.8621)

We are 95% confident that the true proportion of broiler chickens

infected with Campylobacter lies between 79.8% and 86.2%.

Page 21: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

p. 456 #18. Direct mail advertisers send solicitations (a.k.a.

“junk mail”) to thousands of potential customers in the hope

that some will buy the company’s product. The acceptance

rate is usually quite low. Suppose a company wants to test

the response to a new flyer, and sends it to 1000 people

randomly selected from their mailing list of over 200,000

people. They get orders from 123 of the recipients. Create a

90% confidence interval for the percentage of people the

company contacts who may buy something.

Page 22: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

ˆ 877 10nq

Conditions:

•Random: stated as a random sample

•10% Condition: 1000 people is less than 10% of 200,000 people on the mailing list

•Success/Failure:

p. 456 #18. Junk Mail

n = 1000

123ˆ 0.123

1000p

ˆ 123 10np

ˆ ˆ 0.123 0.877ˆ 0.123 1.645

1000

p qp z

n

All conditions have been met to use the Normal model for a 1 proportion z-interval.

CI:

(0.1059,0.1400)

We are 90% confident that the true proportion of people contacted that

buy something lies between 10.6% and 14.0%

Page 23: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

What does __% confidence mean?

Stock Statement:

About ___% of random samples of size (n) will produce confidence

intervals that contain the true proportion of ___

Ex. #18. What does 90% confidence mean?

About 90% of random samples of size 1000 will produce confidence

intervals that contain the true proportion of people contacted who will

buy something.

Page 24: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

d) Since 5% lies below the interval it is

suggested that the company run the mass

mailing.

Page 25: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

From a previous experiment we found the cure rate to be 27%. How

many subjects would we need in a new experiment to be able to

create a confidence interval with 98% and a ME of only ±5%?

427n

0.27 98% * 2.326 0.05p z ME

*pq

ME zn

(0.27)(0.73)0.05 2.326

n

Page 26: Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled values must be independent 2. The sample size, n, must be large enough •The mean

• 95% Confidence ME = 0.03 ˆ 0.36p

90% Confidence ME = 0.045 ˆ 0.27p

(0.36)(0.64)0.03 1.96

n

984n

(0.27)(0.73)0.045 1.645

n

264n