copyright © 2005 brooks/cole, a division of thomson learning, inc. 12.1 chapter 12 inference about...

16
12.1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 12 Inference About A Population

Upload: gwen-douglas

Post on 03-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.1Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Chapter 12

Inference About A Population

Page 2: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.2Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Inference With Variance Unknown…

Previously, we looked at estimating and testing the population mean when the population standard deviation ( ) was known or given:

But in general we do not know the actual population standard deviation and have to estimate it from the data?

The minute we do this, the “Z-statistic” used in all formulas changes to a “t-statistic” [Student t-statistic], provided you are sampling from a normal distribution.

Page 3: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.3Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Inference With Variance Unknown…

When is unknown, we use its point estimator s

Note that the t statistic has one parameter called “degrees of freedom”. [the normal has 2 parameters, etc.]

The degrees of freedom for the single mean problems we are working is given by d.f. = = n–1.

Page 4: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.4Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Testing when is unknown…

When the population standard deviation is unknown and the population is normal, the test statistic for testing hypotheses about is:

which is Student t distributed with = n–1 degrees of freedom. The confidence interval estimator of is given by:

Page 5: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.5Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Example 12.1…In a clinical trial, if the average time for a drug to take effect is greater than 450 minutes, it is declared ineffective. Thus, each new drug must be subjected to the following hypothesis test. If the null hypothesis is rejected in favor of the alternative hypothesis, the drug is not approved.

H0: < 450

H1: > 450

In general we would use a 5% level of significance and in this example we are going to randomly sample 50 patients.

IDENTIFY

Page 6: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.6Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Example 12.1…

Our test statistic is:

With n=50 data points, we have n–1=49 degrees of freedom. Our hypothesis under question is:

H1: > 450

Our rejection region becomes:

Thus we will reject the null hypothesis in favor of the alternative if our calculated test static falls in this region.

COMPUTE

Page 7: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.7Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Example 12.1…

From the data, we calculate = 460.38, s =38.83 and thus:

Since

we reject H0 in favor of H1. That is, there is sufficient evidence to conclude that the new drug is not effective.

COMPUTE

Page 8: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.8Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Example 12.2…

How do we estimate the mean time for a drug to take effect when the standard deviation is unknown? Same problem but different data.

Assume we take a random sample of n = 83 patients and measure the time it takes for the drug to take effect. We want to construct a 95% confidence interval for the mean time, i.e. what is:

IDENTIFY

Page 9: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.9Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Example 12.2…From the data, we calculate:

For this term

and so:

COMPUTE

Page 10: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.10Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Example 12.2…

We are 95% confident that the population mean, , i.e. the mean time for the drug to become effective, lies between 13.20 minutes and 16.84 minutes.

If the sample size had been n = 21, what value of t would you use?

If the sample size had been n = 5, what value of t would you use?

INTERPRET

Page 11: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.11Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Check Requisite Conditions…The Student t distribution is robust, which means that if the population is nonnormal, the results of the t-test and confidence interval estimate are still valid provided that the population is “not extremely nonnormal”.

To check this requirement, draw a histogram of the data and see how “bell shaped” the resulting figure is. If a histogram is extremely skewed, that could be considered “extremely nonnormal” and hence t-statistics would be not be valid in this case. There are formal statistical tests available to test the hypothesis that your data comes from a normal distribution. Always wise to check this out, especially if the sample size is small [?]

Page 12: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.12Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Inference About Population Variance…If we are interested in drawing inferences about a population’s variability, the parameter we need to investigate is the population variance:

The sample variance (s2) is an unbiased, consistent and efficient point estimator for . Moreover,

the statistic, , has a chi-squared distribution,

with n–1 degrees of freedom.

Page 13: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.13Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Testing & Estimating Population VarianceThe test statistic used to test hypotheses about is:

(which is chi-squared with = n–1 degrees of freedom).

Page 14: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.14Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Inference: Population Proportion…Test statistic for p:

The confidence interval estimator for p is given by:

(both of which require that np>5 and n(1–p)>5)If this condition is not satisfied we can still work the problem with a different statistical approach.

Page 15: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.15Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Determine the Sample Size necessary to estimate the population proportion within + B with 95% confidence…

Two methods – in each case we choose a value for then

solve the equation for n.

Method 1 : no knowledge of even a rough value of . This is a ‘worst case scenario’ so we substitute = .50

Method 2 : we have some idea about the value of . This is a better scenario and we substitute in our estimated value.

Page 16: Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12.1 Chapter 12 Inference About A Population

12.16Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Selecting the Sample Size…

Method 1 : no knowledge of value of , use 50%:

Method 2 : some idea about a possible value, say 20%:

Thus, we can sample fewer people if we already have a reasonable estimate of the population proportion before starting.