11.1 – significance tests: the basics

69
11.1 – Significance Tests: The basics

Upload: neil-alston

Post on 31-Dec-2015

55 views

Category:

Documents


0 download

DESCRIPTION

11.1 – Significance Tests: The basics. Inference:. to assess the evidence provided by the sample to claim information about the population. Hypothesis:. A claim made about a population parameter, and sample data is gathered to determine whether the hypothesis is true. Null Hypothesis:. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 11.1 – Significance Tests:  The basics

11.1 – Significance Tests: The basics

Page 2: 11.1 – Significance Tests:  The basics

Inference:

to assess the evidence provided by the sample to claim information about the population.

x

p̂ p

Page 3: 11.1 – Significance Tests:  The basics

Hypothesis:

A claim made about a population parameter, and sample data is gathered to determine whether the hypothesis is true.

Page 4: 11.1 – Significance Tests:  The basics

Null Hypothesis:

The statement being tested. We believe this to be true until we get evidence against it.

NOTATION:

ooH :

Page 5: 11.1 – Significance Tests:  The basics

Alternate Hypothesis:

Statement we hope or suspect is true instead of the null hypothesis

NOTATION:

:a oH :a oH

:a oH

Page 6: 11.1 – Significance Tests:  The basics

:a oH

:a oH

:a oH

x

x

x x(Two-Tailed)

Page 7: 11.1 – Significance Tests:  The basics

Test Statistic:

A sample statistic that is computed from the data. It helps us to make a statistical decision. Do we have enough evidence to reject the null hypothesis or not?

test statistic = estimate hypothesized value

standard deviation of the estimate

Z = x O

n

Page 8: 11.1 – Significance Tests:  The basics

p-value:

• This value measures how much evidence you have against the null hypothesis.

• Small p-values indicate the outcome measured from the sample data is unlikely given the null hypothesis is true. It provides strong evidence against your null hypothesis.

Page 9: 11.1 – Significance Tests:  The basics

Statistically Significant:

An event unlikely to occur by chance. If your p-value is small, then it is statistically significant. It is called alpha, .

Page 10: 11.1 – Significance Tests:  The basics

Significance Level:

The decisive p-value we fix in advance. This states when the null hypothesis should be rejected. This level is compared to the p-value. Common levels of rejection are =0.10, =0.05, and =0.01.

p < , then reject the null

p , then accept the null

Page 11: 11.1 – Significance Tests:  The basics

Conditions:

SRS

Normality

Independence

Page 12: 11.1 – Significance Tests:  The basics

Example #1 – State the notation for the null and alternative hypothesis

a. Suppose we work in the quality control department of Ruffles Potato Chips. The quality control manager wants us to verify that the filling machine is calibrated properly. We wish to determine if the mean amount of chips in a bag is different from the advertised 12.5 ounces. The company is concerned if there are too many or too few chips in the bag.

: 12.5oH : 12.5aH

Page 13: 11.1 – Significance Tests:  The basics

: $89oH : $89aH

Example #1 – State the notation for the null and alternative hypothesis

b. According to the US Department of Agriculture, the mean farm rent in Indiana was $89 per acre in 1995. A researcher for the USDA claims that the mean rent has decreased since then. He randomly selected 50 farms from Indiana and determined the mean farm rent to be $67.

Page 14: 11.1 – Significance Tests:  The basics

: 0oH : 0aH

Example #1 – State the notation for the null and alternative hypothesis

c. Researchers claim to have found a brain protein that blocks the craving for fatty food and therefore, increases the loss of body fat. To test this theory, 100 people are treated with protein and the reduction in body fat is measured.

Page 15: 11.1 – Significance Tests:  The basics

Example #2

At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds.

a. Identify the parameter of interest. State your null and alternative hypotheses.

= True mean weight of loaves of bread

: 1oH : 1aH

Page 16: 11.1 – Significance Tests:  The basics

Example #2

At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds.

b. Verify the conditions are met.

SRS

Normality

Independence

(must assume)

(yes, pop. is approx normal, therefore, so is sample dist)

(There are more than 200 loaves of bread)

Page 17: 11.1 – Significance Tests:  The basics

Example #2

At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds.

c. Calculate the test statistic and the P-value. Illustrate using the graph provided.

OxZ

n

1.05 1

0.13

20

0.05

.029 1.72

Page 18: 11.1 – Significance Tests:  The basics

1.72

P(Z > 1.72) = 1 – P(Z < 1.72) =

Page 19: 11.1 – Significance Tests:  The basics
Page 20: 11.1 – Significance Tests:  The basics

1.72

P(Z > 1.72) = 1 – P(Z < 1.72) = 1 – 0.9573 = 0.0427

Page 21: 11.1 – Significance Tests:  The basics

Example #2

At the bakery where you work, loaves of bread are supposed to weigh 1 pound. From experience, the weights of loaves produced at the bakery follow a Normal distribution with standard deviation = 0.13 pounds. You believe that new personnel are producing loaves that are heavier than 1 pound. As supervisor of Quality Control, you want to test your claim at the 5% significance level. You weigh 20 loaves and obtain a mean weight of 1.05 pounds.

d. State your conclusions clearly in complete sentences.

I would reject the null hypothesis at the 0.05 level. I believe that the workers are making the loaves heavier.

Page 22: 11.1 – Significance Tests:  The basics

11.2 - Carrying Out Significance Tests

Page 23: 11.1 – Significance Tests:  The basics

Steps to Hypothesis Testing: PHANTOMS

P: Parameter of interest

H: Hypothesis

A: Assumptions

N: Name of Test

T: Test Statistic

O: Obtain P-Value

M: Make a Statistical Decision

S: Summary in context of problem.

Page 24: 11.1 – Significance Tests:  The basics

One-Sample Z-Test:

Testing the mean when is known.

Page 25: 11.1 – Significance Tests:  The basics

Calculator Tip: Z-Test

Stat – Tests - ZTest

Page 26: 11.1 – Significance Tests:  The basics

Example #1An energy official claims that the oil output per well in the US has declined from the 1998 level of 11.1 barrels per day. He randomly samples 50 wells throughout the US and determines that the mean output to be 10.7 barrels per day. Assume =1.3 barrels. Test the researchers claim at the =0.05 level.

P: Mean oil output per well in the US

H: o: 11.1H : 11.1AH

Page 27: 11.1 – Significance Tests:  The basics

A: SRS (says so)

Normality (n 30, so by the CLT, approx normal)

Independence

(Safe to assume more than 500 wells in the US)

N: ZTest

Page 28: 11.1 – Significance Tests:  The basics

T:

OxZ

n

10.7 11.1

1.3

50

0.4

0.1838

2.176

Page 29: 11.1 – Significance Tests:  The basics

2.176

P(Z < -2.176) =

O:

Page 30: 11.1 – Significance Tests:  The basics
Page 31: 11.1 – Significance Tests:  The basics

P(Z < -2.176) = 0.0146

2.176

O:

Page 32: 11.1 – Significance Tests:  The basics

M:

____ p 0.0146 0.05

<

Reject the Null

Page 33: 11.1 – Significance Tests:  The basics

There is enough evidence to reject the claim that the average oil output per well in the US is 11.1 barrels per day.

S:

Page 34: 11.1 – Significance Tests:  The basics

Example #2The average daily volume of Dell computer stock in 2000 was 31.8 million shares with a standard deviation of 14.8 million shares according to Yahoo! A stock analyst claims that the stock volume in 2001 is different from the 2000 level. Based on a random sample of 35 trading days in 2001, he finds the sample mean to be 37.2 million shares. Test the analyst’s claim at the =0.01 level.

P: Mean volume of Dell computer stock

H: o: 31.8H : 31.8AH

Page 35: 11.1 – Significance Tests:  The basics

A: SRS (says so)

Normality (n 30, so by the CLT, approx normal)

Independence

(Safe to assume more than 350 trading days)

N: ZTest

Page 36: 11.1 – Significance Tests:  The basics

T:

OxZ

n

37.2 31.8

14.8

35

5.4

2.50 2.16

Page 37: 11.1 – Significance Tests:  The basics

2.16

2[ P(Z < -2.16)] =

O:

2.16

Page 38: 11.1 – Significance Tests:  The basics
Page 39: 11.1 – Significance Tests:  The basics

0.0308

O:

2.162.16

2[ P(Z < -2.16)] = 2[ 0.0154] =

Page 40: 11.1 – Significance Tests:  The basics

M:

____ p 0.0308 0.01

>

Accept the Null

Page 41: 11.1 – Significance Tests:  The basics

There is not enough evidence to claim that the average daily volume of Dell stock is different from 31.8 million shares.

S:

Page 42: 11.1 – Significance Tests:  The basics

Duality of Confidence Intervals and Hypothesis Testing

If the confidence interval does not contain μo, we have evidence that supports the alternative hypothesis, thus we reject the null hypothesis at the level.

Note: The Confidence Interval matches the two-tailed test only!

Page 43: 11.1 – Significance Tests:  The basics

Example #3The average daily volume of Dell computer stock in 2000 was 31.8 million shares with a standard deviation of 14.8 million shares according to Yahoo! A stock analyst claims that the stock volume in 2001 is different from the 2000 level. Based on a random sample of 35 trading days in 2001, he finds the sample mean to be 37.2 million shares. Test the analyst’s claim at the =0.01 level.

a. What was your conclusion from this hypothesis test in Example #2?

To not reject 31.8 million shares per day

Page 44: 11.1 – Significance Tests:  The basics

b. Construct a 99% confidence interval for the true average daily volume of Dell Computer stock in 2001.

Note: We already did P and A

N: Z-Interval

Page 45: 11.1 – Significance Tests:  The basics

I: *x Zn

14.837.2 2.576

35

37.2 6.444

30.756, 43.644

Page 46: 11.1 – Significance Tests:  The basics

C: I am 99% confident the true mean daily Dell volume stock is between 30.756 and 43.644 million shares.

c. Does this interval reaffirm your statistical decision from the hypothesis test? Explain.

Yes, 31.8 is in the interval, so can’t assume it is different

Page 47: 11.1 – Significance Tests:  The basics

Example #4Does marijuana use affect anger expression? Assume for all non-users, the mean score on an anger expression scale is 41.5 with a standard deviation of 6.05. For a random sample of 47 frequent marijuana users, the mean score was 44.

a. Test the claim that marijuana affects the expression of anger at the =0.05 level.

P: True mean anger expression for marijuana users

H: o: 41.5H : 41.5AH

Page 48: 11.1 – Significance Tests:  The basics

A: SRS (says so)

Normality (n 30, so by the CLT, approx normal)

Independence

(Safe to assume more than 470 marijuana users)

N: ZTest

Page 49: 11.1 – Significance Tests:  The basics

T:

OxZ

n

44 41.5

6.05

47

2.5

0.88 2.83

Page 50: 11.1 – Significance Tests:  The basics

2.83

2[ P(Z < -2.83)] =

O:

2.83

Page 51: 11.1 – Significance Tests:  The basics
Page 52: 11.1 – Significance Tests:  The basics

0.0046

O:

2.832.83

2[ P(Z < -2.83)] = 2[ 0.0023] =

Page 53: 11.1 – Significance Tests:  The basics

M:

____ p 0.0046 0.05

>

Reject the Null

Page 54: 11.1 – Significance Tests:  The basics

There is not enough evidence to claim that the average anger expression for marijuana users is 41.5.

S:

Yes,

Does marijuana use affect anger expression?

anger expression is different for marijuana users

Page 55: 11.1 – Significance Tests:  The basics

b. Calculate a 95% confidence interval for the mean anger expression of frequent marijuana users. Does this interval reaffirm your statistical decision in part a?

N: Z-Interval

Page 56: 11.1 – Significance Tests:  The basics

I: *x Zn

6.0544 1.96

47

44 1.72966

42.27, 45.73

Page 57: 11.1 – Significance Tests:  The basics

C: I am 95% confident the true mean anger expression for marijuana users is between 42.27 and 45.73.

b. Does this interval reaffirm your statistical decision in part a?

Yes, 41.5 is not in the interval, so can’t assume it is the same

Page 58: 11.1 – Significance Tests:  The basics

11.3 – Use and Abuse of Tests11.4 – Using Inference to Make Decisions

Page 59: 11.1 – Significance Tests:  The basics

How plausible is Ho? If it represents an assumption that the people you must convince have believed for years, strong evidence (small ) will be needed.

What are the consequences for rejecting Ho? To do this means you might have to make major changes to accept Ha.

Consider the sample and if you need to increase the sample size or look for outliers. Is the sample a true representation of the population?

Remember that a certain percent of time you won’t reject the null. (ex. 5%) Multiple testing helps to check this.

What level to use?

Page 60: 11.1 – Significance Tests:  The basics

What level to use?

• Typically ok to use 0.05

• Beware of the p-values of 0.049 and the 0.051!

Page 61: 11.1 – Significance Tests:  The basics

Errors in Hypothesis Testing:Because a statistician must make inferences (or conclusions) based on random data that is subject to sampling errors, we can make mistakes in hypothesis testing. In fact, there are two types of errors that can be made.

Ho True Ho False

Reject Ho

Do not Reject Ho

Type I Error

p =

p = 1 – Power of the test

Type II Error

p =

Note: You will never have to calculate

Page 62: 11.1 – Significance Tests:  The basics

To reduce type II error and increase the power of the test:

• Increase the sample size

• Increase the significance level alpha (be careful, if we choose an alpha that almost guarantees never to make a type I error, then there is a large type II error, because it would be hard to reject the null under any circumstance.

Page 63: 11.1 – Significance Tests:  The basics

Example #1: In a criminal trial, the defendant is held to be innocent until shown to be guilty beyond a reasonable doubt. If we consider hypotheses H0: defendant is innocent Ha: defendant is guiltywe can reject H0 only if the evidence strongly favors Ha.

1. Make a diagram that shows the truth about the defendant, and the possible verdicts and that identifies the two types of error. Which type of error is more serious?

Ho True Ho False

Reject Ho

Do not Reject Ho

Innocent and found guilty

Guilty and found innocent

Innocent and found innocent

Guilty and found guilty

I:

II:

Type I error is more serious

Page 64: 11.1 – Significance Tests:  The basics

Example #1: In a criminal trial, the defendant is held to be innocent until shown to be guilty beyond a reasonable doubt. If we consider hypotheses H0: defendant is innocent Ha: defendant is guiltywe can reject H0 only if the evidence strongly favors Ha.

2. Is this goal better served by a test with = 0.20 or a test with = 0.01? Explain your answer.

Ho True Ho False

Reject Ho

Do not Reject Ho

Innocent and found guilty

Guilty and found innocent

Innocent and found innocent

Guilty and found guilty

I:

II:

= 0.01 because the probability of a Type I error would be smaller then

Page 65: 11.1 – Significance Tests:  The basics

Example #1: In a criminal trial, the defendant is held to be innocent until shown to be guilty beyond a reasonable doubt. If we consider hypotheses H0: defendant is innocent Ha: defendant is guiltywe can reject H0 only if the evidence strongly favors Ha.

3. Explain what is meant by the power of the test in this setting.

Ho True Ho False

Reject Ho

Do not Reject Ho

Innocent and found guilty

Guilty and found innocent

Innocent and found innocent

Guilty and found guilty

I:

II:

power

The ability to find a person guilty that is in fact guilty

Page 66: 11.1 – Significance Tests:  The basics

Example #2: For each of the following samples, state the null and alternative hypotheses, Identify when a Type I and a Type II Error would occur.

a. A company specializing in parachute assembly claims that its competitor’s main parachute failure rate is more than 1%. You perform a hypothesis test to determine whether the company’s claim is true. Which error is more serious?

Ho: The main parachute failure rate is 1%

Ha: The main parachute failure rate is more than 1%

Page 67: 11.1 – Significance Tests:  The basics

a. A company specializing in parachute assembly claims that its competitor’s main parachute failure rate is more than 1%. You perform a hypothesis test to determine whether the company’s claim is true. Which error is more serious?

Ho True Ho False

Reject Ho

Do not Reject Ho

Failure rate is 1% and think its more than 1%

Failure rate is not 1% and think it is 1%

Failure rate is 1% and think it is 1%

Failure rate is not 1% and think its more than 1%

I:

II:

Type II error is more serious

Ho: The main parachute failure rate is 1%

Ha: The main parachute failure rate is more than 1%

Page 68: 11.1 – Significance Tests:  The basics

Example #2: For each of the following samples, state the null and alternative hypotheses, Identify when a Type I and a Type II Error would occur.

b. A company that produces snack foods uses a machine to package 454 gram bags of pretzels. If it is working properly, the bags will be exactly 454 grams. You perform a hypothesis test to determine whether the company is packaging the right amount of grams per bag.

Ho: There is 454 grams of pretzels are in the bag

Ha: There is not 454 grams of pretzels are in the bag

Page 69: 11.1 – Significance Tests:  The basics

b. A company that produces snack foods uses a machine to package 454 gram bags of pretzels. If it is working properly, the bags will be exactly 454 grams. You perform a hypothesis test to determine whether the company is packaging the right amount of grams per bag.

Ho True Ho False

Reject Ho

Do not Reject Ho

454 grams in bag and don’t think 454g

Not 454g in bag and think 454g.

454 grams in bag and think 454g.

Not 454g in bag, and don’t think 454g

I:

II:

Ho: There is 454 grams of pretzels are in the bag Ha: There is not 454 grams of pretzels are in the bag