hypothesis testing iii 2/15/12

38
Hypothesis Testing III 2/15/12 • Statistical significance • Errors • Power • Significance and sample size Section 4.3 Professor Kari Lock Morgan Duke University

Upload: loyal

Post on 24-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Hypothesis Testing III 2/15/12. Statistical significance Errors Power Significance and sample size. Section 4.3. Professor Kari Lock Morgan Duke University. To Do. P roject 1 Proposal (due today, 5pm) Homework 4 (due Monday) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hypothesis Testing III 2/15/12

Hypothesis Testing III2/15/12

• Statistical significance• Errors• Power• Significance and sample size

Section 4.3 Professor Kari Lock MorganDuke University

Page 2: Hypothesis Testing III 2/15/12

• Project 1 Proposal (due today, 5pm)

• Homework 4 (due Monday)• If you turn in your HW4 by 5pm on Friday (either

slide it under the door of Old Chem 216 or email it to your TA), it will be graded by class on Monday

• NO LATE HOMEWORK ACCEPTED!

• Study/prepare for Exam 1!

To Do

Page 3: Hypothesis Testing III 2/15/12

• Exam 1:• In-class portion: Wednesday, 2/22• Lab portion: Thursday, 2/23

• In-class portion: (75%)• Open only to a calculator and one double sided

page of notes prepared by you• Emphasis on conceptual understanding

• Lab portion: (25%)• Open to everything except communication of any

form with other humans• Emphasis on actually analyzing data

Exam 1

Page 4: Hypothesis Testing III 2/15/12

• Last year’s in-class and lab midterms, with solutions, are available on the course website (under documents)

• Full solutions to ALL the essential synthesis and review problems from Units 1 and 2 are available on the course website

• Doing problems is the key to success!!!

Practice

Page 5: Hypothesis Testing III 2/15/12

• Work lots of practice problems! • Take last year’s exams under realistic

conditions (time yourself, do it all before looking at the solutions, etc.)

• Prepare a good cheat sheet and use it when working problems

• Read the corresponding sections in the book if there are concepts you are still confused about

Keys to In-Class Exam Success

Page 6: Hypothesis Testing III 2/15/12

• Primarily, make sure you know how to summarize, visualize, create an interval, and conduct a test for any one variable or relationship between two variables.

• Beyond that, make sure you are comfortable with the content from the labs

• Open-book does NOT mean you don’t have to study. You will not have time to look up every command you need during the exam.

Keys to Lab Exam Success

Page 7: Hypothesis Testing III 2/15/12

You have LOTS of opportunities for help!

• Today, 3 – 5pm (Prof Morgan)• Friday, 1 – 3 pm (Prof Morgan• Sunday, 5 – 7 pm (Jessica)• Sunday, 7 – 9 pm (Michael)• Monday, 3 – 4 pm (Prof Morgan)• Monday, 4 – 6 pm (Christine)• Tuesday, 3 – 6 pm (Prof Morgan)• Tuesday, 6 – 8 pm (Yue)

(My office hours next week have been moved to Monday and Tuesday to answer questions before the exam)

Office Hours before Exam

Page 8: Hypothesis Testing III 2/15/12

Statistical ConclusionsStrength of evidence against H0:

Formal decision of hypothesis test, based on = 0.05 :

statistically significant

not statistically significant

Page 9: Hypothesis Testing III 2/15/12

• Resveratrol, an ingredient in red wine and grapes, has been shown to promote weight loss in rodents, and has recently been investigated in primates (specifically, the Grey Mouse Lemur).

• A sample of lemurs had various measurements taken before and after receiving resveratrol supplementation for 4 weeks

Red Wine and Weight Loss

BioMed Central (2010, June 22). “Lemurs lose weight with ‘life-extending’ supplement resveratrol. Science Daily.

Page 10: Hypothesis Testing III 2/15/12

In the test to see if the mean resting metabolic rate is higher after treatment, the p-value is 0.013.

Using = 0.05, is this difference statistically significant? (should we reject H0: no difference?)

(a) Yes

(b) No

Red Wine and Weight Loss

Page 11: Hypothesis Testing III 2/15/12

In the test to see if the mean body mass is lower after treatment, the p-value is 0.007.

Using = 0.05, is this difference statistically significant? (should we reject H0: no difference?)

(a) Yes

(b) No

Red Wine and Weight Loss

Page 12: Hypothesis Testing III 2/15/12

In the test to see if locomotor activity changes after treatment, the p-value is 0.980.

Using = 0.05, is this difference statistically significant? (should we reject H0: no difference?)

(a) Yes

(b) No

Red Wine and Weight Loss

Page 13: Hypothesis Testing III 2/15/12

In the test to see if mean food intake changes after treatment, the p-value is 0.035.

Using = 0.05, is this difference statistically significant? (should we reject H0: no difference?)

(a) Yes

(b) No

Red Wine and Weight Loss

Page 14: Hypothesis Testing III 2/15/12

Suppose many researchers around the world are all investigating the same question of interest. If the null hypothesis is true, using = 0.05, what proportion of hypothesis tests will incorrectly reject the null?

a) None

b) 95%

c) 5%

d) It depends

Formal Decisions

Page 15: Hypothesis Testing III 2/15/12

• There are four possibilities:

Errors

Reject H0 Do not reject H0

H0 true

H0 false TYPE I ERROR

TYPE II ERRORTrut

h

Decision

Page 16: Hypothesis Testing III 2/15/12

• In the test to see if resveratrol is associated with food intake, the p-value is 0.035.

• If resveratrol is not associated with food intake, a Type I Error would have been made

• In the test to see if resveratrol is associated with locomotor activity, the p-value is 0.980.

• If resveratrol is associated with locomotor activity, a Type II Error would have been made

Red Wine and Weight Loss

Page 17: Hypothesis Testing III 2/15/12

• Usually, we have no way of knowing whether an error has been made, without doing another study

• Analogously, we have no way to knowing whether our confidence interval actually contains the true parameter

Errors

Page 18: Hypothesis Testing III 2/15/12

If the null hypothesis is true, what is the probability of making a Type I error?

a) 0b) c) 1 – d) It depends

Errors

Page 19: Hypothesis Testing III 2/15/12

• Why would you use a smaller significance level, like = 0.01?

• Why would you use a larger significance level, like = 0.10?

Significance Level

Page 20: Hypothesis Testing III 2/15/12

A person is innocent until proven guilty.

Evidence must be beyond the shadow of a doubt.

Types of mistakes in a verdict?

Convict an innocent

Release a guilty

Ho Ha

Type I error

Type II error

Analogy to Law

Page 21: Hypothesis Testing III 2/15/12

If the alternative hypothesis is true, what is the probability of making a Type II error?

a) 0b) c) 1 – d) It depends

Errors

Page 22: Hypothesis Testing III 2/15/12

• The power of a hypothesis test is the probability of correctly detecting a significant effect, is there is one (correctly rejecting the null hypothesis when it is false)

Power

)power = 1 probability of making a Type II error

Page 23: Hypothesis Testing III 2/15/12

• There are four possibilities:

Errors

Reject H0 Do not reject H0

H0 true

H0 false

TYPE I ERROR

TYPE II ERRORTrut

h

Decision

0If true, probability = H

0If true,

probability = 1H

If true, probability =

ap

Hower

If true, probability = 1

ap r

Howe

Page 24: Hypothesis Testing III 2/15/12

What factors influence the power of a test?

1. Sample size

2. True value or effect size

3. Variability of values (standard deviation)

Power

Page 25: Hypothesis Testing III 2/15/12

Power

If you want to increase the power of your test, what can you do?

a) Increase the sample size

b) Make the true value farther from the null value

c) Decrease the standard deviation of your variables

d) Any of the above

Page 26: Hypothesis Testing III 2/15/12

www.lock5stat.com/statkey

Significance and Sample Size0 : 0.5

: 0.5a

pHH

p

ˆ 0.6p

10n 100n

Page 27: Hypothesis Testing III 2/15/12

Significance and Sample Size

The smaller the sample size, the

(a) smaller

(b) larger

the chance of a Type II error (failing to reject the null hypothesis, even when it is false).

Page 28: Hypothesis Testing III 2/15/12

Significance and Sample Size

• Just because you fail to get significant results, does NOT mean the null hypothesis is true

• This is particularly true for small sample sizes. Unless the truth is very far from the null value, it is hard to find significant results if the sample size is small.

• With small sample sizes, Type II Errors are very likely!

Page 29: Hypothesis Testing III 2/15/12

• With small sample sizes, even large differences or effects may not be significant

• With large sample sizes, even a very small difference or effect can be significant

• A statistically significant result is not always practically significant, especially with large sample sizes

Statistical vs Practical Significance

Page 30: Hypothesis Testing III 2/15/12

• Example: Suppose a weight loss program recruits 10,000 people for a randomized experiment.

• A difference in average weight loss of only 0.5 lbs could be found to be statistically significant

• Suppose the experiment lasted for a year. Is a loss of ½ a pound practically significant?

Statistical vs Practical Significance

Page 31: Hypothesis Testing III 2/15/12

Videogames and GPA

If you get put with a roommate who brings a videogame to college, does that lower your GPA?

What are the null and alternative hypotheses?

a) H0: pv – pnv = 0, Ha: pv – pnv < 0b) H0: µv – µnv = 0, Ha: µv – µnv < 0c) H0: pv – pnv < 0, Ha: pv – pnv = 0d) H0: µv – µnv < 0, Ha: µv – µnv = 0

Page 32: Hypothesis Testing III 2/15/12

If you get put with a roommate who brings a videogame to college, does that lower your GPA?

A study at Berea college conducted this test and obtained a p-value of 0.036. What does this mean?

a) The probability that H0 is true is 0.036b) The probability that H0 is false is 0.036c) The probability of seeing a difference in mean GPA as

extreme as that in the sample is 0.036 d) If H0 is true, the probability of seeing a difference in mean

GPA as extreme as that in the sample is 0.036

Source: Stinebrickner, R. and Stinebrickner, T.R. (2008). “The Causal Effect of Studying on Academic Performance,” The B.E. Journal of Economic Analysis & Policy: Vol. 8: Iss. 1 (Frontiers), Article 14.

Videogames and GPA

Page 33: Hypothesis Testing III 2/15/12

In the study about roommates bringing videogames to college and GPA, the p-value is 0.036. Using = 0.05, what would you conclude?

a) People assigned roommates who bring videogames have significantly lower GPAs

b) People assigned roommates who bring videogames do not have lower GPAs

c) Nothing

Videogames and GPA

Page 34: Hypothesis Testing III 2/15/12

Based on this p-value, can you conclude that getting put with a roommate who brings a videogame to campus causes you to have a lower GPA, on average?

a) Yes

b) No

Videogames and GPA

Page 35: Hypothesis Testing III 2/15/12

• The p-value alone tells you whether there is a significant association between two variables, but NOT whether this is a causal association

• The data collection method tells you whether causal conclusions are possible, but not whether an association is significant

• If the study is a randomized experiment AND the p–value indicates statistically significant results, only then you can conclude that the explanatory variable has a causal effect on the response variable

Significance and Causation

Page 36: Hypothesis Testing III 2/15/12

Roommates are assigned randomly at Berea college. Based on this knowledge and the p-value (0.036), can you conclude that getting put with a roommate who brings a videogame to campus causes you to have a lower GPA, on average?

a) Yes

b) No

Videogames and GPA

Page 37: Hypothesis Testing III 2/15/12

The study also tested whether students who bring a videogame to college themselves have lower GPAs, on average. The p-value of this test is 0.068. Using = 0.05, what would you conclude?

a) People who bring videogames have significantly lower GPAsb) People who bring videogames do not have lower GPAsc) In this study, the difference in average GPA between students

who bring videogames and those who don’t is not statistically significant

d) Nothinge) Either (c) or (d)

Videogames and GPA

Page 38: Hypothesis Testing III 2/15/12

• In making formal decisions, reject H0 if the p-value is less than α, otherwise do not reject H0

• There are two types of errors that can be made in hypothesis testing: rejecting a true null or failing to reject a false null

• The larger your sample size, the higher your chance of finding a significant result, if one exists

Summary