binf702 spring 2014 - systems...

50
BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 1 BINF702 SPRING 2014 Chapter 7 – Hypothesis Testing: One-Sample Inference

Upload: others

Post on 27-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 1

BINF702 SPRING 2014

Chapter 7 – Hypothesis Testing: One-Sample Inference

Page 2: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2

Section 7.1 - Introduction

Ex. 7.1 – (Cardiovascular Disease, Pediatrics) – A current area of research interest is the familial aggregation of cardiovascular risk factors in general and lipid levels in particular. Suppose the “average” cholesterol level in children is 175 mg/dL. A group of men who have died from, heart disease within the past year are identified , and the cholesterol level of their offspring are measured. Two hypothesis are considered

The average cholesterol level of these children is 175 mg/dL.

The average cholesterol level of these children is greater than 175 mg/dL.

Page 3: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 3

Section 7.2 – General Concepts

Ex. 7.2 (Obstetrics) Suppose we want to test the hypothesis that mothers with low socioeconomic status (SES) deliver babies whose birthweights are lower than “normal.” To test this hypothesis, a list is obtained of birthweights from 100 consecutive, full term, live-born, deliveries from the maternity ward of a hospital in a low-SES area. The mean birthweight (xbar) is found to be 115 oz. with a sample standard deviation (s) of 24 oz.. Suppose we know from nationwide surveys based on millions of deliveries that the mean birthweight in the United States is 120 oz. Can we actually say that the underlying mean birthweight from this hospital is lower than the national average?

Page 4: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 4

Section 7.2 – General Concepts

Def. 7.1 – The null hypothesis denoted H0, is the hypothesis to be tested. The alternate hypothesis, denoted by H1, is the hypothesis that in some way contradicts the null hypothesis.

Ex. 7.3 (7.2 Obstetrics Example)

Eq. 7.1

H0 : m = m0

H1 : m < m0

Where m0 is the men birthweight in the United States

Page 5: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

Section 7.2 – General Concepts

Truth

D

E

C

I

S

I

O

N

H0 H1

Accept H0 H0 is true and H0 is accepted

H1 is true and H0 is accepted

Reject H0

H0 is true and H0 is rejected

H1 is true and H0 is rejected

Correct Decision 5 BINF702 SPRING 2014 Chapter 7 Hypothesis Testing

Page 6: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 6

Section 7.2 – General Concepts

Def. 7.2 – The probability of a type I error is the probability of rejecting the null hypothesis when H0 is true.

Def. 7.3 – The probability of a type II error is the probability of accepting the null hypothesis when H1 is true. This probability is a function of m as well as other factors.

Page 7: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 7

Section 7.2 – General Concepts

Ex. 7.4 (Obstetrics)

What is a type I error in the case of the birthweight data?

Ex. 7.4 (Obstetrics)

What is a type I error in the case of the birthweight data?

Page 8: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 8

Section 7.2 – General Concepts

Def. 7.4 – The probability of a type I error is usually denoted by a and is commonly referred to as the significance level of a test.

Def. 7.5 - The probability of a type II error is usually denoted by b.

Def. 7.6 – The power of a test is defined as

1-b = 1 – probability of a type II error = Pr(rejecting H0|H1 true)

Page 9: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 9

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Def. 7.7 – The acceptance region is the range of values of xbar (the test statistic in general) for which H0 is accepted.

Def. 7.8 - The rejection region if the range of values of xbar (the test statistic in general) for which H0 is rejected.

Def. 7.9 – A one-tailed test is a test in which the values of the parameters being studied (in this case m) under the alternative hypothesis are allowed to be either greater than or less than the values of the parameter under the null hypothesis (m0), but not both.

Page 10: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 10

Eq. 7.2 (One-Sample t-Test for the Mean of a Normal Distribution with Unknown Variance (Alternative Mean < Null Mean) To test the hypothesis

H0: m = m0, s unknown vs. H1: m < m0, s unknown

with significance level of a, we compute

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

If t < tn-1, a then we reject H0

If t >= tn-1, a then we accept H0

0

/

xt

s n

m

Page 11: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 11

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Def. 7.10 – The value t in Equation 7.2 is called a test statistic, because the test procedure is based on this statistics.

Def. 7.11 – The value tn-1,a in Equation 7.2 is called a critical value because the outcome of the test depends on whether the test statistic t < tn-1,a = critical value, whereby we reject H0 or t >= tn-1,a whereby we accept H0.

Def. 7.12 – The general approach where we compute a test statistic and determine the outcome of a test by comparing the test statistic to a critical value determined by the type I error is called the critical-value method of hypothesis testing.

Page 12: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 12

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Ex. 7.10

Page 13: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 13

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Def. 7.13 – The p-value for any hypothesis test is the a level at which we would be indifferent between accepting or rejecting H0, given the sample data at hand. That is, the p-value is the a level at which the given value of the test statistic (such as t) is on the borderline between the acceptance and rejection region.

Eq. 7.3 – p = Pr(tn-1 <= t)

Ex. 7.12

Page 14: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 14

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Def. 7.14 – The p-value can also be thought of as the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained, given that the null hypothesis is true.

N.B. – I actually like this definition a little better than Def. 7.13.

Page 15: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 15

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Eq. 7.4 – (Guidelines for Judging the Significance of a p-Value)

If .01 <= p < .05, then the results are significant.

If .001 <= p < .01, then the results are highly significant.

If p < .001, then the results are very highly significant.

If p > .05, then the results are considered not statistically significant (sometimes denoted by NS).

However, if .05 <= p < .10, then a trend toward statistical significance is sometimes denoted.

Page 16: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 16

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Eq. 7.5 – (Determination of Statistical Significance for Results from Hypothesis Tests) Either of the following methods can be used to establish whether results from hypothesis tests are statistically significant:

The test statistic t can be computed and compared with the critical value tn-1,a at an a level of .05. Specifically, if H0 : m = m0 versus H1 : m < m0, is being tested and t <=tn-1,a then H0 is rejected and the results are declared statistically significant ( p < .05). Otherwise, H0 is accepted and the results are declared not statistically significant (P >= .05). We called this approach the critical-value method (Def. 7.12)

The exact p-value can be computed, and if p < .05, then H0 is rejected

and the results are declared statistically significant. Otherwise, if p >= .05, then H0 is accepted and the results are declared not statistically significant. We will refer to this approach as the p-value method.

Page 17: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 17

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Eq. 7.6 (One-Sample t Test for the Mean of the Mean of a Normal Distribution with Unknown Variance (Alternative Mean > Null Mean)) To test the hypothesis

H0 m = m0 vs. H1: m > m0

with a significance level of a, the best test is based on t, where

0

/

xt

s n

m

If t > tn-1, 1-a then we reject H0

If t <= tn-1, a then we accept H0

Page 18: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 18

Section 7.3 – One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives

Eq. 7.6 (cont.)

The p-value for this test is given by

p = Pr(tn-1 > t)

Ex. 7.18

Page 19: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 19

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Def. 7.15 – A two-tailed test is a test in which the values of the parameter being studied (in this case m) under the alternative hypothesis are allowed to be either greater than or less than the values of the parameter under the null hypothesis (m0).

Page 20: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 20

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Eq. 7.10 (One-Sample t Test for the Mean of a Normal Distribution with Unknown Variance (Two-Sided Alternative) To test the hypothesis H0: m = m0, versus H1: m != m0 with a significance level of a, the best test is based on

0

/

xt

s n

m

If |t| > tn-1, 1-a/2 then we reject H0

If |t| <= tn-1, a/2 then we accept H0

Page 21: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 21

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Eq. 7.11 (p-value for the One-Sample t Test for the Mean of a Normal Distribution (Two-Sided Alternative))

Let

0

/

xt

s n

m

1

1

2Pr( ), if 0

2 1 Pr( ) , if 0

n

n

t t tp

t t t

Page 22: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 22

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Eq. 7.12 – The p-value is the probability under the null hypothesis of obtaining a test statistic as extreme as or more extreme than the observed test statistic, where, because a two-sided alternative hypothesis is being used, extremeness is measured based on the absolute value of the test statistic.

Page 23: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 23

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Eq. 7.13 (One-Sample z Test for the Mean of a Normal Distribution with Known Variance (Two-Sided Alternative)) To test the hypothesis H0: m = m0 vs. H1: m != m0 with a significance level of a, where the underlying standard deviation s is known, the best test is based on

0

/

xZ

n

m

s

If z < Za/2 or z > Z1-a/2 then we reject H0

If za/2 <= z <= z1-a/2 then we accept H0

Page 24: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 24

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

To compute a two-sided p-value we have

2 if 0

2 1 if 0

z zp

z z

Page 25: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 25

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Eq. 7.14 (One-Sample z Test for the Mean of a Normal Distribution with Known Variance (One-Sided Alternative)(m1 < m0)) To test the hypothesis H0: m = m0 versus H1: m < m0 with a significance level of a, where the underlying standard deviation s is known, the best test is based on

0

/

xZ

n

m

s

If z < za, then H0 is rejected; if z >= za, then H0 is accepted. The p-value is given by p = (z).

Page 26: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 26

Section 7.4 – One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives

Eq. 7.14 (One-Sample z Test for the Mean of a Normal Distribution with Known Variance (One-Sided Alternative)(m1 > m0)) To test the hypothesis H0: m = m0 versus H1: m > m0 with a significance level of a, where the underlying standard deviation s is known, the best test is based on

0

/

xZ

n

m

s

If z > z1-a, then H0 is rejected; if z <= z1-a, then H0 is accepted. The p-value is given by p = 1-(z).

Page 27: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 27

Cardiovascular Disease, Nutrition

7.26 pg. 286

What is the null and alternative hypotheses?

What is the equation for the test statistic?

What is the calculated p-value?

Page 28: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 28

Section 7.5 – The Power of a Test

Ex. 7.25 (Ophthalmology) A new drug is proposed for people with high intraocular pressure (IOP), to prevent te development of glacoma. A pilot study is conducted with the drug among 10 patients. Their mean IOP decreases by 5 mm Hg with an sd of 10 mm Hg after 1 month of using the drug. The investigators propose to study 100 participants in the main study. Is this a sufficient sample size for the study?

This is a power question? Power = Pr(declare a significant difference with a sample

size of 100 if the true mean decline in IOP is 5 mm Hg with a sd of 10 mm Hg)

Page 29: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 29

Section 7.5 – The Power of a Test

Consider the case H0: m = m0 vs. H1: m = m1 < m0

0 0 1Pr reject H | H false Pr |Power Z za m m

01Pr |

/

Xz

na

mm m

s

0 1Pr / |X z nam s m m

2

1 1

0 1

0 1

We know that under H , ~ , hence

/ / /

X Nn

Power z n n

z

a

a

sm

m s m s

m m

s

Page 30: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 30

Section 7.5 – The Power of a Test

Eq. 7.19 (Power for the One-Sample z Test for the Mean of a Normal Distribution with Known Variance (One-Sided Alternative)) The power of the test for the hypothesis H0: m = m0 vs. H1: m = m1 where the underlying distribution is normal and the population variace (s2) is assumed known is given by

0 1 1 0 1/ /z n z na a m m s m m s

Page 31: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 31

Section 7.5 – The Power of a Test

Eq. 7.20 (Factors Affecting the Power)

1. If the significance level is made smaller (a decreases), za decreases and hence the power decreases.

2. If the alternative mean is shifted further away from the null mean (|m0 – m1| increases), then the power increases.

3. If the standard deviation of the distribution of individual observations increases (s increases), then the power decreases.

4. If the sample size increases (n increases), then the power increases.

Page 32: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 32

Section 7.5 – The Power of a Test

Ex. 7.28 eq7.19 <- function(a,m0,m1,n,s){

t1 = -qnorm(1-a)

num = abs(m0-m1) * sqrt(n)

t2 = num/s

pow = pnorm(t1 + t2)

}

> source("C:/Documents and Settings/Owner/My Documents/Work/Beth/gmu/fall05/binf702/R_code/eq7.19.R")

> eq7.19(.01,175,190,10,50)

> power = eq7.19(.01,175,190,10,50)

> power

[1] 0.08415344

Page 33: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 33

Section 7.5 – The Power of a Test

Eq. 7.21 (Power for the One-Sample z Test for the Mean of a Normal Distribution (Two-Sided Alternative) The power of the two-sided test H0: m = m0 versus H1: m != m0 for the specific alternative m = m1, where the underlying distribution is normal and the population variance (s2) is assumed known is given exactly by

0 1 1 0

1 / 2 1 / 2

0 1

1 / 2

and is approximated by

n nz z

nz

a a

a

m m m m

s s

m m

s

Page 34: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 34

Section 7.5 – The Power of a Test (Ex. 7.32)

Can you write R code to implement Eq. 7.21 the approximation case?

Page 35: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 35

Section 7.5 – The Power of a Test (Ex. 7.32 The Code in Action)

Page 36: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 36

Section 7.6 – Sample-Size Determination

Eq. 7.26 (Sample-Size Estimation When Testing for the Mean of a Normal Distribution (One-Sided Alternative) Support we wish to test H0: m = m0 vs. H1: m = m1 where the data are normally distributed with mean m and known variance s2. The sample size needed to conduct a one-sided test with a significance level a and probability of detecting a significant difference = 1 – b is given by

22

1 1

2

0 1

z zn

b as

m m

Page 37: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 37

Section 7.6 – Sample-Size Determination

Eq. 7.27 (Factors Affecting Sample Size)

1. The sample size increases as s2 increases.

2. The sample size increases as the significance level is made smaller (a decreases).

3. The sample size increases as the required power increases (1 – b increases).

4. The sample size decreases as the absolute value of the distance between the null and alternative means (|m0 – m1|) increases.

Page 38: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 38

Section 7.6 – Sample-Size Determination

Eq. 7.28 (Sample-Size Estimation When Testing for the Mean of a Normal Distribution (Two-Sided Alternative) Support we wish to test H0: m = m0 vs. H1: m = m1 where the data are normally distributed with mean m and known variance s2. The sample size needed to conduct a two-sided test with a significance level a and probability of detecting a significant difference = 1 – b is given by

22

1 1 / 2

2

0 1

z zn

b as

m m

Page 39: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

Section 7.6 – Sample-Size Determination

Ex. 7.37, Can you write R code to implement Eq. 7.28 and use it to solve this example?

39 BINF702 SPRING 2014 Chapter 7 Hypothesis Testing

Page 40: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 40

Section 7.6 – Sample-Size Determination

Eq. 7.29 (Sample-Size Based on Confidence-Interval Width) Suppose we wish to estimate the mean of a normal distribution with sample variance s2, and require that the two-sided 100%x(1-a) CI for m be no wider than L. The number of objects needed is approximately

2 2 2

1 / 24 /n z s La

Page 41: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 41

Section 7.7 – The Relationship Between Hypothesis Testing and Confidence Intervals

Eq. 7.30 (The Relationship Between Hypothesis Testing and Confidence Intervals (Two-Sided Case)) Suppose we are testing H0 m = m0 versus H1: m != m0. H0 is rejected with a two-sided level a test iff the two-sided 100%x(1-a) confidence interval for m does not contain m0. H0 is accepted with a two-sided level a test iff the two-sided 100%x(1-a) confidence interval for m does contain m0.

Page 42: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 42

Section 7.7 – The Relationship Between Hypothesis Testing and Confidence Intervals

Eq. 7.31 – The two-sided 100%x(1-a) confidence interval for m contains all values m0 such that we accept H0 using a two-sided test with significance level a, where the hypotheses are H0: m = m0 versus H1: m != m0. Conversely , the 100%x(1-a) confidence interval does not contain any value m0 for which we can reject H0, using a two-sided test with significance level a, where H0: m = m0 and H1: m != m0.

Page 43: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 43

Section 7.8 – Bayesian Inference

In Bayesian inference we make assumptions about the distribution of our parameter. A flat or noninformative prior is given by.

.7.33 Pr

where c is a constant.

Eq cm

Page 44: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 44

Section 7.8 – Bayesian Inference

We wish to determine the posterior distribution of m given our sample. It seems intuitive that if the distribution is normal and s is known that

Eq. 7.34

1Pr | , , Pr |nx x xm m

The mean is referred to as a sufficient statistic for m.

Page 45: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 45

Section 7.8 – Bayesian Inference

Pr | PrPr |

Pr

Pr | Pr | Pr

.7.35 Pr | Pr |

xx

x

x x

Eq x x

m mm

m m m

m m

Page 46: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 46

Section 7.8 – Bayesian Inference

In the case of a continuous distribution we may make the following approximation

2

Pr | |

We also know that ~ ,

x f x

x Nn

m m

sm

Hence we have Eq. 7.36

2

2

1

2 /

1

2 /

1|

2 /

1

2 /

x

n

x

n

f x en

en

m

s

m

s

ms

s

Page 47: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

Section 7.8 – Bayesian Inference

Hence 7.35 and 7.36 imply 7.37

21

2 /1Pr |

2 /

x

nx e

n

m

sms

But the RHS is a bona fide pdf and we can replace the proportional sign with an equality sign. Therefore we have Eq. 7.38

2

| ~ ,x N xn

sm

47 BINF702 SPRING 2014 Chapter 7 Hypothesis Testing

Page 48: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 48

Section 7.8 – Bayesian Inference

Our Bayesian Confidence Interval is given by Eq. 7.39

1 2

1 1 / 2 1 1 / 2

Pr 1

where

,x z x zn n

a a

m m m a

s sm m

N.B. – In the case of uniformative priors our Bayesian CI matches the frequentist CI. As the priors become more informative then there are marked differences between the two calculated CIs. Please examine Ex. 7.44.

Page 49: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 49

Homework

Set # 1

1,2,3,4,7,8,9,23, 24, 25

Set #2

10,11,12, 13, 14, 19, 20, 22, 43, 44, 45

Page 50: BINF702 SPRING 2014 - Systems Biologybinf.gmu.edu/jsolka/spg2014/binf702/lectures/chpt7_p1_st.pdfBINF702 SPRING 2014 Chapter 7 Hypothesis Testing 2 Section 7.1 - Introduction Ex. 7.1

BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 50

Test # 2 (4/14/11)

Chapter 4

Chapter 5

Chapter 6