multiple testing: false discovery rate and the q-value

29
Multiple testing: False discovery rate and the q-value Stefanie Scheid Max-Planck-Institute for Molecular Genetics Computational Diagnostics June 17, 2002

Upload: others

Post on 05-Apr-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple testing: False discovery rate and the q-value

Multiple testing:

False discovery rate and the q-value

Stefanie Scheid

Max-Planck-Institute for Molecular Genetics

Computational Diagnostics

June 17, 2002

Page 2: Multiple testing: False discovery rate and the q-value

Articles

• Storey, J.D. (2001a): The positive False Discovery Rate: A Bayesian

Interpretation and the q-value, submitted

• Storey, J.D. (2001b): A Direct Approach to False Discovery Rates, submitted

• Storey, J.D., Tibshirani, R. (2001): Estimating False Discovery Rates Under

Dependence, with Applications to DNA Microarrays, submitted

• http://www-stat.stanford.edu/~jstorey/

Stefanie Scheid - False Discovery Rate - June 17, 2002 1

Page 3: Multiple testing: False discovery rate and the q-value

The multiple testing problem

• random variables X1, . . . , Xk from same family of distribution P = {Pθ, θ ∈ Θ}

• set of m hypotheses for θ

• conclude from one sample

Stefanie Scheid - False Discovery Rate - June 17, 2002 2

Page 4: Multiple testing: False discovery rate and the q-value

Why not testing separately?

m independent test statistics with level α

Stefanie Scheid - False Discovery Rate - June 17, 2002 3

Page 5: Multiple testing: False discovery rate and the q-value

Why not testing separately?

m independent test statistics with level α

⇒ Prob(at least 1 is falsely rejected) = 1− (1− α)m = αm

Stefanie Scheid - False Discovery Rate - June 17, 2002 3

Page 6: Multiple testing: False discovery rate and the q-value

Why not testing separately?

m independent test statistics with level α

⇒ Prob(at least 1 is falsely rejected) = 1− (1− α)m = αm

⇒ control αm with multiple test procedure

Stefanie Scheid - False Discovery Rate - June 17, 2002 3

Page 7: Multiple testing: False discovery rate and the q-value

Two kinds of multiple testing procedures: One-step

• test each null hypothesis “independently” from outcome of others

• e.g. Bonferroni: test each hypothesis to levelα

m

⇒ multiple level = α

Stefanie Scheid - False Discovery Rate - June 17, 2002 4

Page 8: Multiple testing: False discovery rate and the q-value

Two kinds of multiple testing procedures: Multi-step

• test in sorted order dependent on outcome

• e.g. Bonferroni-Holm: sort according to p-values and test with increasing α:α

m,

α

m− 1, . . . , α

⇒ multiple level = α

Stefanie Scheid - False Discovery Rate - June 17, 2002 5

Page 9: Multiple testing: False discovery rate and the q-value

Microarray data

• k samples (e.g. in 2 groups) and m genes

Stefanie Scheid - False Discovery Rate - June 17, 2002 6

Page 10: Multiple testing: False discovery rate and the q-value

Microarray data

• k samples (e.g. in 2 groups) and m genes

• observe gene expression as intensity values:

sample 1 sample 2 . . . sample k

gene 1 404 1873 . . . 151

gene 2 -2015 -716 . . . 1227

... ... ... . . . ...

gene m 126 42 . . . 85

Stefanie Scheid - False Discovery Rate - June 17, 2002 6

Page 11: Multiple testing: False discovery rate and the q-value

Number of outcomes

Not rejected Rejected Σ

Null true U V m0

Alternative true T S m1

Σ W R m

Stefanie Scheid - False Discovery Rate - June 17, 2002 7

Page 12: Multiple testing: False discovery rate and the q-value

Problems

• multiple test controls Prob(V ≥ 1) ≤ α

Stefanie Scheid - False Discovery Rate - June 17, 2002 8

Page 13: Multiple testing: False discovery rate and the q-value

Problems

• multiple test controls Prob(V ≥ 1) ≤ α

• m is huge ⇒ falsely rejected are likely to occur

Stefanie Scheid - False Discovery Rate - June 17, 2002 8

Page 14: Multiple testing: False discovery rate and the q-value

Problems

• multiple test controls Prob(V ≥ 1) ≤ α

• m is huge ⇒ falsely rejected are likely to occur

• better control]{falsely rejected}]{rejected in total}

Stefanie Scheid - False Discovery Rate - June 17, 2002 8

Page 15: Multiple testing: False discovery rate and the q-value

Problems

• multiple test controls Prob(V ≥ 1) ≤ α

• m is huge ⇒ falsely rejected are likely to occur

• better control]{falsely rejected}]{rejected in total}

• intuitive definition of false discovery rate:

FDR = E

[V

R

]

Stefanie Scheid - False Discovery Rate - June 17, 2002 8

Page 16: Multiple testing: False discovery rate and the q-value

Difference between multiple testing and FDR

• Multiple testing:

Fixed error rate ⇒ estimated rejection area

• FDR:

Fixed rejection area ⇒ estimated error rate

Stefanie Scheid - False Discovery Rate - June 17, 2002 9

Page 17: Multiple testing: False discovery rate and the q-value

What if R = 0?

• Benjamini and Hochberg: FDR = E

[V

R|R > 0

]· Prob(R > 0)

“the rate that false discoveries occur”

Stefanie Scheid - False Discovery Rate - June 17, 2002 10

Page 18: Multiple testing: False discovery rate and the q-value

What if R = 0?

• Benjamini and Hochberg: FDR = E

[V

R|R > 0

]· Prob(R > 0)

“the rate that false discoveries occur”

• Storey: pFDR = E

[V

R|R > 0

]“the rate that discoveries are false”

Stefanie Scheid - False Discovery Rate - June 17, 2002 10

Page 19: Multiple testing: False discovery rate and the q-value

FDR in Bayesian terms

Theorem: m identical hypothesis tests are performed with independent statistics

T1, . . . , Tm and rejection area C. A null hypothesis is true with a-priori probability

π0 = Prob(H = 0). Then

pFDR(C) =π0 · Prob(T ∈ C |H = 0)

Prob(T ∈ C)= Prob(H = 0 |T ∈ C).

Algorithms for calculating F̂DR and p̂FDR in Storey (2001b).

Stefanie Scheid - False Discovery Rate - June 17, 2002 11

Page 20: Multiple testing: False discovery rate and the q-value

Simulation study

1. independence between genes

2. loose (“clumpy”) dependence

3. general dependence

Stefanie Scheid - False Discovery Rate - June 17, 2002 12

Page 21: Multiple testing: False discovery rate and the q-value

Simulation study

1. independence between genes

2. loose (“clumpy”) dependence

3. general dependence

1. + 2. p̂FDR is very accurate

3. p̂FDR is biased upward (overestimation of π0)

Stefanie Scheid - False Discovery Rate - June 17, 2002 12

Page 22: Multiple testing: False discovery rate and the q-value

Reasons for “clumpy dependence” in microarrays

• genes work in pathways

⇒ small groups of genes interact to produce overall process

Stefanie Scheid - False Discovery Rate - June 17, 2002 13

Page 23: Multiple testing: False discovery rate and the q-value

Reasons for “clumpy dependence” in microarrays

• genes work in pathways

⇒ small groups of genes interact to produce overall process

• cross-hybridization

⇒ genes with molecular similarity have evolutionary and/or functional

relationship

Stefanie Scheid - False Discovery Rate - June 17, 2002 13

Page 24: Multiple testing: False discovery rate and the q-value

p-value vs. q-value

• Definition: For a nested set of rejection areas {C} define the p-value of an

observed statistic T = t to be:

p-value(t) = min{C: t∈C}

Prob(T ∈ C |H = 0).

Stefanie Scheid - False Discovery Rate - June 17, 2002 14

Page 25: Multiple testing: False discovery rate and the q-value

p-value vs. q-value

• Definition: For a nested set of rejection areas {C} define the p-value of an

observed statistic T = t to be:

p-value(t) = min{C: t∈C}

Prob(T ∈ C |H = 0).

• Definition: For an observed statistic T = t define the q-value of t to be:

q-value(t) = min{C: t∈C}

pFDR(C) = min{C:t∈C}

Prob(H = 0 |T ∈ C).

“posterior Bayesian p-value”

Stefanie Scheid - False Discovery Rate - June 17, 2002 14

Page 26: Multiple testing: False discovery rate and the q-value

Conclusion

• FDRs are more appropriate in large sets of hypotheses than multiple testing

procedures

Stefanie Scheid - False Discovery Rate - June 17, 2002 15

Page 27: Multiple testing: False discovery rate and the q-value

Conclusion

• FDRs are more appropriate in large sets of hypotheses than multiple testing

procedures

• one has to be sure about rejection area

Stefanie Scheid - False Discovery Rate - June 17, 2002 15

Page 28: Multiple testing: False discovery rate and the q-value

Conclusion

• FDRs are more appropriate in large sets of hypotheses than multiple testing

procedures

• one has to be sure about rejection area

• positive FDR (“rate that discoveries are false”) is more appropriate than FDR

(“rate that false discoveries occur”)

Stefanie Scheid - False Discovery Rate - June 17, 2002 15

Page 29: Multiple testing: False discovery rate and the q-value

Conclusion

• FDRs are more appropriate in large sets of hypotheses than multiple testing

procedures

• one has to be sure about rejection area

• positive FDR (“rate that discoveries are false”) is more appropriate than FDR

(“rate that false discoveries occur”)

• q-value can be reported with every statistic (“minimum pFDR over which that

statistic can be rejected”)

Stefanie Scheid - False Discovery Rate - June 17, 2002 15