false discoveries in genome scanning

6
Genetic Epidemiology 14:779!784 (1997) False Discoveries in Genome Scanning Eugene I. Drigalenko and Robert C. Elston Department of Epidemiology and Biostatistics, Rammelkamp Center for Education and Research, MetroHealth Campus, Case Western Reserve University, Cleveland, Ohio. Methods of multiple comparisons were applied to linkage analysis in the case of genome scanning. Data for Problem 2A were used. p-Values were calculated for all 440,400 possible tests of linkage. Plots of distribution functions and false discovery rate are shown. © 1997 Wiley-Liss, Inc. Key words: Bonferroni method, linkage analysis, multiple comparisons, p-values INTRODUCTION The problem of how to avoid false discovery of linkage is a key one in genome scanning. Results of Genetic Analysis Workshop 9 show that this problem continues to be unclear as no strict tests were used in that workshop [Blangero, 1995; Hodge, 1995]. The purpose of this study is to apply methods of multiple comparisons to linkage analysis in the case of genome scanning using the data for Problem 2A. These data consist of 200 simulated replicates of sets of nuclear families, so we can compare our results with the true answers. Each replicate consists of a set of data on 239 families, comprising 1,164 individuals, 771 sib pairs. Data are given for most individuals on six quantitative traits and 367 markers spaced 2 cM apart on 10 chromosomes. So, six traits and 367 markers give a total of 2,202 (6 × 367) tests. The five quantitative traits Q1-Q5 were influenced by six major genes MG1-MG6, having known positions on chromosomes 4, 5, 8, 9, 10. The sixth trait EF (environmental factor) was purely environmental. The sib-pair method [Haseman and Elston, 1972] was used to test for linkage because it does not require the mode of inheritance of the quantitative trait to be specified. A p-value for the hypothesis of linkage between each of the traits and each of the marker loci was calculated using the regression statistics given by the program SIBPAL [S.A.G.E., 1996]. This was done for each of the 200 replicates, giving a total of 440,400 p-values in all. Address reprint requests to Dr. Robert C. Elston, Department of Epidemiology and Biostatistics, Rammelkamp Center for Education and Research, MetroHealth Campus, Case Western Reserve University, 2500 MetroHealth Drive, Room R259, Cleveland, OH 44109-1998. © 1997 Wiley-Liss, Inc.

Upload: eugene-i-drigalenko

Post on 06-Jun-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: False discoveries in genome scanning

Genetic Epidemiology 14:779!784 (1997)

False Discoveries in Genome ScanningEugene I. Drigalenko and Robert C. Elston

Department of Epidemiology and Biostatistics, Rammelkamp Center for Educationand Research, MetroHealth Campus, Case Western Reserve University,Cleveland, Ohio.

Methods of multiple comparisons were applied to linkage analysis in the case ofgenome scanning. Data for Problem 2A were used. p-Values were calculated forall 440,400 possible tests of linkage. Plots of distribution functions and falsediscovery rate are shown. © 1997 Wiley-Liss, Inc.

Key words: Bonferroni method, linkage analysis, multiple comparisons, p-values

INTRODUCTION

The problem of how to avoid false discovery of linkage is a key one in genomescanning. Results of Genetic Analysis Workshop 9 show that this problem continues to beunclear as no strict tests were used in that workshop [Blangero, 1995; Hodge, 1995]. Thepurpose of this study is to apply methods of multiple comparisons to linkage analysis inthe case of genome scanning using the data for Problem 2A. These data consist of 200simulated replicates of sets of nuclear families, so we can compare our results with the trueanswers. Each replicate consists of a set of data on 239 families, comprising 1,164individuals, 771 sib pairs. Data are given for most individuals on six quantitative traits and367 markers spaced 2 cM apart on 10 chromosomes. So, six traits and 367 markers givea total of 2,202 (6 × 367) tests. The five quantitative traits Q1-Q5 were influenced by sixmajor genes MG1-MG6, having known positions on chromosomes 4, 5, 8, 9, 10. The sixthtrait EF (environmental factor) was purely environmental.

The sib-pair method [Haseman and Elston, 1972] was used to test for linkagebecause it does not require the mode of inheritance of the quantitative trait to be specified.A p-value for the hypothesis of linkage between each of the traits and each of the markerloci was calculated using the regression statistics given by the program SIBPAL [S.A.G.E.,1996]. This was done for each of the 200 replicates, giving a total of 440,400 p-values inall.

Address reprint requests to Dr. Robert C. Elston, Department of Epidemiology and Biostatistics,Rammelkamp Center for Education and Research, MetroHealth Campus, Case Western Reserve University,2500 MetroHealth Drive, Room R259, Cleveland, OH 44109-1998.

© 1997 Wiley-Liss, Inc.

Page 2: False discoveries in genome scanning

780 Drigalenko and Elston

Distribution of P-values

To study the empirical distribution of the p-values, the 440,400 tests were dividedinto three groups:

(i) “environmental factor” -- the 73,400 (367 × 200) tests involving the trait EFwhich was independent of all markers;

(ii) “genetic, linked” -- the 162,000 (5 × 162 × 200) tests of each of the traits Q1-Q5to markers not farther than 34.5 cM from trait loci, which is the maximum mappabledistance, corresponding to a recombination fraction of 0.3 when Kosambi’s mappingfunction is used [Risch, 1991]: all the 54 markers on chromosomes 4 and 5, markers 11-42on chromosome 8, markers 1-34 on chromosome 9, and markers 1-33 on chromosome 10.Because all five traits were correlated with each other, we did not differentiate among thetraits with respect to which loci they were linked to.

(iii) “genetic, unlinked” -- the 205,000 (5 × 205 × 200) tests of the traits Q1-Q5 tothe markers on chromosomes not involved (chromosomes 1, 2, 3, 6, and 7) or to themarkers farther then 34.5 cM from any trait locus.

Groups (i) and (iii) are situations in which the null hypothesis is true in the sense oflinkage being absent. However, the null hypothesis also includes the assumptions that thesquared sib-pair differences are normally distributed, an assumption that can never bestrictly true, and that all sibs pairs are independent. Thus groups (i) and (iii) enable us toinvestigate the robustness of the method against these distributional assumptions in a largesample.

Figure 1 shows the complete cumulative distribution of each of these three groupsof p-values. The distribution of the p-values for the “genetic, unlinked” group isindistinguishable from a uniform distribution. The distribution of the “genetic, linked”group lies above the uniform distribution because it includes true alternative hypotheses(linkage), while the “environmental factor” group is very slightly below, indicating that thetest is then slightly conservative. Figure 2 shows the lower left hand corner of Figure 1magnified. We can see that when linkage is absent the p-values are close to uniform up tovery small values, there being a very small tendency to be liberal in the “genetic, unlinked”group.

Figure 3 shows the plots in Figure 1 transformed to magnify the departures from thecumulative uniform distribution: the value of the cumulative uniform distribution wassubtracted from the empirical cumulative distributions of the three groups. The larger p-values of the “genetic, linked” group should appear in Figure 3 as a straight linecorresponding to a uniform distribution of true null hypotheses, and the intercept of thisline on the ordinate axis would give the proportion of all tests in which the alternativehypothesis is true. This was pointed out by Schweder and Sprjøtvoll [1982] in terms of thekind of plot shown in Figure 1, but a similar result would hold for the transformed plotshown in Figure 3. The absence of such a straight line in this figure may be explained bythe low power of the method (or equivalently, the small effect of the linked loci), so thattrue alternative hypotheses can have high p-values relatively close to 1 mixing in with thep-values of the true null hypotheses, so preventing their detection.

Rejections

For the smallest p-values, which are those of interest, the Bonferroni, Holm [1979],and Benjamini and Hochberg [1995] criteria give the same or very similar critical regions,

Page 3: False discoveries in genome scanning

False Discoveries in Genome Scanning 781

Fig. 1. Empirical cumulative distribution of three groups of tests, corresponding to the presence of anenvironmental factor, a linked trait locus, or an unlinked trait locus. The thick line is for the genetic,unlinked group, which at this magnification is indistinguishable from the cumulative uniform distribution.

Fig. 2. Lower left hand corner of Fig.1, magnified. The thick line is the cumulative uniform distribution.

Page 4: False discoveries in genome scanning

782 Drigalenko and Elston

Fig. 3. Plot shown in Fig.1 transformed to magnify departure from the uniform distribution, whichcorresponds to the abscissa.

so for brevity we report here only results for the Bonferroni method. It should be noticedthat for dependent hypotheses (close markers in our case) this method loses power[Schwager, 1984]. The main point of the Bonferroni method for multiple comparisons isthat, to control type I errors at level " in N tests, we take "/N as the critical value.

We calculated Bonferroni-adjusted p-values [Rosenthal and Rubin, 1983], i.e., everyp-value was multiplied by the number of comparisons. When for a set of tests such anadjusted p-value is smaller than some cutoff value, such as 0.05 or 0.01, we reject H and0

say that we have found linkage. When this happens but there is in fact no linkage, we havea false rejection of the null hypothesis, or “false discovery” [Holm, 1979; Soric, 1989].

We made all 440,400 tests (6 × 367 × 200) and found that the only linkagesignificant with p < 0.05 (adjusted) is between trait Q3 and marker D4G3 in replicate (dataset) number 21, the nonadjusted p-value being 2.84 × 10 . This result is poor because it-8

does not take into account the fact that the data are separated into 200 sets. Taking this intoaccount, we can examine the results as if 200 research groups, working independently,reported significant linkages (rejections). Every p-value is then adjusted for the numberof tests in a replicate, i.e., 2,202. The rejections that are now found at the 0.05 level areshown in Table 1.

Page 5: False discoveries in genome scanning

False Discoveries in Genome Scanning 783

TABLE I. Marker Loci at Which Linkages Were Found on the Basis of a 0.05 Cutoff, the Criticalp-value Being Adjusted for 2,202 Tests. Numbers in Parentheses Indicate the Number of Replicatesin Which the Linkage Was Found, If More than One.

Trait Chromosome Marker loci3 2

Q1 5 11, 12(2), 13, 14, 15(3), 17(2), 18(3), 19 (3)8 38, 45

10 6 2 6

Q2 4 4, 7, 9, 11 8 24(2), 26, 32, 34, 35 1 29, 31

Q3 4 1, 2, 3, 10 5 7, 10 9 8 6 13, 15

Q4 8 20(2), 22(2), 23(2), 24(4), 25(2), 26(6), 27(6), 28(3), 29(7), 30(2),31(2), 32(3), 33(2), 34(2), 35, 38, 39

9 6, 7, 11, 13 1 22

Q5 2 33 8 15, 27, 27, 28, 30 9 6(2), 7, 8(2), 9(2), 10, 11

For all 440,400 tests taken together, we calculated the empirical false discovery rateas a function of the adjusted significance level (cutoff) used. The result, shown in Figure4, indicates little change in the false discovery rate for adjusted p-values from 0.03 to 0.2.We repeated this calculation after dividing the markers into the even- and odd-numberedones, to obtain two 4 cM maps the results of which were pooled. Finally, we analogouslyobtained results for an 8 cM map. In each case, the results regarding false discovery ratewere similar. The 440,400 tests are not independent; this would increase the samplingerror, but not cause any bias, in the result found.

CONCLUSIONS

In a large sample (771 sibs pairs), the probability of type I error is well controlledin the Haseman-Elston test as implemented in the SIBPAL program of S.A.G.E. [1996].

It appears that when we are dealing with complex traits and a reasonably largesample size, the exact critical value used to test for linkage has relatively little effect onthe false discovery rate, i.e., the posterior probability of type I error. Although thegenerality of these findings remains to be determined, these preliminary results suggest thatfor adjusted p-values anywhere between 0.03 and 0.2 (i.e., unadjusted p-values between1.5 × 10 and 1.0 × 10 ), somewhere between 6% and 12% of all reported linkages are-5 -4

false discoveries.

ACKNOWLEDGMENTS

This work was supported in part by research grant F05 TW05285-01 from theFogarty International Center, and by U.S. Public Health Service Resource Grant P41

Page 6: False discoveries in genome scanning

784 Drigalenko and Elston

Fig. 4. Plot of the proportion of linkage findings that are false discoveries, against the Bonferroni-adjustedp-value used to declare a finding of linkage.

RR03655 from the National Center for Research Resources and research grant GM 28356from the National Institute of General Medical Sciences.

REFERENCES

Benjamini Y, Hochberg Y (1995): Controlling the false discovery rate: a practical and powerful approachto multiple testing. J R Statist Soc B 57:289-300.

Blangero J (1995): Genetic analysis of a common oligogenic trait with quantitative correlates: summary ofGAW9 Results. Genet Epidemiol 12:689-706.

Haseman JK, Elston RC (1972): The investigation of linkage between a quantitative trait and a markerlocus. Behav Genet 2:3-19.

Hodge SE (1995): An oligogenic disease displaying weak marker associations: a summary of contributionsto problem 1 of GAW9. Genet Epidemiol 12:545-554.

Holm S (1979): A simple sequentially rejective multiple test procedure. Scand J Statist 6:65-70.Risch N (1991): A note on multiple testing procedure in linkage analysis. Am J Hum Genet 48:1058-1064.Rosenthal R, Rubin DB (1983): Ensemble-adjusted p values. Psych Bull 94:540-541.S.A.G.E. (1996): Statistical Analysis for Genetic Epidemiology, Release 3.0. Computer program package

available from the Department of Epidemiology and Biostatistics, Case Western Reserve University,Cleveland.

Schwager SJ (1984): Bonferroni sometimes loses. Am Statist 38:192-197.Schweder T, Sprjøtvoll E (1982): Plots of P-values to evaluate many tests simultaneously. Biometrika

69:493-502.Soric B (1989): Statistical “discoveries” and effect-size estimation. J Am Statist Assoc 84:608-610.