testing for mutual exclusivity

9
This article was downloaded by: [McMaster University] On: 28 October 2014, At: 11:20 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/cjas20 Testing for Mutual Exclusivity Jonathan R. Bradley a & David L. Farnsworth a a School of Mathematical Sciences, Rochester Institute of Technology , Rochester, NY, USA Published online: 08 Oct 2009. To cite this article: Jonathan R. Bradley & David L. Farnsworth (2009) Testing for Mutual Exclusivity, Journal of Applied Statistics, 36:11, 1307-1314, DOI: 10.1080/02664760802582306 To link to this article: http://dx.doi.org/10.1080/02664760802582306 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Upload: david-l

Post on 27-Feb-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Testing for Mutual Exclusivity

This article was downloaded by: [McMaster University]On: 28 October 2014, At: 11:20Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

Testing for Mutual ExclusivityJonathan R. Bradley a & David L. Farnsworth aa School of Mathematical Sciences, Rochester Institute ofTechnology , Rochester, NY, USAPublished online: 08 Oct 2009.

To cite this article: Jonathan R. Bradley & David L. Farnsworth (2009) Testing for MutualExclusivity, Journal of Applied Statistics, 36:11, 1307-1314, DOI: 10.1080/02664760802582306

To link to this article: http://dx.doi.org/10.1080/02664760802582306

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Testing for Mutual Exclusivity

Journal of Applied StatisticsVol. 36, No. 11, November 2009, 1307–1314

Testing for Mutual Exclusivity

Jonathan R. Bradley and David L. Farnsworth∗

School of Mathematical Sciences, Rochester Institute of Technology, Rochester, NY, USA

(Received 24 September 2008; final version received 23 October 2008)

A test for two events being mutually exclusive is presented for the case in which there are known rates of mis-classification of the events. The test can be utilized in other situations, such as to test whether a set is a subsetof another set. In the test, the null value of the probability of the intersection is replaced by the expectedvalue of the number determined to be in the intersection by the imperfect diagnostic tools. The test statistic isthe number in a sample that is judged to be in the intersection. Medical testing applications are emphasized.

Keywords: intersection test; misclassification rate; misdiagnosis rate; mutual exclusivity test; power;subset test

1. Introduction

If you wanted to determine whether the intersection of two sets is empty, you might take asample and look for elements in the intersection. If there is just one element of the sample in theintersection, probably you would decide that the sets are not mutually exclusive. However, considerthe situation in which the classification tools used on the sample are imprecise. Then, there couldbe some elements incorrectly identified as being in the intersection, so it is less clear whether theintersection is empty. Re-examining those elements that appear to be in the intersection could be arelatively quick way to make a decision. We consider situations in which re-examinations are notfeasible. The statistical question is: How many elements can be determined by faulty classificationtools as being in the intersection and still allow us to decide that indeed the intersection is emptyin the population?

Williams and Strachan [16] investigated whether it is possible for patients to have psoriasisand eczema at the same time. They concluded that the two conditions can co-exist. The authorsdiscussed at length threats to correct diagnoses, which were done by medical personnel duringroutine periodic general exams. The present statistical test is designed for circumstances such asthis in which there might be errors in the diagnoses and it is not feasible to perform rediagnoses.Their example is revisited below.

It is not surprising that the authors’ literature search yielded no tests for mutual exclusivity. Itis difficult to directly test such a hypothesis with the standard methods, since, for example, the

∗Corresponding author. Email: [email protected]

ISSN 0266-4763 print/ISSN 1360-0532 online© 2009 Taylor & FrancisDOI: 10.1080/02664760802582306http://www.informaworld.com

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 3: Testing for Mutual Exclusivity

1308 J.R. Bradley and D.L. Farnsworth

usual z-score of a test for a proportion is undefined, having a zero in its denominator under thenull hypothesis. We take advantage of the fact that misclassifications create non-zero expectedvalues for the quantities that would appear in the denominator.

Since the application that initially drew the authors’ attention to the issue of testing for mutualexclusivity with imperfect observations is medical testing with the possibility of misdiagnoses,events will be called, for example, ‘has condition A’ or ‘has disease A’. The word ‘diagnosis’will sometimes be used instead of ‘specification’ or ‘classification’. There are other applicationsbesides medical testing. In the political science or social science realms, it could be claimed thatpossessing certain opinions on two topics are mutually exclusive or that individuals would notengage in two behaviors. Then, the misclassification might be the result of a faulty opinion poll,unformed opinions of the subjects, or lying.

The test for mutual exclusivity can be used to test the null hypothesis that one set is a subsetof another set. Using the methods developed herein, we test the hypothesis that individuals withone of two psychiatric conditions will develop the other. Further extensions are discussed in thelast section.

2. The test for mutual exclusivity

Designate by A and B the events of having the two conditions or diseases. Complements arewritten Ac and Bc. The events of being diagnosed with these two conditions are A+ and B+,respectively. The events of being diagnosed as not having the two conditions are A− and B−,respectively

Assume that the precisions of the classification tools or clinical tests are known. In particular, thenumerical values of the sensitivities P(A+ | A) and P(B+ | B) and their complements, which arethe false negative rates P(A− | A) and P(B− | B), are known. Also, the values of the specificitiesP(A− | Ac) and P(B− | Bc) and their complements, which are the false positive rates P(A+ | Ac)

and P(B+ | Bc), are known.Usually, for medical tests, these misdiagnosis rates are established by their manufacturers or

are available from the literature. Searching the United States Centers for Disease Control andPrevention’s site www.cdc.gov with a keyword such as ‘false positive’ produces many examples.There is a vast literature on methods of estimating misclassification rates. Many involve doublesampling, second samples, or a perfect diagnostic tool called a gold standard. For example, seeDuffy et al. [4] for medical studies, Johnson et al. [10] for the quality control setting, and Biemerand Trewin [2] and Kuha and Skinner [11] for surveys. A remedy for situations in which onlyinterval estimates of the misdiagnosis rates are known is presented in the next to last section.

We assume that the diagnoses are conditionally independent and nondifferential [4; 8, p. 168;9, p. 356; 11, pp. 638–639]. The assumption of conditional independence implies that P(A+ ∩B+|A ∩ B) = P(A+|A ∩ B)P (B+|A ∩ B). The assumption of nondifferential diagnoses impliesP(A+|A ∩ B) = P(A+|A) and P(B+|A ∩ B) = P(B+|B). The assumptions include negativediagnoses and complements of A and B. Hui and Walter [8] based their justification of theseassumptions in part on the fact that the diagnostic tools for the two diseases can be very different.

In the sample of size n, there are x individuals diagnosed with condition A only, y individ-uals with condition B only, w individuals with both conditions, and n − x − y − w individualsdiagnosed with neither condition.

The null hypothesis is P(A ∩ B) = 0, and the alternative hypothesis is P(A ∩ B) > 0. Thebinomial count W of individuals in A+ ∩ B+ is the test statistic. The point estimator of P(A+ ∩B+) is W/n.

The appendix contains a derivation of the relationship between the probability P(A+ ∩ B+) thatan individual will be diagnosed with both conditions and the probability P(A ∩ B) that an indi-vidual has both conditions. The other probabilities in the relationship are the known misdiagnosis

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 4: Testing for Mutual Exclusivity

Journal of Applied Statistics 1309

rates P(A+|Ac), P(A−|A), P(B+|Bc), and P(B−|B) and the probabilities P(A+ ∩ B−) andP(A− ∩ B+), which can be estimated from the data by x/n and y/n, respectively.

The false positive rate for the intersection under the null hypothesis is given by Equation (A.7).To find the maximum likelihood estimate of P(A+ ∩ B+), use the known values and maximumlikelihood estimates of each factor on the right-hand side [7, p. 316]. The test statistic is thebinomially distributed W , whose null distribution’s probability parameter is estimated by

p = P(B+ | Bc)(x/n) + P(A+ | Ac)(y/n) − P(A+ | Ac)P (B+ | Bc)

1 − P(A+ | Ac) − P(B+ | Bc). (1)

As an example, consider the artificial data for which n = 5000, x = y = 600, andw = 44. The false positive rates are P(A+|Ac) = P(B+|Bc) = 0.03. From Equation (1),the estimated value of the number in the sample that will be diagnosed as being inthe intersection is np = {5000}{(0.03)(600/5000) + (0.03)(600/5000) − (0.03)(0.03)}/{1 −0.03 − 0.03} = (5000)(0.0063)/(0.94) = (5000)(0.00670) = 33.51. Using the normal dis-tribution approximation to the binomial distribution with the continuity correction [3,pp. 152–154, 306–309; 7, pp. 222–224], the P -value is P(W ≥ 44) = P {Z ≥ (44 − 0.5 −33.51)/

√5000(0.00670)(0.99330)} = P(Z ≥ 1.79) = 0.037, for standard normal Z. For level

of significance 0.05, the null hypothesis of an empty intersection would be rejected. For smallsample sizes, we use the exact binomial test [3, pp. 309–310; 7, pp. 266–267].

3. Estimation and power

An estimate of P(A ∩ B) can be obtained by substituting the known conditional misdiagnosis ratesof the diagnostic tools and x/n, y/n, and w/n into Equation (A.6). This maximum likelihoodestimator is obtained by using these maximum likelihood estimators for the probabilities ofdiagnoses. The maximum likelihood estimate of P(A ∩ B) is

{1 − P(A+ | Ac) − P(B+ | Bc)}(w/n) − P(B+ | Bc)(x/n) − P(A+ | Ac)(y/n)

+P(A+ | Ac)P (B+ | Bc)

{1 − P(A− | A) − P(A+ | Ac)}{1 − P(B− | B) − P(B+ | Bc)} . (2)

For the numerical example in the previous section with the additional information that the falsenegative rates are P(A−|A) = P(B−|B) = 0.02, the estimate of the probability, P(A ∩ B), thatan individual in the population will have both conditions is 0.00238, so that the estimate of thenumber of individuals in the sample’s intersection is (5000)(0.00238) = 11.9.

The power can be obtained with standard methods for binomial tests [3, pp. 306–310; 7, pp. 265–272]. For the current example with level of significance 0.05, the critical value of W is wc =33.51 + 1.645

√5000(0.00670)(0.99330) = 43.00. Say that we want the power at P(A ∩ B) =

0.005. Obtain the corresponding value of W from Equation (2): 0.005 = {(0.94)(w/5000) −0.0063}/{(0.95)(0.95)}, so w = 57.51. The power is P(W ≥ wc) = P(W ≥ 43.00) = P {Z ≥(43.00 − 0.5 − 57.51)/

√5000(57.51/5000)(4942.49/5000)} = P(Z ≥ −1.99) = 0.977.

4. A clinical example

Consider the aforementioned study by Williams and Strachan [16], who claimed that it is possiblefor patients aged 11 years to have psoriasis (Condition A) and eczema (Condition B) at thesame time, that is, the two conditions are not mutually exclusive. In their sample of n = 926311-year-old British children, x = 42 were diagnosed with psoriasis alone, y = 220 were diag-nosed with eczema alone, w = 3 were diagnosed with both conditions, and the remainder werediagnosed with neither condition. Take both false positive rates P(A+|Ac) and P(B+|Bc) to be

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 5: Testing for Mutual Exclusivity

1310 J.R. Bradley and D.L. Farnsworth

0.01, which are compatible with the authors’ discussion. Under the null hypothesis of an emptyintersection in the population, from Equation (1) the predicted proportion in the sample’s intersec-tion is {(0.01)(42/9263) + (0.01)(220/9263) − (0.01)(0.01)}/{1 − 0.01 − 0.01} = 0.0001866.The predicted number in the sample that would be diagnosed as having both conditions is(9263)(0.0001866) = 1.73. Since W is a binomial random variable with n = 9263 and p =0.0001866, the P -value is P(W ≥ 3) = 0.250, so there is not sufficient evidence to say that theintersection is not empty. The events of having psoriasis and having eczema are apparently disjoint,so one would be able to predict that an individual with eczema would not have psoriasis and viceversa. Although the conclusion is that these children do not have both conditions, for illustrativepurposes, from Equation (2) and using the false negative rates P(A−|A) = P(B−|B) = 0.01, theestimated value of P(A ∩ B) is 0.000140 and the estimated number of children in the samplewith both conditions is (9263)(0.000140) = 1.30.

5. Test of being a subset

The present test can be used to test the null hypothesis that event B is a subset of event A, thatis, P(Ac ∩ B) = 0. An application is the hypothesis that individuals with condition B must havecondition A. This is a test for events being nested.

Consider the following example. Generalized anxiety disorder (GAD) is a mental disorder inwhich a person suffers from consistent and excessive worry. Major depression (MD) is character-ized by intense and enduring feelings of sadness. The null hypothesis is that in the population ofAmerican undergraduate- and graduate-level university students the set of individuals with GADare, over time, a subset of those with MD. The thinking is that the periods of more-or-less constantanxiety leads to MD with repeated stressors in this population. Data from Eisenberg et al. [5] areused below to test this hypothesis. They mentioned the high comorbidity of GAD and MD, butdid not pose this hypothesis.

Let C be the event of having MD and B be the event of having GAD. The null hypothesisis that B ⊂ C, which is equivalent to Cc ∩ B being empty or P(Cc ∩ B) = 0. The alternativehypothesis is P(Cc ∩ B) > 0.

Eisenberg et al. [5] employed the Patient Health Questionnaire. They used the sensitivity andspecificity of the instrument for MD as P(C+ | C) = 0.73 and P(C− | Cc) = 0.98, respectively,and the sensitivity and specificity for GAD as P(B+ | B) = 0.63 and P(B− | Bc) = 0.97, respec-tively, from Spitzer et al. [15]. These numbers are similar to those in other studies [1]. To convertto the notation of the test for mutual exclusivity, designate by A the event Cc of not having MD.Then, P(A+ | Ac) = P(C− | C) = 1 − 0.73 = 0.27. Also, P(B+ | Bc) = 1 − 0.97 = 0.03.

In Eisenberg et al. [5, p. 538], the sample size was n = 2843, the number diagnosed withneither MD nor GAD was x = 2606, the number with MD and GAD was y = 56, and the numberwithout MD and with GAD was w = 55. From Equation (1)

p = (0.03)(2606/2843) + (0.27)(56/2843) − (0.27)(0.03)

{1 − 0.27 − 0.03}= 0.0353.

Since the requisite conditions for using the normal approximation to the binomial are satisfied, theP -value is P(W ≥ 55) = P [{55 − 0.5 − (0.0353)(2843)}/√2843(0.0353)(0.9647)] = P(Z ≥−4.66) ≈ 1.00. At any level of significance, we fail to reject the null hypothesis that the set ofindividuals with GAD is a subset of those with MD. So, psychiatrists can predict that an individualwith GAD will develop MD in an environment with repeated stressors. With this knowledge, achange in the patient’s environment becomes more urgent.This provides more understanding aboutworry; in particular, in stressful environments, worry almost certainly develops into depression.

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 6: Testing for Mutual Exclusivity

Journal of Applied Statistics 1311

6. Conditions

There are certain conditions that need to be met in order to run this test. Clearly, there must bethe possibility of false positives.

The terms 1 − P(A+ | Ac) − P(B+ | Bc), 1 − P(A− | A) − P(A+ | Ac), and 1 − P(B− | B)

− P(B+ | Bc) in Equations (1) and (2) must be positive. Any diagnostic procedure with errorrates so large that these terms are not positive would not be clinically effective.

For some numerical examples, the estimate of the intersection’s false positive rate, p inEquation (1), under the null hypothesis can be less than zero or more than one. In those cases,the maximum likelihood estimate is zero or one, respectively [14, pp. 720–722]. The value ofp in Equation (1) will be less than zero when x/n and y/n are small compared with the falsepositive rates. In those cases, the test for mutual exclusivity cannot be employed because the rela-tively large misdiagnoses rates cause overwhelming uncertainty in the observations. In somewhatsimilar settings, remedies have been suggested to obtain nonzero estimates. These include usingBayesian analyses [6,12] and latent variable analysis [13].

There is the possibility that p in Equation (1) may be negative in cases in which P(A+ ∩ B+)

is nonnegative from Equation (A.7). Using Equation (1), an estimator of the variance of p is

{P(B+ | Bc)}2Var(X/n) + {P(A+ | Ac)}2Var(Y/n)

{1 − P(A+ | Ac) − P(B+ | Bc)}2. (3)

For the values in the artificial data, Equation (3) yields the estimate 9.36 × 10−6 and an esti-mated standard deviation of 0.00306. In this example, p = 0 is 2.19 standard deviations from thecomputed p = 0.00670.

The requirements of the diagnoses being conditionally independent and nondifferential shouldbe investigated for each experiment. If they are false, then in the test for mutual exclusivityEquation (A.7) is replaced by P(A+ ∩ B+) = P(A+ ∩ B+ ∩ (A ∩ B)c)/P ((A ∩ B)c), whoseestimator could be drastically different from Equation (1). Gustafson [6, Chapter 5] discussesmisclassifications that are not conditionally independent and nondifferential.

Perhaps the greatest threat to the validity of decisions based on employing these procedures isin the assumption that the misdiagnosis rates are known. If they are not known exactly, a remedyis to perform the calculations with a range of rates that might be correct for the population.Fortunately, the formulas for these procedures are quite simple, so this tactic does not causeexcessive numerical work. For example, for the artificial data with n = 5000, x = y = 600, andw = 44, say that it is reasonable to believe that both false positive rates that appear in Equation (1)are in the interval (0.025, 0.030). Using 0.025 gives w = 28.29 and P -value = 0.002. All P -values for misdiagnosis rates in this interval are between 0.002 and 0.037. Indeed, for smallvalues of the rates, p in Equation (1) is increasing. Use the largest value, 0.037, for the P -value.If these misdiagnosis rates are in (0.025, 0.035), then using 0.035 for both gives w = 38.58 andP -value = 0.212, yielding a different conclusion from the static use of just the one misdiagnosisrate of 0.03.

7. Concluding comments

There are extensions of the test’s procedure. The methodology to test for mutual exclusivity canbe easily adapted to test P(A ∩ B) = p0 for p0 �= 0. In that case, Equation (A.7) is replacedby the rearrangement of Equation (A.6) retaining P(A ∩ B). The replacement for Equation (1)contains p0. The test, estimator, and power proceed with those changes. This generalization haswide applicability. Both this extension and the test for mutual exclusivity are easily implementedcomputationally.

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 7: Testing for Mutual Exclusivity

1312 J.R. Bradley and D.L. Farnsworth

The test for mutual exclusivity and its extensions might be further expanded to contain sensi-tivities and specificities that are not constant and to relax the requirements of the diagnoses beingconditionally independent and nondifferential.

These tests offer a different perspective on misclassifications. Instead of misclassificationsbeing something to be dealt with as threats to validity, we use them to answer questions that wewould not be able to answer otherwise. This is a more utilitarian perspective on misclassifications.

References

[1] S. Becker, K. Al Zaid, and E. Al Faris, Screening for somatization and depression in Saudi Arabia: A validationstudy of the PHQ in primary care, Int. J. Psychiatr. Med. 32(3) (2002), pp. 271–283.

[2] P.P. Biemer and D. Trewin, A review of measurement error effects on the analysis of survey data, in SurveyMeasurement and Process Quality, L. Lyberg et al., eds., Wiley, New York, (1997), pp. 603–632.

[3] J.L. Devore, Probability and Statistics for Engineering and the Sciences, 7th ed., Duxbury, Belmont, CA, 2008.[4] S.W. Duffy, T.E. Rohan, and N.E. Day, Misclassification in more than one factor in a case-control study: A

combination of Mantel-Haenszel and maximum likelihood approaches, Stat. Med. 8 (1989), pp. 1529–1536.[5] D. Eisenberg, S.E. Gollust, E. Golberstein, and J.L. Hefner, Prevalence and correlates of depression, anxiety, and

suicidality among university students, Am. J. Orthopsychiatr. 77(4) (2007), pp. 534–542.[6] P. Gustafson, Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian

Adjustments, Chapman & Hall/CRC, Boca Raton, Florida, 2004.[7] R.V. Hogg, J.W. McKean, and A.T. Craig, Introduction to Mathematical Statistics, 6th ed., Pearson Prentice Hall,

Upper Saddle River, New Jersey, 2005.[8] S.L. Hui and S.D. Walter, Estimating the error rates of diagnostic tests, Biometrics 36 (1980), pp. 167–171.[9] S.L. Hui and X.H. Zhou, Evaluation of diagnostic tests without gold standards, Stat. Methods Med. Res. 7 (1998),

pp. 354–370.[10] N.L. Johnson, S. Kotz, and X. Wu, Inspection Errors for Attributes in Quality Control, Chapman & Hall, London,

1991.[11] J. Kuha and C. Skinner, Categorical data analysis and misclassification, in Survey Measurement and Process Quality,

L. Lyberg et al., eds., Wiley, New York, 1997, pp. 633–670.[12] R.A. Lew and P.S. Levy, Estimation of prevalence on the basis of screening tests, Stat. Med. 8 (1989), pp. 1225–1230.[13] E. Rahme and L. Joseph, Estimating the prevalence of a rare disease: Adjusted maximum likelihood, Statistician

47(1) (1998), pp. 149–158.[14] V.K. Rohatgi, Statistical Inference, Wiley, New York, 1984.[15] R.L. Spitzer, K. Kroenke, J.B.W. Williams, and The Patient Health Questionnaire Primary Care Group, Validation

and utility of a self-report version of PRIME–MD: The PHQ Primary Care Study, J. Am. Med. Assoc. 282 (1999),pp. 1737–1744.

[16] H.C. Williams and D.P. Strachan, Psoriasis and eczema are not mutually exclusive diseases, Dermatology 189(1994), pp. 238–240.

A. Appendix: P(A ∩ B) as a Function of P(A+ ∩ B+)

A formula is derived for the probability P(A ∩ B) that an individual in the population has condi-tions A and B in terms of the probability P(A+ ∩ B+) that an individual in a sample is diagnosedwith both conditions. It is a linear function. All other expressions in the function are known errorrates for the diagnostic tools and the probabilities P(A+ ∩ B−) and P(A− ∩ B+), which can beestimated from the sample’s values for x/n and y/n.

For simplicity, designate by πAB the probability P(A ∩ B) that an individual has both conditionsA and B; designate by πA the probability P(A ∩ Bc) that an individual has condition A alone;and designate by πB the probability P(Ac ∩ B) that an individual has condition B alone. Then,P(Ac ∩ Bc) = 1 − πA − πB − πAB.Also, set p1 = P(A−|A), p2 = P(A+|Ac), p3 = P(B−|B),and p4 = P(B+|Bc). So, the sensitivity of the clinical test for condition A is 1 − p1. The speci-ficity of the test for condition A is 1 − p2. The sensitivity and specificity for condition B are1 − p3 and 1 − p4, respectively.

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 8: Testing for Mutual Exclusivity

Journal of Applied Statistics 1313

For each individual, there are four possible outcomes of the two clinical tests. SinceA ∩ B, A ∩ Bc, Ac ∩ B, and Ac ∩ Bc are mutually exclusive, using that the diagnoses are condi-tionally independent and nondifferential, the respective probabilities of having each of the fouroutcomes are

P(A+ ∩ B+) = (1 − p1)(1 − p3)πAB + (1 − p1)p4πA + p2(1 − p3)πB

+ p2p4(1 − πA − πB − πAB), (A.1)

P(A+ ∩ B−) = (1 − p1)p3πAB + (1 − p1)(1 − p4)πA + p2p3πB

+ p2(1 − p4)(1 − πA − πB − πAB), (A.2)

P(A− ∩ B+) = p1(1 − p3)πAB + p1p4πA + (1 − p2)(1 − p3)πB

+ (1 − p2)p4(1 − πA − πB − πAB), (A.3)

P(A− ∩ B−) = p1p3πAB + p1(1 − p4)πA + (1 − p2)p3πB

+ (1 − p2)(1 − p4)(1 − πA − πB − πAB). (A.4)

These can be expressed in matrix notation as

⎛⎜⎜⎝

P(A+ ∩ B+)

P (A+ ∩ B−)

P (A− ∩ B+)

P (A− ∩ B−)

⎞⎟⎟⎠

=

⎛⎜⎜⎝

(1 − p1)(1 − p3) (1 − p1)p4 p2(1 − p3) p2p4

(1 − p1)p3 (1 − p1)(1 − p4) p2p3 p2(1 − p4)

p1(1 − p3) p1p4 (1 − p2)(1 − p3) (1 − p2)p4

p1p3 p1(1 − p4) (1 − p2)p3 (1 − p2)(1 − p4)

⎞⎟⎟⎠

×

⎛⎜⎜⎝

πAB

πA

πB

1 − πAB − πA − πB

⎞⎟⎟⎠ .

Since each of its columns sums to one, the coefficient matrix is a stochastic matrix. Its deter-minant is (1 − p1 − p2)

2(1 − p3 − p4)2, so the matrix is nonsingular unless p1 + p2 = 1 or

p3 + p4 = 1.Since Equation (A.4) can be obtained algebraically from Equations (A.1) to (A.3), we use these

equations for the three unknowns πAB, πA, and πB. Expressed in matrix notation, the equationsbecome⎛⎝

P(A+ ∩ B+)

P (A+ ∩ B−)

P (A− ∩ B+)

⎞⎠

=⎛⎝

(1 − p1)(1 − p3) − p2p4 (1 − p1 − p2)p4 p2(1 − p3 − p4)

(1 − p1)p3 − p2(1 − p4) (1 − p1 − p2)(1 − p4) −p2(1 − p3 − p4)

p1(1 − p3) − (1 − p2)p4 −(1 − p1 − p2)p4 (1 − p2)(1 − p3 − p4)

⎞⎠

⎛⎝

πAB

πA

πB

⎞⎠

+⎛⎝

p2p4

p2(1 − p4)

(1 − p2)p4

⎞⎠ . (A.5)

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014

Page 9: Testing for Mutual Exclusivity

1314 J.R. Bradley and D.L. Farnsworth

Solving Equation (A.5) for πAB gives

πAB = (1 − p2 − p4)P (A+ ∩ B+) − p4P(A+ ∩ B−) − p2P(A− ∩ B+) + p2p4

(1 − p1 − p2)(1 − p3 − p4),

that is,

P(A ∩ B) ={1 − P(A+ | Ac) − P(B+ | Bc)}P(A+ ∩ B+) − P(B+ | Bc)P (A+ ∩ B−)

−P(A+ | Ac)P (A− ∩ B+) + P(A+ | Ac)P (B+ | Bc)

{1 − P(A− | A) − P(A+ | Ac)}{1 − P(B− | B) − P(B+ | Bc)} .

(A.6)

This expression is used to estimate P(A ∩ B).Under the null hypothesis P(A ∩ B) = 0, rearranging Equation (A.6) gives

P(A+ ∩ B+)

= P(B+ | Bc)P (A+ ∩ B−) + P(A+ | Ac)P (A− ∩ B+) − P(A+ | Ac)P (B+ | Bc)

1 − P(A+ | Ac) − P(B+ | Bc). (A.7)

Equation (A.7) is used to find the expected proportion to be observed as being in the intersectionin the sample.

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

11:

20 2

8 O

ctob

er 2

014