Anything You Can Do I Can Do Better
Post on 27-Jan-2017
lable at ScienceDirect
The Journal of Foot & Ankle Surgery 53 (2014) 252253Contents lists avaiThe Journal of Foot & Ankle Surgery
journal homepage: www.j fas .orgInvestigators Corner
Anything You Can Do I Can Do Better
Daniel C. Jupiter, PhDAssociate Professor of Surgery, Department of Surgery, Texas A&M Health Science Center, College of Medicine; and Research Scientist I, Scott and White Memorial Clinic and Hospital,Temple, TXa r t i c l e i n f o
Keywords:equivalence testingnon-inferiority testingFinancial Disclosure: None reported.Conflict of Interest: None reported.Address correspondence to: Daniel C. Jupiter, PhD
TX 76508.E-mail address: email@example.com
1067-2516/$ - see front matter 2014 by the Americhttp://dx.doi.org/10.1053/j.jfas.2013.06.003a b s t r a c t
Newly introduced drugs or treatments may not be substantively more effective than current therapies, butthese drugs or treatments may have distinct advantages in terms of lower costs or fewer or less severe sideeffects. Demonstrating the utility of a novel treatment is thus unlike usual hypothesis testing, in whichresearchers seek to prove that treatments differ (i.e., that one treatment is better than another). Instead,researchers must prove that the treatments are equivalent in effectiveness (i.e., that the treatments do notdiffer). I discuss here how to execute this type of study: the non-inferiority study.
2014 by the American College of Foot and Ankle Surgeons. All rights reserved., 2401 South 31st Street, Temple
an College of Foot and Ankle SWhen we think of clinical trials of drugs, we usually think of,
urgeonthe comparison of a drug to a placebo. When we think of trials ofnovel surgical techniques or of postoperative patient managementprotocols, we usually think of the comparison between 2 techniqueswherewe expect to see a noticeable difference between them.Most ofour statistical techniques are designed to study such differences. Infact, the entire machinery of the p value is phrased in terms ofdiscovering differences. Researchers set up the null hypothesis of nodifference in order to reject it, thus establishing the presence ofa difference.
There may be times, however, when researchers wish to comparedrugs or treatments, expecting and hoping not to see a difference. Asan example, consider the case in which a drug is currently on themarket, but it is rather expensive or has some unpleasant side effects.When a cheaper drug or a drug with fewer side effects comes alongthat is effective, we would prefer to use that drug. It need not bemoreeffective than the current drug. Indeed, we might even toleratea minimal loss of efficacy, given that it is much cheaper or has fewerdeleterious effects. Similarly, if we develop a less invasive surgicaltechnique, we would prefer to use the less invasive technique, as longas the postoperative results are essentially the same as those obtainedwith the current, more invasive, technique.
This type of comparison is entirely different than what we areused to. We have no desire to prove that there are differences; rather,we want to show sameness! And here, the machinery of the p valueseems to fall apart. Look at the example of the novel drug mentionedearlier and assume that it is, indeed, an effective treatment. If wecarry out our usual statistical tests that look for a difference intreatment effect, we will simply fail to reject the null hypothesis. Atthis point, we might be tempted to claim that we have shown thatthere is no difference between the 2 treatments, declare victory, andstart prescribing the new drug. If we did so, we would be makinga logical errordsuch as that discussed in an earlier Investigatorss. All rights reserved.
D.C. Jupiter / The Journal of Foot & Ankle Surgery 53 (2014) 252253 253Corner (1)dof misinterpreting non-rejection of the null hypothesis. Inshort, not seeing a difference does not prove that there is nodifference.
What are we to do, then, if our arsenal of tools appears to beill-equipped to address this type of problem? The key lies inrethinking not how we use the p value but how we set up ournull hypothesis. Imagine that this drug is used to lower systolicblood pressure (SBP) from stage 1 hypertensive levels to lowprehypertensive levels. The currently available drug lowers SBP to anaverage of 120, but it has side effects and is expensive. The newproposed drug has many fewer side effects, is much cheaper, but onlyreduces SBP to an average of 125. Essentially, our new drug is aseffective as the old, perhaps a little worse, but a little worse thatdoctors, patients, and insurance companies may be willing to accept,given the positive aspects of the new drug.
What if, rather than trying to show that our new drug is betterthan the old, which it is not, we try to show that our new drug is notworse than the old. In other words, we set up the null hypothesis asfollows:
Null hypothesis: Novel drug reduces SBP to a level at least 10higher than the level to which the currently available drugreduces SBP.
Given this null hypothesis, the alternative hypothesis, which weprove if we reject the null hypothesis above, is this:
Alternative hypothesis: Novel drug reduces SBP to a level nomore than 10 higher than the level to which the currentlyavailable drug reduces SBP.
In other words, the new drug is not that much worse thanthe currently available drug. We are back on solid ground, usingthe p value properly and not relying on a misinterpretation of thenon-rejection of the null hypothesis to prove our point.
I have just outlined the logic of the non-inferiority test,which is thetest to show that 1 thing is not substantively worse than another.I could just as easily consider non-superiority testing or, by combiningthese 2, consider equivalence testing. Equivalence testing shows thatthe 2 treatments are not substantively different. In the setting of thehypothetical drugs discussed earlier, this means that the new drugreduces SBP to a level no more than 10 higher than the current drugand that the current drug reduces SBP to a level no more than10 higher than the new drug. In short, the difference between thedrugs is no more than 10.
I conclude with 2 remarks. My first remark is about the effectdifference that we should look for in designing this type of study.Clearly, if I had hypothesized that my new drug reduced SBP levelsto within 50 of those of the current drug, clinicians would have beenless than impressed. That difference is far too large. In general, inlooking for non-inferiority or equivalence, the allowable differencebetween treatments should be less than clinically significant. In thisway we ensure that, although we are not achieving exact equivalence,we are close to doing so, at least from a patient perspective. Mysecond remark is simply that now that we understand the mechanicsof non-inferiority and equivalence testing, there is little excuse notto use them in our own study designs. With this and a previous (1)Investigators Corner in hand, we should never again be fooled byproofs using non-rejection of the null.Reference
1. Jupiter D. Turning a negative into a positive. J Foot Ankle Surg 52(4):556557,2013.
Anything You Can Do I Can Do BetterReference