robustness of results and conclusions in...

77
1 ROBUSTNESS OF RESULTS AND CONCLUSIONS IN SYSTEMATIC REVIEWS, TRIALS AND ABSTRACTS Anders W. Jørgensen PhD thesis 31 May 2011 The Nordic Cochrane Centre Rigshospitalet Faculty of Health Sciences University of Copenhagen

Upload: lengoc

Post on 14-Aug-2019

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

1

ROBUSTNESS OF RESULTS AND CONCLUSIONS IN

SYSTEMATIC REVIEWS, TRIALS AND ABSTRACTS

Anders W. Jørgensen

PhD thesis

31 May 2011

The Nordic Cochrane Centre

Rigshospitalet

Faculty of Health Sciences

University of Copenhagen

Page 2: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

2

TABLE OF CONTENTS

Preface and acknowledgements ..........................................................................................3

Structure of the thesis.......................................................................................................3

Acknowledgement............................................................................................................3

Financial support ..............................................................................................................3

Original Papers ....................................................................................................................4

Abstract................................................................................................................................5

Danish Summary..................................................................................................................6

Introduction ..........................................................................................................................7

Systematic reviews...........................................................................................................7

Bias ..................................................................................................................................8

The abstracts role in dissemination of information ............................................................9

Large losses to follow up, ITT and imputation of missing data ..........................................9

Multiple imputation .........................................................................................................10

Objectives ......................................................................................................................11

Methods and results...........................................................................................................12

Paper 1...........................................................................................................................12

Discussion of paper 1..................................................................................................18

Paper 2...........................................................................................................................21

Discussion of paper 2..................................................................................................29

Paper 3...........................................................................................................................31

Discussion of paper 3..................................................................................................39

Paper 4...........................................................................................................................41

Discussion of paper 4..................................................................................................60

Paper 5...........................................................................................................................62

Discussion of paper 5..................................................................................................67

Conclusion .........................................................................................................................69

Implications ....................................................................................................................70

Reccommendations for future research..........................................................................71

References ........................................................................................................................72

Page 3: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

3

PREFACE AND ACKNOWLEDGEMENTS

STRUCTURE OF THE THESIS

The structure of this thesis follows the guidelines from the Faculty of Health Sciences,

University of Copenhagen. First I will give an introduction to the context of this thesis and

present the objectives. The next part of the thesis consists of five articles, each followed by

an additional discussion of the methods and findings in the context of observations done by

other researchers on similar topics. Finally, I present an overall conclusion.

ACKNOWLEDGEMENTS

It is a great opportunity and a pleasure to thank those who made this thesis possible. I

would like to thank my supervisor Peter C. Gøtzsche, who in the first place triggered me as

a medical student into this field with his inspiring lectures about evidence based medicine,

and secondly helped me with financial support and facilities, but most importantly, for his

enthusiasm, drive and determination that also made this thesis possible.

I would also like to thank my co-authors for their constructive discussions and collaboration

on the papers.

My gratitude goes to all my colleagues at The Nordic Cochrane Centre for their support and

for making my three years as a PhD student a pleasant time.

Finally, I am forever indebted to my wife Anne who gave birth to our two wonderful girls and

made moving to a new house possible during my PhD studies, but above all for her

optimism, love and care.

FINANCIAL SUPPORT

I would like to thank The Nordic Cochrane Centre, The Danish Council for Independent

Research in Medical Sciences (FSS), The Health Insurance Foundation, The Medical

Research Foundation of the Capital Region of Denmark and Copenhagen University

Hospital, which funded this PhD.

Page 4: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

4

ORIGINAL PAPERS

The thesis is based on the following papers:

Jørgensen AW, Hilden J, Gøtzsche PC. Cochrane reviews compared with industry

supported meta-analyses and other meta-analyses of the same drugs: systematic review.

BMJ 2006;333:782.

Jørgensen AW, Maric KL, Tendal B, Faurschou A, Gøtzsche PC. Industry-supported meta-

analyses compared with meta-analyses with non-profit or no support: Differences in

methodological quality and conclusions. BMC Med Res Methodol 2008;8:60.

Jørgensen AW, Jørgensen KJ, Gøtzsche PC. Unbalanced reporting of benefits and harms

in abstracts on rofecoxib. Eur J Clin Pharmacol 2010;66:341-7.

Jørgensen AW, Lundstrøm LH, Wetterslev J, Astrup A, Gøtzsche PC. We should not

continue to use complete case analysis and single imputations like LOCF and BOCF. A

comparison of results from different imputation techniques of missing data from an anti-

obesity drug trial. (Manuscript to be submitted)

Gøtzsche PC, Jørgensen AW. Opening up data at the European Medicines Agency.

BMJ 2011;342:d2686.

Page 5: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

5

ABSTRACT

Bias in drug trials, e.g. introduced through lack of allocation concealment, lack of blinding, a

large amount of missing outcome data, no intention-to-treat analysis and selective outcome

reporting, is common and conclusions often favour the sponsor's product in drug trials.

Therefore, a critical appraisal of the research methodology and results is essential, but this

is not an easy task when important details and data are kept unpublished.

The aim of this PhD was to explore the robustness of results and conclusions in reviews,

randomised trials and abstracts. This was done in 4 studies. In the first study, we compared

Cochrane reviews with industry-supported meta-analyses and other meta-analyses of the

same drugs. In the second, we compared all industry-supported meta-analyses of drug-

drug comparisons with those without industry support. In the third, we studied the balance

between benefits and harms reported in abstracts on rofecoxib (Vioxx), and in the fourth,

we compared results from different imputation methods for missing outcome data in an anti-

obesity drug trial where we had access to individual patient data. A fifth paper, in which we

describe how we succeeded to get access to unpublished trial reports possessed by the

European Medicines Agency, has also been included in this thesis.

The results from the first two studies demonstrated that industry-supported reviews of drugs

should be read with caution, as they were less transparent and had more favourable

conclusions than other reviews. The third study found that the reporting of benefits and

harms in abstracts on Vioxx was unbalanced. The fourth showed great variation in the

results when different imputation techniques were used.

I conclude that systematic reviews, randomised trials and abstracts should be read with

caution, especially when the research is industry supported, as important information is

often not published. We suggest more transparency and access to unpublished trials and

the raw data and trial protocols.

Page 6: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

6

DANISH SUMMARY

Bias i lægemiddelforsøg, f.eks. opstået ved utilstrækkelig skjult allokering, manglende

blinding, manglende data, ingen ’intention-to-treat’-analyse og selektiv rapportering af

forsøgsresultater, er almindelig, og konklusionerne er ofte til fordel for sponsorens

lægemiddel. Derfor er det vigtigt, at man kritisk vurderer metoder og resultater, hvilket dog

ikke er nemt, når vigtige detaljer og data ikke bliver offentliggjort.

Formålet med ph.d.-projektet, som blev opdelt i 4 delprojekter, var at undersøge

pålideligheden af resultater og konklusioner i oversigtsartikler, artikler om lægemiddelforsøg

og artikelresumeer. I det første projekt sammenlignede vi Cochrane-oversigter med

firmastøttede og andre meta-analyser af de samme lægemidler. I det andet sammenlignede

vi firmastøttede meta-analyser af lægemiddelsammenligninger med ikke-frimastøttede

meta-analyser. I det tredje undersøgte vi balancen mellem gavnlige og skadelige virkninger

beskrevet i artikelresuméer om rofecoxib (Vioxx), og i det fjerde sammenlignede vi

forskellige imputationsmetoder for manglende data i et et forsøg med et lægemiddel mod

fedme, hvor vi havde adgang til individuelle patientdata. En femte artikel, hvor vi beskriver

hvordan det lykkedes os at få aktindsigt i hemmeligholdte forsøgsrapporter hos den

Europæiske Lægemiddelstyrelse, er også indkluderet i ph.d.-afhandlingen.

Resultaterne fra de to første projekter viste, at man skal være særlig kritisk overfor

firmastøttede meta-analyser, da de er sværere at gennemskue og oftere anbefaler det

eksperimentielle lægemiddel end andre meta-analyser. I det tredje projekt fandt vi en

ubalanceret beskrivelse af gavnlige og skadelige effekter i artikelresuméer om rofecoxib.

Det fjerde viste en betydelig variation i resultaterne, når forskellige imputationsmetoder blev

brugt.

Jeg konkluderer, at systematiske oversigtsartikler, artikler om lægemiddelforsøg og

artikeresuméer bør læses med omhu, fordi vigtig information ofte ikke offentliggøres, og

især når forskningen er støttet af firmaer. Jeg foreslår mere gennemsigtighed og fri adgang

til ikke-offentliggjorte lægemiddelforsøg, individuelle patientdata og forsøgsprotokoller.

Page 7: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

7

INTRODUCTION

Health professionals, including physicians, make a lot of decisions every day in their clinical

practice about patients’ diagnoses and treatments. They try to choose the best test for

diagnosing and the most effective intervention for treating diseases. A clinical decision is

based on the setting, the patient’s values and expectations and the physician’s experience,

skills, personality, and of course the knowledge, which primarily comes from the medical

literature (1). However, to make informed decisions requires available and reliable

information from unbiased research that seeks to answer clinical questions relevant to

patients, but to get this kind of information is difficult because the amount of healthcare

research is overwhelming (2) and most is not applicable to clinical decision-making.

Evidence-based medicine, defined as the integration of best research evidence with clinical

expertise and patient values (3), is helpful in making informed decisions. Practicing

evidence-based medicine is about trying to improve the quality of the information on which

the decisions are based by avoiding information overload and by finding and applying the

most useful information. A task that is not unique to medicine, but to all informed decisions

(4).

Within the framework of evidence-based medicine, study designs have traditionally been

arranged and graded hierarchically with randomised controlled trials (RCTs) ranking higher

than observational studies (5,6). Unknown and uncontrollable prognostic factors can bias

the result in observational studies if they are unbalanced, e.g. between the controls and

cases. Due to randomisation the chance of unbalanced prognostic factors in RCTs is low

and therefore the risk of bias is low.

SYSTEMATIC REVIEWS

Despite being retrospective, systematic reviews of RCTs rank higher than RCTs because

they summarize, if possible in a meta-analysis, all the evidence from the available RCTs

resulting in a higher degree of precision (5-7). The Cochrane Collaboration is an

independent organisation that produces systematic reviews (Cochrane reviews) that aim at

providing information for healthcare decisions by addressing the choices that politicians,

clinicians and patients face and by reporting meaningful outcomes. To minimize bias and to

promote transparency of methods, Cochrane reviews are based on pre-defined, peer-

reviewed and published protocols (8).

Page 8: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

8

BIAS

Systematic reviews and RCTs rank high in the hierarchy of evidence, but when conducted

poorly the risk of bias is high. Major limitations in study methodology increase the risk of

bias in these studies and lower their quality. For RCTs, these limitations in study design

include lack of allocation concealment, lack of blinding, large losses to follow-up and failure

to adhere to an intention-to-treat (ITT) analysis (9). If a meta-analysis is based on trials with

a high risk of bias, the result from the meta-analysis will also have a high risk of being

biased. For systematic reviews, poor searching and improper selection of studies increase

the risk of bias. All this can be assessed if the papers of trials and systematic reviews report

important details transparently, but this is not always the case.

A major pitfall for evidence-based medicine is missing or unpublished information.

Published trials may only include some of the outcomes investigated and analyses actually

undertaken, and if there is a systematic difference between the reported and unreported

outcomes, it may lead to bias. Such selective outcome reporting bias has been

demonstrated in studies of RCTs (10,11). The same is true for publication bias, where

positive results get published more often than other results (12,13). This means that the

information that doctors and other health personal rely on represent the most positive

information and thus the informed decision becomes an illusion.

Part of the solution to publication and selective reporting bias is to let independent

researchers have access to data at the medicines regulatory authorities. When

manufacturers apply for marketing approval, they submit detailed reports of all trials,

published or unpublished to the medicines agencies. Such invaluable information should be

used to provide a more reliable basis for an informed decision.

There are several motives for deliberate introduction of bias in research. Interests, financial

and non-financial, may conflict with the responsibility to conduct, design and report trials

objectively. Little research has been done to investigate non-financial conflict of interests

(14), whereas numerous studies have documented that financial conflicts of interest may

influence the study outcome, and the evidence has been synthesized in several systematic

reviews (15-18).

Page 9: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

9

THE ABSTRACT'S ROLE IN DISSEMINATION OF INFORMATION

When finding the best available evidence, the search often begins with databases. In

addition to the title of the article, the databases usually include an abstract, which is a

condensation of the information in the article. This makes the abstract the key entry to the

information and the part of a paper that - second to the title - is most often read. Sometimes

the abstract is the only published information about a trial, e.g. presented at a conference,

and sometimes clinical decisions are based solely on abstracts (19,20). Abstracts are

therefore a very important information source when evaluating an intervention, which

underlines the importance of well-written abstracts. Since medical interventions and tests

have the potential to be both beneficial and harmful, the benefits and harms must be equally

prioritised and reported in the medical literature including the abstract (20).

LARGE LOSSES TO FOLLOW UP, AND IMPUTATION OF MISSING DATA

Attrition is inevitable in large long-term RCTs. Some subjects will discontinue the treatment

due to harms, lack of benefit or other reasons, for example moving home, which makes

participation in the trial troublesome. If there is a systematic difference in attrition between

the intervention and the control group, the risk of bias is high if only the subjects who

completed the trial are analysed. Intention-to-treat (ITT) is a strategy to reduce this bias and

implies that all randomised subjects are included in the analyses, regardless of whether

they received the treatment to which they were originally allocated, or were subsequently

withdrawn or deviated from the protocol in other respects (21). In other words, an ITT

analysis estimates the effect of the intention to give a particular treatment and thus mimics

clinical practice where patients also discontinue treatments for the same reasons as stated

above. Therefore, the principle of ITT is widely accepted and appears in several guidelines

and recommendations (22,23). Frequently, however, the ITT approach is inadequately

described and inappropriately applied (8,21). To make a proper ITT analysis, it is important

that data are available for all patients, but missing data is a common problem in all types of

medical research.

Attrition and missing data do not always introduce bias. If data are missing by chance then

the available data can be regarded as a random sample of the ‘full’ data set and there is no

need for imputation. But this situation is not common in trials and missing data must

therefore be imputed.

In trials of anti-obesity drugs, the most used method for imputing missing data is ‘last

observation carried forward’ (LOCF) (24). This method replaces missing measurements of

body weight with the last measured weight. For example, if 5 weight measurements have

been planned, but a patient only attends the first 3 visits, the missing weight measurements

Page 10: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

10

on visits 4 and 5 are replaced by the measurement at visit 3. To assume that a subject’s

weight is unchanged after dropping out of a trial is hard to justify, as most patients regain

much of their lost weight within a short period of time after they stop treatment (25). Further,

LOCF overestimates the precision of the effect estimate because the dataset is analysed as

if it were a ‘complete’ dataset with no missing data (26). Therefore, LOCF is likely to be a

biased analysis.

There are other and more reliable techniques for handling missing data (27). Attention has

been drawn to multiple imputation (MI) (28) and it was recently recommended for handling

missing data in obesity trials (24).

MULTIPLE IMPUTATION

Multiple imputation may be the most reliable technique (29). It is a stepwise procedure. In

the first stage, multiple copies of the dataset are being created with the missing values

replaced by imputed values. The imputed values are randomly drawn from their predictive

distribution based on the observed data. Since we cannot know the true values of the

missing data, the imputation procedure must inject appropriate variability into the multiple

imputed values to account for the uncertainty when the missing values are predicted (which

single imputation such as LOCF doesn't do). In the second stage, the multiple data sets that

have been generated are analysed with standard statistical methods and the results, which

differ from dataset to dataset due to the variation introduced in the imputation of the missing

values, are averaged. The final estimate is unbiased because we are averaging over the

distribution of the missing data given the observed data and incorporating the uncertainty of

the imputed missing values into the confidence intervals.

Page 11: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

11

OBJECTIVES

I wanted to explore the robustness of results and conclusions in systematic reviews,

randomised trials and abstracts and to get access to unpublished data at the European

Medicines Agency. I undertook five projects with the aims:

1. To compare Cochrane reviews with industry-supported meta-analyses and other

meta-analyses of the same drugs for differences in conclusions and methodological

quality.

2. To assess all meta-analyses indexed on PubMed of drug-drug comparisons

published in 2004 and compare those with industry support with those with other or

no support.

3. To quantify how often benefits and harms in terms of gastrointestinal bleeding and

cardiovascular thrombosis, respectively, were reported in abstracts on rofecoxib.

4. To analyse the weight change and compare the results when using four different

methods for handling missing data in an randomised trial of an anti-obesity drug.

5. To describe the character of the arguments in our case when we applied for access

to unpublished trial reports and protocols of two anti-obesity drugs at the European

Medicines Agency.

Page 12: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

12

METHODS AND RESULTS

PAPER 1

The aim of the first study was to compare methodological quality and conclusions in

Cochrane reviews with those in industry supported meta-analyses and other meta-analyses

of the same drugs.

Page 13: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

Research

Cochrane reviews compared with industry supported meta-analysesand other meta-analyses of the same drugs: systematic reviewAnders W Jørgensen, Jørgen Hilden, Peter C Gøtzsche

AbstractObjective To compare the methodological quality andconclusions in Cochrane reviews with those in industrysupported meta-analyses and other meta-analyses of the samedrugs.Design Systematic review comparing pairs of meta-analysesthat studied the same two drugs in the same disease and werepublished within two years of each other.Data sources Cochrane Database of Systematic Reviews (2003,issue 1), PubMed, and Embase.Data extraction Two observers independently extracted dataand used a validated scale to judge the methodological qualityof the reviews.Results 175 of 1596 Cochrane reviews had a meta-analysis thatcompared two drugs. Twenty four meta-analyses that matchedthe Cochrane reviews were found: eight were industrysupported, nine had undeclared support, and seven had nosupport or were supported by non-industry sources. On a 0-7scale, the median quality score was 7 for Cochrane reviews and3 for other reviews (P < 0.01). Compared with industrysupported reviews and reviews with undeclared support,Cochrane reviews had more often considered the potential forbias in the review—for example, by describing the method ofconcealment of allocation and describing excluded patients orstudies. The seven industry supported reviews that hadconclusions recommended the experimental drug withoutreservations, compared with none of the Cochrane reviews(P = 0.02), although the estimated treatment effect was similaron average (z = 0.46, P = 0.64). Reviews with undeclared supportand reviews with not for profit support or no support hadconclusions that were similar in cautiousness to the Cochranereviews.Conclusions Industry supported reviews of drugs should beread with caution as they were less transparent, had fewreservations about methodological limitations of the includedtrials, and had more favourable conclusions than thecorresponding Cochrane reviews.

IntroductionBias in drug trials is common and often favours the sponsor’sproduct.1–3 Critical, systematic reviews that aggregate theavailable information in a neutral manner are therefore essential.Cochrane reviews aim to minimise bias and avoid conflicts ofinterest,4 and, on average, they may have greater methodologicalrigour than systematic reviews published in paper basedjournals.5 6 We therefore hypothesised that Cochrane reviewswould be more transparent and less biased than industry

supported systematic reviews. We aimed to compare Cochranereviews with other meta-analyses of the same drugs, which wedivided into those that had industry support, those withundeclared support, and those that had non-profit support or nosupport.

MethodsWe searched for pairs that consisted of a Cochrane review and asimilar review in a paper based journal. A Cochrane review waseligible if it used meta-analysis to compare at least two differentdrugs or classes of drugs; was published in the Cochrane Databaseof Systematic Reviews 2003, issue 1; could be matched with a meta-analysis of the same drugs and diseases published in full in apaper based journal within two years before or after the mostrecent substantive amendment of the Cochrane review; and hadno authors in common with the Cochrane review.

We defined support by the pharmaceutical industry as provi-sion of grants, authorship, or other major assistance such as helpwith the statistical analysis. We did not consider provision of ref-erences or unpublished trial reports as support.

One investigator (AWJ) hand searched all reviews in theCochrane Database of Systematic Reviews 2003, issue 1 for drugcomparisons. For each potentially eligible Cochrane review, wesought possibly eligible paper based reviews by searchingPubMed (January 1966 to July 2003) for the same diseases anddrugs combined with “meta-analysis” or meta-analysis[pt]. Fromonline inspection of titles and abstracts, we selected meta-analyses for examination of the full text. When we found nomatch in PubMed, we searched Embase (WebSPIRS 5) (1980 toAugust 2003). When we found more than one match with thesame type of support, we chose the one with the closest publica-tion date to the Cochrane review.

AWJ and PCG independently assessed each pair of reviews inrandom order, by reading the Cochrane review first in half of thepairs. We used a pilot tested data sheet and resolveddisagreements by discussion. We were not blinded. We extracteddata on the date of the most recent substantive amendment tothe Cochrane review and of the publication of the paper basedreview; names of relevant drugs and diseases; types of support;number and type of sources used to identify trials for the review;searches for unpublished trials; and descriptions of concealmentof allocation, details of blinding, and excluded patients and trials.

We assessed the methodological quality of the reviews withOxman and Guyatt’s index, which is a validated tool with nine

An appendix, three extra tables, and extra references w1-w48 are on bmj-.com

Cite this article as: BMJ, doi:10.1136/bmj.38973.444699.0B (published 6 October 2006)

BMJ

BMJ Online First bmj.com page 1 of 5

Page 14: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

items considering the potential for bias and an overallassessment on a 0-7 scale.7–9 We judged the review authors’ con-clusions by assessing whether the experimental intervention wasrecommended without reservations or whether it was notrecommended or recommended only with reservations.3 As ourdata were paired, we compared quantitative data with theWilcoxon-Pratt one sample rank sum test and binary data with asign test; P values are two sided.

For the industry supported reviews, we also assessed whetherthe estimated treatment effects were different from thosereported in the Cochrane reviews. We used the first reportedoutcome with data in the industry supported review that was alsopresented in the Cochrane review, and we also did thecorresponding analysis in which we started with the firstreported outcome in the Cochrane review. We calculated pooledcomparative z scores, after adjustment for the number of patientscontributing to the outcome and for the number of patients thatwere common to the two analyses (see appendix on bmj.com fordetails).

ResultsThe Cochrane Database of Systematic Reviews 2003, issue 1contained 1596 Cochrane reviews, of which we excluded 1421,mostly because they did not compare drugs (figure). In 72 of theremaining 175 reviews, the Medline and Embase searches identi-fied potentially eligible paper based reviews, of which weexcluded 48, mostly because their publication date differed bymore than two years from the last substantive update of theCochrane review, resulting in 24 matched pairs.w1-w48 OneCochrane revieww1 was paired with a paper based review withindustry supportw2 and also with one with undeclared supportw3;one of these reviewsw2 presented only a subgroup analysis, andwe therefore substituted it with an older review that containedthe same trials.w4 In eight of the 24 pairs the paper based reviewshad industry support, w1 w4-w18 in nine they had undeclaredsupport,w1 w3 w19-w34 and in seven they had non-profit or nosupport.w35-w48

The overall median quality score was 7 for the 24 Cochranereviews and 3 for the other reviews (P < 0.001; table A on

bmj.com). The mean years of publication for the three sets ofpairs were 2000 versus 2000 for Cochrane reviews versus indus-try supported reviews, 2000 versus 1999 for Cochrane reviewsversus reviews with undeclared support, and 2000 versus 2000for Cochrane reviews versus reviews with non-profit or nosupport; the median differences in number of included trialswere 0, 1, and 1 (table B on bmj.com).

Cochrane reviews versus industry supported reviews (eightpairs)Cochrane reviews were of higher quality than industrysupported reviews (P < 0.01). They also more often stated thesearch methods used to find studies (P = 0.06), searched compre-hensively (P = 0.06), avoided bias in the selection of studies(P = 0.03), reported criteria for assessing the validity of the stud-ies (P = 0.03), used appropriate criteria in assessing the studies(P < 0.01) (table A), described methods of concealment of alloca-tion (P = 0.02), and described excluded patients (P = 0.03) andstudies (P = 0.03), and they used more sources to identify studies(P = 0.02) (table B).

One of the industry supported reviews had no conclusion, asit referred to physiological characteristics of the drug.w6 Theother seven reviews supported by industry all recommended theexperimental drug without reservations, compared with none ofthe Cochrane reviews (P < 0.01). This difference was related tointerpretation of the data (table) and consideration of costs. Theauthors of six of the eight Cochrane reviews had reservationsabout the quality or relevance of the trials or their findings,w1 w5 w7

w9 w11 w17 and two noted that the effect decreased with increasingsample size.w5 w9 Seven mentioned the higher cost of theexperimental drug as a problem compared with none of theindustry supported reviews, of which two claimed that economicanalyses had shown that the experimental drug was costeffective.w4 w18 In contrast to these interpretations, the estimatedtreatment effect was similar, on average, in the pairs of reviews(pooled z = 0.46, P = 0.64; appendix and table C on bmj.com).However, the scatter of the comparative z scores was high(�2 = 19.4, df = 6, P = 0.004), which partly reflects differentialinclusion of trials and patients despite our close matching (tableC) and partly could be caused by selective or biased handling ofdata.

Cochrane reviews versus reviews with undeclared support(nine pairs)The results for the comparison with reviews with undeclaredsupport were similar to those for the industry supported reviews(table A), except that no significant differences existed for thestated search methods (P = 1.00) or efforts to avoid bias in theselection of studies (P = 0.22) and the recommendations werewithout reservations in one Cochrane review and in two otherreviews (P = 1.00).

The paper based reviews were often biased and poorly done.One seemed to have included non-randomised studies.w22

Another included retrospective studies, had arbitrary entry crite-ria, and seemed to have preferentially selected those studies anddata that were in favour of the experimental drug; the outcomewas analysed in eight different ways, and the biggest differencewas emphasised.w26 A third review presented a highly misleadingresult in the abstract that was based on an indirect comparison oftreatment arms,w30 and, in contrast to the Cochrane review,w29 theauthors failed to include all eligible studies and failed to detectsample size bias. A fourth review found no differences betweenthe drugs but stated, with reference to an economic analysis:“Coupled with potential cost savings driven by the reduced needfor hospitalization and revascularization procedures, it is not dif-

Reviews in 2003, issue 1 (n=1596)

Cochrane reviews eligible for a pair (n=175)

Pairs included (n=24)

Excluded (n=1421) Withdrawn from Cochrane Library (n=16) Did not review drug v drug comparisons (n=1209) Compared combination of drugs with one drug (n=25) Had no proper meta-analysis (n=171)

Search forCochranereviews inCochraneLibrary

Finalexaminationand inclusionor exclusionof bestmatchingpaper basedmeta-analysis

Onlinesearch

Excluded (n=48) Beyond ±2 years (n=30) Did not contain a meta-analysis (n=8) Inaccessible (n=2) On side effects only (n=3) Not the same diseases or drugs (n=4) Author in common with Cochrane review (n=1)

Potential pairs (n=72) PubMed (n=58) Embase (n=14)

Searches for pairs of reviews (first or most obvious reason for exclusionindicated)

Research

page 2 of 5 BMJ Online First bmj.com

Page 15: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

ficult to understand why clinicians, administrators and medicalorganizations representing the interests of physicians and thewelfare of patients look favourably upon [low molecular weightheparin].”w3

Cochrane reviews versus reviews with non-profit or nosupport (seven pairs)We found no significant differences between Cochrane reviewsand reviews with non-profit or no support (table A). The recom-mendations were without reservations in two Cochrane reviewsand in one other review.

However, in three pairs of reviews the conclusions favoureddifferent drugs. In one set, although the same trials wereincluded, the authors had different views on the trade-offbetween benefits and harms. The Cochrane review strongly sup-ported the use of warfarin in patients with atrial fibrillation withaverage or greater risk of stroke,w35 whereas the paper basedreview strongly favoured antiplatelet drugs.w36 In another set, thepaper review found better control of haemorrhage with

octreotide.w40 The Cochrane review did not find any differencesbut referred to another meta-analysis when it emphasised thatterlipressin is the only drug for which a reduction in mortalityhas been found compared with no treatment.w39 We havepublished a comment with this review, as we doubt that the effecton mortality is reliable.

In the third set, the paper review included only three doubleblind randomised trials and concluded that methadone shouldbe used for heroin dependence; it noted in an addendum thatLAAM (levomethadyl acetate hydrochloride) had been with-drawn from the market because of cardiotoxicity.w46 TheCochrane review included 18 studies, three of which were cohortstudies that were not analysed separately from the randomisedtrials, and concluded that LAAM seems to be more effective thanmethadone.w45 A fourth Cochrane review found a major samplesize bias but failed to make reservations in the abstract as regardsthe mortality benefit, which was based on all trials.w44

Characteristics of pairs of Cochrane reviews (C) and industry supported paper based reviews (I) of the same drugs

References Disease InterventionsNo of included trials

(both reviews/Conly/I only)

No ofincludedpatients

(C/I)

Significant differencefavouring drug of interest

CommentsBenefits

(C/I) Harms (C/I)

w1, w4

Acute coronarysyndrome

Enoxaparin vunfractionatedheparin

2/0/0 7081/7081 Yes/yes No/no C: control drug is cheaper; new trials with longer follow-up are needed.I: health economic study shows that enoxaparin lowers total costs ofcare. Our comment: both reviews found that control drug causessignificantly fewer minor bleeds

w5, w6 Schizophreniaor similarpsychoses

Amisulpride vtypicalantipsychotics

14/0/0 1702/1700 Yes/yes Yes/yes Both reviews noted that funnel plot suggested publication bias infavour of amisulpride. C: the result must be considered with provisos;amisulpride is expensive. I: dismissed finding, noting that according tomanufacturer no further studies had been done; no conclusion. Ourcomment: bias should not be dismissed; reasons for sample size biasother than selective publication exist

w7, w8 Rheumatoidarthritis

Celecoxib v otherNSAIDs

4/1/1 4465/4191 No/no Yes/yes C: 12 month results from CLASS study suggest that short term benefitof celecoxib on gastrointestinal ulcers may not persist; this isimportant as rheumatoid arthritis is a chronic disease and patients arelikely to be taking celecoxib for extended periods; increased cost. I: nosuch reservations

w9, w10 Schizophrenia Risperidone vhaloperidol

6/4/0 2326/1047 Yes/yes No/yes C: more weight gain with risperidone, which is costly; includedschizophrenia-like psychoses as no evidence that these should betreated differently from schizophrenia; funnel plots showed greaterbenefit for risperidone in smaller studies. I: all results favouredrisperidone; no comment on possible sample size bias. Our comment:inconsistency in inclusion of patients in relation to dose of risperidonein I review; excluded trials had rather negative results

w11, w12 Asthma Inhaledfluticasone vbudesonide andbeclometasone

14/16/0 7775/3564 Yes/yes No/yes C: plasma cortisol is an unreliable measure of harm. I: plasma cortisolfavours experimental drug. Our comment: no information on wherestudies in I review came from and only those published after 1995included; C review included 10 such studies that were missing in Ireview; complete discrepancy between two reviews in studies withcortisol values, and for several studies in I review C authors did notknow that cortisol had been measured; authors of I review had accessto individual patient data; all requests by authors of C review to getaccess to further data from authors of trial reports were unsuccessful

w13, w14 Asthma Salmeterol vtheophylline

3/2/6 777/1330 No/yes Yes/yes C: trend towards better effect of salmeterol, but inconsistent reportingof data precluded meta-analysis; cost analysis needed. I: significantdifferences in favour of salmeterol for all efficacy outcomes. Ourcomment: I review included only studies carried out by company; threemissing studies in C review were unpublished, two were abstracts, andone was untraceable as reference in I review was wrong

w15, w16 Depression Paroxetine v TCAs 15/17/24 5910/3758 NA/no Yes/yes C: SSRIs up to 30 times more costly than TCAs. I: no search strategy;company’s worldwide clinical database used. Our comment: 12 trials inC review indisputably fulfilled inclusion criteria for I review but weremissing; 20 trials included only in I review were “data on file;” Creview had missed three small trials, two of which were published onlyas abstracts

w17, w18 Depression Venlafaxine votherantidepressants

0/4/5 708/450 No/yes NA/no C: costs of SSRIs and limited benefit do not justify routine first lineuse. I: economic analyses have shown that venlafaxine is cost effective.Our comment: no significant benefit in favour of venlafaxine in Creview; I review was obscure; unclear selection criteria (“some selectedSSRIs” and “certain TCAs”); impossible to verify which trials wereincluded as references were to trial protocols; inappropriatemeta-analysis of study arms

NA=not available; NSAID=non-steroidal anti-inflammatory drug; SSRI=selective serotonin reuptake inhibitor; TCA=tricyclic antidepressant.

Research

BMJ Online First bmj.com page 3 of 5

Page 16: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

DiscussionWe found that although some Cochrane reviews had clear meth-odological deficiencies, these were fewer, on average, than inreviews published in paper based journals.

LimitationsA minor limitation was that we could not be blinded, as the lay-out of Cochrane reviews is unique; blinding has little impact onextraction of data for reviews.10 A more important limitation isthat our sample was small and needs to be replicated.Furthermore, we are affiliated with a Cochrane centre, andCochrane reviews are done according to a handbook11 that wasdeveloped partly by Andy Oxman, who also participated in thedevelopment of the validated index that we used for evaluatingmethodological quality.9 12 To help readers to make their ownjudgments, we have provided details on the comparison withindustry supported reviews (table) and added items to our dataextraction sheet that are indisputably important for the reliabil-ity of reviews, such as adequate concealment of allocation.13 14

Our findings correspond to another recent finding thatCochrane reviews assess methodological quality more often thando other reviews,15 although because of space constraints somepaper based reviews might have been conducted better than wasreported.

Industry supported reviewThe estimated treatment effects in industry supported reviewswere similar to those of Cochrane reviews, but the former haduniformly positive recommendations for the experimental drug,without reservations about methodological limitations of the tri-als or costs, in contrast to none of the Cochrane reviews. Thissuggests that the main problem with industry supported reviewslies in how conclusions are formulated.

We compared pairs of similar reviews published within a nar-row time frame. Other such pairwise comparisons have beenanecdotal. A meta-analysis found a similar drop-out rate forfluoxetine as for tricyclic antidepressants (P = 0.4),16 17 whereas acompany employee reported a marked difference in favour offluoxetine (P < 0.001)18 in subsequent correspondence. This issurprising, as the industry supported meta-analysis containedfewer patients and included “data on file” reports, which are usu-ally less favourable than published ones.19 20 In contrast to indus-try supported authors, authors of Cochrane reviews oftencannot get access to such data, as indicated in two of thereviews.w11 w37

As another example, a meta-analysis supported by Merckconcluded in 2001 that no increased risk of arterial thrombosisexisted with the company’s drug rofecoxib,21 but a meta-analysisnot supported by industry showed an increased risk,22 which wasapparent in publications available to the authors of the industrysupported meta-analysis. Rofecoxib was withdrawn because ofthromboses in 2004.

The influence of industry on trial reports is similar to ourfindings. A survey found that none of 56 trials of non-steroidalanti-inflammatory drugs supported by the manufacturerpresented results that were unfavourable to the company.23

Another survey found that the conclusions recommended theexperimental drug as the drug of choice five times as often if thetrial was funded by for profit organisations, even afteradjustment for the effect size.3

Reviews with undeclared supportThe conclusions of paper based reviews with undeclared supportwere more cautious than those for industry supported reviews.

We contacted the authors after we had assembled our data. Eightdeclared that they had not received any external funding orother type of support (which we exemplified as help with the sta-tistical analyses), and one replied that he had not received anyother financial support.w3 We do not know whether these replieswere comprehensive, and we suspect that some authors hadreceived undeclared support or had allowed the company toreview the paper and insert text, as suggested by the recommen-dation for low molecular weight heparin above.w3

Interpretation of financial supportThe interpretation of financial support is not always straightfor-ward. The authors of a paper based review that we classified as“non-profit support” noted that it was supported in part by pub-lic sources but did not describe the nature of the other part, andtwo of the authors had previously received “unrestricted grants”from the manufacturer of octreotide.w40 The authors of thematching Cochrane review declared that they had no conflicts ofinterest but added that they had “no permanent financialcontracts” with companies producing the comparator, terlipress-in.w39 Finally, one of the Cochrane reviews had industry support,as Upjohn had funded secondary analyses of the author’s owntrial for use in the review.w47 The current policy in the CochraneCollaboration is that industry support of Cochrane reviews is notacceptable.24

Other problems with reviewsLess rigorously controlled studies than ours have reported ondiscrepant conclusions between systematic reviews assessing thesame subject. The major reasons were incomplete searches,differential inclusion of trials, insufficient attention to the qualityof the trials and to bias detection, and differences ininterpretation.25–32 Some reviews missed more than half of theavailable trials,26–28 and a review of meta-analyses of analgesicinterventions found that those with positive conclusions hadlower quality scores on the Oxman and Guyatt index.8 A recentstudy of antihypertensive drugs found that the conclusions ofmeta-analyses were positive in 91% of the papers with financialties and in 72% of other papers.33 However, the conclusions ofCochrane reviews also tend to be too positive.34

Our examination of the comparative z scores revealed morescatter than expected, which indicates that some effect estimatesmight have been biased, and, furthermore, that the confidenceinterval in a meta-analysis generally exaggerates the precision inthe underlying data.

ConclusionsIndustry supported reviews of drugs are less transparent thanCochrane reviews and have few reservations about methodologi-cal limitations of the included trials; their conclusions should beread with caution. We believe that details of concealment of allo-cation, blinding, inclusion and exclusion criteria for trials, searchstrategies, and estimated effects in each included trial need to bereported to allow readers to judge the reliability of reviews. Toimprove transparency, access to the protocol should be available.Protocols for Cochrane reviews are published in the CochraneLibrary, and protocols for other systematic reviews can be regis-tered free of charge at the UK national research register throughthe Centre for Reviews and Dissemination in York, UK.

We thank Stefan Leucht, Deborah J Cook, Deborah Goebert, Michael Lud-wig, Max H Pittler, Chiel Springer, Steven L West, Frederick A Spencer, andPatrick Chien for providing information on support and funding of theirstudies and the Iberoamerican Cochrane Centre, Barcelona for providingstudy facilities.Contributors: AWJ wrote the draft protocol and manuscript, and PCG con-tributed. AWJ and PCG extracted data. JH did the statistical analyses that

Research

page 4 of 5 BMJ Online First bmj.com

Page 17: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

compared estimated treatment effects. All authors commented on the finalmanuscript. PCG is the guarantor.Funding: None.Competing interests: AWJ and PCG are affiliated with the Nordic CochraneCentre. The views expressed in this article represent those of the authorsand are not necessarily the views or the official policy of the Cochrane Col-laboration.

1 Lexchin J, Bero LA, Djulbegovic B, Clark O. Pharmaceutical industry sponsorship andresearch outcome and quality: systematic review. BMJ 2003;326:1167-70.

2 Bekelman JE, Li Y, Gross CP. Scope and impact of financial conflicts of interest in bio-medical research: a systematic review. JAMA 2003;289:454-65.

3 Als-Nielsen B, Chen W, Gluud C, Kjaergard LL. Association of funding andconclusions in randomized drug trials: a reflection of treatment effect or adverseevents? JAMA 2003;290:921-8.

4 The Cochrane Collaboration. The Cochrane Collaboration—principles.www.cochrane.org/docs/tenprinciples.htm (accessed 18 Sept 2006).

5 Jadad AR, Cook DJ, Jones A, Klassen TP, Tugwell P, Moher M, et al. Methodology andreports of systematic reviews and meta-analyses: a comparison of Cochrane reviewswith articles published in paper-based journals. JAMA 1998;280:278-80.

6 Jadad AR, Moher M, Browman GP, Booker L, Sigouin C, Fuentes M, et al. Systematicreviews and meta-analyses on treatment of asthma: critical evaluation. BMJ2000;320:537-40.

7 Oxman AD, Guyatt GH. Guidelines for reading literature reviews. Can Med Assoc J1988;138:697-703.

8 Jadad AR, McQuay HJ. Meta-analyses to evaluate analgesic interventions: a systematicqualitative review of their methodology. J Clin Epidemiol 1996;49:235-43.

9 Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. J ClinEpidemiol 1991;44:1271-8.

10 Berlin JA. Does blinding of readers affect the results of meta-analyses? Lancet1997;350:185-6.

11 Higgins JPT, Green S, eds. Cochrane handbook for systematic reviews of interventions4.2.5 [updated May 2005]. www.cochrane.org/resources/handbook/hbook.htm(accessed 19 Sept 2005).

12 Oxman AD, Guyatt GH, Singer J, Goldsmith CH, Hutchison BG, Milner RA, et al.Agreement among reviewers of review articles. J Clin Epidemiol 1991;44:91-8.

13 Juni P, Altman DG, Egger M. Systematic reviews in health care: assessing the quality ofcontrolled clinical trials. BMJ 2001;323:42-6.

14 Juni P, Egger M. Allocation concealment in clinical trials. JAMA 2002;288:2407-8.15 Moja LP, Telaro E, D’Amico R, Moschetti I, Coe L, Liberati A. Assessment of methodo-

logical quality of primary studies by systematic reviews: results of the metaquality crosssectional study. BMJ 2005;330:1053.

16 Song F, Freemantle N, Sheldon TA, House A, Watson P, Long A, et al. Selective serot-onin reuptake inhibitors: meta-analysis of efficacy and acceptability. BMJ1993;306:683-7.

17 Smith G, Egger M. Meta-analysis: unresolved issues and future developments. BMJ1998;316:221-5.

18 Nakielny J. Effective and acceptable treatment for depression [letter]. BMJ1993;306:1125.

19 Stern JM, Simes JR. Publication bias: evidence of delayed publication in a cohort studyof clinical research projects. BMJ 1997;315:640-5.

20 Hemminki E. Study of information submitted by drug companies to licensing authori-ties. BMJ 1980;280:833-6.

21 Konstam MA, Weir MR, Reicin A, Shapiro D, Sperling RS, Barr E, et al. Cardiovascularthrombotic events in controlled, clinical trials of rofecoxib. Circulation2001;104:2280-8.

22 Jüni P, Nartey L, Reichenbach S, Sterchi R, Dieppe PA, Egger M. Risk of cardiovascularevents and rofecoxib: cumulative meta-analysis. Lancet 2004;364:2021-9.

23 Rochon PA, Gurwitz JH, Simms RW, Fortin PR, Felson DT, Minaker KL, et al. A study ofmanufacturer-supported trials of nonsteroidal anti-inflammatory drugs in thetreatment of arthritis. Arch Intern Med 1994;154:157-63.

24 Cochrane Collaboration. Commercial sponsorship and the Cochrane Collaboration.www.cochrane.org/docs/commercialsponsorship.htm (accessed 16 June 2005).

25 Gøtzsche PC. Steroids and peptic ulcer: an end to the controversy? J Intern Med1994;236:599-601.

26 Katerndahl DA, Lawler WR. Variability in meta-analytic results concerning the value ofcholesterol reduction in coronary heart disease: a meta-meta-analysis. Am J Epidemiol1999;149:429-41.

27 Chalmers TC, Berrier J, Sacks HS, Levin H, Reitman D, Nagalingam R. Meta-analysis ofclinical trials as a scientific discipline. II: replicate variability and comparison of studiesthat agree and disagree. Stat Med 1987;6:733-44.

28 Linde K, Willich SN. How objective are systematic reviews? Differences betweenreviews on complementary medicine. J R Soc Med 2003;96:17-22.

29 Gøtzsche PC, Pødenphant J, Olesen M, Halberg P. Meta-analysis of second-lineantirheumatic drugs: sample size bias and uncertain benefit. J Clin Epidemiol1992;45:587-94.

30 Cook DJ, Reeve BK, Guyatt GH, Heyland DK, Griffith LE, Buckingham L, et al. Stressulcer prophylaxis in critically ill patients: resolving discordant meta-analyses. JAMA1996;275:308-14.

31 Olsen O, Gøtzsche PC. Screening for breast cancer with mammography. CochraneDatabase Syst Rev 2001;(4):CD001877.

32 Hopayian K. The need for caution in interpreting high quality systematic reviews. BMJ2001;323:681-4.

33 Yank V, Rennie D, Bero LA. Are authors’ financial ties with pharmaceutical companiesassociated with positive results or conclusions in meta-analyses on antihypertensivemedications? 5th International Congress on Peer Review and Biomedical Publication,Chicago, 16-18 September 2005:16 (www.jama-peer.org).

34 Olsen O, Middleton P, Ezzo J, Gøtzsche PC, Hadhazy V, Herxheimer A, Kleijnen J,McIntosh H. Quality of Cochrane reviews: assessment of sample from 1998. BMJ2001;323:829-32.

(Accepted 23 August 2006)

doi 10.1136/bmj.38973.444699.0B

Nordic Cochrane Centre, Rigshospitalet, DK-2100 Copenhagen Ø, DenmarkAnders W Jørgensen physicianPeter C Gøtzsche director

Department of Biostatistics, Panum Institute, University of Copenhagen, DenmarkJørgen Hilden associate professorCorrespondence to: P C Gøtzsche [email protected]

What is already known on this topic

Bias commonly occurs in trials of healthcare interventionsand often favours the sponsor’s product

Anecdotal reports have suggested that industry supportedmeta-analyses may also be more flawed than othermeta-analyses

What this study adds

Industry supported reviews were of lesser quality thanCochrane reviews of the same drugs and alwaysrecommended the experimental drug without reservations,which none of the Cochrane reviews did

Industry supported meta-analyses of drugs were lesstransparent and had few reservations about methodologicallimitations of included trials

Reviews with undeclared support and those with not forprofit support or no support had similarly cautiousconclusions to matched Cochrane reviews

Research

BMJ Online First bmj.com page 5 of 5

Page 18: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

18

DISCUSSION OF PAPER 1

In this study, we compared pairs of meta-analyses that consisted of Cochrane reviews

closely matched to meta-analyses of the same drugs in reviews with different kinds of

support published in paper based journals. Overall, we found that Cochrane reviews were

of better methodological quality than the other reviews although some of the Cochrane

reviews had clear methodological deficiencies.

We used an instrument developed by Andy Oxman and Gordon Guyatt to assess the

overall methodological quality of the meta-analyses. Both researchers have also been

involved with authoring the Cochrane Handbook, which is the official guide on how to

conduct a Cochrane review. Therefore, their instrument has been criticised for favouring

Cochrane reviews over other reviews explaining the differences in methodological quality

between the set of reviews. However, to our knowledge it was the only validated tool among

the more than 24 other tools available when we did the study and it was also the most used

(30). Since then, the 11-item assessment tool AMSTAR has been developed and validated

(31,32). AMSTAR addresses similar domains as the Oxman-Guyatt Index and therefore I

do not believe that this tool would have altered our results.

We assessed the methodological quality of meta-analyses by reading the published paper

only and one could therefore argue that we assessed the quality of reporting rather than the

actual methods. This seems to be correct. An observational study found that authors of

randomised controlled trials frequently used concealment of allocation and blinding, despite

the failure to report these methods (33) and a comparison of the protocol and the published

paper of 73 trials showed that the protocols described the blinding methods more precisely

than the papers (34). However, the details reported may still be a good surrogate for the

actual methodology. In an assessment of RCTs in breast cancer, Liberati and colleagues

found that the average quality score was 50% of the maximum. After the principal

investigators of the trials had been interviewed, the quality increased only marginally, to

57% (35). I am not aware of similar studies on meta-analyses and systematic reviews. One

should also consider that when authors are asked what they did, their replies may not be

completely accurate but might tend to be what 'looks good.'

In a letter in BMJ criticising our work, the author argued that the difference in

methodological quality between Cochrane reviews and paper based meta-analysis was

mainly due to space restrictions in journals compared to almost no restrictions in Cochrane

Library (36). Several studies have found that Cochrane reviews are of higher

methodological quality on average than other reviews (37-41), but Cochrane reviews

published in the Cochrane Library do not seem to differ from Cochrane reviews published

Page 19: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

19

also in paper based journals (42,43). Therefore, word limitation is likely to have played a

minor role for our results, and a more reasonable explanation is that The Cochrane

Collaboration since its formation has been leading in setting the methodological standards

for systematic reviews and meta-analyses. Most of the authors of the PRISMA statement on

good reporting of systematic reviews (44), which most scientific journals adhere to, have

been or are involved with the collaboration.

We also compared the conclusions of Cochrane reviews and industry-supported reviews

and found that the latter had uniformly positive recommendations for the experimental drug

despite the results in the meta-analyses being similar. The industry-supported reviews had

no reservations regarding the risk of bias in the included trials or the cost of the drugs in

contrast to none of the Cochrane reviews. Reviews with undeclared support and reviews

with not for profit or no support had conclusions that were similar to the conclusions in

Cochrane reviews.

Our findings were reinforced by a study of a large cohort of meta-analyses on

antihypertensive drugs published before 2004, which found that meta-analyses with

favourable conclusions were associated with financial ties to a drug company and the

association increased when adjusting for methodological quality (45). Another study on

review articles and meta-analyses on passive smoking found that the conclusions were

strongly associated with author affiliation to the tobacco industry (46)

Numerous studies have documented that financial conflicts of interest influence the study

outcome and the evidence has been synthesized in several systematic reviews (15-18). In

industry-supported trials, both the conclusions and the results seem to favour the sponsor’s

product (18) when compared to other trials. Differences in design such as study objectives

and the selection of dosage and type of administration of the study drugs can partly explain

this (47). Compared to other trials, industry-supported trials are more often placebo

controlled (48), probably because they are less expensive to conduct than active controlled

trials and because the medicines regulatory authorities almost only require this type of

research for marketing approval (48). We, as well as another study of meta-analyses (45),

did not find a relation between funding source and results. Different meta-analyses of the

same intervention and disease are likely to include the same studies since most trials in

meta-analyses are published, and despite the challenges in data extraction and analysis,

meta-analyses can often be replicated to a reasonable degree by other researchers (49,50).

Re-analysing trials requires the trial protocol and patient data on file and these are only

rarely publicly available. Therefore, the risk of undetected data manipulation, selective

outcome reporting and publication bias seems to be higher in trials than in meta-analyses,

and this may be a part of the explanation that the results in industry-funded trials, but not in

Page 20: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

20

meta-analyses, are more in favour of the sponsor’s product compared to other sources of

support.

The undetected bias in trials is the Achilles’ heel of meta-analyses (and systematic

reviews). Turner et al. conducted two separate meta-analyses of the same antidepressant

trials (51). One with trial results from FDA reviews and one with results from trial reports

published in medical journals. The authors found that the effect size in the journals was 32%

larger than when all FDA registered trials were included, ranging from 11 to 69% for

individual drugs. In addition, 23 of 74 trials registered by FDA were not published and only

one of these had positive results whereas the rest had negative or questionable results (51).

Therefore, as suggested by Jefferson et al., meta-analyses and systematic reviews that rely

on the published literature only run the risk of being an “unsolicited authoritative advertising

for the drug industry” (52).

Page 21: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

21

PAPER 2

The aim of this study was similar to the first, but with a broadened scope. We compared all

industry-supported meta-analyses of drug-drug comparisons with those without industry

support indexed in PubMed and published in 2004 for differences in methodological quality

and conclusions.

Page 22: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

BioMed Central

Page 1 of 7(page number not for citation purposes)

BMC Medical Research Methodology

Open AccessResearch articleIndustry-supported meta-analyses compared with meta-analyses with non-profit or no support: Differences in methodological quality and conclusionsAnders W Jørgensen*, Katja L Maric, Britta Tendal, Annesofie Faurschou and Peter C Gøtzsche

Address: The Nordic Cochrane Centre, Department 3343, Rigshospitalet, Blegdamsvej 9, DK-2100 Copenhagen, Denmark

Email: Anders W Jørgensen* - [email protected]; Katja L Maric - [email protected]; Britta Tendal - [email protected]; Annesofie Faurschou - [email protected]; Peter C Gøtzsche - [email protected]

* Corresponding author

AbstractBackground: Studies have shown that industry-sponsored meta-analyses of drugs lack scientific rigour and havebiased conclusions. However, these studies have been restricted to certain medical specialities. We compared allindustry-supported meta-analyses of drug-drug comparisons with those without industry support.

Methods: We searched PubMed for all meta-analyses that compared different drugs or classes of drugs publishedin 2004. Two authors assessed the meta-analyses and independently extracted data. We used a validated scale forjudging the methodological quality and a binary scale for judging conclusions. We divided the meta-analysesaccording to the type of support in 3 categories: industry-supported, non-profit support or no support, andundeclared support.

Results: We included 39 meta-analyses. Ten had industry support, 18 non-profit or no support, and 11undeclared support. On a 0–7 scale, the median quality score was 6 for meta-analyses with non-profit or nosupport and 2.5 for the industry-supported meta-analyses (P < 0.01). Compared with industry-supported meta-analyses, more meta-analyses with non-profit or no support avoided bias in the selection of studies (P = 0.01),more often stated the search methods used to find studies (P = 0.02), searched comprehensively (P < 0.01),reported criteria for assessing the validity of the studies (P = 0.02), used appropriate criteria (P = 0.04), describedmethods of allocation concealment (P = 0.05), described methods of blinding (P = 0.05), and described excludedpatients (P = 0.08) and studies (P = 0.15). Forty percent of the industry-supported meta-analyses recommendedthe experimental drug without reservations, compared with 22% of the meta-analyses with non-profit or nosupport (P = 0.57).

In a sensitivity analysis, we contacted the authors of the meta-analyses with undeclared support. Eight who repliedthat they had not received industry funding were added to those with non-profit or no support, and 3 who didnot reply were added to those with industry support. This analysis did not change the results much.

Conclusion: Transparency is essential for readers to make their own judgment about medical interventionsguided by the results of meta-analyses. We found that industry-supported meta-analyses are less transparent thanmeta-analyses with non-profit support or no support.

Published: 9 September 2008

BMC Medical Research Methodology 2008, 8:60 doi:10.1186/1471-2288-8-60

Received: 7 August 2008Accepted: 9 September 2008

This article is available from: http://www.biomedcentral.com/1471-2288/8/60

© 2008 Jørgensen et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Page 23: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

BMC Medical Research Methodology 2008, 8:60 http://www.biomedcentral.com/1471-2288/8/60

Page 2 of 7(page number not for citation purposes)

BackgroundThe primary source of knowledge used by physicians isthe medical literature [1]. The amount of information onhealth care interventions is hardly manageable, however,and the need for systematic reviews and meta-analyses toaggregate the information and give a basis for rationaldecision-making is therefore obvious [2]. Systematicreviews and meta-analyses of randomised controlled trialsare considered to be the highest level of evidence [3].

Bias in industry-sponsored drug trials is common and thesponsor's product is often favoured [4-7]. Concerning sys-tematic reviews and meta-analyses, little information hasbeen available on sponsor bias until recently. We foundthat, although the results were similar, industry-supportedmeta-analyses lacked scientific rigour and were morelikely to recommend the experimental drug, compared toCochrane reviews of the same disease and drugs [8]. Astudy on anti-hypertensive drugs found that meta-analy-ses with financial ties to only one drug company wereassociated with conclusions in favour of that company[9].

For this report, we assessed all meta-analyses indexed onPubMed of drug-drug comparisons published in 2004.We hypothesized that meta-analyses supported by thepharmaceutical industry are of poorer methodologicalquality and have conclusions favouring the experimentaldrug, compared to meta-analyses with non-profit or nosupport.

MethodsWe collected meta-analyses that compared different drugsor classes of drugs, were published in full and in English.A meta-analysis was defined as a review with quantita-tively combined data from at least two studies. Meta-anal-yses that we had authored ourselves and meta-analyses ofplacebo controlled trials were excluded.

Literature searchWe searched PubMed for meta-analyses published in2004 in English. We used a validated search strategy [10]and combined it with "Drug Therapy" [MeSH] OR "drugtherapy" [Subheading] OR "Pharmaceutical Preparations"[MeSH] OR "drugs".

One author (KLM) reviewed the titles and abstracts of allpotentially eligible meta-analyses, and if in doubt, the fulltext of the article was retrieved. Another author (AWJ)assessed all the included and 10% of the excluded meta-analyses for eligibility. Disagreements were resolved bydiscussion.

Data extractionTwo authors independently assessed the included meta-analyses. Data extraction was done unblinded. A thirdauthor resolved disagreements.

We used a pilot tested data sheet and extracted data ondrugs and diseases; types of support; sources used to iden-tify trials for the review; searches for unpublished trials;and descriptions of concealment of allocation, details ofblinding, and excluded patients and trials.

We assessed the methodological quality of the reviewswith the Oxman and Guyatt index, which is a validatedtool with nine items considering the potential for bias andan overall assessment on a 0–7 scale [11-13]. Further-more, we judged the review authors' conclusions byassessing whether the experimental intervention was rec-ommended without reservations, or whether it was notrecommended (or recommended with reservations) [6].

Data analysisWe divided the meta-analyses according to the type ofsupport in 3 categories: industry-supported, non-profitsupport or no support, and undeclared support.

Industry support was defined as authorship, provision ofgrants to authors of the meta-analysis, or other majorassistance such as help with the statistical analysis. We didnot consider provision of references or unpublished trialreports as support.

We compared industry-supported meta-analyses withmeta-analyses with non-profit support or no support withthe Mann-Whitney test for categorical data and withFisher's exact test for binary data; P values are two-sided.

ResultsThe search in PubMed identified 1188 records of meta-analyses. Most were ineligible because they did not com-pare drugs (Figure 1). We included 39 meta-analyses, 10of which were Cochrane reviews. Ten had industry sup-port, 18 non-profit or no support, and 11 undeclared sup-port (Table 1 and additional file 1: References of includedmeta-analyses).

Methodological qualityThe median quality score was 6 for the 18 meta-analyseswith non-profit or no support and 2.5 for the ten industry-supported meta-analyses (P < 0.01) (Table 2). More meta-analyses with non-profit or no support avoided bias in theselection of studies (P = 0.01); only five industry-sup-ported meta-analyses reported specific inclusion criteriaand two only included studies provided by the supportingcompany. Meta-analyses with non-profit or no supportmore often stated the search methods used to find studies

Page 24: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

BMC Medical Research Methodology 2008, 8:60 http://www.biomedcentral.com/1471-2288/8/60

Page 3 of 7(page number not for citation purposes)

(P = 0.02) and searched comprehensively (P < 0.01). Onlyfour (40%) industry-supported meta-analyses searchedfor unpublished studies and five (50%) searched in theCochrane Library and MEDLINE. For meta-analyses withnon-profit or no support the numbers were 16 (89%) and

14 (78%) respectively. Meta-analyses with non-profit orno support more often reported criteria for assessing thevalidity of the studies (P = 0.02), used appropriate criteria(P = 0.04), described methods of allocation concealment(P = 0.05), described methods of blinding (P = 0.05), and

Table 1: Characteristics of included meta-analyses.

Search strategy incl. Detailed description of

Ref # Medline and CLIB

Unpublished studies

Allocation concealment

Blinding Excluded patients

Excluded studies

Recommends the experimental drug without reservations

Quality score

Industry support

1 ✔ ✔ ✔ 4

2 ✔ ✔ 23 ✔ 24 ✔ ✔ ✔ 35 ✔ ✔ ✔ 46 ✔ 27 ✔ 18 ✔ ✔ 29 ✔ ✔ 310 ✔ 6

Non-profit or no

support

11 ✔ 2

12 ✔ ✔ ✔ 313 ✔ ✔ ✔ ✔ 314 ✔ ✔ ✔ ✔ 615 ✔ ✔ 316 ✔ ✔ ✔ 417 ✔ ✔ ✔ ✔ 718 ✔ ✔ 519 ✔ ✔ ✔ ✔ 620 ✔ ✔ ✔ 321 ✔ ✔ ✔ ✔ ✔ 722 ✔ ✔ ✔ ✔ ✔ ✔ 723 ✔ ✔ ✔ ✔ ✔ ✔ 724 ✔ ✔ ✔ ✔ ✔ 625 ✔ ✔ ✔ ✔ ✔ 726 ✔ ✔ ✔ ✔ ✔ ✔ 727 ✔ ✔ ✔ ✔ ✔ ✔ 628 ✔ ✔ ✔ 6

Support not declared

29 ✔ ✔ 5

30 331 232 ✔ ✔ ✔ ✔ ✔ 533 ✔ ✔ 334 ✔ ✔ 335 ✔ 236 137 ✔ 338 ✔ ✔ ✔ ✔ 339 ✔ ✔ 2

CLIB: The Cochrane Library

Page 25: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

BMC Medical Research Methodology 2008, 8:60 http://www.biomedcentral.com/1471-2288/8/60

Page 4 of 7(page number not for citation purposes)

described excluded patients (P = 0.08) and studies (P =0.15).

Authors' conclusionFour industry-supported meta-analyses (40%) recom-mended the experimental drug without reservations, com-pared with four meta-analyses with non-profit or nosupport (22%) (P = 0.57). The supporting companies ofall 4 meta-analyses that were recommending the experi-mental drug without reservation were also selling thedrug, but not the control drug. This only applied to 3 ofthe 6 industry-supported meta-analyses that did not rec-ommend or only recommended the experimental drugwith reservations.

Post hoc sensitivity analysesWe did two sensitivity analyses. First, we excluded the 10Cochrane reviews from the analysis, as these reviews hadhigh quality scores (median of 7); nine from the meta-analyses with non-profit or no support and one fromthose with industry support. This resulted in a medianquality score of 3 and 2 respectively (P = 0.06).

Second, we contacted the authors of the 11 meta-analyseswith undeclared support. Four declared that they had notreceived any external funding or other type of support(which we exemplified as help with the statistical analy-ses), four replied that they had received funding from not-profit organisations and three did not reply. We added theeight who replied to the meta-analyses with non-profit orno support and the three who did not reply to the meta-

Search for meta-analyses and reasons for exclusionFigure 1Search for meta-analyses and reasons for exclusion.

Page 26: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

BMC Medical Research Methodology 2008, 8:60 http://www.biomedcentral.com/1471-2288/8/60

Page 5 of 7(page number not for citation purposes)

analyses with industry support. This sensitivity analysisdid not change the results much, e.g. the median qualityscore was 5 for meta-analyses with non-profit or no sup-port and 2 for industry-supported meta-analyses (P <0.01)

DiscussionWe found that meta-analyses with non-profit or no sup-port are of better methodological quality on average thanthose with industry support. Lack of allocation conceal-ment and blinding, and high attrition rates in randomisedcontrolled trials may bias results of meta-analyses, but ifthe authors fail to describe these details, the reader is notable to judge if the meta-analysis is reliable. Most indus-try-supported meta-analyses failed on these counts; thisagrees with results we have published previously [8].

Cochrane reviewsCochrane reviews seem to have a better methodologicalquality on average than other meta-analyses [8,14-17].

In this study, ten Cochrane reviews contributed with highmethodological quality mainly to meta-analyses withnon-profit or no support, as only one Cochrane reviewwas supported by the industry. The policy in the CochraneCollaboration is that industry support of Cochranereviews is not allowed [18].

Forty percent of the industry-supported meta-analysesand 22 percent of those with non-profit or no support rec-ommended the experimental drug without reservation.This difference was not statistically significant, but wehave previously found a significant difference in a com-parison that was closely matched for drugs and diseases (P< 0.01) [8]. A large survey of 124 meta-analyses of anti-hypertensive drugs found that those with financial ties toonly one drug company were significantly associated withconclusions in favour of that company [9].

Studies of trials have found similar results. A survey foundthat none of 56 trial reports of non-steroidal anti-inflam-matory drugs supported by the manufacturer presentedresults that were unfavourable to the company [19].Another survey found that trials funded by for profitorganizations were more likely to recommend the experi-mental drug as the drug of choice (odds ratio = 5.3) com-pared with trials funded by non-profit organisations [6].And recently a study of randomised trials of head-to-headcomparisons of statins with other drugs found that notonly the conclusions but also the results were in favour ofthe sponsor's product [7].

LimitationsWe were not blinded, as the layout of some papers, e.g.Cochrane reviews, is unique and impossible to blind.

Table 2: Methodological quality of meta-analyses.

Questions Supported by industry Non-profit support or no support P value* Undeclared support(N = 10) (N = 18) (N = 11)

1. Were the search methods used to find evidence on the primary question stated?

6 18 0.02 6

2. Was the search for evidence reasonably comprehensive?

4 17 <0.01 6

3. Were the criteria used for deciding which studies to include reported?

5 17 0.03 10

4. Was bias in the selection of studies avoided?

1 12 0.01 3

5. Were the criteria used for assessing the validity of the included studies reported?

3 15 0.02 5

6. Was the validity of all studies referred to in the text assessed using appropriate criteria?

1 10 0.04 1

7. Were the methods used to combine the findings of the relevant studies reported?

9 17 1.00 10

8. Were the findings of the relevant studies combined appropriately?

7 16 0.46 8

9. Were the conclusions made by the author(s) supported by the data reported?

9 15 1.00 9

Overall quality (1–7) (median score) 2.5 6 <0.01 3

Number of meta-analyses that obtained an affirmative answer to the questions of the Oxman and Guyatt index; and median overall quality score.* Supported by industry vs non-profit or no support.

Page 27: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

BMC Medical Research Methodology 2008, 8:60 http://www.biomedcentral.com/1471-2288/8/60

Page 6 of 7(page number not for citation purposes)

However, blinding when assessing meta-analyses has littleimpact [20].

A substantial part of the meta-analyses with non-profit orno support were Cochrane reviews and these are madeaccording to the Cochrane Handbook [21]. Andy Oxmanparticipated in the development of the validated indexthat was used for evaluating methodological quality andhe has also participated in developing the CochraneHandbook.

Our definition of industry support does not distinguishbetween different amounts of support, and our judge-ment of support is based on details reported in the meta-analyses. This can theoretically lead to misclassification ofthe support, as industry support may range from very littleto generous, and details about some types of support maybe lacking more often than others. However, the defini-tion is operational and we believe that it includes themost important types of industry support. Lack of detailsor transparency in meta-analyses may also have led tomisjudgement of the methodological quality, and it hasbeen argued that the methodological superiority ofCochrane reviews can be explained by the fact that thereare no word limits in the Cochrane Library. However, themethodological quality of Cochrane reviews published inregular journals do not seem to differ from Cochranereviews published in The Cochrane Library [16,17]. Fur-thermore, important methodological details shouldalways be made available in journals with a word limit,either in the article itself, or in material on the journal'swebsite.

ConclusionTransparency is essential for readers to make their ownjudgment about medical interventions guided by theresults of meta-analyses. We found that industry-sup-ported meta-analyses are less transparent than meta-anal-yses with non-profit support or no support.

Competing interestsAll authors are affiliated with the Nordic Cochrane Cen-tre. We did not receive any funding or support for thisstudy.

Authors' contributionsAWJ drafted the manuscript and PCG helped. AWJ andKLM participated in writing the protocol, the literaturesearch, data extraction and statistical analysis. BT, AF andPCG participated in data extraction. All authors reviewedand approved the final manuscript.

Additional material

References1. Stinson ER, Mueller DA: Survey of health professionals' infor-

mation habits and needs. JAMA 1980, 243:140-3.2. Cook DJ, Mulrow CD, Haynes RB: Systematic Reviews: Synthe-

sis of Best Evidence for Clinical Decisions. Ann Intern Med 1997,126:376-80.

3. Harbour R, Miller J: A new system for grading recommenda-tions in evidence based guidelines. BMJ 2001, 323:334-6.

4. Lexchin J, Bero LA, Djulbegovic B, Clark O: Pharmaceutical indus-try sponsored and research outcome and quality: systematicreview. BMJ 2003, 326:1167-70.

5. Bekelman JE, Li Y, Gross CP: Scope and impact of financial con-flicts of interest in biomedical research: a systematic review.JAMA 2003, 289:454-65.

6. Als-Nielsen B, Chen W, Gluud C, Kjaergard LL: Association offunding and conclusions in randomized drug trials. JAMA2003, 290:921-8.

7. Bero L, Oostvogel F, Bacchetti P, Lee K: Factors associated withfindings of published trials of drug-drug comparisons: whysome statins appear more efficacious than others. PLoS Med2007, 4:e184.

8. Jørgensen AW, Hilden J, Gøtzsche PG: Cochrane reviews com-pared with industry supported meta-analysis and othermeta-analysis of the same drugs: systematic review. BMJ2006, 333:782-5.

9. Yank V, Rennie D, Bero LA: Financial ties and concordancebetween results and conclusions in meta-analyses: retro-spective cohort study. BMJ 2007, 335:1202-5.

10. Montori VM, Wilczynski NL, Morgan D, Haynes RB: Optimalsearch strategies for retrieving systematic reviews fromMedline: analytical survey. BMJ 2005, 330:68-71.

11. Oxman AD, Guyatt GH: Guidelines for reading literaturereviews. CMAJ 1988, 138(8):697-703.

12. Jadad AR, McQuay HJ: Meta-analyses to evaluate analgesicinterventions: a systematic qualitative review of their meth-odology. J Clin Epidemiol 1996, 49:235-43.

13. Oxman AD, Guyatt GH: Validation of an index of the quality ofreview articles. J Clin Epidemiol 1991, 44:1271-8.

14. Jadad AR, Cook DJ, Jones A, Klassen TP, Tugwell P, Moher M, MoherD: Methodology and reports of systematic reviews and meta-analyses: a comparison of Cochrane reviews with articlespublished in paper-based journals. JAMA 1998, 280:278-80.

15. Jadad AR, Moher M, Browman GP, Booker L, Sigouin C, Fuentes M,Stevens R: Systematic reviews and meta-analyses on treat-ment of asthma: critical evaluation. BMJ 2000, 320:537-40.

16. Delaney A, Bagshaw SM, Ferland A, Laupland K, Manns B, Doig C:The quality of reports of critical care meta-analyses in theCochrane Database of Systematic Reviews: an independentappraisal. Crit Care Med 2007, 35:589-94.

17. Collier A, Heilig L, Schilling L, Williams H, Dellavalle RP: CochraneSkin Group systematic reviews are more methodologicallyrigorous than other systematic reviews in dermatology. Br JDermatol 2006, 155:1230-5.

18. Cochrane Collaboration: Commercial sponsorship and theCochrane Collaboration. (accessed 14 March 2008) [http://www.cochrane.org/docs/commercialsponsorship.htm].

19. Rochon PA, Gurwitz JH, Simms RW, Fortin PR, Felson DT, MinakerKL, et al.: A study of manufacturer-supported trials of nons-teroidal anti-inflammatory drugs in the treatment of arthri-tis. Arch Intern Med 1994, 154:157-63.

20. Berlin JA: Does blinding of readers affect the results of meta-analyses? Lancet 1997, 350:185-6.

Additional file 1References of included meta-analysesClick here for file[http://www.biomedcentral.com/content/supplementary/1471-2288-8-60-S1.pdf]

Page 28: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

Publish with BioMed Central and every scientist can read your work free of charge

"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."

Sir Paul Nurse, Cancer Research UK

Your research papers will be:

available free of charge to the entire biomedical community

peer reviewed and published immediately upon acceptance

cited in PubMed and archived on PubMed Central

yours — you keep the copyright

Submit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.asp

BioMedcentral

BMC Medical Research Methodology 2008, 8:60 http://www.biomedcentral.com/1471-2288/8/60

Page 7 of 7(page number not for citation purposes)

21. Higgins JPT, Green S, editors: Cochrane Handbook for SystematicReviews of Interventions Version 5.0.0 [updated February 2008]. TheCochrane Collaboration (accessed 10 March 2008) 2008 [http://www.cochrane.org/resources/handbook].

Pre-publication historyThe pre-publication history for this paper can be accessedhere:

http://www.biomedcentral.com/1471-2288/8/60/prepub

Page 29: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

29

DISCUSSION OF PAPER 2

Compared to the first project, the second was similar, but with a broadened scope. We

assessed a cohort of meta-analyses from 2004, but apart from that we used the same

methods and compared industry-supported meta-analyses with meta-analyses with non-

profit or no support for differences in conclusions and methodological quality. Overall, the

meta-analyses with non-profit or no support were of better methodological quality than

industry-supported meta-analyses. The major contributor to a high median quality score

was Cochrane reviews and when they were excluded, the differences were smaller but still

apparent (P=0.06). Therefore, one may argue that the difference in methodological quality

was more related to the high quality of Cochrane reviews than to source of support.

It is true that Cochrane reviews had a great impact on our results, but it seems that the

relation between methodological quality and source of support was not only confounded by

Cochrane reviews. From a publication by Yank et al. (45) in which the authors assessed

meta-analyses of anti-hypertension drugs, it was possible for me to compare those with

financial ties to one or more drug companies (n=72) with other meta-analyses (n=52). Only

one of these meta-analyses was a Cochrane review. The authors also used the quality

instrument by Oxman and Guyatt to assess methodological quality, but in a modified form,

which meant that the meta-analyses with the lowest quality scored 0 and the ones with the

highest quality scored 18. The median quality score of meta-analyses with financial ties to

one or more drug companies was 4.5 and it was 9 for other meta-analyses (P=0.03, Mann-

Whitney U test). The median quality score for meta-analyses with ties to only one company

was 3.

Contrary to meta-analyses, industry-supported trials do not have poorer methodological

quality than other trials and there is some evidence that they are of better quality (16). An

explanation for this can be that some of the development of trial methodology has been

done within the drug industry and that the regulatory authorities primarily require submission

of high quality trials and only under special circumstances also meta-analyses (53). In

contrast to this the development of meta-analyses and systematic reviews has been outside

the medical industry. Over the recent two decades the Cochrane Collaboration, which is

independent of the medical industry, has been the major contributor to the development of

systematic reviews and meta-analyses.

It has been claimed that meta-analyses conducted by the industry are mainly for internal

decision-making or regulatory purposes and from a drug company’s point of view - when

conducting a meta-analysis of their own drugs - systematic searches of the literature is not

important, as the company has access to all trials of that drug (54). This requires that the

Page 30: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

30

company have a complete database of all trials and have access to the trials conducted by

independent researchers, but this is not always the case.

When the authors of a Cochrane review on neuraminidase inhibitors for influenza searched

for trials they identified a large oseltamivir trial by Roche Shanghai that Roche Basel

claimed it did not know about. It was omitted from Roche’s list of 101 sponsored and

supported trials despite the existence of an English language study report (52).

The ICH9 (International Conference on Harmonisation of technical requirements for

registration of pharmaceuticals for human use) guideline only mentions the possibilities of

including a meta-analysis in the application for marketing approval without discussing the

methodology of meta-analyses and systematic reviews. Therefore it seems that the medical

industry has failed to acknowledge the important methodological aspects that are required

when conducting meta-analyses enabling readers to make their own judgement about the

results and conclusions. This is crucial, especially considering that industry meta-analyses

seem to have more favourable conclusions than other meta-analyses.

Page 31: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

31

PAPER 3

In the third study, we quantified how often benefits and harms in terms of gastrointestinal

bleeding and cardiovascular thrombosis were reported in abstracts on rofecoxib, how often

the drug was described favourably, and how this pattern changed over time.

Page 32: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

SPECIAL ARTICLE

Unbalanced reporting of benefits and harms in abstractson rofecoxib

Anders W. Jørgensen & Karsten Juhl Jørgensen &

Peter C. Gøtzsche

Received: 19 August 2009 /Accepted: 19 January 2010 /Published online: 17 February 2010# Springer-Verlag 2010

AbstractPurpose It was predicted from the mechanism of actionthat, compared to older non-steroidal anti-inflammatorydrugs, rofecoxib (Vioxx) would reduce gastrointestinalbleeding, but also that it would increase the occurrence ofcardiovascular thrombosis. From the patient’s point of view,both effects are important and should be investigated andreported similarly. We studied how they have been reportedover time.Methods We searched PubMed for abstracts on rofecoxibthat commented on gastrointestinal bleeding or cardiovas-cular thrombosis or both. Two researchers, blinded to dateof publication and authors, assessed the abstracts indepen-dently. We judged the authors' view on rofecoxib andcomments on gastrointestinal bleeding and thrombosis asbeing favourable, neutral or unfavourable towards rofe-coxib.Results We included 393 abstracts commenting ongastrointestinal bleeding (72%) and cardiovascularthrombosis (54%) or both. Before October 2000, allabstracts (n=27) mentioned only gastrointestinal bleedingand 89% were positive towards rofecoxib. The year beforethe withdrawal of rofecoxib (October 2003 to September2004) (n=46), 59% of abstracts commented on gastroin-testinal bleeding only, 17% on thrombosis only, 24% onboth and 67% were still positive. From October 2006 toSeptember 2007 (n=54), 13% mentioned gastrointestinal

bleeding, 54% thrombosis, 33% mentioned both and only11% were positive.Conclusions The reporting of benefits and harms was notbalanced and changed markedly over time. Knowledge ofincreased risk of thrombosis existed early on, but the harmscame into focus too late, when the drug was alreadywithdrawn, and when tens of thousands of patients hadbeen harmed unnecessarily.

Keywords Rofecoxib . Cyclooxygenase-2 inhibitors .

Abstracts . Bias . Thrombosis . Gastrointestinal bleeding

Introduction

Rofecoxib was marketed in 1999 as first-line treatment ofosteoarthritis with the claim that it reduced pain as effectivelyas conventional non-steroidal anti-inflammatory drugs(NSAIDs), but with less gastrointestinal bleeding. However,the drug also caused serious thrombotic cardiovascular events[1, 2]. On 30 September 2004, Merck, the manufacturer,withdrew it from the market. Rofecoxib has been estimatedto have caused the death of tens of thousands of patientsbecause of thromboses [3].

The part of a paper that is most often read is the abstract andsometimes clinical decisions are based solely on abstracts[4, 5]. The recently published CONSORT guideline forabstracts states that any important adverse (or unexpected)effects of an intervention should be described in the abstract[5]. We quantified how often benefits and harms in terms ofgastrointestinal bleeding and cardiovascular thrombosis,respectively, were reported in abstracts on rofecoxib, howoften the drug was described favourably, and how thispattern changed over time.

A. W. Jørgensen (*) :K. J. Jørgensen : P. C. GøtzscheThe Nordic Cochrane Centre,Department 3343, Rigshospitalet, Blegdamsvej 9,2100 Copenhagen, Denmarke-mail: [email protected]

Eur J Clin Pharmacol (2010) 66:341–347DOI 10.1007/s00228-010-0791-8

Page 33: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

Methods

Search strategy and data extraction

One author searched PubMed (24 September 2007) usingthe search terms “Vioxx OR rofecoxib”. All records with anabstract were assessed for inclusion by two observersindependently. At the same time, they extracted the datausing a pilot-tested data sheet. Any disagreements weresettled by discussion. The observers were blinded to anyinformation about authors and institutions and to the date ofpublication, and assessed only the title and the text of theabstract. The blinding was obtained by exporting therelevant parts of the PubMed records into Microsoft Excel.

Inclusion and exclusion criteria

We included abstracts that commented on the effect ofrofecoxib on gastrointestinal bleeding or cardiovascularthrombosis or both, and contained a comment reflecting theauthors’ view on rofecoxib. We accepted abstracts thatimplicitly referred to these harms by using phrases such as“gastrointestinal adverse effect” and “cardiovascular risk”.We also included abstracts that commented on the effect ofcyclooxygenase 2 (COX-2) inhibitors and other NSAIDs ingeneral when these included rofecoxib, albeit indirectly,e.g. “celecoxib had no tangible advantage in terms ofserious gastrointestinal complications … overall mortalitywas higher with celecoxib than in the placebo group. Thedifference was similar to that observed in placebo-controlled trials of rofecoxib in Alzheimer’s disease.”

We excluded abstracts that only commented on harmsthat did not involve gastrointestinal bleeding or cardiovas-cular thrombosis, such as nausea or hypertension. We alsoexcluded abstracts that did not have a comment reflectingthe authors view on rofecoxib, e.g. “The manufacturerclaims that in clinical studies rofecoxib inhibits COX-2 butnot COX-1, has the power of high-dose NSAIDs—diclofenac and ibuprofen—and superior GI [gastrointesti-nal] safety profile compared to conventional NSAIDs.”Abstracts of in vitro studies, animal studies, medicaldevices and pharmacokinetics were also excluded.

Evaluation of the authors’ view on rofecoxib

We categorised the authors’ view on rofecoxib as eitherfavourable, neutral or unfavourable. If an active comparatorwas not used as a reference, we accepted placebo orstatements that did not involve a comparator. The judge-ment was preferentially made using the conclusion of theabstract. If this was not possible, we used statements in theresults section or elsewhere, e.g. “Because of its morefavorable gastrointestinal toxicity profile compared with

non-selective NSAIDs, rofecoxib is safer in patients …”.We also judged the individual comments on gastrointestinalbleeding and cardiovascular thrombosis in the same way.

Analysis

We used graphs and descriptive statistics to assess how thereporting in abstracts changed over time. Because rofecoxibwas withdrawn on 30 September 2004, our 1-year intervalsare from October to September, e.g. year 2000 was definedas October 1999 to September 2000, both months included.For abstracts that only contained information about the yearof publication, we used the date the citation was added tothe PubMed database [EDAT].

Results

Our PubMed search identified 2,047 records and weincluded 393 abstracts. Most records were excludedbecause there was no abstract in PubMed or because theabstract did not contain a comment on gastrointestinalbleeding or cardiovascular thrombosis (Fig. 1). Twenty-nine of the excluded abstracts mentioned hypertension, butdid not comment on cardiovascular thrombosis.

Reporting of harms over time

During the whole observation period, 181 of the includedabstracts (46%) commented on gastrointestinal bleedingonly, 110 (28%) on cardiovascular thrombosis only and 102(26%) commented on both. Of the 283 (181+102) abstractscommenting on gastrointestinal bleeding, 141 (50%) usedthe explicit terms “ulcer”, “gastrointestinal bleeding” or“perforation”, or “serious gastrointestinal adverse effect”.The remaining 142 abstracts (50%) used the less explicitterms “gastrointestinal risk”, “gastrointestinal safety”, or“gastrointestinal adverse effect”. Of the 212 (110+102)abstracts commenting on cardiovascular thrombosis, 137(65%) used the explicit terms “thrombosis”, “thromboem-bolic” or “thrombotic effect”, “myocardial infarction” or“stroke”. The remaining 75 abstracts (35%) used the lessexplicit terms “cardiovascular risk”, “cardiovascular safety”or “cardiovascular adverse effect”.

Until and including September 2000, no abstractscommented on cardiovascular thrombosis (the first waspublished in November 2000), i.e. 100% (n=27) of theabstracts commented only on gastrointestinal bleeding(Fig. 2). The percentage of abstracts that only commentedon gastrointestinal bleeding decreased to 59% in 2004 (n=46), before rofecoxib was withdrawn, and to 13% in 2007(n=54). The percentage of abstracts that commented onlyon cardiovascular thrombosis increased from 0 to 17% in

342 Eur J Clin Pharmacol (2010) 66:341–347

Page 34: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

2004 and to 54% in 2007. The percentage commenting onboth gastrointestinal bleeding and cardiovascular thrombo-sis increased from 0 to 24% in 2004, and to 33% in 2007.The greatest change in reporting was seen immediately afterthe withdrawal of rofecoxib in 2004 (Fig. 2).

Authors’ general view on rofecoxib over time

Until and including September 2000 (n=27), the proportionof abstracts favouring rofecoxib was 89%. In 2004 (n=46),it was 67%. The greatest change was seen after thewithdrawal of rofecoxib and in 2007 (n=54) where only

11% of the abstracts were positive (Fig. 3). We investigatedthe robustness of this result by including only thoseabstracts where our judgement was based on the conclusionsection of the abstracts. This graph had a similar slope(dotted line in Fig. 3).

Authors’ views on harms before and after the withdrawalof rofecoxib

Two hundred and eleven abstracts (54%) were publishedbefore the withdrawal of rofecoxib in 2004 and 182 (46%)were published after.

Records from PubMed (n=2047)

Excluded (n=677)No abstract available (n=585)

Excluded (n=977)Nothing on b;eeding or thrombosis in text (n=525)

Vioxx or rofecoxib not mentioned in abstract or title (n=92)

Animal study (n=282)

Other laboratory studies (n=59)In vitro study (n=103)

Autnors' opinion not clear (n=8)

Included abstracts (n=393)

Sortingelectronically

Readingabstracts

Fig. 1 Search for relevantabstracts

0%

20%

40%

60%

80%

100%

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

0

20

40

60

80

100

Gastrointestinal (GI) bleeding only (%)Cardiovascular (CV) thrombosis only (%)Both GI bleeding and CV thrombosis (%)Abstracts published per year (total number)Withdrawal of rofecoxib

NumbersPercentFig. 2 Abstracts published peryear. Total number of abstractspublished per year and percent-age commenting on gastrointes-tinal bleeding, cardiovascularthrombosis or both. The threeproportions add up to 100%

Eur J Clin Pharmacol (2010) 66:341–347 343

Page 35: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

Before the withdrawal, 193 (91%) abstracts commentedon gastrointestinal bleeding, and 168 (87%) of them werefavourable towards rofecoxib, 19 (10%) were neutral and 6(3%) unfavourable. Fifty-six (27%) abstracts commentedon cardiovascular thrombosis, and 5 (9%) of those werefavourable towards rofecoxib, 31 (55%) neutral and 20(36%) unfavourable.

After the withdrawal, the effect on gastrointestinalbleeding was mentioned in 89 (49%) abstracts, and 67(75%) of those were favourable towards rofecoxib, 14(16%) neutral and 8 (9%) unfavourable. The thromboticeffect was mentioned in 156 (86%) abstracts and, none ofthem were favourable towards rofecoxib, 26 (17%) neutraland 130 (83%) unfavourable.

Discussion

We found that most abstracts on rofecoxib reported only onthe beneficial effect regarding less gastrointestinal bleeding,and that they were generally in favour of rofecoxib, fromthe introduction of the drug in 1999 to its withdrawal in2004. After the withdrawal, most abstracts reported on theharmful effects, cardiovascular thrombosis, and few were infavour of rofecoxib.

Such findings might be expected for drugs withimportant but rare harms that are unknown when the drugs

are introduced on the market and only discovered later.However, this is not the only explanation for our findings.It has been documented that the company suppressedcardiovascular harms in the scientific literature [6] andintimidated researchers and speakers who were critical ofrofecoxib [6–8].

Before its introduction, it was predicted from themechanism of action that the drug should reduce theincidence of gastrointestinal bleeding [9] but also increasethe incidence of thrombosis, compared with non-selectiveNSAIDs [10–12]. Two trials conducted by Merck, 090 [3,13, 14] and VIGOR [15], both showed that rofecoxibincreased the risk of cardiovascular events significantly.However, the first trial, which ended in 1999, was notpublished in a scientific journal until 2006 [14]. The secondtrial was published in the New England Journal ofMedicine, but the increased risk of myocardial infarctionwas interpreted as a beneficial aspirin-like prophylacticeffect of the control NSAID [15]. This interpretation wasspeculative and was later refuted. Furthermore, three casesof myocardial infarction in the rofecoxib arm had beenomitted from the paper [16].

In 2001, it was documented in a systematic review thatCOX-2 inhibitors increased the risk of cardiovascularevents [17], and a cumulative meta-analysis of trials from2004 showed that a clear relationship between rofecoxiband increased risk of myocardial infarction existed already

0%

20%

40%

60%

80%

100%

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

All favourableGastrointestinal (GI) bleeding onlyCardiovascular (CV) thrombosis onlyBoth GI bleeding and CV thrombosisWithdrawal of rofecoxibSensitivity analysis of abstracts with a conclusion section

Fig. 3 Abstracts favouringrofecoxib. Percentage ofabstracts favouring rofecoxibamong those commenting ongastrointestinal bleeding, car-diovascular thrombosis or both

344 Eur J Clin Pharmacol (2010) 66:341–347

Page 36: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

by the end of 2000 [18]. Two other meta-analyses did notfind evidence of an increase in cardiovascular risk withrofecoxib, but they were conducted by employees of Merck[19, 20].

Over the studied time period, there has been an increasedfocus on harms [21–23] and the quality of reporting trialresults in abstracts [5], which may have had an impact onour results. However, our sample of abstracts did notexclusively consist of trial abstracts, and the increasedattention to harms does not explain the dramatic change infocus from beneficial to harmful effect when rofecoxib waswithdrawn.

The safety data from trials on rofecoxib were far toopositive compared to a real-world setting. None of the trialsin the application for marketing approval were designed toevaluate the cardiovascular risk [6]. In fact, they includedpatients that had an unusually low cardiovascular risk.Medicare patients in Tennessee, who were treated withrofecoxib in clinical practice, had a baseline risk of gettinga myocardial infarction that was eight times higher than thatfor the patients in the trials [18]. Patients at high risk ofdeveloping peptic ulcers were also often left out of trials on

rofecoxib. In 2002, an analysis of cardiovascular adverseevents was added to the protocols of three studies [23]including the one [2] that led to the withdrawal of rofecoxib[1]. This was considered breaking news and is likely tohave initiated the change in focus from beneficial effects toharms.

Publishing and disseminating scientific papers on med-ical interventions is an important marketing strategy for thepharmaceutical industry [24], and Merck’s active role in thewriting of journal articles is likely to have influenced howrofecoxib was portrayed and perceived by the clinicians. Itis difficult to explore Merck’s role in more detail in relationto our results. Merck’s information control could have beenclarified by looking at reporting in relation to type offinancial support. We did not attempt to do this, as ghostauthorship and other forms of support from drug companiesare often not revealed in scientific papers [25], and Merckused guest and ghost authors for many of the papers onrofecoxib [26]. Merck also conducted a seeding trial [27],the ADVANTAGE trial, published in Annals of InternalMedicine [28], and sponsored the Australasian Journal ofBone and Joint Medicine, which looked like a peer-reviewed medical journal but was only a marketing tool.Most of the articles in the journal presented data favourableto Merck products, including rofecoxib, without disclosingsponsorship [29].

A strategy to increase drug sales that has been used byMerck [30] and many other drug companies is to stimulateoff-label use [24, 31]. This may also be the case forrofecoxib [32] and could explain why many abstractsmentioned or evaluated the effect of rofecoxib in relationto other conditions than arthritis (Fig. 4). After havingassessed one-third of the abstracts (n=1,370), one observerdecided to register the conditions (apart from arthritis) thatrofecoxib was proposed for. These were mainly neurolog-ical disorders, cancer and pain related to minor surgery. TheU.S. Food and Drug Administration approved rofecoxib forosteoarthritis, acute pain, primary dysmenorrhea and rheu-matoid arthritis.

We searched for abstracts in PubMed only, as PubMed isthe most widely used database for medical research. It islikely that more abstracts would have been included if wehad searched additional databases, but we would not expectit to have led to any important changes in our results.

It has been suggested that increases in blood pressurerelated to rofecoxib are a mechanism for the increase in therisk of cardiovascular events [33]. We excluded 26 abstractsfor the reason that they only commented on hypertension,but they would not have changed the results much as theywere scattered over the years 2001 to 2007.

We believe that if the reporting of benefits and harms inabstracts is unbalanced, doctors will get a false perceptionof the drug’s value. In particular, readers need information

Neurological disorders Hemicrania continua Schizophrenia Sclerosis Alzheimers dementia Migraine Premenstrual migraine Surgery Prevention of urethral strictures after TURP Pre-medication for tonsilectomy Pre-medication for uterine curettage Hernia operations Post CABG Pre-medication for ear-nose-throat surgery in general Minor dental surgery (e.g. removal of molars) Minor orthopaedic surgery Cancer Treatment for glioblastoma multiforme Protection against colorectal neoplasia in familiar polyposis Treatment of malignant melanoma and sarcomas Treatment of prostate cancer Treatment of bone cancer Treatment of breast cancer Treatment of lung cancer Other Reduction of atherosclerosis among ACS-patients post-infarctionCongenial nephrogenous diabetes insipidus Menstrual pain Endometriosis Non-bacterial prostatitis Haemophilic arthropathy Premenstrual acne Prevention of ectopic ossification in arthroplasty

Fig. 4 Conditions for which the effect of rofecoxib was mentioned

Eur J Clin Pharmacol (2010) 66:341–347 345

Page 37: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

on deaths, and on harms that can be lethal, such as thrombosescauses by COX-2 inhibitors. In the drug literature, there isplenty of evidence of flawed research [34–43], ghost-writtenarticles [24, 25, 44–47], intimidation of researchers [44, 45,48–56] and misleading and false statements in researchpapers and marketing [24, 44–47, 57–65].

We suggest that studies like ours should be done onother drugs than rofecoxib, preferably with newly marketeddrugs associated with high expectations.

Conclusions

The basic principle of balanced reporting of benefits andharms seems to have been seriously distorted in abstracts onrofecoxib, although the harms were equally predictable as thebenefits from the mechanism of action of the drug. Before thewithdrawal of rofecoxib, abstracts mostly reported ongastrointestinal bleeding and were in favour of rofecoxib.The harms came in focus too late, when the drug had alreadybeen withdrawn, and when tens of thousands of patients hadbeen harmed unnecessarily [3, 66, 67].

Funding None.

References

1. Merck News Release (2004) Merck announces voluntary world-wide withdrawal of VIOXX. http://www.merck.com/newsroom/vioxx/pdf/vioxx_press_release_final.pdf. Accessed 23 June 2009

2. Bresalier RS, Sandler RS, Quan H, Bolognese JA, Oxenius B,Horgan K, Lines C, Riddell R, Morton D, Lanas A, Konstam MA,Baron JA, Adenomatous Polyp Prevention on Vioxx (APPROVe)Trial Investigators (2005) Cardiovascular events associated withrofecoxib in a colorectal adenoma chemoprevention trial. N Engl JMed 352:1092–1102

3. U.S. Senate Finance Committee (2004) Testimony of David J.Graham, MD, MPH, November 18 http://finance.senate.gov/hearings/testimony/2004test/111804dgtest.pdf. Accessed 23 June2009

4. Barry HC, Ebell MH, Shaughnessy AF, Slawson DC, Nietzke F(2001) Family physicians’ use of medical abstracts to guidedecision making: style or substance? J Am Board Fam Pract14:437–442

5. Hopewell S, Clarke M, Moher D, Wager E, Middleton P, AltmanDG, Schulz KF (2008) CONSORT for reporting randomised trialsin journal and conference abstracts. Lancet 371:281–283

6. Krumholz HM, Ross JS, Presler AH, Egilman DS (2007) Whathave we learned from Vioxx? BMJ 334:120–123

7. Fries JF (2001) Letter to Raymond Gilmartin re: physicianintimidation. 9 Jan. Merck. Bates No MRK-ABH0002204 toMRK-ABH0002207. http://www.vioxxdocuments.com/Documents/Krumholz_Vioxx/Fries2001.pdf. Accessed 23 June 2009

8. Rout M (2009) Vioxx maker Merck and Co drew up doctor hit list.The Australian. http://www.theaustralian.news.com.au/business/story/0,,25272600-5018983,00.html. Accessed 23 June 2009

9. Knijff-Dutmer EAJ, Martens A, vander Laar MAFJ (1999) Effectsof nabumetone compared with naproxen on platelet aggregation inpatients with rheumatoid arthritis. Ann Rheum Dis 58:257–259

10. Schonbeck U, Sukhova GK, Graber P, Coulter S, Libby P (1999)Augmented expression of cyclooxygenase-2 in human atheroscle-rotic lesions. Am J Pathol 155:1281–1291

11. Catella-Lawson F, McAdam B, Morrison BW, Kapoor S, KujubuD, Antes L, Lasseter KC, Quan H, Gertz BJ, FitzGerald GA(1999) Effects of specific inhibition of cyclooxygenase-2 onsodium balance, hemodynamics, and vasoactive eicosanoids. JPharmacol Exp Ther 289:735–741

12. McAdam BF, Catella-Lawson F, Mardini IA, Kapoor S, Lawson JA,FitzGerald GA (1999) Systemic biosynthesis of prostacyclin bycyclooxygenase (COX)-2: the human pharmacology of a selectiveinhibitor of COX-2. Proc Natl Acad Sci USA 96:272–277

13. US Food and Drug Administration (2001) Memorandum. http://www.fda.gov/ohrms/dockets/ac/01/briefing/3677b2_06_cardio.pdf. Accessed 23 June 2009

14. Weaver AL, Messner RP, Storms WW, Polis AB, Najarian DK,Petruschke RA, Geba GP, Tershakovec AM (2006) Treatment ofpatients with osteoarthritis with rofecoxib compared with nabu-metone. J Clin Rheumatol 12:17–25

15. Bombardier C, Laine L, Reicin A, Shapiro D, Burgos-Vargas R,Davis B, Day R, Ferraz MB, Hawkey CJ, Hochberg MC, KvienTK, Schnitzer TJ, VIGOR Study Group (2000) Comparison ofupper gastrointestinal toxicity of rofecoxib and naproxen inpatients with rheumatoid arthritis. N Engl J Med 343:1520–1528

16. Curfman GD ,Morrissey S, Drazen JM (2005) Expression of concern:Bombardier et al, “Comparison of upper gastrointestinal toxicity ofrofecoxib and naproxen in patients with rheumatoid arthritis”N Engl JMed 2000;343:1520–1528. N Engl J Med 353:2813–14

17. Mukherjee D, Nissen SE, Topol EJ (2001) Risk of cardiovascularevents associated with selective COX-2 inhibitors. JAMA286:954–959

18. Jüni P, Nartey L, Reichenbach S, Sterchi R, Dieppe PA, Egger M(2004) Risk of cardiovascular events and rofecoxib: cumulativemeta-analysis. Lancet 364:2021–2029

19. Konstam MA, Weir MR, Reicin A, Shapiro D, Sperling RS, BarrE, Gertz BJ (2001) Cardiovascular thrombotic events in con-trolled, clinical trials of rofecoxib. Circulation 104:2280–2288

20. Reicin AS, Shapiro D, Sperling RS, Barr E, Yu Q (2002)Comparison of cardiovascular thrombotic events in patients withosteoarthritis treated with rofecoxib versus nonselective nonste-roidal anti-inflammatory drugs (ibuprofen, diclofenac, and nabu-metone). Am J Cardiol 89:204–209

21. Ioannidis JPA, Evans SJW, Gøtzsche PC, O’Neill RT, Altman DG,Schulz K, Moher D, for the CONSORT Group (2004) Betterreporting of harms in randomized trials: an extension of theCONSORT statement. Ann Intern Med 141:781–788

22. Bernal-Delgado E, Fisher ES (2008) Abstracts in high profilejournals often fail to report harm. BMC Med Res Methodol 8:14

23. Kim PS, Reicin AS (2004) Rofecoxib, Merck, and the FDA. NEngl J Med 351:2875–2878

24. Steinman MA, Bero LA, Chren MM, Landefeld CS (2006)Narrative review: the promotion of gabapentin: an analysis ofinternal industry documents. Ann Intern Med 145:284–293

25. Gøtzsche PC, Hróbjartsson A, Johansen HK, Haahr MT, AltmanDG, Chan AW (2007) Ghost authorship in industry-initiatedrandomised trials. PLoS Med 4:e19

26. Ross JS, Hill KP, Egilman DS, Krumholz HM (2008) Guestauthorship and ghostwriting in publications related to rofecoxib: acase study of industry documents from rofecoxib litigation. JAMA299:1800–1812

27. Hill KP, Ross JS, Egilman DS, Krumholz HM (2008) TheADVANTAGE seeding trial: a review of internal documents.Ann Intern Med 149:251–258

346 Eur J Clin Pharmacol (2010) 66:341–347

Page 38: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

28. Lisse JR, Perlman M, Johansson G, Shoemaker JR, Schechtman J,Skalky CS, Dixon ME, Polis AB, Mollen AJ, Geba GP,ADVANTAGE Study Group (2003) Gastrointestinal tolerabilityand effectiveness of rofecoxib versus naproxen in the treatment ofosteoarthritis: a randomized, controlled trial. Ann Intern Med139:539–546

29. Grant B (2009) Merck published fake journal. The Scientist http://www.the-scientist.com/blog/display/55671. Accessed 23 June2009

30. U.S. Food and Drug Administration (1998) Warning letter fromFDA’s division of drug marketing and communications to Merckregarding aggrastat. http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/EnforcementActivitiesbyFDA/WarningLettersandNoticeofViolationLetterstoPharmaceuticalCompanies/UCM167857.pdf. Accessed 07 January 2010

31. Ratner M (2009) Pfizer settles largest ever fraud suit for off-labelpromotion. Nat Biotechnol 27:961–962

32. Tesorieri H (2008) Federal grand jury probes Merck's handling ofVioxx. Wall Street Journal. http://online.wsj.com/article/SB120182583816633697.html. Accessed 07 January 2010

33. Singh G, Miller JD, Huse DM, Pettitt D, D'Agostino RB, RussellMW (2003) Consequences of increased systolic blood pressure inpatients with osteoarthritis and rheumatoid arthritis. J Rheumatol30:714–719

34. Gøtzsche PC (1989) Methodology and overt and hidden bias inreports of 196 double-blind trials of nonsteroidal, antiinflamma-tory drugs in rheumatoid arthritis. Controlled Clin Trials 10:31–56

35. Bero LA, Rennie D (1996) Influences on the quality of publisheddrug studies. Int J Tech Assess Health Care 12:209–237

36. Safer DJ (2002) Design and reporting modifications in industry-sponsored comparative psychopharmacology trials. J Nerv MentDis 190:583–592

37. Melander H, Ahlqvist-Rastad J, Meijer G, Beermann B (2003)Evidence b(i)ased medicine -selective reporting from studiessponsored by pharmaceutical industry: review of studies in newdrug applications. BMJ 326:1171–1173

38. Bero L, Oostvogel F, Bacchetti P, Lee K (2007) Factors associatedwith findings of published trials of drug–drug comparisons: whysome statins appear more efficacious than others. PLoS Med 4:e184

39. Hrachovec JB (2001) Reporting of 6-month vs 12-month data in aclinical trial of celecoxib. JAMA 286:2398

40. Jüni P, Rutjes AW, Dieppe PA (2002) Are selective COX 2inhibitors superior to traditional non steroidal anti-inflammatorydrugs? BMJ 324:1287–1288

41. Brophy JM (2005) Selling safety—lessons from muraglitazar.JAMA 294:2633–2635

42. Nissen SE, Wolski K, Topol EJ (2005) Effect of muraglitazar ondeath and major adverse cardiovascular events in patients withtype 2 diabetes mellitus. JAMA 294:2581–2586

43. Vedula SS, Bero L, Scherer RW, Dickersin K (2009) Outcomereporting in industry-sponsored trials of gabapentin for off-labeluse. N Engl J Med 361:1963–1971

44. Grill M (2007) Kranke Geschäfte: wie die Pharmaindustrie unsmanipuliert. Rowohlt Verlag, Hamburg

45. Mundy A (2001) Dispensing with the truth. St. Martin's Press,New York

46. Angell M (2004) The truth about the drug companies: how theydeceive us and what to do about it. Random House, New York

47. Petersen M (2008) Our daily meds. Sarah Crichton, New York48. Schafer A (2004) Biomedical conflicts of interest: a defence of the

sequestration thesis—learning from the cases of Nancy Olivieriand David Healy. J Med Ethics 30:8–24

49. Revill J (2005) Doctor accuses drug giant of ‘unethical’ secrecy.Observer. http://www.guardian.co.uk/uk/2005/dec/04/health.healthandwellbeing. Accessed 23 June 2009

50. Revill J (2005) How the drugs giant and a lone academic went towar. Observer. http://www.guardian.co.uk/uk/2005/dec/04/health.businessofresearch. Accessed 23 June 2009

51. Bosely S (2004) Junket time in Munich for the medical profession—and it´s all on the drug firms. Guardian. http://www.guardian.co.uk/uk/2004/oct/05/highereducation.businessofresearch.Accessed 23 June 2009

52. Tanne JH (2007) FDA places “black box” warning on antidiabetesdrugs. BMJ 334:1237

53. Mello MM, Clarridge BR, Studdert DM (2005) Academic medicalcenters’ standards for clinical-trial agreements with industry. NEngl J Med 352:2202–2210

54. Shuchman M (1999) Drug company threatens legal action overCanadian guidelines. BMJ 319:1388

55. Williams HC (2003) Evening primrose oil for atopic dermatitis.BMJ 327:1358–1259

56. Gornall J (2009) Industry attack on academics. BMJ 338:b73657. Braitwaite J (1984) Corporate crime in the pharmaceutical

industry. Routledge & Kegan Paul, London58. Avorn J (2005) Powerful medicines: the benefits, risks, and costs

of prescription drugs. Vintage, New York59. Goozner M (2005) The $800 million pill: the truth behind the cost

of new drugs. University of California Press, Berkeley60. CBS News (2007) OxyContin’s deception costs firm $634 M.

http://www.cbsnews.com/stories/2007/05/10/health/main2785453.shtml. Accessed 23 June 2009

61. Graham DJ (2006) COX-2 inhibitors, other NSAIDs, andcardiovascular risk: the seduction of common sense. JAMA296:1653–1656

62. Avorn J (2006) Dangerous deception—hiding the evidence ofadverse drug effects. N Engl J Med 355:2169–2171

63. Steinman MA, Harper GM, Chren MM, Landefeld CS, Bero LA(2007) Characteristics and impact of drug detailing for gabapentin.PLoS Med 4:e134

64. Ziegler MG, Lew P, Singer BC (1995) The accuracy of druginformation from pharmaceutical sales representatives. JAMA273:1296–1298

65. Perry T (2008) Neurontin: clinical pharmacologic opinion of Dr.Thomas L. Perry. http://dida.library.ucsf.edu/pdf/oxx18p10.Accessed 23 June 2009

66. Topol EJ (2004) Failing the public health–rofecoxib, Merck, andthe FDA. N Engl J Med 351:1707–1709

67. Graham DJ, Campen D, Hui R, Spence M, Cheetham C, Levy G,Shoor S, Ray WA (2005) Risk of acute myocardial infarction andsudden cardiac death in patients treated with cyclo-oxygenase 2selective and non-selective non-steroidal anti-inflammatory drugs:nested case-control study. Lancet 365:475–481

Eur J Clin Pharmacol (2010) 66:341–347 347

Page 39: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

39

DISCUSSION OF PAPER 3

In the third project, we assessed the reporting of cardiovascular thrombosis and

gastrointestinal bleeding in abstracts on the cyclooxygenase-2 inhibitor rofecoxib.

Rofecoxib was marketed in 1999 under the trade name Vioxx as being as effective as

conventional non-steroidal anti-inflammatory drugs (NSAIDs), but with less gastrointestinal

bleeding, for treating osteoarthritis and rheumatoid arthritis. It was withdrawn due to serious

thrombotic cardiovascular adverse events in 2004.

We found that, despite the manufacturer’s awareness of an increased risk of cardiovascular

thrombosis, the harm received little attention from the introduction of the drug to its

withdrawal. In this period, most abstracts reported on the benefits of less gastrointestinal

bleeding and they were in general in favour of rofecoxib. After withdrawal, most abstracts

reported on the harmful effects, cardiovascular thrombosis, and few were in favour of

rofecoxib.

This study did not have a control or reference drug and therefore the evolution of the

descriptions of harms and benefits in abstracts cannot be categorised as more or less

extreme than the ‘average drug’ marketed at the same time. However, a control drug was

not necessary to illustrate that important harms of rofecoxib were underreported in

abstracts. This is an important problem, as this skews the impression one gets of an

intervention, and unfortunately, the problem is general, as it also exists in trials (55), review

articles (56,57) and cost benefit analyses (58).

The change in reporting of harms and benefits over time for a particular drug or class of

drugs has not previously been studied in abstracts. A more recent study about the evolution

of trial methodology in psychopharmacological research found that the quality of reporting

the results in abstracts improved over time (59), but it did not specify if the reporting of

harms was considered a quality criterion. Nevertheless, there has been an increased focus

on harms and the quality of reporting in abstracts, which may have influenced our findings

to some degree, although it cannot explain the dramatic change in focus from a beneficial to

a harmful effect after rofecoxib was withdrawn from the market.

Some abstracts are more informative than others, e.g. structured abstracts, which often

allow more liberal word counts. Informative abstracts usually present the background,

objectives, methods, results and conclusions of a survey or a trial and indicative abstracts

presents fewer subheadings (60). Some of the abstracts that we included were indicative

and due to the nature of those abstracts, important information may have been left out.

Page 40: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

40

Therefore the style of abstracts plays an important role for the quality of information

presented.

It can be discussed whether safety results should be presented in the abstract

if they were not pre-planned as primary or secondary outcomes. Nevertheless, the

CONSORT statement for abstracts mentions that authors should describe any important

adverse (or unexpected) effects of an intervention, to which I agree. Harms are not always

the primary outcome and generally receive too little attention, as illustrated by the late

amendment of an analysis of cardiovascular adverse events to three study reports in 2002

(61). One of them that let to the withdrawal of rofecoxib began in 2000 (62), but a trial

finished in 1999 had already shown an increased risk of cardiovascular adverse events.

This trial, however, was first published in 2006 (63).

Page 41: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

41

PAPER 4

Our primary aim of the fourth study was to analyse the change in body weight and compare

the results when using four different methods for handling missing data.

Page 42: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

42

25. May 2011 We should not continue to use complete case analysis and single imputations like LOCF and BOCF. A comparison of results from different imputation techniques of missing data from an anti-obesity drug trial. Anders W Jørgensen (MD)1, Lars H Lundstrøm (MD, PhD) 2, Jørn Wetterslev (MD, PhD) 2, Arne Astrup (Professor, DrMedSci) 3 and Peter C. Gøtzsche (Professor, DrMedSci) 1,4 1The Nordic Cochrane Centre, Dept 3343, Rigshospitalet, Blegdamsvej 9, DK-2100 Copenhagen, Denmark 2Copenhagen Trial Unit, Copenhagen Centre of Clinical Intervention Research, Dept 3344, Rigshospitalet, Blegdamsvej 9, DK-2100 Copenhagen, Denmark 3 Department of Human Nutrition, Faculty of Life Sciences, University of Copenhagen, Rolighedsvej 30, DK-1958 Frederiksberg, Denmark 4Institute of Medicine and Surgery, Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark Corresponding author: Anders W Jørgensen The Nordic Cochrane Centre Rigshospitalet Blegdamsvej 9 DK-2100 Copenhagen Denmark e-mail: [email protected] Abstract: 222 words Text: 3766 words

Page 43: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

43

Abstract Background: In randomised trials of medical interventions the most reliable analysis follows the intention-to-treat (ITT) principle. However, the ITT analysis requires that missing outcome data have to be imputed. Different imputation techniques may give different results and some may lead to bias. In anti-obesity drug trials, many data are missing and the most used imputation method is last observation carried forward (LOCF). LOCF is generally considered conservative, but there are more reliable methods such as multiple imputation (MI). Objectives: To compare four different methods of handling missing data in a 60-week placebo controlled anti-obesity drug trial on topiramate. Methods: We compared an analysis of complete cases with datasets where missing body weight measurements had been replaced using three different imputation methods: LOCF, baseline carried forward (BOCF) and MI. Results: 561 participants were randomised. At the end of the trial, 85% of the participants were lost to follow up. Compared to placebo, there was a significantly greater weight loss with the topiramate in all analyses: 9.5 kg (SE 1.17) in the complete case analysis (N=86), 6.8 kg (SE 0.66) using LOCF (N=561), 6.4 kg (SE 0.90) using MI (N=561) and 1.5 kg (SE 0.28) using BOCF (N=561). Conclusions: The different imputation methods gave very different results. LOCF was not conservative compared to the more reliable MI, in fact LOCF had a lower SE. Introduction Attrition has been described as the bane of clinical trials on anti-obesity drugs (1). In most studies more than one third of the subjects drop out after one year (2), and depending on the type of missingness the results will be biased (3). Missingness can be classified as missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). Only when missingness is MCAR the use of complete case analysis of the data, obtained exclusively from participants with all data observed, may give an unbiased result because the analysis then is a random sample of the full dataset. However, in most studies MCAR is rarely the case (4) and when missingness is MAR, missingness depends on the observed data. Finally, when missingness is MNAR, it also depends on some unobserved data. One of the most commonly applied methods for handling attrition in obesity research is ‘last observation carried forward’ (LOCF) (2). In an analysis of the outcome measure with LOCF a missing weight measurement of a participant at the end of trial is replaced by the participant’s last observed value. To assume that one’s weight is unchanged after dropping out of a trial seems hard to justify as participants tend to regain much of their lost weight within a short period of time after having stopped the intervention (5). Therefore, baseline carried forward (BOCF) has been proposed as a more reliable imputation strategy (6). However, both BOCF and LOCF overestimate the precision of the effect estimate because the dataset is analysed, after single imputation of the missing data, as if it was a ‘complete’ dataset with no missing data (7). Intuitively one has doubt in imputed data and thus the p-values and confidence interval should increase. Therefore BOCF and LOCF provide undue certainty of the effect estimate even under MCAR.

Page 44: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

44

There are other and more reliable techniques for handling missing data (8), and they can also be used when missingness is MAR. Attention has been drawn to the multiple imputation (MI) technique (9), which for some time has been recommended for handling missing data in obesity trials (2). The multiple imputation technique is a stepwise procedure. First, based on the observed data a plausible multivariable distribution for the missing values is estimated and they are being replaced by values randomly drawn from this distribution resulting in a complete dataset. Second, the first step is repeated multiple times generating multiple datasets. Third, the datasets are then analysed separately producing multiple estimates, e.g. weight change, and fourth, the multiple estimates are pooled resulting in one single estimate. Compared to LOCF and BOCF the precision in this case will not be unduly overestimated because the uncertainty of the imputed values due to multiple sampling from the distribution is taken into account. We had access to an individual patient dataset from a large randomised three armed weight loss trial that compared diet plus topiramate with diet plus placebo (10). Our primary aim was to analyse the weight change and compare the results by using four different methods for handling missing data. Thus we analysed the dataset of complete cases and datasets where missing weight measurements had been replaced by three different imputation methods LOCF, BOCF and MI. Our second aim was to report the results of the analysis of the primary outcome measure at the time-point specified in the trial protocol. Material and Methods The trial The trial was a randomised weight maintenance trial (n=561) that compared placebo (n=187) with topiramate 96 mg (n=190) and 192 mg (n=184) per day. It was designed to proceed for a total of 82 weeks; a 8-week non-pharmacological low-calorie diet run-in phase that was followed by the randomisation, a 60-week intervention phase, hereafter a 14-week drug tapering period and lastly a follow-up period. The data included assessments of weight from 26 visits plus standard baseline values (age, height, sex etc.), a variety of blood sample analyses and measures of hip and waist circumferences. Each visit corresponded to a specific number of weeks in the trial. Subjects that withdrew from the study were tapered off intervention over two to three weeks and during that period their weight was measured. More details about the methodology have been published previously (10). However, the sponsor terminated the trial prematurely due to low tolerability to the drug, and therefore the publication only contained results on a subset of patients. The primary outcome measure being percent weight change from enrolment to end of intervention phase has never been published. Data Data was provided by the first author (AA) of a previous report of this trial (10) and imported to SPSS 18.0. Missingness We assessed the mechanism of missingness by using Little’s test (11)and by plotting weights of subjects with missing data with those with data.

Page 45: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

45

Comparison of imputation methods We used the data from baseline (randomisation) to week 60. For simplicity we pooled the topiramate arms. We analysed the mean weight change in the placebo and the pooled topiramate group and the difference between the two from baseline to week 60. We also analysed percentage change in the same way. The results were plotted against time for comparisons between the four following methods. Complete case analysis This did not involve any imputation and was an analysis of data from participants on intervention, i.e. all available data from baseline and to week 60. Patients that had their weights measured on the visit of interest e.g. week 60, but had missing weight measurements on other visits, e.g. week 40 and 52, were also included. Last observation carried forward We substituted missing weight measurements at 60 weeks with the last observed measurement. We allowed carrying forward the baseline value if this was the last observed measurement. Baseline carried forward We substituted missing weight measurements with the baseline weight. Mutiple imputation We imputed the missing weight measurements with values of weight obtained by the ‘Fully conditional specification method’ in SPSS ver. 18. This is an iterative Markov chain Monte Carlo method that can be used when the pattern of missing data is monotone (i.e. a subject attends all visits till a visit is missed and never return) or none-monotone (12). We used a linear regression model that contained the variables: intervention group, sex, race, age and baseline values (height, waist circumference, plasma glucose, triglycerides, HDL-cholesterol, HDL/LDL-ratio, insulin, haemoglobin and haemoglobin A1c). Additionally we included variables for weight, but only at visits prior to the visit of interest. E.g. we only included weight measurements from baseline to week 20 for the analysis on week 20. We log-transformed weight to satisfy the normality assumption which seemed to hold when we assessed Q-Q plots and used the Kolmogorov–Smirnov test (p>0.2, df=561), but not the Shapiro–Wilk test (p=0.04, df=561). We also assessed convergence by plotting the means and standard deviations by iteration and imputation. The default number of 10 iterations in SPSS 18 seemed to be too low (web figure) as the standard deviation gradually increased up till and stabilized at 400 iterations. Therefore we chose 500 iterations and 10 imputations assuring an efficiency of the imputation of 99%. The primary outcome measure at 60 weeks in the trial protocol We estimated the mean percent weight change from enrolment to week 60 using the same methods as described above. However, the two topiramate groups receiving different dosages were not pooled. For completeness we also re-analysed the data on the subset of patients previously published (10). The subset was chosen because the authors wanted to reduce the risk of bias due to trial termination and consisted of participants who received at least one dose of study drug, provided at least one post-baseline efficacy evaluation, had the opportunity to complete 44 weeks of treatment before the study closedown announcement and allowed only data collected before the closedown announcement and

Page 46: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

46

up to week 44 to be included in the analysis of efficacy. Using these criteria it was not possible to get the exact same subset, and therefore we allowed data collected 3 days after the closedown announcement to be included in the analysis. However, in our analysis the number of patients in the topiramate 192 mg group (n=105) differed from the original analysis (n=99). Results Details about the study population have previously been published (10). Baseline characteristics are available in web table 1 and include the variables used in our analyses after MI. Missingness Missing data on weight gradually increased from week 0 to 44 (from 0% to 27%), but onwards missingness increased much more (Figure 1). Only 15% (n=86) of the patients were still on treatment and had their weight recorded on week 60. The reason for missingness varied, but mostly because of trial termination (10). The mechanism was unlikely to be MCAR with a statistical significant Little’s test (P<0.01)(11). On most visits participants who missed the following visit seemed to weigh more than those who attended (Figure 1). During the 60 weeks only 13% of the participants attended all visits and the most frequently occurring patterns of missingness (66% of the participants) were monotone; 17%, 17%, 9% and 9% had missing visits onwards from week 44, 48, 52 and 56 (figure 3). Comparison of imputation methods over time and with increasing missingness From baseline to week 44 the estimated difference in mean weight change between placebo and topiramate increased by all four methods. From week 44 to week 60 the difference increased in the complete case analysis and in LOCF, but decreased in MI and BOCF (Figure 3). The complete case analysis estimated the greatest difference in weight loss (-9.5 kg on week 60) and BOCF the smallest (-1.5 kg on week 60). MI and LOCF were similar with MI resulting in a slightly greater difference from the beginning and throughout most of the trial and a slightly smaller difference in the end (-6.4 kg on week 60) when compared to LOCF (-6.8 kg on week 60). These differences were a result of an overall weight loss in the topiramate group (figure 4) and weight gain in the placebo group (Figure 5). The change within the pooled topiramate group was again biggest in the complete case analysis (-5.9 kg on week 60) and smallest in the BOCF (-0,9 kg on week 60). Also, MI and LOCF estimated a similar weight change in the beginning of the trial, but in the end MI (-3.4 kg on week 60) showed a much smaller weight loss than LOCF (-5.5 kg on week 60) (Figure 4). The change within the placebo group was similar in the complete case analysis and BOCF in the beginning, however in the end the complete case analysis estimated the biggest weight gain (3,7 kg on week 60) and BOCF the smallest (0.5 kg on week 60). In the placebo group MI and LOCF were similar in the beginning, but in the end MI (3.0 kg on week 60) showed a greater weight gain than LOCF (1.3 kg on week 60) (Figure 5). We got similar results for the analyses of the difference in weight change between the groups and the weight change within the groups when analysing the percentage change from baseline.

Page 47: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

47

The trial’s primary outcome measure at 60 weeks Regardless the methods of analysis participants in the topiramate groups on average lost more weight than participants in the placebo group. The primary outcome measure at end of trial according to the trial protocol was the percentage change from enrolment to 60 weeks (week –8 to week 60) and for placebo compared to topiramate 96 mg the mean difference was -10.5% (SE = 2.2) in the complete case analysis, -6.1% (SE = 0.7) using LOCF, -5.5% (SE = 1.1) using MI and -1.7% (SE = 0.4) using BOCF. For placebo compared to topiramate 192 mg/day the mean difference was -10.0% (SE = 1.9), -7.5% (SE = 0.8), -7.3% (SE = 1.0) and -1.5% (SE = 0.4) using complete case analysis, LOCF, MI and BOCF respectively (Table 1). When we re-analysed the percentage change from enrolment to week 44 of the subset of participants our results were similar, but not identical to those published previously (webtable 2). The mean difference between placebo and topiramtate 96 mg was -7.4% in the complete case analysis, -6.2% using LOCF, -6.0% using MI and -5.6% using BOCF. For placebo compared to topiramate 192 mg the mean difference was -8.0%, -7.5%, -7.6% and -6.4% using complete case analysis, LOCF, MI and BOCF respectively. Discussion We compared 4 methods of analysing the effect of topiramate in a weight loss and maintenance trial. We found that in the beginning of the trial LOCF and MI estimated similar body weight changes, but over time and with high attrition they estimated different body weight changes. This however, did not have a big effect on the difference in body weight change between topiramate and placebo, which was similar throughout the trial. Complete case analysis estimated the greatest difference and BOCF the smallest. We also estimated the weight change as original planned in the trial protocol and as it was done in a previous publication of the trial (10), and our results regardless of method confirmed that in this weight maintenance trial topiramate produces a greater weight-loss than placebo. Other placebo-controlled trials have shown that topiramate reduces weight (13). In several simulation studies MI has been shown to provide both a confidence interval with better coverage (realistic precision) and a less biased estimate of the intervention effect than both complete case analysis and single imputations using single imputations (14)(15)(16). In these studies the method has been to simulate missingness from a complete dataset and further with different imputation techniques trying to amend the missingness calculating an estimate of the intervention effect from the imputed dataset. Accordingly, having the correct estimate of the intervention effect from the complete dataset, these studies have been able to evaluate how close the result of an imputation method comes to the correct estimate from the original complete dataset. The conclusion of such simulation studies have been overwhelmingly in favour of MI. In trials on anti-obesity drugs it has previously been shown that LOCF estimate similar effect sizes (2), but overestimates the precision compared to MI (8) and therefore describing LOCF as a conservative analysis is wrong.

Page 48: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

48

Our results can be interpreted in relation to the per-protocol (PP) hypothesis that assumes that all participants adhere to treatment, which is an unrealistic scenario in trials on anti-obesity drugs, or the intention-to-treat (ITT) hypothesis that assumes that some participants do not adhere to treatment. Thus the PP analysis estimates the weight loss as if the participants adhere to the treatment and the ITT analysis estimates the weight loss of the intention to give the treatment regardless of adherence (17). If a drug truly reduces the weight, then BOCF will result in a conservative estimate (smaller weight loss) compared to a PP analysis, but maybe a realistic estimate compared to a ITT analysis (6), as participants tend to regain some of their lost weight within a after ended treatment (5). The interpretation of the LOCF analyses is difficult due to the course of weight change during a weight loss trial. Most of the weight loss occurs early during treatment, then levels out and some is regained in the end of a trial (18). Compared with a PP analysis, it may be reasonable to assume that LOCF underestimates the weight loss in the short term and overestimates it in the long term. On the other hand, compared with the ITT hypothesis, it is likely that LOCF overestimates the weight loss, as BOCF seems more realistic. Our MI analysis is more compatible with a PP analysis than an ITT analysis and should be interpreted as such (17). We had weight data for some of the withdrawn participants from a 2 to 3 weeks tapering off period in the MI analysis, however during that period they were still on treatment, but on a smaller dose, and therefore the result is still a good estimation of the weight loss as if all participants had continued the treatment. If we had had complete data on some of the participants that dropped out or did not adhere to the treatment, we could have used their data for MI, which then would have reflected an ITT analysis (17). The complete case analysis can be an unbiased PP analysis, but only when the participants in the analysis can be regarded as a random sample of the study population (when the missing mechanism is MCAR). In our dataset the missing mechanism was not MCAR. It seemed that those who had missing data weighted more than those without and thus complete case analysis overestimated the weight loss. MCAR is only rarely the case in clinical trials (4). Our results should also be interpreted in relation to how results from weight loss trials are being reported in scientific articles in general. Often, only the weight change within the treatment groups is stated, and sometimes also the difference in weight change between the groups is stated. We have reported both and found that MI and LOCF resulted in similar differences between topiramate and placebo, but that the weight change within the groups was smaller in the MI analysis than in LOCF and this difference increased over time and with increasing missing data. The most likely explanation for this is that LOCF, but not MI, ignores the course of weight change (described above). Figure 4 and 5 both show that the graphs for MI is more or less a translation of the complete case analysis, suggesting that the bias LOCF introduce compared to MI is more or less the same in both treatment groups, and therefore MI and LOCF estimates similar difference between the groups. Further, MI introduced greater uncertainty of the intervention effect estimate than both the single imputations LOCF and BOCF. Obviously LOCF is biased towards the

Page 49: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

49

alternative hypothesis being true and BOCF is biased towards the null hypothesis being true and both methods have unduly high estimates of the precision (table 1 & figure 3-5). Limitation We do not know the true treatment effect of topiramate or the weight of the missing participants and thus it is impossible to validate our findings. Also we do not know the exact mechanism of missingness. If the mechanism is MNAR all imputation methods are likely to be biased. However, MI may provide less biased results in this situation as well (16). Most participants dropped out if the trial because of trial termination and this is not a common reason for attrition in obesity trials. However the reason for termination was safety and tolerability issues with the drug. We did not include safety or harm in our imputation model and thus missingness may be at risk of depending on data that are ‘unobserved’ by the MI procedure. However, the harms may still be correlated to the variables in our model and figure 1 showed that missingness could be predicted by the weight of the participants. Further, MI has been found to be a reliable imputation method in weight loss trials when only relaying on measurements over time (2). Suggestions for future research The major reason for missing data when participants’ quit the treatment is their lack of interest for having their weight measured and for obvious reasons trialists cannot force the patient to show up for a visit. Therefore we need weight loss trials that include incentatives for participants to be followed-up and other logistic methods for measuring the weight and side effects of those participants who decide to quit treatment. Trialists are aware of this problem and most trial protocols specify that investigators shall do their best to have a last measurement at the end of the specified follow up period, but do not provide the incentatives or methods. The protocol for the trial in this study stated: “Participants withdrawn from the study prior to completion of treatment period will be encouraged […] to attend study visits with assessments equivalent to those performed at Visit 23 (Week 60)” A similar statement was in the protocol for the RIO-North America trial, a weight loss study of rimonabant (19): “For patients with premature treatment discontinuation or patients considered lost to follow-up, the [case report form] must be filled in up to the last visit performed. The Investigator should make every effort to re-contact and to identify the reason why the patient failed to attend the visit and to determine his/her health status and to retrieve study medication.” When the trial was published it was criticised in an editorial for only measuring those participants who completed the trial (53%) rather than all patients who were not lost to follow-up (93%) (18). It is indeed possible to have a high follow-up as seen in a recently published weight loss trial of free meals and an intensive weight loss program. After 24 month the investigators had weight data on 92% of the study participants (20).

Page 50: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

50

Sometimes some of the participants that withdrew from a trial are measured and analysed, but unfortunately not published. In a placebo controlled trial of long-term treatment with sibutramine the authors only published an unadjusted ITT analysis (LOCF) (n=464) (21) that showed a greater weight reduction with the drug compared to an adjusted analysis of all available patients (including those who withdrew) (n=205) which was available in a trial report submitted to the Danish Medicines Agency. We believe that it is possible to get complete weight data for some and a final planned weight measurement for many of the withdrawn participants. When imputation is undertaken reliable techniques, such as MI, should be used and to make a proper ITT analysis imputation should also be based on weight data collected after withdrawal. We can only then provide a meaningful answer to the question of how much weight one can expect to loose on average when using a particular intervention. Conclusion The different imputation methods gave different results, but all showed that topiramate reduced weight compared to placebo. In anti obesity trials imputation is obligatory due to the amount and type of missing data and because the complete case analysis is biased. However, the ITT analysis using LOCF, which in general is considered a conservative analysis, was in this trial similar to the PP analysis that used the more reliable MI method and therefore we suggest that post withdrawal weight data must be obtained to make a proper ITT analysis and MI should replace the standard imputation strategies LOCF and BOCF because they overestimate the precision. 1. Fabricatore AN, Wadden TA, Moore RH, Butryn ML, Gravallese EA, Erondu NE, et al.

Attrition from randomized controlled trials of pharmacological weight loss agents: a systematic review and analysis. Obes Rev 2009;10:333-341.

2. Elobeid MA, Padilla MA, McVie T, Thomas O, Brock DW, Musser B, et al. Missing data in randomized clinical trials for weight loss: scope of the problem, state of the field, and performance of statistical methods. PLoS ONE 2009;4:e6624.

3. Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 2009;338:b2393.

4. Donders ART, van der Heijden GJMG, Stijnen T, Moons KGM. Review: a gentle introduction to imputation of missing values. J Clin Epidemiol 2006;5:1087-1091.

5. Methods for voluntary weight loss and control. NIH Technology Assessment Conference Panel. Consensus Development Conference, 30 March to 1 April 1992. Ann Intern Med 1993;119:764-770.

6. Ware JH. Interpreting incomplete data in studies of diet and weight loss. NEJM 2003;348:2136-2137.

7. Beunckens C, Molenberghs G, Kenward MG. Direct likelihood analysis versus simple forms of imputation for missing data in randomized clinical trials. Clin Trials 2005;2:379-386.

Page 51: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

51

8. Molenberghs G, Kenward M. Missing data in clinical studies. Wiley 2007.

9. Gadbury GL, Coffey CS, Allison DB. Modern statistical methods for handling missing repeated measurements in obesity trial data: beyond LOCF. Obes Rev 2003;4:175-184.

10. Astrup A, Caterson I, Zelissen P, Guy-Grand B, Carruba M, Levy B, et al. Topiramate: long-term maintenance of weight loss induced by a low-calorie diet in obese subjects. Obes Res 2004;12:1658-1669.

11. Little RJA. A Test of Missing Completely at Random for Multivariate Data with Missing Values. Journal of the American Statistical Association. 1988;83:1198-1202.

12. PASW Missing Values 18. http://support.spss.com/productsext/statistics/documentation/18/client/User%20Manuals/English/PASW%20Missing%20Values%2018.pdf (accessed 14 Feb 2011).

13. Kramer CK, Leitão CB, Pinto LC, Canani LH, Azevedo MJ, Gross JL. Efficacy and safety of topiramate on weight loss: a meta-analysis of randomized controlled trials. Obes Rev 2011;12:e338-347.

14. Sinharay S, Stern HS, Russell D. The use of multiple imputation for the analysis of missing data. Psychol Methods 2001;6:317-329.

15. Schafer JL, Graham JW. Missing data: Our view of the state of the art. Psychological Methods 2002;7:147-177.

16. Schafer JL. Multiple imputation: a primer. Stat Methods Med Res 1999;8:3-15.

17. Carpenter J, Kenward M. Missing data in randomised controlled trials - a practical guide. http://www.haps.bham.ac.uk/publichealth/methodology/docs/invitations/Final_Report_RM04_JH17_mk.pdf (accessed 14 Feb 2011).

18. Simons-Morton DG, Obarzanek E, Cutler JA. Obesity research--limitations of methods, measurements, and medications. JAMA 2006;295:826-828.

19. Pi-Sunyer FX, Aronne LJ, Heshmati HM, Devin J, Rosenstock J. Effect of rimonabant, a cannabinoid-1 receptor blocker, on weight and cardiometabolic risk factors in overweight or obese patients: RIO-North America: a randomized controlled trial. JAMA 2006;295:761-775.

20. Rock CL, Flatt SW, Sherwood NE, Karanja N, Pakiz B, Thomson CA. Effect of a free prepared meal and incentivized weight loss program on weight loss and weight loss maintenance in obese and overweight women: a randomized controlled trial. JAMA 2010;304:1803-1810.

21. Smith IG, Goulder MA. Randomized placebo-controlled trial of long-term treatment with sibutramine in mild to moderate obesity. J Fam Pract 2001;50:505-512.

Page 52: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

52

Figure 1. Mean weight of patients attending and missing the following visit. Some of the missing patients return and therefore week 28 and 32 both include 448 patients.

Figure 2. The most frequently occurring patterns of missingness.

% of all subjects

9

9

17 Missed

17 Attended

3

3

3

3

2

Patt

ern

s

0 2 4 6 8 12 16 20 24 28 32 36 40 44 48 52 56 60

Trial duration (weeks)

Number of patients

Trial duration / weeks

85

95

105

115

56

1

54

7

53

3

52

1

51

2

50

1

48

7

47

5

45

7

44

8

44

8

43

6

43

0

40

9

29

9

20

1

14

5

86

0 2 4 6 8 12 16 20 24 28 32 36 40 44 48 52 56 60

Me

an

we

igh

t /

kg

attending next visit

missing next visit

Page 53: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

53

Figure 3. Analysis of difference in weight change between placebo and topiramate pooled (96 and 192 mg/day) from baseline to week 60 using different methods

0

13%

20%

27%

85%

0

1

2

3

4

5

6

7

8

9

10

0 10 20 30 40 50 60

Trial duration / weeks

Ch

an

ge

fro

m b

as

eli

ne

/ k

g

Completers

LOCF

MI (all data)

BOCF

% subjects missing

Page 54: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

54

Figure 4. Analysis of weight loss over time in topiramate pooled (96 or 192 mg/day) group using different imputation methods

0

84%

27%

13%

21%

0

1

2

3

4

5

6

7

8

0 10 20 30 40 50 60 70

Trial duration / weeks

Ch

an

ge

fro

m b

as

eli

ne

/ k

g

Completers

LOCF

MI (all data)

BOCF

% subjects missing

Page 55: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

55

Figure 5. Analysis of weight loss over time in the placebo group using different imputation methods

Analysis of weight lossover time in placebo group using different imputation

methods

085%

23%

13%

7%

-4

-3

-2

-1

0

1

2

0 10 20 30 40 50 60 70

Trial duration / weeks

Ch

an

ge

fro

m b

as

eli

ne

/ k

g

Completers

LOCF

MI (all data)

BOCF

% subjects missing

Page 56: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

56

Table 1. Percentage weight change from enrolment (- 8 week) to 60 weeks

Placebo Topiramate 96 mg Topiramate 192 mg

Analysis Mean SE Mean SE Mean SE

Completers (n=28) (n=31) (n=27)

Change, % 4,2 1,13 -6,3 1,82 -5,8 1,59

LOCF (n=187) (n=190) (n=184)

Change, % 1,3 0,52 -4,8 0,53 -6,2 0,66

MI (n=187) (n=190) (n=184)

Change, % 3,0 0,69 -2,5 0,79 -4,3 0,79

BOCF (n=187) (n=190) (n=184)

Change, % 0,6 0,20 -1,0 0,34 -0,9 0,28

SE: standard error Each difference (topiramate - placebo) had a P-value < 0.001 (Unpaired T-test)

Page 57: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

57

Web material Web figure 1. Convergence chart showing the mean and standard deviation of the imputed values of log transformed weight measurements at each iteration of the multiple imputation method for each of the 5 requested imputations. The purpose of this plot is to look for patterns in the lines. There should not be any, and these should look suitably “random” which is the case for the mean, but note that the standard deviation increases up to iteration number 450 and the reaches a plateau.

Page 58: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

58

Webtable 1. Baseline values for variables included in the model for multiple imputaion.

Placebo Topiramate 96 mg Topiramate 192 mg All

Number of subjects 187 190 184 561

Sex

Male, no. (%) 39 (21) 53 (28) 44 (24) 136 (24)

Female, no. (%) 148 (79) 137 (72) 140 (76) 425 (76)

Race

White, no. (%) 182 (97) 188 (99) 181 (98) 551 (98)

Black, no. (%) 4 (2) 1 (0.5) 2 (1) 3 (1)

Asian, no. (%) 1 (1) 1 (0.5) 1 (1) 7 (1)

Mean baseline values

Age, years (SD) 44 (11) 44 (10) 44 (11) 44 (11)

Weight, kg (SD) 96.6 (14.7) 98.8 (13.8) 99.3 (15.1) 98.2 (14.5)

Height, cm (SD) 167 (9) 168 (9) 168 (9) 168 (9)

Waist size, cm (SD) 105 (12) 106 (11) 107 (11) 106 (11)

Plasma glucose, mmol/L (SD)* 6.6 (1.8) 6.6 (1.8) 6.7 (1.8) 6.6 (1.8)

Triglycerides, mmol/L (SD) 1.2 (0.5) 1.1 (0.6) 1.3 (0.8) 1.2 (0.6)

HDL cholesterol, mmol/L (SD) 1.2 (0.3) 1.2 (0.3) 1.2 (0.3) 1.2 (0.3)

HDL/LDL cholesterol, ratio, (SD) 2.8 (1.0) 2.9 (1.0) 2.8 (0.9) 2.8 (1.0)

Insulin, mcUI/ml (SD) 10.2 (5.0) 10.7 (8.4) 11.2 (6.2) 10.7 (6.7)

Hemoglobin, g/L (SD) 143 (13) 144 (12) 143 (13) 143.3 (12.3)

Hemoglobin 1Ac, percent (SD) 5.5 (0.5) 5.5 (0.5) 5.5 (0.5) 5.5 (0.5)

*Oral glucose intolerance test

Page 59: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

59

Web table 2. Percentage weight change from enrolment (- 8 week) to week 44 for the subset of subjects published in Obes. Res. 2004;12:1658-1669.

Placebo Topiramate 96 mg Topiramate 192 mg

Analysis Mean SE Mean SE CI95% Mean SE CI95%

Low High Low High

Completers (n=83) (n=71) (n=81)

Change, % -9.6 0.78 -17.0 0.81 -17.6 0,98

Difference,% (topiramate - placebo) 7.4 1.12 5.1 9.6 8.0 1.3 5.6 10.5

LOCF (n=99) (n=96) (n=105)

Change, % -9.1 0.69 -15.3 0.71 -16.6 0.84

Difference,% (topiramate - placebo) 6.2 0.99 4.2 8.1 7.5 1.10 5.4 9.7

MI (n=99) (n=96) (n=105)

Change, % -8.8 0.77 -14.7 0.86 -16.4 0.93

Difference,% (topiramate - placebo) 5.9 1.15 3.7 8.2 7.6 1.23 5.2 10.0

BOCF (n=99) (n=96) (n=105)

Change, % -9.6 0.65 -15.1 0.69 -16.0 0.81

Difference,% (topiramate - placebo) 5.6 0.95 3.7 7.4 6.4 1.05 4.4 8.5

SE: standard error CI95%: 95% confidenceinterval Each difference (topiramate - placebo) had a P-value < 0.001 (Unpaired T-test)

Page 60: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

60

DISCUSSION OF PAPER 4

In this project, we compared 4 methods of analysing the effect of a weight-reducing drug in

a weight maintenance trial. We found that, over time and with increasing attrition, the

difference in body weight change between the 4 methods also increased. When attrition

was high LOCF and MI estimated different body weight changes within the treatment

groups, but similar differences between the groups. Complete case analysis estimated the

greatest difference and baseline carried forward (BOCF) the smallest. As expected, LOCF

and BOCF both had higher precision than MI and the complete case analysis.

A major limitation to our study is that we do not know the true treatment effect of topiramate

or the body weight of the missing participants and it is therefore not possible to validate our

findings against a gold standard - the correct result. Also, we do not know the exact

mechanism of missingness. Missingness can be classified as missing completely at

random (MCAR), missing at random (MAR), or missing not at random (MNAR). When

missingness is MCAR the available data can be regarded as a random sample of the ‘full’

dataset and only in this case the complete case analysis exclusively on participants with all

data observed gives an unbiased result. However, in most studies MCAR is rarely the case.

When missingness is MAR, missingness depends on the observed data and in this case MI

can make reliable imputation of the missing data using the distribution of the observed.

Finally, when missingness is MNAR, missingness also depends on unobserved data. For

obvious reasons unobserved data cannot be included in the imputation method and

therefore all methods are likely to be biased. Nevertheless, MI may provide less biased

results in this situation as well, compared to the other methods (64) and has been found to

be a reliable imputation method in trials of anti-obesity drugs (24).

In general, MI is considered one of the most reliable methods for imputing missing data. It

has been available for more than 20 years and in standard statistical programs for about 10

years. But MI is rarely used in randomised trials and has only recently been proposed for

trials on anti-obesity drugs (24). There has been and still is a steadfast tradition of using

LOCF in these trials, but policy is changing. The European Medicines Agency has from

2011 implemented a guideline for handling missing data in clinical trials (65) that the drug

industry has to follow. The guideline does not find LOCF, (but BOCF) appropriate for

chronic conditions such as obesity that will return to baseline when the treatment is stopped

and it describes MI as a more proper imputation technique. Therefore, I assume that we will

see more trials using BOCF and MI in the future. With these changes, drug companies will

be more motivated for minimising attrition and missing data because increasing attrition and

missing data will result in a treatment effect that approaches zero when BOCF is used.

Page 61: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

61

Likewise, there will be a larger proportion of withdrawn patients with complete data as their

data are required for a proper ITT analysis.

A Cochrane review on long-term use of the anti-obesity drugs sibutramine, orlistat and

rimonabant found that all trials reported ITT-analyses using LOCF. The authors of the

review preferred this analysis to a complete case analysis, as it is more conservative, but

acknowledge that the bias introduced with LOCF can go in both directions, and due to high

attrition rates they conclude that more methodological rigorous trials are needed (66). An

average attrition rate of 37% at 1 year in trials on anti-obesity drugs (67) is high, but not

compared to the ‘real world’. A survey of orlistat and sibutramine users in British Columbia,

Canada, found that the persistence rates with the drugs were 6% and 8%, respectively,

after one year. Therefore the high attrition rates in trials on anti-obesity drugs may not be the

biggest problem as they reflect the ‘real world’, whereas failure to follow up on participants

that discontinue the treatment and the use of LOCF comprise major methodological

problems. It shouldn't be difficult to obtain a final body weight even for patients who

dropped out of the study.

Authors of systematic reviews and meta-analyses have methods for imputing missing

outcome data in trials but they are of limited quality and the best methods require individual

patient data (68). Individual patient data are only rarely available. Therefore, the solution to

the problem with missing data remains with those who design and conduct the trial. It is

indeed possible to have low rates of missing data, as seen in a recently published weight

loss trial of free meal plus motivated structured weight loss program compared with usual

care. After 24 months, the investigators had weight data on 92% of the study participants

(69). Most weight loss drug trials also include a diet or behavioural guidance, which can be

used to minimise missing weight data. Further, many trial protocols describe that outcome

data should be obtained from withdrawn patients.

Page 62: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

62

PAPER 5

The aim of the fifth paper was to provide other researchers with arguments for increased

transparency in drug regulation and to inform them about EMA’s widened access to

documents such as trial reports and protocols.

Page 63: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

Opening up data at the European Medicines AgencyWidespread selective reporting of research results means we don’t know the true benefits andharms of prescribed drugs. Peter Gøtzsche and Anders Jørgensen describe their efforts to getaccess to unpublished trial reports from the European Medicines Agency

Peter C Gøtzsche professor, Anders W Jørgensen PhD student

Nordic Cochrane Centre, Rigshospitalet and University of Copenhagen, Dept 3343, Blegdamsvej 9, DK-2100 Copenhagen Ø, Denmark

Doctors cannot choose the best treatments for their patientsdespite the existence of hundreds of thousands of randomisedtrials. The main reason is that research results are being reportedselectively. Comparisons of published drug trials withunpublished data available at drug regulatory agencies haveshown that the benefits of drugs have been much over-rated1-3

and the harms under-rated.4Comparisons of trial protocols withpublished papers have also shownwidespread selective reportingof favourable results.5 6

Selective reporting can have disastrous consequences. Rofecoxib(Vioxx) has probably caused about 100 000 unnecessary heartattacks in the United States alone,7 and class 1 antiarrhythmicdrugs probably caused the premature death of about 50 000Americans each year in the 1980s.8 An early trial found ninedeaths among patients taking the antiarrhythmic drug and onlyone among those taking placebo, but it was never publishedbecause the company abandoned the drug for commercialreasons.9

Allowing researchers access to unpublished trial reportssubmitted to drug regulatory agencies is important for publichealth. Such reports are very detailed and provide more reliabledata than published papers,1-4 but it has been virtually impossibleto get access to them.We eventually succeeded in getting accessto reports held by the EuropeanMedicines Agency (EMA) afterthree years of trying. Our case has set an important precedent,and we summarise here the process and the arguments.

Our application for accessOn 29 June 2007 we applied for access to the clinical studyreports and corresponding protocols for 15 placebo controlledtrials of two anti-obesity drugs, rimonabant and orlistat. Themanufacturers had submitted the reports to the EMA to obtainmarketing approval in the European Union. We explained thatwe wanted to explore the robustness of the results by adjustingfor the many missing data on weight loss and to study selectivepublication by comparing protocols and unpublished resultswith those in published reports.

The information was important for patients because anti-obesitypills are controversial. The effect on weight loss in the publishedtrials is small,10 and the harms are substantial. People have diedfrom cardiac and pulmonary complications11 or have experiencedpsychiatric disturbances, including suicidal events,12 and mostof the drugs have been deregistered for safety reasons.A basic principle in the European Union is to allow its citizensthe widest possible access to the documents its agencies possess(box 1).13But there are exemptions, and the EMA refuses accessif disclosure would threaten commercial interests unless thereis an over-riding public interest.14 We argued in our first letterto the EMA that secrecy was not in the best interests of thepatients because biased reporting of drug trials is common.2 5

Furthermore, we hadn’t found any information that couldcompromise commercial interests in 44 trial protocols ofindustry initiated trials we had reviewed previously.5

Without any comment on our arguments, the EMA replied thatthe documents could not be released because it would underminecommercial interests. We appealed to the EMA’s executivedirector, Thomas Lönngren, and asked him to explain why theEMA considered that the commercial interests of the drugindustry should over-ride the welfare of patients. We arguedthat the EMA’s attitude increased the risk of patients dyingbecause their doctors prescribed drugs for themwithout knowingwhat the true benefits and harms were. He sent us a similar letterto the EMA’s first letter, ignoring our request for clarification,and told us we could lodge a complaint with the Europeanombudsman, which we did.Over the following three years the EMA put forward severalarguments to avoid disclosing the documents: protection ofcommercial interests, no over-riding public interest, theadministrative burden involved, or the worthlessness of the datato us after the EMA had redacted them (box 2). It also did notrespond to the ombudsman’s letters before his rather generousdeadlines had run out.

Correspondence to: P C Gøtzsche, [email protected]

Reprints: http://journals.bmj.com/cgi/reprintform Subscribe: http://resources.bmj.com/bmj/subscribers/how-to-subscribe

BMJ 2011;342:d2686 doi: 10.1136/bmj.d2686 Page 1 of 4

Analysis

ANALYSIS

Page 64: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

Box 1: Basic principles on citizens’ access to EU documents13

“Any citizen of the Union, and any natural or legal person residing or having its registered office in a Member State,has a right of access to documents of the institutions, subject to the principles, conditions and limits defined in thisRegulation.”“Openness enables citizens to participate more closely in the decision-making process and guarantees that theadministration enjoys greater legitimacy and is more effective and more accountable to the citizen in a democraticsystem. Openness contributes to strengthening the principles of democracy and respect for fundamental rights as laiddown in Article 6 of the EU Treaty and in the Charter of Fundamental Rights of the European Union.”

Box 2: The path to the data

The delays on our part amounted to 130 days (11% of the time); we awaited replies for 1028 days.29 Jun 2007:We asked the EMA to provide access to the clinical study reports and their corresponding protocols onrimonabant and sibutramine20 Aug 2007: The EMA replied that the documents could not be released because they came under the exception ofcommercial interests24 Aug 2007:We explained that the EMA’s lack of transparency violated basic principles in the EU treaty and that itleads to suboptimal treatment of patients17 Sept 2007:With no comment on our arguments, the EMA referred again to commercial interests and noted wecould institute court proceedings against the EMA or complain to the European ombudsman8 Oct 2007:We appealed to the ombudsman, noting that the published literature on drugs is flawed and arguing thatprotocols and study reports did not disclose anything that could undermine commercial interests30 Jan 2008: The EMA replied to two letters from the ombudsman, referred to protection of commercial interests andmentioned that it could not identify any over-riding public interest that could justify disclosure of the requested documents26 Feb 2008:We told the ombudsman that the EMA had failed to explain why commercial interests would be undermined28 Apr 2008: The EMA replied to the ombudsman that it needed to protect the data against unfair commercial use;that evaluating the balance between benefits and risks of medicines is the EMA’s job; and that redaction of personaldata would cause disproportionate effort17 Jun 2008: In our reply to the ombudsman, we argued against this and noted that if commercial success dependson withholding data that are important for rational decision making by doctors and patients, there is somethingfundamentally wrong with our priorities in healthcare22 Jan 2009: The ombudsman proposes a friendly solution to the EMA and asks it to grant us access to the documentsor provide a convincing explanation why such access cannot be granted26 Feb 2009: The EMA restates the commercial interests; claims that we have not given evidence of an over-ridingpublic interest; and refers to the workload involved in redacting the documents10 Mar 2009: The ombudsman again proposes a friendly solution to the EMA and asks it to clarify its reasoning7 Apr 2009: The EMA repeats its previous arguments.19 May 2009:We again counter the EMA’s arguments: the EMA has provided no evidence that the documents arecommercially sensitive; many patients had been harmed by selective publication of trial data on COX 2 inhibitors; andredacting the documents should be quick and easy31 Aug 2009:We tell the ombudsman that we have received trial data from the Danish Medical Agency on a thirdanti-obesity drug, sibutramine6 Oct 2009: The ombudsman goes to the EMA to inspect the documents we had requested19 May 2010: The ombudsman issues a draft recommendation that the EMA should grant us access to the documentsor provide a convincing explanation as to why not7 Jun 2010: In a press release the ombudsman accuses the EMA of maladministration because of its refusal to grantaccess31 Aug 2010: The EMA informs the ombudsman that it will provide access1 Feb 2011:We receive the data

Protection of commercial interestsProtection of commercial interests was the EMA’s over-ridingargument. It would undermine the protection of commercialinterests to allow us access, it said, as the documents representedthe full details of the clinical development programme and themost substantial part of the applicant’s investment. Competitorscould use them as a basis for developing the same or a similardrug and gather valuable information on the long term clinicaldevelopment strategy of the company to their own economicadvantage.

We explained that the clinical study reports and protocols arebased on well known principles that can be applied to any drugtrial; that the clinical study reports describe the clinical effectsof drugs; and that nothing in the EMA’s guidelines forpreparation of such reports indicates that any informationincluded in them can be considered a trade secret. The trialprotocols are always sent to the clinical investigators, and it isunlikely that companies would have left in any information thatcould be of commercial value (such as a description of the drugsynthesis). We also noted that the clinical study reports and trialprotocols represent the last phase of drug development, which

Reprints: http://journals.bmj.com/cgi/reprintform Subscribe: http://resources.bmj.com/bmj/subscribers/how-to-subscribe

BMJ 2011;342:d2686 doi: 10.1136/bmj.d2686 Page 2 of 4

ANALYSIS

Page 65: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

has been preceded by many years of preclinical development.Other companies could hardly use them as a basis for developingsimilar drugs. In fact, unpublished trial data are generally lesspositive than published ones,1-6 and competitors would thereforebe less likely to start drug development if they had access tothe unpublished results. Other companies are more likely to beinterested in in vitro, animal, and early human studies, and drugcompanies have no problems with publishing such studiesbecause the results may attract investors.The European ombudsman, P Nikiforos Diamandouros,considered that commercial interests might be at stake but notedthat the risk of an interest being undermined must be reasonablyforeseeable and not purely hypothetical. He could not see thataccess would “specifically and actually” undermine commercialinterests. He inspected the relevant reports and protocols at theEMA and concluded that the documents did not containcommercially confidential information. He therefore criticisedthe EMA’s refusal to grant us access.

Over-riding public interest in disclosureEven if commercial interests were undermined by disclosure,access would still have to be granted if there was an over-ridingpublic interest. The EMA argued that it could not identify anyover-riding public interest and remarked that the evaluation ofsafety and efficacy of drugs is its responsibility—the EMAconstantly monitors drugs and updates its assessment reportsand requires changes in product information as appropriate.We considered this insufficient. Monitoring adverse effectsreported by doctors to drug agencies would not have revealedthat rofecoxib causes heart attacks. Few such events are reported,and heart attacks are common in people with arthritis.Postmarketing passive surveillance systems can therefore usuallynot detect whether a drug leads to more heart attacks thanexpected; randomised trials are needed for this.We provided more evidence of the detrimental effects ofselective publication but to no avail. The EMA continued toclaim that we had not documented the existence of anover-riding public interest. We noted that we could not provethis in this specific case because we were denied access to thedata, but we drew attention to the fact that the total number ofpatients in the main clinical studies of orlistat differed accordingto the source of the information: published reports, the EMA’swebsite, and the website of the US Food and DrugAdministration.The ombudsman indicated that we had established an over-ridingpublic interest, but he did not take a definitive stance on whetheran over-riding public interest existed because this questionneeded answering only if disclosure undermined commercialinterests. He asked the EMA to justify its position that therewasn’t an over-riding public interest, but the EMA avoidedreplying by saying that we had not given evidence of theexistence of such an interest. We believe that we had.Furthermore, the EMA’s argument was irrelevant. A suspectasked for his alibi on the day of the crime doesn’t get off thehook by asking for someone else’s alibi.

Administrative burdenAccording to the EMA, the redaction of (unspecified) “personaldata” would cause the EMA a disproportionate effort that woulddivert attention from its core business, as it would meanredacting 300 000-400 000 pages. This was surprising. TheDanish Drug Agency had not seen the workload as a problemwhen it granted us access to the reports for the anti-obesity drug

sibutramine, which was locally approved in Denmark. The 56study reports we received comprised 14 309 pages in total, andwe requested only 15 study reports from the EMA (the pivotalstudies described in the European Public Assessment Reports(EPARs) on rimonabant and orlistat). The ombudsman declaredthat the EMA had overestimated the administrative burdeninvolved.

Worthlessness of data after redactionThe EMA argued that, “as a result of the redaction exercise, thedocuments will be deprived of all the relevant information andthe remaining parts of them will be worthless for the interest ofthe complainant.”From what we know of clinical trial reports and protocols itstruck us as odd that they would contain so much personal datathat the documents became worthless. The ombudsman notedthat the requested documents do not identify patients by namebut by their identification and test centre numbers, and heconcluded that the only personal data are those identifying thestudy authors and principal investigators and to redact thisinformation would be quick and easy.The EMA also remarked that a possible future release of theassessment reports of the EMA’s Committee for MedicinalProducts for Human Use and the (co)rapporteur assessmentreports “could satisfy the request of the complainants.” Thesereports were not available and they would have been worthlessto us because they are merely summaries used for regulatorydecisions.

MaladministrationThe EMA was completely resistant to our arguments and thosefrom the ombudsman. However, after the ombudsman accusedthe EMA of maladministration in a press release on 7 June2010,15 three years after our request, the EMA reversed itsstance. The EMA now gave the impression that it had favoureddisclosure all the time, agreed with the ombudsman’s reasoning,and noted that the same principles would be applied for futurerequests for access but that it would consider the need to redactpart of the documents.The EMA’s last letter was unclear: “The Agency will do itsutmost to implement its decision as quickly as possible, in anycase within the next 3 months at the latest. The Agency willkeep the European Ombudsman promptly informed of the exactimplementation date.”It was not clear whether the three months was the deadline forsending the reports to us, for implementing its new policy, orboth. We received the data we requested from the EMA on 1February 2011, which in some cases included individual patientdata in anonymised format, identified by individual and testcentre numbers.

Concluding remarksAccording to the EMA’s responses to the ombudsman, the EMAput protecting the profits of the drug companies ahead ofprotecting the lives and welfare of patients. Moreover the EMA'sposition is inconsistent because it resisted requests to give accessto trial data on adult patients while providing access to data onpaediatric trials, in accordance with EU legislation.16 TheDeclaration of Helsinki gives authors the duty to make publiclyavailable the results of their research on humans.17 Thedeclaration also says that, “Medical research involving humansubjects must . . . be based on a thorough knowledge of the

Reprints: http://journals.bmj.com/cgi/reprintform Subscribe: http://resources.bmj.com/bmj/subscribers/how-to-subscribe

BMJ 2011;342:d2686 doi: 10.1136/bmj.d2686 Page 3 of 4

ANALYSIS

Page 66: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

scientific literature.” If the knowledge base is incomplete,patients may suffer and cannot give fully informed consent9 andresearch resources are wasted. The EMA should be promotingaccess to full information that will aid rational decision making,not impede it.Our case sets an important precedent. On 30 November 2010the EMA declared it would widen public access to documents,including trial reports and protocols.18 We recommend that theFDA and other drug regulatory agencies should follow suit.Access should be prompt—for example, within three monthsof the regulator’s decision—and documents should be providedin a useful format. Drug agencies should get rid of the hugepaper mountains and require electronic submissions from thedrug companies, including the raw data, which should also bemade publicly available.

Contributors and sources: PCG wrote the first draft, both authorscommented on it and are guarantors.Funding: Sygekassernes Helsefond and Rigshospitalet, University ofCopenhagen.Competing interests: All authors have completed the unified competinginterest form at www.icmje.org/coi_disclosure.pdf (available on requestfrom the corresponding author) and declare no support from anyorganisation for the submitted work; no financial relationships with anyorganisation that might have an interest in the submitted work in theprevious three years; and no other relationships or activities that couldappear to have influenced the submitted work.All documents in this case (133 pages) are available at www.cochrane.dk/research/EMA, together with a comprehensive 26 page report of thecase including 54 references.Provenance and peer review: Not commissioned; externally peerreviewed.

1 Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication ofantidepressant trials and its influence on apparent efficacy.NEngl J Med 2008;358:252-60.

2 Melander H, Ahlqvist-Rastad J, Meijer G, Beermann B. Evidence b(i)asedmedicine—selective reporting from studies sponsored by pharmaceutical industry: reviewof studies in new drug applications. BMJ 2003;326:1171-3.

3 Rising K, Bacchetti P, Bero L. Reporting bias in drug trials submitted to the Food andDrug Administration: a review of publication and presentation. PLoS Med 2008;5:e217.

4 Healy D. Let them eat Prozac. New York University Press, 2004.5 Chan A-W, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for

selective reporting of outcomes in randomized trials: comparison of protocols to publishedarticles. JAMA 2004;291:2457-65.

6 Vedula SS, Bero L, Scherer RW, Dickersin K. Outcome reporting in industry-sponsoredtrials of gabapentin for off-label use. N Engl J Med 2009;361:1963-71.

7 Lenzer J. FDA is incapable of protecting US “against another Vioxx.” BMJ 2004;329:1253.8 Moore TJ. Deadly medicine: why tens of thousands of heart patients died in America’s

worst drug disaster. Simon & Schuster, 1995.9 Cowley AJ, Skene A, Stainer K, Hampton JR. The effect of lorcainide on arrhythmias and

survival in patients with acute myocardial infarction: an example of publication bias. Int JCardiol 1993;40:161-6.

10 Padwal R, Li SK, Lau DCW. Long-term pharmacotherapy for obesity and overweight.Cochrane Database Syst Rev 2003;4:CD004094 (updated 2009).

11 Mundy A. Dispensing with the truth. St Martin’s Press, 2001.12 Food and Drug Administration. FDA briefing document. NDA 21-888. Zimulti (rimonabant)

tablets, 20 mg. 2007. www.fda.gov/ohrms/dockets/ac/07/briefing/2007-4306b1-fda-backgrounder.pdf.

13 Regulation (EC) No 1049/2001 of the European Parliament and of the Council of 30 May2001 regarding public access to European Parliament, Council and Commissiondocuments.Official Journal of the European Communities 2001;L145:43-8. www.europarl.europa.eu/RegData/PDF/r1049_en.pdf.

14 Rules for the implementation of Regulation (EC) No 1049/2001on access to EMEAdocuments. EMEA/MB/203359/2006 Rev 1 Adopted. Management board meeting 19December 2006. www.ema.europa.eu/docs/en_GB/document_library/Other/2010/02/WC500070829.pdf.

15 Diamandouros PN. Ombudsman: European Medicines Agency should disclose clinicalreports on anti-obesity drugs. European ombudsman press release, 7 June 2010. www.ombudsman.europa.eu/press/release.faces/en/4940/html.bookmark.

16 Choonara I. Regulation of drugs for children in Europe. BMJ 2007;335:1221-2.17 World Medical Association. Declaration of Helsinki—ethical principles for medical research

involving human subjects. 2008. www.wma.net/en/30publications/10policies/b3/index.html.

18 EMA. European Medicines Agency widens public access to documents. Press release,30 November 2010. www.ema.europa.eu/ema/index.jsp?curl=pages/news_and_events/news/2010/11/news_detail_001158.jsp&mid=WC0b01ac058004d5c1&murl=menus/news_and_events/news_and_events.jsp.

Accepted: 7 March 2011

Cite this as: BMJ 2011;342:d2686

Reprints: http://journals.bmj.com/cgi/reprintform Subscribe: http://resources.bmj.com/bmj/subscribers/how-to-subscribe

BMJ 2011;342:d2686 doi: 10.1136/bmj.d2686 Page 4 of 4

ANALYSIS

Page 67: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

67

DISCUSSION OF PAPER 5

In the fifth project, we described how we managed to get access to unpublished trial reports

and protocols at the European Medicines Agency (EMA). In June 2007, we applied for

access to the clinical study reports and their corresponding protocols of the placebo

controlled trials of four anti-obesity pills, rimonabant, orlistat, sibutramine and

amfepramone. The first two were submitted by the manufacturers to EMA for obtaining

marketing approval in the European Union and the last two to the Danish Medicines Agency

(DMA). Shortly after our application, the DMA told us that amfepramone was approved in

the 1960’s and that DMA did not have the requested documents on this drug. However, we

still hoped to get access to the three other drugs, as anti-obesity drugs are controversial.

The effect on weight loss in the published trials is small (66) and the harms are substantial.

People have died from cardiac and pulmonary complications or have experienced

psychiatric disturbances including suicide attempts, and most of the drugs have been pulled

off the market for safety reasons.

In August 2009, the DMA provided us with the trial reports on sibutramine, but not the

protocols, as they were not part of the submission to DMA. In February 2011, we received

both trial reports and protocols of orlistat and rimonabant from EMA.

EMA and DMA had a very different view on our application and handled it differently. EMA

rejected our request immediately whereas DMA was very positive. DMA granted us access

to the documents 12 month after our application was submitted, but EMA did that 36 month

after our submission. In both cases, a third party got involved. Abbott, the producer of

sibutramine, appealed DMA’s decision on releasing the documents to the Danish Ministry of

Health and we appealed EMA’s decision to the European Ombudsman.

In both cases, the primary argument for not allowing us access to the documents was that it

would undermine the protection of commercial interests, as the documents represented the

full details of the clinical development programme. Competitors could use them as a basis

for developing similar drugs and gather valuable information on the development strategy of

the company. We explained to EMA that the documents are based on well-known principles

that can be applied to any drug trial; that the clinical study reports describe the clinical

effects of drugs; and that nothing in EMA’s guidelines for preparation of such reports

indicates that any information included in them can be considered a trade secret. We also

noted that unpublished trial data are generally less positive than published ones.

The manufacturer of sibutramine was assisted by a professor from the University of

Copenhagen who came up with arguments against the release of the documents.

Page 68: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

68

Apparently, he had knowledge about the information in the documents and argued that

Neurosearch could use the information in their development of a similar product,

tesofensine. This professor was a member of the tesofensine advisory board. He was also

the primary investigator of a large tesofensine trial and the primary investigator of a large

sibutramine trial, and it is common that drug companies allow doctors to be members of

advisory boards, consultants and investigators for competing companies. I really find this a

strange clash of competing interests and wonder what precautions the companies take to

protect their interests.

We received the documents from both EMA and DMA, but the outcome was different. We

only received the core trial reports from DMA without appendices, as these were not

available. EMA provided us with the full trial reports including appendices. The DMA had

redacted a substantially amount of ‘personal data’. This was patient numbers and

descriptions of side effects. The reason for this was that DMA claimed that the patient

numbers could be used to identify the patients and the descriptions of side effects were

similar to health records. Also, investigator names in the documents from DMA were

redacted, but only because they were regarded as commercially confidential. The European

Ombudsman concluded that the only “personal data” are those identifying the study authors

and principal investigators. EMA only redacted the curriculum vitae of the investigators and

therefore, we received documents with names and addresses of investigators and

individual patient data listings, including detailed descriptions of side effects before any

coding had taken place.

Further DMA sent the documents in binders whereas EMA sent pdf files. The latter makes

electronic searches possible and navigation through the many pages easier. However, it

can be improved. The preferable way must be a database with individual anonymised

patient data that can be imported to standard statistical software. Independent researchers

would then be able to make their own analyses and conclusions.

Page 69: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

69

CONCLUSIONS

Systematic reviews, randomised trials and abstracts are key resources for evidence-based

medicine, but they should always be read with caution because important information may

have been unreported for various reasons.

In the two first studies, we found that industry-supported meta-analyses are less

transparent, describe fewer limitations in the included trials and have more favourable

conclusions than other meta-analyses. Ignorance due to lack of expertise or unwillingness

due to commercial interests seem to be the best possible explanations for the low

transparency and methodological quality in industry-supported meta-analyses, whereas

there is much evidence in support of commercial interests being the main reason for biased

conclusions. Unpublished data and selective outcome reporting is probably the greatest

threat to systematic reviews and meta-analyses.

In the third study of abstracts on rofecoxib, we found that the basic principle of balanced

reporting of benefits and harms was generally violated, although the harms were equally

predictable as the benefits from the mechanism of action of the drug. Before the withdrawal

of rofecoxib, abstracts mostly reported on benefits and were in favour of rofecoxib. Only

after the drug had been withdrawn, the harms came in focus, and only due to court trials

unpublished data were revealed that showed that the harms were evident early on, long

before the drug was withdrawn.

In the fourth study, we found that various imputation methods give different results. In most

trials, imputation is obligatory because the complete case analysis is biased. The general

perception that LOCF is a conservative analysis is wrong. It overestimates the precision and

sometimes also the treatment effect. Therefore the more reliable MI should be used, but to

get the most reliable result, trial investigators should do their utmost to collect outcome data

on all trial participants and also report the results.

In the fifth paper, we conclude that there is something fundamentally wrong with our

priorities in health care if commercial success of the drug industry is dependent on

withholding data that are needed for rational decision-making for doctors and patients.

Page 70: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

70

IMPLICATIONS

Healthcare professionals, politicians and patients should realize that commercial interests

play an important role in how healthcare information is assembled and disseminated. They

should also prioritise independent and transparent research, which, for example is one of

the aims of the Cochrane Collaboration.

All research should be read with caution. Systematic reviews and meta-analyses should

report details of concealment of allocation, blinding, inclusion and exclusion criteria for

trials, search strategies, estimated effects, and review authors should assess the risk of

bias in each included trial and should also have their protocols published to allow readers to

judge the reliability of the reviews. This has been specified in the PRISMA statement for

reporting systematic reviews and meta-analyses (44).

Plans for handling missing data in trials should not only be pre-specified in the protocol, but

should also be based on evidence. Authors of trial protocols should discuss whether

imputation is needed and if so also justify the method. The SPIRIT initiative (Standard

Protocols Items for Randomised Trials) is currently preparing a guideline for writing trial

protocols, which needs to discuss imputations.

Our access to clinical trial protocols and study reports sets an important precedent and on

30 November 2010, EMA declared it would widen public access to documents. We

recommend FDA and other drug agencies to follow suit. The access should be quick (e.g. 3

months) and in a useful format. Drug agencies should get rid of the huge paper loads and

require electronic submissions from the drug companies, including the raw data, which

should also be made publicly available.

Page 71: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

71

RECCOMMENDATIONS FOR FUTURE RESEARCH

Research on systematic reviews should focus on the implications of having access to

unpublished trial reports. One study suggested it is not worth the effort to search for

unpublished trials because they have poor or unclear methodological quality (70), but other

studies clearly show that including unpublished trial reports in meta-analyses will impact on

their results (51,71,72).

The increased focus on harms and the attempts to improve the quality of abstracts will

probably result in a more balanced reporting of benefits and harms. One way to study this is

to compare the reporting of benefits and harms in abstracts on new dugs with abstracts on

old drugs.

The increased access to unpublished study reports and detailed descriptions of harms

gives the possibility of assessing how harms are coded. The medical industry is required to

use the Medical Dictionary for Regulatory Activities (MEDDRA) to classify harms in

randomised trials. MEDDRA is being developed continually, but only little research seems

to have been done on inter-coder variation. A study of coding harms in an HIV trial found a

12% difference between 2 coders’ decision on the preferred term for harms (73). A study on

the reliability of reading electrocardiograms found much disagreement between

cardiologists. Four cardiologists read 537 ECGs and 20% of were found compatible with

coronary heart disease by at least 1 observer, but all readers only agreed on 4% (74).

Therefore it is likely that the inter-coder variation will increase with more coders, but it needs

to be studied.

Page 72: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

72

REFERENCES

1. Stinson ER, Mueller DA. Survey of health professionals’ information habits and needs. Conducted through personal interviews. JAMA 1980;243:140-143.

2. Fraser AG, Dunstan FD. On the impossibility of being expert. BMJ 2010;341:c6815.

3. Sackett D. Evidence-based medicine: How to practice and teach EBM. 2nd ed. Churchill Livingstone 2000.

4. Blair A. Information overload, the early years - The Boston Globe. http://www.boston.com/bostonglobe/ideas/articles/2010/11/28/information_overload_the_early_years/?page=full (accessed 29 Nov 2010).

5. Oxford centre for Evidence-based Medicine - Levels of Evidence. http://www.cebm.net/?o=1025 (accessed 29 Nov 2010).

6. Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008;336:924-926.

7. Haynes RB. Of studies, syntheses, synopses, summaries, and systems: the »5S« evolution of information services for evidence-based healthcare decisions. Evid Based Med 2006;11:162-164.

8. Higgins J, Cochrane Collaboration. Cochrane handbook for systematic reviews of interventions. Wiley-Blackwell 2008.

9. Guyatt GH, Oxman AD, Kunz R, Vist GE, Falck-Ytter Y, Schünemann HJ. What is »quality of evidence« and why is it important to clinicians? BMJ 2008;336:995-998.

10. Chan A-W, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 2004;291:2457-2465.

11. Chan A-W, Altman DG. Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of authors. BMJ 2005;330:753.

12. Hopewell S, Loudon K, Clarke MJ, Oxman AD, Dickersin K. Publication bias in clinical trials due to statistical significance or direction of trial results. Cochrane Database Syst Rev 2009;(1):MR000006.

13. Scherer RW, Langenberg P, von Elm E. Full publication of results initially presented in abstracts. Cochrane Database Syst Rev 2007;(2):MR000005.

14. Making sense of non-financial competing interests. PLoS Med 2008;5:e199.

Page 73: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

73

15. Bekelman JE, Li Y, Gross CP. Scope and impact of financial conflicts of interest in biomedical research: a systematic review. JAMA 2003;289:454-465.

16. Lexchin J, Bero LA, Djulbegovic B, Clark O. Pharmaceutical industry sponsorship and research outcome and quality: systematic review. BMJ 2003;326:1167-1170.

17. Sismondo S. Pharmaceutical company funding and its consequences: a qualitative systematic review. Contemp Clin Trials 2008;29:109-113.

18. Schott G, Pachl H, Limbach U, Gundert-Remy U, Ludwig W-D, Lieb K. The financing of drug trials by pharmaceutical companies and its consequences. Part 1: a qualitative, systematic review of the literature on possible influences on the findings, protocols, and quality of drug trials. Dtsch Arztebl Int 2010;107:279-285.

19. Barry HC, Ebell MH, Shaughnessy AF, Slawson DC, Nietzke F. Family physicians’ use of medical abstracts to guide decision making: style or substance? J Am Board Fam Pract 2001;14:437-442.

20. Hopewell S, Clarke M, Moher D, Wager E, Middleton P, Altman DG, et al. CONSORT for reporting randomised trials in journal and conference abstracts. The Lancet 2008;371:281-283.

21. Hollis S, Campbell F. What is meant by intention to treat analysis? Survey of published randomised controlled trials. BMJ 1999;319:670-674.

22. Unnebrink K, Windeler J. Intention-to-treat: methods for dealing with missing values in clinical trials of progressively deteriorating diseases. Stat Med 2001;20:3931-3946.

23. Wright CC, Sim J. Intention-to-treat approach to data from randomized controlled trials: a sensitivity analysis. J Clin Epidemiol 2003;56:833-842.

24. Elobeid MA, Padilla MA, McVie T, Thomas O, Brock DW, Musser B, et al. Missing data in randomized clinical trials for weight loss: scope of the problem, state of the field, and performance of statistical methods. PLoS ONE 2009;4:e6624.

25. Methods for voluntary weight loss and control. NIH Technology Assessment Conference Panel. Consensus Development Conference, 30 March to 1 April 1992. Ann. Intern. Med 1993;119:764-770.

26. Beunckens C, Molenberghs G, Kenward MG. Direct likelihood analysis versus simple forms of imputation for missing data in randomized clinical trials. Clin Trials 2005;2:379-386.

27. Molenberghs G, Kenward M. Missing data in clinical studies. Wiley 2007.

28. Gadbury GL, Coffey CS, Allison DB. Modern statistical methods for handling missing repeated measurements in obesity trial data: beyond LOCF. Obes Rev 2003;4:175-184.

Page 74: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

74

29. Donders ART, van der Heijden GJMG, Stijnen T, Moons KGM. Review: a gentle introduction to imputation of missing values. J Clin Epidemiol 2006;59:1087-1091.

30. Egger M. Systematic reviews in health care: meta-analysis in context. 2nd ed. BMJ 2001.

31. Shea BJ, Bouter LM, Peterson J, Boers M, Andersson N, Ortiz Z, et al. External validation of a measurement tool to assess systematic reviews (AMSTAR). PLoS ONE 2007;2:e1350.

32. Shea BJ, Hamel C, Wells GA, Bouter LM, Kristjansson E, Grimshaw J, et al. AMSTAR is a reliable and valid measurement tool to assess the methodological quality of systematic reviews. J Clin Epidemiol 2009;62:1013-1020.

33. Devereaux PJ, Choi PT-L, El-Dika S, Bhandari M, Montori VM, Schünemann HJ, et al. An observational study found that authors of randomized controlled trials frequently use concealment of randomization and blinding, despite the failure to report these methods. J Clin Epidemiol 2004;57:1232-1236.

34. Hróbjartsson A, Pildal J, Chan A-W, Haahr MT, Altman DG, Gøtzsche PC. Reporting on blinding in trial protocols and corresponding publications was often inadequate but rarely contradictory. J Clin Epidemiol 2009;62:967-973.

35. Liberati A, Himel HN, Chalmers TC. A quality assessment of randomized control trials of primary treatment of breast cancer. J Clin Oncol 1986;4:942-951.

36. Deeks JJ. Word limits best explain failings of industry supported meta-analyses. BMJ 2006;333:1021.

37. Jadad AR, Cook DJ, Jones A, Klassen TP, Tugwell P, Moher M, et al. Methodology and reports of systematic reviews and meta-analyses: a comparison of Cochrane reviews with articles published in paper-based journals. JAMA 1998;280:278-280.

38. Jadad AR, Moher M, Browman GP, Booker L, Sigouin C, Fuentes M, et al. Systematic reviews and meta-analyses on treatment of asthma: critical evaluation. BMJ 2000;320:537-540.

39. Lundh A, Knijnenburg SL, Jørgensen AW, van Dalen EC, Kremer LCM. Quality of systematic reviews in pediatric oncology – a systematic review. Cancer Treat Rev 2009;35:645-652.

40. Moseley AM, Elkins MR, Herbert RD, Maher CG, Sherrington C. Cochrane reviews used more rigorous methods than non-Cochrane reviews: survey of systematic reviews in physiotherapy. J Clin Epidemiol 2009;62:1021-1030.

41. Boluyt N, van der Lee JH, Moyer VA, Brand PLP, Offringa M. State of the evidence on acute asthma management in children: a critical appraisal of systematic reviews. Pediatrics 2007;120:1334-1343.

Page 75: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

75

42. Delaney A, Bagshaw SM, Ferland A, Laupland K, Manns B, Doig C. The quality of reports of critical care meta-analyses in the Cochrane Database of Systematic Reviews: an independent appraisal. Crit Care Med 2007;35:589-594.

43. Collier A, Heilig L, Schilling L, Williams H, Dellavalle RP. Cochrane Skin Group systematic reviews are more methodologically rigorous than other systematic reviews in dermatology. Br J Dermatol 2006;155:1230-1235.

44. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med 2009;151:W65-94.

45. Yank V, Rennie D, Bero LA. Financial ties and concordance between results and conclusions in meta-analyses: retrospective cohort study. BMJ 2007;335:1202-1205.

46. Barnes DE, Bero LA. Why review articles on the health effects of passive smoking reach different conclusions. JAMA 1998;279:1566-1570.

47. Nieto A, Mazon A, Pamies R, Linana JJ, Lanuza A, Jiménez FO, et al. Adverse effects of inhaled corticosteroids in funded and nonfunded studies. Arch Intern Med 2007;167:2047-2053.

48. Katz KA, Karlawish JH, Chiang DS, Bognet RA, Propert KJ, Margolis DJ. Prevalence and factors associated with use of placebo control groups in randomized controlled trials in psoriasis: a cross-sectional study. J Am Acad Dermatol 2006;55:814-822.

49. Gøtzsche PC, Hróbjartsson A, Maric K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences. JAMA 2007;298:430-437.

50. Tendal B, Higgins JPT, Jüni P, Hróbjartsson A, Trelle S, Nüesch E, et al. Disagreements in meta-analyses using outcomes measured on continuous or rating scales: observer agreement study. BMJ 2009;339:b3128.

51. Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med 2008;358:252-260.

52. Jefferson T, Doshi P, Thompson M, Heneghan C. Ensuring safe and effective drugs: who can do what it takes? BMJ 2011;342:c7258.

53. EMA. Points to consider on applications with 1. meta-analyses; 2. one pivotal study. 2001 May 31. www.ema.europa.eu/pdfs/human/ewp/233099fen.pdf (accessed 8 Mar 2011)

54. Senn SJ. The many modes of meta. Drug Information journal 2000;34:535-549.

55. Ioannidis JP, Lau J. Completeness of safety reporting in randomized trials: an evaluation of 7 medical areas. JAMA 2001;285:437-443.

Page 76: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

76

56. Hopewell S, Wolfenden L, Clarke M. Reporting of adverse events in systematic reviews can be improved: survey results. J Clin Epidemiol 2008;61:597-602.

57. McIntosh HM, Woolacott NF, Bagnall A-M. Assessing harmful effects in systematic reviews. BMC Med Res Methodol 2004;4:19.

58. Craig D, McDaid C, Fonseca T, Stock C, Duffy S, Woolacott N. Are adverse effects incorporated in economic models? A survey of current practice. Int J Technol Assess Health Care 2010;26:323-329.

59. Brunoni AR, Tadini L, Fregni F. Changes in clinical trials methodology over time: a systematic review of six decades of research in psychopharmacology. PLoS ONE 2010;5:e9479.

60. National Information Standards Organization (U.S.). Guidelines for abstracts: an American national standard. Bethesda Md. NISO Press 1997.

61. Kim PS, Reicin AS. Rofecoxib, Merck, and the FDA. N Engl J Med 2004;351:2875-2878.

62. Bresalier RS, Sandler RS, Quan H, Bolognese JA, Oxenius B, Horgan K, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. N Engl J Med 2005;352:1092-1102.

63. Weaver AL, Messner RP, Storms WW, Polis AB, Najarian DK, Petruschke RA, et al. Treatment of patients with osteoarthritis with rofecoxib compared with nabumetone. J Clin Rheumatol 2006;12:17-25.

64. Schafer JL. Multiple imputation: a primer. Stat Methods Med Res 1999;8:3-15.

65. Guideline on Missing Data in Confirmatory Clinical Trials. 10 Jul 2002. http://www.ema.europa.eu/ema/pages/includes/document/open_document.jsp?webContentId=WC500096793 (accessed 24 Mar 2011)

66. Padwal RS, Rucker D, Li SK, Curioni C, Lau DCW. Long-term pharmacotherapy for obesity and overweight. Cochrane Database of Systematic Reviews 2003, Issue 4. Art. No.: CD004094. DOI: 10.1002/14651858.CD004094.pub2.

67. Fabricatore AN, Wadden TA, Moore RH, Butryn ML, Gravallese EA, Erondu NE, et al. Attrition from randomized controlled trials of pharmacological weight loss agents: a systematic review and analysis. Obes Rev 200910:333-341.

68. Higgins JPT, White IR, Wood AM. Imputation methods for missing outcome data in meta-analysis of clinical trials. Clin Trials 2008;5:225-239.

69. Rock CL, Flatt SW, Sherwood NE, Karanja N, Pakiz B, Thomson CA. Effect of a free prepared meal and incentivized weight loss program on weight loss and weight loss maintenance in obese and overweight women: a randomized controlled trial. JAMA 2010;304:1803-1810.

Page 77: ROBUSTNESS OF RESULTS AND CONCLUSIONS IN …nordic.cochrane.org/sites/nordic.cochrane.org/files/public/uploads/theses/PhD_Thesis... · Therefore, a critical appraisal of the research

77

70. van Driel ML, De Sutter A, De Maeseneer J, Christiaens T. Searching for unpublished trials in Cochrane reviews may not be worth the effort. J Clin Epidemiol 2009;62:838-844.e3.

71. Melander H, Ahlqvist-Rastad J, Meijer G, Beermann B. Evidence b(i)ased medicine--selective reporting from studies sponsored by pharmaceutical industry: review of studies in new drug applications. BMJ 2003;326:1171-1173.

72. Rising K, Bacchetti P, Bero L. Reporting bias in drug trials submitted to the Food and Drug Administration: review of publication and presentation. PLoS Med 2008;5:e217.

73. Tonéatti C, Saïdi Y, Meiffrédy V, Tangre P, Harel M, Eliette V, et al. Experience using MedDRA for global events coding in HIV clinical trials. Contemp Clin Trials 2006;27:13-22.

74. Higgins IT, Cochrane AL, Thomas AJ. Epidemiological studies of coronary disease. Br J Prev Soc Med 1963;17:153-165.