chelsea ho research project report

26
Promoting clinical decision-making based on sound evidence generated by high-quality systematic reviews: the roles of the Cochrane Collaboration and a standardised appraisal instrument Chelsea Ho 1.0 Abstract Background: Recent studies suggest that systematic reviews may have been generating imprecise and biased results due to poor methodological conduct. Furthermore, there is growing concern that such unreliable findings are being implemented into the clinical setting, due to the lack of an appropriate appraisal tool to be used by practitioners in making the final judgement as to whether these findings can be trusted. This has prompted widespread awareness for the need to pioneer a novel approach that would effectively address these two issues, because previous initiatives have seen limited success thus far. Objective: By investigating controversies and recent advances pertaining to systematic review methodology and critical appraisal techniques, this literature review aims to recommend ways in which stringent, evidence-informed health decision-making can be achieved. Methods: MEDLINE, EMBASE and Cochrane Collaboration databases were searched to review the literature. AMSTAR (a measurement tool for the assessment of multiple systematic reviews) appraisals were conducted on two pairs of Cochrane and paper-based reviews. Results were compared and differences were explored. Limitations, extensions and recommendations were discussed. Findings: We anticipate that poor systematic review methodology and the lack of a standardised, comprehensive appraisal tool will continue to threaten appropriate clinical decision-making. Cochrane reviews were generally found to be superior in quality compared to

Upload: chelsea-ho

Post on 06-Apr-2017

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chelsea Ho Research Project Report

Promoting clinical decision-making based on sound evidence generated by

high-quality systematic reviews: the roles of the Cochrane Collaboration

and a standardised appraisal instrument

Chelsea Ho

1.0 Abstract

Background: Recent studies suggest that systematic reviews may have been generating

imprecise and biased results due to poor methodological conduct. Furthermore, there is growing

concern that such unreliable findings are being implemented into the clinical setting, due to the lack

of an appropriate appraisal tool to be used by practitioners in making the final judgement as to whether

these findings can be trusted. This has prompted widespread awareness for the need to pioneer a novel

approach that would effectively address these two issues, because previous initiatives have seen

limited success thus far.

Objective: By investigating controversies and recent advances pertaining to systematic

review methodology and critical appraisal techniques, this literature review aims to recommend ways

in which stringent, evidence-informed health decision-making can be achieved.

Methods: MEDLINE, EMBASE and Cochrane Collaboration databases were searched to

review the literature. AMSTAR (a measurement tool for the assessment of multiple systematic

reviews) appraisals were conducted on two pairs of Cochrane and paper-based reviews. Results were

compared and differences were explored. Limitations, extensions and recommendations were

discussed.

Findings: We anticipate that poor systematic review methodology and the lack of a

standardised, comprehensive appraisal tool will continue to threaten appropriate clinical

decision-making. Cochrane reviews were generally found to be superior in quality compared to

Page 2: Chelsea Ho Research Project Report

paper-based reviews. The AMSTAR instrument appears to be a user-friendly, reliable and efficient

appraisal tool.

Recommendations: To ensure that future clinical decisions are based only on sound evidence

generated by high-quality systematic reviews, we propose a two-tier approach: 1) to achieve high

methodological rigour, prospective review authors are encouraged to register under the Cochrane

Collaboration, and 2) a standardised instrument for the appraisal of published reviews (e.g.

AMSTAR) needs to be available to practitioners, to encourage routine appraisals of these reviews

prior to implementation of their findings into clinical practice.

2.0 Introduction

2.1 The rise of systematic reviews

With an ever-increasing number of studies being published in the health sciences, it is almost

impossible for clinicians to practically keep abreast with findings related to their fields1. The need for

healthcare professionals to stay updated has prompted the novel science of research synthesis, with

Archibald Cochrane providing the seminal framework for the conduct of a systematic review2. The

purpose of a systematic review is to summarise the outcomes of similar studies so that it allows the

practitioner to determine the ‘bottom line’ about the most likely effect of an exposure factor1, 3. This

is achieved by developing thorough protocols and search methods a priori, followed by synthesising

all relevant findings in a precise and unbiased manner. A meta-analysis component may be involved

whereby statistical techniques are used to incorporate quantitative data from various studies into a

single summary result1. The methodology of a typical systematic review involves a multi-step process

as shown in figure 2.1.1.

Page 3: Chelsea Ho Research Project Report

Figure 2.1.1: Outline of the processes required for conducting systematic reviews4

2.2 Ensuring high-quality systematic reviews: a revised approach is paramount

Since the 1990s, the number of systematic reviews being conducted each year has been on the

rise, with about four thousand articles being published annually by 20105. Of recent concern,

however, is the very real possibility that the majority of these reviews do not meet appropriate

methodological standards. Shea et al.6 defines methodological quality as the extent to which the

design of a systematic review is able to yield unbiased results, one key indicator being the ability for

these findings to be consistently replicated by other independent parties. Poor methodological quality

increases the likelihood of generating false summary results, which in turn could threaten health

outcomes of potential patients should these findings be implemented in the clinical setting.

Such concerns have been substantiated by several studies which suggest that the

methodological quality of most systematic reviews is disappointingly poor. Sacks et al.7 demonstrated

that in only 28% of reviews that they randomly selected and assessed were of adequate quality, while

a survey (using the Oxman and Guyatt index) of 80 reviews led by Jadad and McQuay8 reported a

median score of 4 out of 7, indicating a major flaw. These poor findings are quite ironic given that

there are currently more than 24 tools available to either 1) guide authors in conducting high-quality

systematic reviews and meta-analyses, or 2) assist clinicians in appraising and double-checking the

methodological quality of these reviews, once they are published. A few notable examples include

the STROBE and CONSORT statements, MOOSE guidelines, Sacks’ Checklist and a 9-item scale

by Oxman and Guyatt9. Shea et al.6 suggests that perhaps the continual proliferation of such checklists

Page 4: Chelsea Ho Research Project Report

might have caused confusion among researchers and practitioners, which in turn could have

discouraged them from thoroughly adhering to methodological guidelines or conducting routine

appraisals. Some of these tools were also found to be outdated, suboptimal and complicated10, thereby

causing further discouragement.

As indicated by the above studies, the accuracy of findings of current systematic reviews

cannot simply be trusted. While the 24 tools appear to have achieved limited success in promoting

improved review methodology or assisting clinicians through the appraisal process, a different

strategy is needed to ultimately ensure that only accurate findings generated from high-quality

reviews are transferred into clinical practice.

2.3 Research objectives

In consideration of the ongoing debate surrounding the methodological rigour and accuracy

of findings in today’s systematic reviews, as well as the appropriateness of various appraisal

instruments, this literature review therefore aims to evaluate historical contributions and recent

advances related to these fields. Two key areas of investigation will include the role of the Cochrane

Collaboration in facilitating improved review methodology and accuracy of findings, as well as the

suitability of the recently-developed AMSTAR as an appraisal tool for clinicians. Overall, by

comparing and critiquing the effectiveness of these two initiatives, this will subsequently enable

recommendations to be formulated which may ensure that future clinical decision-making is based

only on unbiased evidence extracted from high-quality systematic reviews.

2.4 Methodology and search strategy

A preliminary investigation into the statistical challenges related to healthcare was conducted.

Using the Griffith Library Catalogue, Google Scholar and PubMed, a general search for articles

dating from 1985 onwards (table 2.4.1) was performed. These sites enabled access to a range of

electronic databases, including MEDLINE, EMBASE, BioMed Central, ProQuest and the Cochrane

Collaboration. Grey literature such as newspaper articles, textbooks, conference proceedings and

Page 5: Chelsea Ho Research Project Report

government publications were also consulted. Throughout the literature review, the EndNote

Referencing Tool (version X7.4) was used to organise the relevant sources.

Due to their increasing relevance in modern healthcare research, we chose to focus on

controversies surrounding systematic reviews. In particular, we explored how methodological quality

of systematic reviews and available appraisal techniques would influence the application of findings

into the practice. Around this time, the research question and objectives were established, refined and

finalised prior to conducting AMSTAR appraisals on selected reviews.

As an extension, selected features of the AMSTAR instrument were explored. Firstly, a trial

AMSTAR appraisal was performed by two independent assessors on a systematic review by

Mickenautsch et al.11. Results (section 5.2) were compared against those produced by Sequeira-Byron

et al.12, who used the same AMSTAR instrument to assess this particular systematic review. Once it

was confirmed that AMSTAR seems to be a user-friendly checklist especially for a novice user, we

proceeded to select four candidate systematic reviews using the same databases listed earlier in this

section. These four articles were grouped into two pairs, with each pair containing a Cochrane and

paper-based review of similar research questions and publication dates (table 3.3.1). Thorough

AMSTAR appraisals were subsequently performed for these reviews. Findings and limitations were

discussed, and final recommendations were made.

Table 2.4.1: Key terms of search strategy used throughout literature review on databases

Page 6: Chelsea Ho Research Project Report

3.0 Discussion

3.1 Cochrane Collaboration

In response to the sudden influx of systematic reviews and growing concerns over their

methodological quality, a renowned international organisation called the Cochrane Collaboration was

founded in 199313. It is an initiative which strives to promote high-quality conduct of systematic

reviews, in order to produce reliable evidence that will empower clinicians to make informed

decisions13. They achieve this by creating various databases and online discussion spaces which

facilitate continual accessibility, transparency and revision of all of its registered systematic reviews

thus far. It has also implemented an elaborate peer review system involving several stages, including

the critiquing of a priori protocols as well as the continual evaluation of methodology by content

experts and potential users in post-publication14. Furthermore, the organisation provides detailed and

explicit instructions related to methodology, as outlined in the Cochrane Handbook for Systematic

Reviews of Interventions13. All these features, when used in conjunction with the organisation’s

Review Manager Software (RevMan), provide a solid framework for the conduct of high-quality

systematic reviews. While some articles published in other scientific journals (“paper-based

reviews”) may likewise employ Cochrane methodology, these articles do not undergo the same

stringent peer-review process as their registered Cochrane counterparts, nor are they always able to

harness the collaborative support and advice of the online Cochrane community13.

According to most studies, Cochrane’s aim to protect the quality and relevance of its

systematic reviews is well on track. When Jadad15 reviewed a random sample of 50 reviews from

both Cochrane and paper-based journals, a majority (70%) of the 10 reviews which demonstrated

higher quality were from Cochrane. Likewise, Olsen et al.16 found minor or no problems in most

(71%) of the 53 Cochrane reviews that were randomly selected for appraisal. These findings make

sense in light of the collaboration’s stringent guidelines related to methodology and their extensive

review process as described previously.

Page 7: Chelsea Ho Research Project Report

However, a study by Shea et al.17 produced contradictory results. When appraised by the

Oxman and Guyatt index, the overall methodological quality in both Cochrane and paper-based

reviews were discovered to be low, with average scores of 3.35 (95% CI = 2.83, 3.87) and 3.42 (95%

CI = 2.92, 3.93) respectively, out of 7. The authors did acknowledge, however, that the unusually-low

scores especially observed in Cochrane reviews might not actually be representative of their actual

quality. A prime example is that while only 4% of Cochrane reviews that they surveyed explicitly

mentioned developing a protocol a priori, in reality all Cochrane reviews are required to publish a

comprehensive protocol in the collaboration’s database before conducting the actual review.

Furthermore, while Shea et al.17 found that numerous Cochrane reviews did not adequately report on

their search strategies in the final publication, this may be due to the assumption that all Cochrane

reviews had already been approved of their methodological rigour in the a priori protocol and

therefore did not need to be mentioned in the final publication. This limitation is further discussed in

section 3.5.

Despite the controversy surrounding the accuracy of the study by Shea et al.17 and that general

consensus remains in favour of Cochrane reviews being more superior in quality over paper-based

articles, the Cochrane Collaboration has nevertheless taken some measures to improve the standards

of its reviews. This has included further training and development for its reviewers, setting up a more

effective system for the post-publication peer-review process and establishing a more comprehensive

post-publication refereeing protocol in place17. The Cochrane Collaboration should be applauded for

pioneering the establishment of a global community dedicated to high-quality production of

systematic reviews, and is expected to play an influential role in improving these standards in the

future.

3.2 AMSTAR instrument

While the Cochrane Collaboration has garnered an international reputation as the leading

organisation which assists authors to produce high-quality systematic reviews, no single instrument

has established dominance in terms of the preferred choice by clinicians to appraise the

Page 8: Chelsea Ho Research Project Report

methodological quality of already-published reviews6. Some possible reasons for this have already

been explored in section 2.2. In addition, one should remember that some tools, such as PRISMA

(Preferred Reporting Items for Systematic Reviews and Meta-Analyses), were not designed to be

used by clinicians for assessing the methodological quality of systematic reviews in first place. This

is because their checklists may have been optimised for other purposes, such as helping review

authors achieve transparent and complete reporting techniques18, 19.

Just as there is a role for authors to ensure high methodological rigour in their reviews, there

should likewise be a responsibility for practitioners to double-check the validity of published

systematic reviews in order to make the final call as to whether their findings can be trusted. Such

appraisal process is important, as although adherence to Cochrane guidelines should theoretically

eliminate any likelihood of producing biased reviews, in reality this may not always be the case.

In response to the lack of a universal appraisal instrument, Shea et al.10 developed AMSTAR,

a measurement tool for the “assessment of multiple systematic reviews”. This instrument aims to be

user-friendly to the novice assessor by enabling the efficient appraisal of numerous reviews within a

short span of time, while ensuring that all key aspects of the methodology are addressed. Development

of the AMSTAR instrument involved building upon empirical evidence and integrating relevant

elements of previous tools, while incorporating expert opinion throughout the process. Specifically,

various elements from the enhanced Overview Quality Assessment Questionnaire (OQAQ) as well

as Sacks’ Checklist10 were combined. The authors determined that these two tools were the best

among the 24 options, because they are comprehensive and have already been rigorously developed,

but merely outdated in some areas6. Additionally, the elements of language restriction, publication

bias and publication status were added to the raw list10. These were absent in previous tools but now

deemed important due to methodological advances and the nature of publishing modern literature10.

Combination of these elements resulted in a preliminary 37-item assessment tool. This hybrid

tool was then subjected to exploratory factor analysis to identify the underlying components of the

list. From this, a 29-item assessment tool was derived which corresponded to 11 key components10.

Page 9: Chelsea Ho Research Project Report

This tool was then critiqued and examined by an international panel of clinicians, methodologists,

epidemiologists and reviewers who possessed expertise in review methodology and critical appraisal

techniques10. The result of this collaboration is an updated, comprehensive and streamlined

instrument known as AMSTAR (figure 3.2.1).

Figure 3.2.1: AMSTAR checklist10

Since its inception in 2007, careful psychometric validations of the AMSTAR instrument have

been conducted to assess its overall appropriateness as an appraisal instrument for clinicians.

Page 10: Chelsea Ho Research Project Report

Generally, assessors found that AMSTAR can be easily and efficiently applied to a large number of

systematic reviews, while thoroughly addressing the most relevant and crucial aspects of

methodological quality6. Shea et al.6 conducted an internal validation whereby the performance of

AMSTAR was compared to previously accepted tools such as OQAQ and Sacks’ Checklist. When

the instruments were applied to a random sample of 30 reviews by two separate assessors, the results

were in favour of AMSTAR being the superior candidate, as it achieved the highest standards related

to the parameters of reliability, feasibility, agreement and construct validation6. These findings were

supported by an external validation performed by the Canadian Agency for Drugs and Technologies

in Health (CADTH)20.

AMSTAR may play a crucial role in empowering clinicians to effectively appraise and

double-check the methodology of published systematic reviews. The roles of AMSTAR and the

Cochrane Collaboration are indeed distinct but may be both necessary as part of an overall strategy

to ensure that the implementation of future clinical measures are based only on accurate summary

results generated from high-quality reviews.

3.3 Selected Cochrane versus paper-based systematic reviews

As an extension of the literature review, two pairs of Cochrane versus paper-based systematic

reviews were appraised using AMSTAR, as mentioned in section 2.4. Full AMSTAR results are

documented in section 5.1. As shown in table 3.3.1, the first pair investigated the impact of Omega-3

fatty acids on the treatment of depression, while the second pair explored the effect of probiotics on

preventing upper respiratory tract infections. This extension may allow us to evaluate the applicability

of the AMSTAR instrument at an anecdotal level.

Very importantly, it should be acknowledged that the quality of each of the four reviews might

not accurately represent the general methodological rigour associated with their respective categories.

The primary aim of these appraisals is to demonstrate the applicability of AMSTAR; any comparison

of results is secondary and approached with caution. Nevertheless, from these appraisals we were

able to extract and highlight some of the differences between the selected Cochrane and paper-based

Page 11: Chelsea Ho Research Project Report

reviews, which may possibly reflect differences in general literature. These differences are discussed

in section 3.4.

Table 3.3.1: Selected pairs of Cochrane and paper-based systematic reviews based on similar research questions

Pair General topic Cochrane review Paper-based review

1 Omega-3 on depression A: Appleton Katherine et

al.21 B: Bloch and Hannestad22

2 Probiotics on preventing upper

respiratory tract infections (URTIs) H: Hao et al.23 O: Ozen24

It is worthwhile to highlight some inconsistencies associated with the comparison of each pair.

For Pair 1, while both systematic reviews explored the same research question, there is a three-year

difference in publication date. This may unfairly reflect differences attributed to the evolution of

general review methodology within that three-year gap, instead of the actual characteristics which

differentiate the Cochrane and paper-based reviews. On the other hand, while both reviews in Pair 2

were conducted in 2015, the research topics were slightly different. The Cochrane review focused on

URTIs in the general population while its paper-based counterpart focused on paediatric URTIs.

Nevertheless, after considering both drawbacks, AMSTAR results of Pair 2 were selected for further

discussion as their similar publication dates would encourage a fairer comparison related to their

methodology.

3.4 AMSTAR results and general observations of the selected pair

For Pair 2, the Cochrane review by Hao et al.23 demonstrated higher methodological rigour

than the paper-based review by Ozen24, with AMSTAR scores of 11/11 (high) and 3/11 (low)

respectively (section 5.1). The perfect score by the Cochrane review aligns with the results of Jadad15

and Olsen et al.16, whereby it was found that Cochrane reviews generally demonstrated higher quality

methodology (section 3.1). Meanwhile, the results did not support the findings by Shea et al.17, since

the large score difference between the Cochrane and paper-based systematic reviews – by eight points

– contradicted their claims that Cochrane reviews were not much better in quality compared to

paper-based reviews17. If it is true that the quality of Cochrane reviews was once almost as low as

their paper-based counterparts during the initial stages of the Collaboration as claimed by Shea et

Page 12: Chelsea Ho Research Project Report

al.17, then clearly the organisation has adopted effective measures to improve the methodological

rigour of its systematic reviews. Of course, one should remember that the individual AMSTAR scores

achieved by Hao et al.23 and Ozen24 do not necessarily reflect the typical quality of Cochrane and

paper-based reviews.

From these scores, eight differences (criteria 1-5; 8-10) between the two reviews were

identified. These criteria were fulfilled by the Cochrane review but not the paper-based review

(section 5.1). Out of these, three criteria were selected for further discussion: the development of a

rigorous a priori design (criterion 1), the inclusion of grey literature (criterion 4) and publication bias

(criterion 10). We chose to discuss the a priori component as it is already known that Cochrane

actively mandates the design of a protocol before conducting any review, while this preliminary step

is less regulated in other journals13. We also selected the topic of grey literature and publication bias

for further discussion, since they were not included in previous checklists until the development of

AMSTAR10. Our selections do not serve to suggest which aspects of systematic review methodology

may exert greater influence on the overall validity of findings.

Unfortunately, the actual findings regarding the effect of probiotics on URTIs were unable to

be compared between the two reviews, as Hao et al.23 investigated URTIs in general while Ozen24

focused on paediatric URTIs specifically.

3.4.1 Criterion 1 - was an a priori design provided?

As part of the transparent and structured nature of systematic reviews, an a priori design in

the form of a detailed protocol is highly advocated. The benefits of such protocols have been

recognised by the Cochrane Collaboration; the organisation mandates the publication of protocols in

the collaboration’s Review Manager Software using a uniform format, before allowing authors to

perform the actual review13. This serves to minimise any potential for bias, promotes transparency of

methods, reduces duplication and allows efficient peer-review of the planned procedures13. While all

efforts should be made to adhere to the original protocol as closely as possible, Cochrane recognises

Page 13: Chelsea Ho Research Project Report

that certain aspects might nevertheless require change in order to adapt to unanticipated

circumstances13.

The AMSTAR criterion for an a priori design was as such fulfilled in the Cochrane review

by Hao et al.23. However, this was not fulfilled in the paper-based review by Ozen24 as it fails to

mention any protocol or published research objectives in the article. Without collaborative scrutiny

of the protocol, some unidentified flaws might have continued to persist through the various statistical

procedures which could ultimately have led to biased conclusions being generated.

3.4.2 Criterion 4 - was status of publication (grey literature) used as an inclusion criterion?

Another common area of contention in systematic review methodology concerns the extent to

which grey literature should be incorporated into reviews. Grey literature ranges from unpublished

observations, to dissertations and conference proceedings; evidence which do not make it to published

journals13. It is known that various authors have incorporated grey literature data to different extents

with the attempt to be at least somewhat inclusive of all available literature2. This variation is certainly

reflected when comparing the reviews by Hao et al.23 and Ozen24. Hao et al.23 fulfilled the criterion

by searching the WHO International Clinical Trials Registry Platform (via a portal,

http://apps.who.int/trialsearch/) and ClinicalTrials.gov (https://clinicaltrials.gov/), while Ozen24 did

not mention any such search at all. However, one should note that Hao et al.23 did not elaborate on

how the grey literature was accessed, even though it is known that such data can sometimes be

difficult to retrieve in a systematic manner.

It is likely that such disparities in search protocol would affect the type and amount of data

being retrieved. Should such data be combined in a meta-analysis, results would inevitably be skewed

depending on whether or not certain pieces of grey literature were included. This may lead to

conflicting findings between systematic reviews, which would in turn complicate the process of

making reliable clinical decisions in the healthcare sector. While inclusion of unpublished data

remains controversial, a survey by Cook et al.25 revealed that most methodologists (77.7%) agree that

grey literature should not be automatically excluded, unless it specifically fails to meet the inclusion

Page 14: Chelsea Ho Research Project Report

criteria25. This view aligns with that of the AMSTAR developers10. Cook et al.25 proceeds to argue

that it is not the naive inclusion or exclusion of grey literature which determines a systematic review’s

validity; instead, a combination of both published and unpublished data should be incorporated on

the basis of reliability and adherence to appropriate methodological standards. Just like systematic

reviews, the quality of grey literature can be critically appraised by considering ‘Levels of Evidence’

as described in The Steering Committee on Clinical Practice Guidelines for the Care and Treatment

of Breast Cancer and other decision-aid tools26, 27. This may enable clinicians to discern whether or

not a particular piece of grey literature should be included.

3.4.3 Criterion 10 - was the likelihood of publication bias assessed?

Publication bias refers to the extent to which findings in published literature is systematically

misrepresentative of the overall population of completed studies28. This often occurs whereby studies

which report positive and statistically significant findings are more likely to be published than studies

which do not detect significance. This poses a serious threat to the validity of conclusions derived

from systematic reviews and meta-analyses; an overwhelming proportion of positive results, for

example, may skew a conclusion to declare statistical significance where it may in fact not exist.

Fortunately, Rothstein et al.28 states that the existence of publication bias and the extent of its

impact can be identified and empirically demonstrated using various statistical methods. One method

recommended by the AMSTAR developers10 is the Funnel Plot, a primary visual tool consisting of

simple scatter plots estimated from individual studies against a measure of study size28. Asymmetrical

funnel plots may be indicative of the existence of publication bias relating to a particular research

topic (figure 3.4.3.1).

A clearly influential factor on the outcomes of systematic reviews, it is both disappointing and

concerning that Ozen24 did not even acknowledge the potential for publication bias to skew its results,

rendering the study to be less credible. While Hao et al.23 also did not assess the presence of

publication bias, this decision was justified by that fact that their small number of trials (less than 10)

would make any detection of publication bias difficult in the first place.

Page 15: Chelsea Ho Research Project Report

It is crucial that future systematic reviews should consider the impact of publication bias on

the validity of their conclusions.

Figure 3.4.3.1: Example funnel plots: a) symmetrical alignment without bias; open circles show smaller studies with no positive effects, while b) asymmetrical alignment with publication bias; smaller studies with no positive effects are absent. Based on figure by Sterne and Egger29.

3.5 Limitations

While AMSTAR has gained respect and has been embraced by various professional

healthcare and policy institutions, there are two main flawed assumptions related to its use.

Firstly, as highlighted by Shea et al.17, the appraisal process relies on the supposition that what

gets reported in the published systematic review directly reflects its actual methodology. Especially

for paper-based reviews, perhaps it is not possible to report every detail of the methodology due to

various factors, such as word limit and readability for the intended audience. This might have been

the case for our selected paper-based reviews; Bloch and Hannestad22 produced only 11 pages, while

Ozen24 produced 13 pages. This restriction may foster an ignorant viewpoint that reduced reporting

is a reflection of poor methodology, which of course is not always the case. In contrast, the Cochrane

Handbook for Systematic Reviews of Interventions13 states that there is no formal word limit (although

authors are strongly encouraged to stay below 10,000 words) for registered Cochrane articles, thus

permitting the opportunity for Cochrane authors to provide extensive descriptions on methodology.

This is indeed observed in our selected Cochrane reviews whereby Appleton Katherine et al.21 and

Hao et al.23 produced 134 and 69 pages respectively. Due to such differences in the amount of

permitted reporting, it is therefore unsurprising that registered Cochrane reviews have generally

received higher scores than paper-based reviews as previously investigated in section 3.1. We

Page 16: Chelsea Ho Research Project Report

anticipate that the discordance between methodological conduct and reporting standards may

continue to pose a major limitation in allowing clinicians to make accurate and representative

assessments of future systematic reviews.

Secondly, the AMSTAR appraisal relies on the assumption that the methodology being

reported is always conducted accurately and without errors. The developers of AMSTAR have aimed

to produce an instrument that can be easily used by practitioners and the general public to evaluate

the quality of systematic reviews within a short time span17. Therefore, it is only expected that

assessors are able to identify as to whether or not a particular procedure was carried out, as opposed

to thoroughly scrutinising the accuracy of the statistical method being conducted.

3.6 Improvements and extensions

In consideration of the above limitations, we propose several improvements to the reporting

and appraisal processes associated with systematic review methodology, as well as the revision of

some aspects of the AMSTAR instrument.

To address the first limitation as described in section 3.5, review authors are encouraged to

strive for transparent and complete reporting of their methodology, within the permissible word

limits. This may be facilitated by adhering to guidelines provided by the PRISMA statement, which

has been specifically designed to promote high-quality reporting18, 19. This should minimise any

potential for discordance between the actual methodology versus what gets reported, thereby allowing

clinicians to correctly appraise the methodological rigour of systematic reviews. However, if it seems

likely that the authors’ reporting might have misrepresented the actual methodology, it is important

for clinicians to gather as much of the available information on the review methodology. This can

include any supplementary data, protocols and feedback, as well as directly contacting the review

authors. We acknowledge, however, that this task could potentially be extremely laborious and

consequently deter clinicians from conducting any appraisals in the first place. As such, perhaps

organisations such as the Cochrane Collaboration could consider establishing a central database

Page 17: Chelsea Ho Research Project Report

solely dedicated to the compilation of all such relevant information related to each specific systematic

review, which can subsequently be retrieved by any assessor.

Furthermore, we support the efforts of Kung et al.2 for proposing a more thorough point-

scoring system as part of their development of the Revised Assessment of Multiple Systematic

Reviews (R-AMSTAR). This differs from the original AMSTAR instrument in that different scores

are awarded to each of the 11 domains. Based on criteria specified in the appendix of Kung et al.2,

each domain is awarded from a minimum of 1 point to a maximum of 4 points, depending on how

adequately the sub-criteria within each domain is covered. This corresponds to a total R-AMSTAR

score ranging from 11 to 44, which is then assigned a quality grade of A, B, C or D. While an external

validation of the R-AMSTAR tool by Pieper et al.30 found that the revised instrument seems to possess

poor measurement qualities, further collaborative refinement of the R-AMSTAR tool may eventually

enable a more accurate quantification of the methodological quality of systematic reviews, and

thereby supersede the current version. Further extending R-AMSTAR, we agree with Sequeira-Byron

et al.12 in their proposal to assign weightings (in addition to scores) to individual criterion, based on

the extent of their likely influence on the overall quality of these reviews. For example, one’s

thoroughness of searching data and his choice of including certain pieces of grey literature (criteria 3

& 4) may exert a greater impact on the accuracy of a review’s findings, than whether or not a list of

studies was provided in the published report (criterion 5).

We acknowledge that incorporating individual scores and weightings may increase the

complexity of the original AMSTAR instrument and perhaps reduce its user-friendliness. To address

this inevitable issue, perhaps one could consider developing a software or application which first

allows assessors to enter in raw results, before calculating a summary score.

These recommendations should be subject to further investigation. Most importantly, internal

and external validations should be performed to ensure that the construct, content and criterion

validity provided by the original AMSTAR instrument are preserved, while making the necessary

Page 18: Chelsea Ho Research Project Report

improvements. AMSTAR will continue to be a living document whereby revisions will be made

should it be required to do so.

3.7 Final recommendations

In consideration of our review of the literature and AMSTAR appraisals, we recommend a

two-tiered approach to address our research objectives (section 2.3).

Firstly, with regards to the process of conducting systematic reviews, prospective authors are

encouraged to register their studies under the Cochrane Collaboration (left column of figure 3.7.2).

The organisation has successfully assisted authors to achieve high methodological rigour thus far,

and it is expected that the proportion of high-quality Cochrane reviews will continue to increase. If

registration under the Collaboration is not possible, authors are nevertheless encouraged to adopt the

Cochrane approach to ensure high methodological standards (right column of figure 3.7.2). Secondly,

in relation to appraising published systematic reviews, we argue for the need of a standardised

instrument to be used by clinicians in verifying their methodological quality, regardless of the

reviews’ affiliation to the Cochrane Collaboration. Such a tool should encourage clinicians to

routinely sieve out poor-quality reviews in an efficient yet comprehensive manner, while minimising

any confusion related to the variation among previous appraisal tools. Although presenting some

limitations, we believe that AMSTAR is one suitable candidate for such an instrument, due to its

user-friendliness, reliability and relevance. Perhaps state and federal healthcare organisations, as well

as WHO, may assist in the implementation of a standardised instrument through policies, guidelines,

conferences and other professional development events.

One should remember that the roles of the Cochrane Collaboration and a standardised

appraisal instrument are distinct. Nevertheless, they are both vital in providing a stringent two-tiered

framework (figure 3.7.1), which may ultimately ensure that only valid findings generated by

high-quality systematic reviews are transferred to the clinical setting. The overall recommended

process, from beginning the conduct of a systematic review to implementation of findings, is

presented in figure 3.7.2.

Page 19: Chelsea Ho Research Project Report

Figure 3.7.1: Summary of the recommended two-tiered approach which may assist in ensuring that only valid findings high-quality systematic reviews are considered for implementation.

Figure 3.7.2: Recommended process for the conduct, publication and appraisal of systematic reviews, as well as the implementation of findings. The left column is the preferred route due to the affiliation with Cochrane. The roles of the Cochrane Collaboration and AMSTAR should be distinct, with Cochrane being of higher relevance to authors in helping to promote high methodological rigour, while AMSTAR should primarily be employed by clinicians in verifying the methodological validity of the published review.

Page 20: Chelsea Ho Research Project Report

4.0 Conclusion

Our review of the literature suggests that the practical application of accurate, unbiased

findings generated from only high-quality systematic reviews will continue to pose a major challenge

to the modern healthcare industry. The rate of systematic reviews being published is evidently on the

rise, but the methodological rigour achieved by its authors, as well as the issue of finding an

appropriate appraisal tool for clinicians, remains controversial among relevant individuals and

organisations.

We believe that the Cochrane Collaboration and the standardisation of an appraisal

instrument, such as AMSTAR, may provide distinct yet equally significant benefits in enabling such

a challenge to be overcome. The online nature of the Cochrane Collaboration, as well as its stringent

guidelines, have been successful in facilitating a culture of constructive scrutiny and revision of its

systematic reviews. This has resulted in improved methodological rigour compared to traditional

paper-based reviews. Meanwhile, a standardised appraisal instrument, such as AMSTAR, may

encourage clinicians to conduct routine appraisals on the relevant systematic reviews, and

consequently empower them to determine whether the findings of such reviews can be trusted. In

addressing our research objectives (section 2.3), we therefore recommend a two-tier approach

involving the Cochrane Collaboration and AMSTAR (as summarised in figures 3.7.1 and 3.7.2) to

ultimately ensure that only reliable evidence is used to influence future clinical practice. Further

research should be directed towards the refinement of the existing AMSTAR tool, while ensuring that

its current merits are retained.

It is essential that academicians, practitioners and policy makers in the healthcare industry

cooperate in adopting strategies that are ultimately in the best interests of the relevant parties

involved, especially patients.

Page 21: Chelsea Ho Research Project Report

5.0 Appendices

5.1 Results of AMSTAR appraisals

Table 5.1.1: AMSTAR scores of selected Cochrane and paper-based systematic reviews, and criterion details. Checklist developed by Shea et al.10. Appraisals were conducted by Chelsea Ho and Thomas Frost, with any conflicts being resolved by discussion.

COMPONENT A B H O

1. Was an a priori design provided?* The research question and inclusion criteria should be established before the conduct of the review. Note: Need to refer to a protocol, ethics approval, or pre-determined/a priori published research objectives to score a “yes.” *While all efforts should be made to adhere to the original protocol as closely as possible, it is acknowledged that certain aspects will inevitably be subject to change in order to adapt to unanticipated circumstances.

✓ ✗ ✓ ✗

2. Was there duplicate study selection and data extraction? There should be at least two independent data extractors and a consensus procedure for disagreements should be in place. Note: 2 people do study selection, 2 people do data extraction, consensus process or one person checks the other’s work.

✓ ✓ ✓ ✗

3. Was a comprehensive literature search performed? At least two electronic sources should be searched. The report must include years and databases used (e.g., Central, EMBASE, and MEDLINE). Key words and/or MESH terms must be stated and where feasible the search strategy should be provided. All searches should be supplemented by consulting current contents, reviews, textbooks, specialized registers, or experts in the particular field of study, and by reviewing the references in the studies found. Note: If at least 2 sources + one supplementary strategy used, select “yes” (Cochrane register/Central counts as 2 sources; a grey literature search counts as supplementary).

✓ ✗ ✓ ✗

4. Was the status of publication (i.e. grey literature) used as an inclusion criterion? The authors should state that they searched for reports regardless of their publication type. The authors should state whether or not they excluded any reports (from the systematic review), based on their publication status, language etc. Note: If review indicates that there was a search for “grey literature” or “unpublished literature,” indicate “yes.” SINGLE database, dissertations, conference proceedings, and trial registries are all considered grey for this purpose. If searching a source that contains both grey and non-grey, must specify that they were searching for grey/unpublished lit.

✓ ✗ ✓ ✗

5. Was a list of studies (included and excluded) provided? A list of included and excluded studies should be provided. Note: Acceptable if the excluded studies are referenced. If there is an electronic link to the list but the link is dead, select “no.”

✓ ✓ ✓ ✗

6. Were the characteristics of the included studies provided? In an aggregated form such as a table, data from the original studies should be provided on the participants, interventions and outcomes. The ranges of characteristics in all the studies analysed e.g., age, race, sex, relevant socioeconomic data, disease status, duration, severity, or other diseases should be reported.

✓ ✓ ✓ ✓

Page 22: Chelsea Ho Research Project Report

Note: Acceptable if not in table format as long as they are described as above.

7. Was the scientific quality of the included studies assessed and documented? 'A priori' methods of assessment should be provided (e.g., for effectiveness studies if the author(s) chose to include only randomized, double-blind, placebo controlled studies, or allocation concealment as inclusion criteria); for other types of studies alternative items will be relevant. Note: Can include use of a quality scoring tool or checklist, e.g., Jadad scale, risk of bias, sensitivity analysis, etc., or a description of quality items, with some kind of result for EACH study (“low” or “high” is fine, as long as it is clear which studies scored “low” and which scored “high”; a summary score/range for all studies is not acceptable).

✓ ✓ ✓ ✓

8. Was the scientific quality of the included studies used appropriately in formulating conclusions? The results of the methodological rigor and scientific quality should be considered in the analysis and the conclusions of the review, and explicitly stated in formulating recommendations. Note: Might say something such as “the results should be interpreted with caution due to poor quality of included studies.” Cannot score “yes” for this question if scored “no” for question 7.

✓ ✓ ✓ ✗

9. Were the methods used to combine the findings of studies appropriate? For the pooled results, a test should be done to ensure the studies were combinable, to assess their homogeneity (i.e., Chi-squared test for homogeneity, I2). If heterogeneity exists a random effects model should be used and/or the clinical appropriateness of combining should be taken into consideration (i.e., is it sensible to combine?). Note: Indicate “yes” if they mention or describe heterogeneity, i.e., if they explain that they cannot pool because of heterogeneity/variability between interventions.

✓ ✓ ✓ N/A

10. Was the likelihood of publication bias assessed? An assessment of publication bias should include a combination of graphical aids (e.g., funnel plot, other available tests) and/or statistical tests (e.g., Egger regression test, Hedges-Olken). Note: If no test values or funnel plot included, score “no”. Score “yes” if mentions that publication bias could not be assessed because there were fewer than 10 included studies.

✓ ✓ ✓ ✗

11. Was the conflict of interest included? Potential sources of support should be clearly acknowledged in both the systematic review and the included studies. Note: To get a “yes,” must indicate source of funding or support for the systematic review AND for each of the included studies.

✓ ✗ ✓ ✓

TOTAL SCORE 11 7 11 3

OVERALL QUALITY RATING High Medium High Low

Table 5.1.2: Selected reviews under two categories: Cochrane vs. paper-based and topic.

Cochrane Paper-based

Omega-3 on depression A: Appleton Katherine et al.21 B: Bloch and Hannestad22

Probiotics on preventing URTI H: Hao et al.23 O: Ozen24

Page 23: Chelsea Ho Research Project Report

5.2 Results of the trial AMSTAR appraisal

Table 5.2.1: AMSTAR appraisals of the systematic review performed by Mickenautsch et al.11. Appraisals were conducted internally (left column) and also by Sequeira-Byron et al.12 (right column). Although some differences (1, 5, 8, 9) in scores between the two appraisals existed, most results (2-4, 6-7, 10-11) were agreed upon, rendering the AMSTAR appraisal tool seemingly reliable and consistent, upon initial testing. Appraisals were conducted by Chelsea Ho and Thomas Frost, with any conflicts being resolved by discussion.

Criterion Internal Appraisal

Appraisal by Sequeira-Byron et al.12

Comments

1 ✗ ✗ Protocol was not presented

2 ✓ ✓

3 ✓ ✓

4 ✓ ✓

5 ✓ ✗ We believe that this criterion was fulfilled as both included and excluded studies were referenced

6 ✓ ✓

7 ✓ ✓

8 ✓ N/A We believe that the conclusion acknowledged the quality of the studies and thereby stated the need for further well-designed randomised trials for confirmation of these results

9 N/A ✗

10 ✗ ✗

11 ✗ ✗

Total score

7/11 5/11

6.0 Conflicts of Interest

The author declares no affiliation with the Cochrane Collaboration, AMSTAR or any other

organisations relevant to this literature review.

7.0 Acknowledgements

I would like to thank Mrs Nikki Fozzard for her continuous enthusiasm in supervising me for

this literature review, Dr Natalie Colson for providing her insights and sharing her expertise in the

field of systematic reviews, Thomas Frost for appraising the selected systematic reviews and assisting

in the moderation process, and Professor Mark Forwood for coordinating the Summer Semester

research projects.

Page 24: Chelsea Ho Research Project Report

8.0 Bibliography

1. Uman LS. Systematic reviews and meta-analyses. J Can Acad Child Adolesc Psychiatry.

2011;20(1):57-9.

2. Kung J, Chiappelli F, Cajulis OO, Avezova R, Kossan G, Chew L, et al. From systematic

reviews to clinical recommendations for evidence-based health care: validation of revised

assessment of multiple systematic reviews (R-AMSTAR) for grading of clinical relevance.

Open Dent J. 2010;4(1):84-91.

3. Boland A, Cherry MG, Dickson R. Doing a systematic review: a student's guide. Los Angeles:

SAGE; 2014.

4. Glasziou P. Systematic reviews in health care. GB: Cambridge University Press; 2004.

5. Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day:

how will we ever keep up? PLoS Med. 2010;7(9):e1000326.

6. Shea BJ, Hamel C, Wells GA, Bouter LM, Kristjansson E, Grimshaw J, et al. AMSTAR is a

reliable and valid measurement tool to assess the methodological quality of systematic reviews.

J Clin Epidemiol. 2009;62(10):1013-20.

7. Sacks HS, Berrier J, Reitman D, Ancona-Berk VA, Chalmers TC. Meta-analyses of randomized

controlled trials. N Engl J Med. 1987;316(8):450-5.

8. Jadad AR, McQuay HJ. Meta-analyses to evaluate analgesic interventions: a systematic

qualitative review of their methodology. J Clin Epidemiol. 1996;49(2):235-43.

9. Vandenbroucke JP. STREGA, STROBE, STARD, SQUIRE, MOOSE, PRISMA,

GNOSIS, TREND, ORION, COREQ, QUOROM, REMARK… and CONSORT: for whom

does the guideline toll? J Clin Epidemiol. 2009;62(6):594-6.

10. Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, Hamel C, et al. Development of

AMSTAR: a measurement tool to assess the methodological quality of systematic reviews.

BMC Med Res Methodol. 2007;7(1):10-6.

Page 25: Chelsea Ho Research Project Report

11. Mickenautsch S, Leal SC, Yengopal V, Bezerra AC, Cruvinel V. Sugar-free chewing gum and

dental caries: a systematic review. J Appl Oral Sci. 2007;15(2):83-8.

12. Sequeira-Byron P, Fedorowicz Z, Jagannath VA, Sharif MO. An AMSTAR assessment of the

methodological quality of systematic reviews of oral healthcare interventions published in the

Journal of Applied Oral Science (JAOS). J Appl Oral Sci. 2011;19(5):440-7.

13. Cochrane handbook for systematic reviews of interventions. Version 5.1.0 ed. England: The

Cochrane Collaboration; 2011.

14. Jadad AR, Cook DJ, Jones A, Klassen TP, Tugwell P, Moher M, et al. Methodology and reports

of systematic reviews and meta-analyses: a comparison of Cochrane reviews with articles

published in paper-based journals. JAMA. 1998;280(3):278-80.

15. Jadad AR. Systematic reviews and meta-analyses on treatment of asthma: critical evaluation.

Br Med J. 2000;320:537-40.

16. Olsen O, Middleton P, Ezzo J, Gøtzsche PC, Hadhazy V, Herxheimer A, et al. Quality of

Cochrane reviews: assessment of sample from 1998. Br Med J. 2001;323:829-32.

17. Shea B, Moher D, Graham I, Pham B, Tugwell P. A comparison of the quality of Cochrane

reviews and systematic reviews published in paper-based journals. Eval Health Prof.

2002;25(1):116-29.

18. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews

and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.

19. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, et al. The PRISMA

statement for reporting systematic reviews and meta-analyses of studies that evaluate health

care interventions: explanation and elaboration. PLoS Med. 2009;6(7):e1000100.

20. Shea BJ, Bouter LM, Peterson J, Boers M, Andersson N, Ortiz Z, et al. External validation of

a measurement tool to assess systematic reviews (AMSTAR). PLoS One. 2007;2(12):e1350.

21. Appleton Katherine M, Sallis Hannah M, Perry R, Ness Andrew R, Churchill R. Omega-3 fatty

acids for depression in adults. Cochrane Database Syst Rev. 2015;11:CD004692.

Page 26: Chelsea Ho Research Project Report

22. Bloch MH, Hannestad J. Omega-3 fatty acids for the treatment of depression: systematic review

and meta-analysis. Mol Psychiatry. 2012;17(12):1272-82.

23. Hao Q, Dong Bi R, Wu T. Probiotics for preventing acute upper respiratory tract infections.

Cochrane Database Syst Rev. 2015;2:CD006895.

24. Ozen M. Probiotics for the prevention of pediatric upper respiratory tract infections: a

systematic review. Expert Opin Biol Ther. 2015;15(1):9-20.

25. Cook DJ, Guyatt GH, Ryan G, Clifton J, Buckingham L, Willan A, et al. Should unpublished

data be included in meta-analyses?: Current convictions and controversies. JAMA.

1993;269(21):2749-53.

26. The Steering Committee on clinical practice guidelines for the care and treatment of breast

cancer. CMAJ. 1998;158(3):S1-2.

27. Benzies KM, Premji S, Hayden KA, Serrett K. State-of-the-evidence reviews: advantages and

challenges of including grey literature. Worldviews Evid Based Nurs. 2006;3(2):55-61.

28. Rothstein H, Sutton AJ, Borenstein M, Wiley B. Publication bias in meta-analysis:

prevention, assessment and adjustments. Chichester: Wiley; 2005.

29. Sterne JA, Egger M. Funnel plots for detecting bias in meta-analysis: guidelines on choice of

axis. J Clin Epidemiol. 2001;54(10):1046-55.

30. Pieper D, Buechter RB, Li L, Prediger B, Eikermann M. Systematic review found AMSTAR,

but not R(evised)-AMSTAR, to have good measurement properties. J Clin Epidemiol.

2015;68(5):574-83.