method - web viewkp conceived the idea for this study ... o’hagan and colleagues substantive...
TRANSCRIPT
Pharmacoeconomics: 24May2016
Reporting guidelines for the use of expert judgement in
model-based economic evaluations
Cynthia P Iglesias1
Alexander Thompson2
Wolf H Rogowski3,4
Katherine Payne2*
Authors Institutions1 Department of Health Sciences, Centre for Health Economics and the Hull and York
Medical School, University of York.2 Manchester Centre for Health Economics, The University of Manchester3 Helmholtz Zentrum München, German Research Center for Environmental Health, Institute
of Health Economics and Health Care Management, Neuherberg, Germany4 Department of Health Care Management, Institute of Public Health and Nursing Research,
Health Sciences, University of Bremen, Germany
*corresponding author: Manchester Centre for Health Economics, 4th floor, Jean McFarlane
Building, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
Keywords: expert judgement, expert elicitation, expert opinion, economic evaluation, model,
guidelines
1
Pharmacoeconomics: 24May2016
Acknowledgements
CPI contributed to the design and the conduct of the study, commented on drafts of the
manuscript and produced the final version of the manuscript. AT contributed to the design
and the conduct of the study, analysed the data, commented on drafts of the manuscript and
approved the final version of the manuscript. WR contributed to the design and the conduct
of the study, commented on drafts of the manuscript and approved the final version of the
manuscript. KP conceived the idea for this study, contributed to the design and the conduct of
the study, produced a first draft of the manuscript, commented on subsequent drafts and
approved the final version of the manuscript. Also, we want to thank the following members
of the Health Economics Elicitation Group for their contribution to this study: Laura Bojke,
Qi Cao, Ali Daneshkhah, Christopher Evans, Bogdan Grigore, Anthony O'Hagan, Vincent
Levy, Claire Rothery, Fabrizio Ruggeri, Ken Stein, Matt Stevenson, Patricia Vella Bonanno.
Compliance with Ethical Standards
(a) No funding was received for this study and (b) Cynthia P Iglesias, Alexander Thompson,
Wolf H Rogowski, and Katherine Payne authors of this paper declare that there is no conflict
of interest regarding its publication.
2
Pharmacoeconomics: 24May2016
Abstract
Introduction: Expert judgement has a role in model-based economic evaluations (EE) of
healthcare interventions. This study aimed to produce reporting criteria for two types of study
design to use expert judgement in model-based EE (1) an expert elicitation (quantitative)
study and (2) a Delphi study to collate (qualitative) expert opinion.
Method: A two-round on-line Delphi process identified the degree of consensus for four core
definitions (expert; expert parameter values; expert elicitation study; expert opinion) and two
sets of reporting criteria in a purposive sample of experts. The initial set of reporting criteria
comprised: 17 statements for reporting a study to elicit parameter values and/or distributions;
11 statements for reporting a Delphi survey to obtain expert opinion. Fifty experts were
invited to become members of the Delphi panel by e-mail. Data analysis summarised the
extent of agreement (using a pre-defined 75% ‘consensus’ threshold) on the definitions and
suggested reporting criteria. Free text comments were analysed using thematic analysis.
Results: The final panel comprised 12 experts. Consensus was achieved for the definitions
of: expert (88%); expert parameter values (83%); expert elicitation study (83%). Criteria
were recommended to use when reporting an expert elicitation study (16 criteria) and a
Delphi study to collate expert opinion (11 criteria).
Conclusion: This study has produced guidelines for reporting two types of study design to
use expert judgement in model-based EE (1) an expert elicitation study requiring 16 reporting
criteria and (2) a Delphi study to collate expert opinion requiring 11 reporting criteria.
3
Pharmacoeconomics: 24May2016
Introduction
Model-based EE, in general, and model-based cost effectiveness analysis (CEA), specifically,
now have a core role to inform reimbursement decisions and the production of clinical
guidelines [1-5]. Early model-based CEA has emerged as a particularly useful approach [6] in
the development phases of new diagnostic or treatment options. The practical application of
model-based CEA can be severely hampered when: i) analysis is conducted early in the
development phase of a new technology [7]; and ii) there is limited randomised controlled
trial or observational data available to populate the model (e.g. devices and diagnostics) [8].
This paucity of available data to populate model-based CEA has stimulated the reliance on
the use of expert judgement, a phenomenon that pervades beyond the health context and has
attracted attention in ecology, environment and engineering [9, 10].
Expert judgement has many potential roles in the context of model-based CEA. Qualitative
judgement expressions can: frame the scope of a model-based CEA; define care pathways;
assist conceptualisation of model’s structure; investigate model’s face validity. Quantitative
expressions of expert judgement can contribute to defining point estimates of key model
parameters and characterise the uncertainty. The process of aggregating (collating or pooling)
the views of a group of experts can be performed using: i) consensus methods (e.g. Delphi
survey) to identify the extent of consensus in a qualitative sense; ii) mixed methods (e.g.
nominal group) require getting experts together to exchange views to draw forth a single
quantitative expression of their collective judgement using mathematical elicitation methods
(e.g. roulette); and iii) mathematical aggregation to pool individual quantitative expressions
of judgements using statistical methods (e.g. linear pooling).
Within the area of expert judgement there are many concepts and terms that are used
interchangeably but often with no specific clear definitions offered leading to inconsistent use
of terminology. Table 1 summarises dictionary-based definitions of key terms. From the
definitions in Table 1 it is possible to develop a potential nomenclature system shown
graphically in Figure 1. The proposed nomenclature suggests separating methods to elicit
expert judgement in terms of whether the study design is underpinned by primarily a
qualitative or quantitative paradigm. By definition “elicitation” implies getting information
from somebody in a “particular way”. This suggests that the term “expert elicitation” may be
better suited to describe methods aimed at drawing forth experts’ judgements expressed in a
quantitative format (i.e. as probability density functions). In contrast, the term ‘expert
4
Pharmacoeconomics: 24May2016
opinion’ may be better suited to describe studies that aim to draw forth the opinions or beliefs
of experts expressed in a qualitative format. This study uses the proposed nomenclature
system to characterise studies that draw on expert judgement.
<Table 1 here><Figure 1 here>
Anecdotally, the ‘Delphi process’ is emerging as the most widely used consensus method
used in the context of model-based EE. It is not possible to formally quantify the extent of the
use of the Delphi in practice because of inconsistencies in terminology, description and
application of the ‘method’. Generally, using a ‘Delphi process’ is described as a survey
approach that involves at least two rounds to allow respondents, collectively grouped into an
‘expert panel’, to formulate and change their opinions on difficult topics [11]. Researchers
commonly refer to the Delphi as a consensus method but are often not explicit that the
method can only establish consensus if a set of clear decision rules are set when consensus
has been reached [12]. In practice, the Delphi process is not a single method and has been
adapted to answer different research questions. Sullivan & Payne (2011) [13] specified three
types of Delphi defined by their stated purpose and research question to be answered[14]. A
‘classical’ Delphi could be used, for example, to inform a decision-analytic model structure.
A ‘policy’ Delphi could be used to identify what value judgements are used by the decision-
making body appraising health technologies. A ‘decision’ Delphi could provide a consensus
view on the care pathways needed to inform the selection of model comparator(s). Although
the potentially useful distinction between these three types of Delphi approach were
suggested, this recommendation has not emerged into published examples in the context of
model-based economic evaluations potentially as a result of the lack of clear reporting
guidelines.
Importantly, Sullivan & Payne (2011) [13] were explicit that the Delphi should not be used as
a method of behavioural aggregation to generate parameter values but more appropriate
quantitative mathematical aggregation methods should be used. A substantial literature exists
on the role and use of quantitative methods to elicit expert values (e.g. roulette, quartile,
bisection, tertile, probability; hybrid) [15, 16]. These quantitative methods are common in
that they are routed in mathematical and statistical Bayesian frameworks. In 2006, O’Hagan
and colleagues produced a seminal textbook describing the rationale and application of
quantitative methods to elicit expert’s probability values [15]. Within the collective set of
5
Pharmacoeconomics: 24May2016
methods to draw forth judgements for use in a model-based CEA –see Fig 1 - there is
division amongst researchers on which methods are most suitable and lack of empirical
research to support the use of one method over another [16].
While there is no agreement on the appropriate type of study to elicit expert judgement, there
is a consensus view on a need for standardised reporting. Grigore et al (2013) reported a
systematic review of 14 studies using the quantitative elicitation of probability distributions
from experts undertaken to inform model‐based EE in healthcare [17]. The review identified
variation in the type of elicitation approach used and a failure to report key aspects of the
methods concluding with the need for better reporting. The potential strengths of using expert
judgements in the context of eliciting quantitative values for model-based CEA can only be
realised if studies are well-designed to minimise bias, conducted appropriately and reported
with clarity and transparency. With the provision of a set of key criteria for reporting on
quantitative estimates, papers can be quality assessed to assist with peer review and to aid
those who may use the expert judgements in their own analysis. O’Hagan and colleagues’
textbook offers a potential starting point for reporting criteria for an expert elicitation study to
generate quantitative values but needs adaptation to be practical and feasible in the context of
writing up a study for publication in a peer reviewed manuscript [15].
Evans & Crawford (2000) commented on the use of the Delphi - as a consensus generating -
method and suggested the need for: clear definition of techniques used; agreement on
consensus reaching criteria; and conducting validation exercises [18]. Eleven reporting
criteria were offered but these were not underpinned by agreement within the research
community and have not been taken forward in practice [19]. More generally, Hasson and
colleagues offered a set of reporting criteria for the Delphi but these were not specific to the
context of using Delphi methods in a model-based CEA [12]. The aim of this study was to
produce reporting criteria for two types of study design used when identifying expert
judgements for use in model-based CEA: a ‘consensus’ Delphi study as the most frequently
used method in “expert opinion” studies; and an “expert elicitation” study.
Method
This study followed published recommendations on how to develop reporting guidelines for
health services research [20]. A rapid review of the literature failed to identify existing
reporting criteria. In the absence of existing reporting criteria a two-round on-line Delphi
process [21] was used to identify the degree of consensus in a purposive sample of experts.
6
Pharmacoeconomics: 24May2016
The objectives were to understand the degree of consensus on definitions of core terms
relevant in the context of obtaining expert judgement for use in model-based EEs; and
reporting criteria in this context for (1) an expert elicitation study and (2) a Delphi study to
collate expert opinion. Ethical approval was required for this study because the Delphi
involved asking respondents for their contact e-mail address. Ethical approval for the study
was granted by The University of Manchester Research Ethics committee project reference
REF: 15462.
The expert panel
In this study an expert was defined as someone with previous experience of conducting a
study to identify expert judgements in the context of EE or had written on this topic. The
sampling frame was informed by a published systematic review of 14 expert elicitation
studies to inform model‐based EEs in healthcare [17]. This sampling frame was
supplemented with hand searching of relevant journals. A sampling frame of 50 potential
members of an expert panel representing the views from different healthcare jurisdictions was
generated. The study aimed to recruit a sample size of 15 experts representing the views of >
20% of the total available international experts. Contact E-mail addresses were obtained from
the published study and updated using a Google search. The sample size was pre-specified in
a dated protocol (available from the corresponding author on request) and based on two
criteria (1) the practicalities of using a Delphi method; and (2) the size of the potential pool of
experts with knowledge of using expert judgements in the context of model-based economic
evaluations. As sample size calculations for Delphi studies are not available it is necessary to
rely on pragmatic approaches to define the relevant sample size. A published systematic
review identified the potential pool of experts to be ~50, thus a sample size of 15 would
represent a substantial proportion of the available pool.
Experts were invited to become members of the Delphi panel by e-mail with a link to the
on-line survey (hosted using SelectSurvey.NET) and an attached study information sheet.
Respondents gave ‘assumed’ consent to take part by completing round-one of the survey and
indicating if they were willing to be sent a second survey and named as a member of the
Health Economics Expert Elicitation Group (HEEEG).
7
Pharmacoeconomics: 24May2016
The Delphi
Round one of the Delphi survey (Appendix 1) comprised four sections: definitions of four
concepts (expert; expert parameter values; expert elicitation study; a study designed to collect
expert opinion); reporting criteria for an expert elicitation study (17 criteria); reporting
criteria for a Delphi survey (11 criteria); background questions on the expert.
Concept definitions were created by a group of four health economists (authors of this paper)
based on their knowledge of the expert judgement literature and the deliberative processes
described in the introduction. Respondents were asked to rate their agreement with each
definition as described using a five-point scale (see Table 2). A free text section at the end of
the ‘definitions section’ and again at the end of the survey asked for comments on the
definitions and general comments, respectively.
<Table 2 here>
In round-one, each of the suggested reporting criteria was presented as a statement. Each
statement included in sections two and three of the first-round survey was informed by three
published sources [15, 12, 21]. Respondents were asked to indicate ‘whether you think each
of the stated criteria is required as a minimum standard for reporting the design and
conduct’ of a study to identify expert values for use in a model-based EE (section two) or a
(Delphi) method used to collate expert opinion (section three). The focus was to establish
which reporting criteria the experts thought would be essential for including in a standalone
expert opinion or expert elicitation study or in a study reporting an expert opinion or expert
elicitation exercise as an element of a wider EE study. A question at the end of sections two
and three, respectively, asked respondents to indicate which, if any, criteria could be removed
if the study identifying expert judgement was reported as part of the model-based EE paper.
Respondents rated each criteria using a five-point scale (see Table 2).
Respondents, who indicated that they were willing to take part, were sent a second-round
survey with a summary of their responses to round-one with a summary of the expert panel’s
responses. The second-round survey (Appendix 2) comprised three sections including:
re-worked definitions of four concepts (expert; expert parameter values; expert elicitation
study; expert opinion); reporting criteria for expert elicitation studies for which no consensus
was reached in round-one; reporting criteria for a Delphi survey to obtain expert opinion for
which no consensus was reached in round-one. Respondents were asked if they had any
8
Pharmacoeconomics: 24May2016
comments on criteria for which consensus had been reached in round-one and general
comments.
Data analysis
Data analysis aimed to summarise the extent of agreement about the appropriateness of the
core definitions and requirement to use the suggested reporting criteria. Only round-two
results were used in the final analysis. A pre-defined criterion was set to define the concept of
consensus with at least 75% of panel members agreeing (rating of 4 or 5; see Table 2) on
each definition or reporting criteria in terms of ‘required’ (rating of 4 or 5; see Table 2) or
‘not required’ (see Table 2). Free text comments collated in each round of the survey were
also analysed using thematic analysis (see Appendix 3).
Results
The first and second rounds of the Delphi survey were conducted in November 2015 and
January 2016, respectively. In round-one, 17 (34% response rate) of the invited 50
respondents completed the survey. Of these 17 respondents, 13 (26% response rate) agreed to
be a member of the expert panel for round-two of the Delphi survey. Appendix 4 shows the
level of consensus reached on each definition or statement and the response distribution from
each round of the Delphi survey.
The expert panel
The final expert panel, completing all questions in both rounds, comprised 12 experts (24%
response rate). These experts named their primary role as: Health Economist (n=2); Decision
Analyst (n=3); Operations Researcher (n=1); Outcomes Researcher (n=1); Statistician (n=2)
Clinician (n=1); HTA researcher (n=1); Decision Maker (n=1). Of these 12 experts, nine
(75%) had been working in their primary role for more than 10 years and five (42%) had
published more than three studies using methods to identify expert judgement.
Key definitions
Box 1 shows the ‘agreed’ definitions for four concepts (expert; expert parameter values;
expert elicitation study; expert opinion). In round-one, consensus was only achieved for the
definition of expert (n=15; 88%). The analysis of free text comments (see Appendix 3) was
used to modify definitions for an expert elicitation study and for expert opinion. In round-
two, consensus was achieved on the definitions for expert parameter values (n=10; 83%) and
9
Pharmacoeconomics: 24May2016
expert elicitation study (n=10; 83%). The expert panel was evenly divided on the
appropriateness (n=6; 50%) of the definition offered for ‘expert opinion. The free text
comments highlighted the reasons for the lack of consensus that seem to focus on the attempt
to use ‘opinion’ to reflect a study underpinned by a ‘qualitative paradigm’ and ‘elicitation’ to
reflect a ‘quantitative paradigm’ with respondents suggesting that the use of the word
‘opinion’ should be specific to the context of the study.
<Box 1 here>
Reporting criteria for an expert elicitation study
Table 3 shows the agreed 16 reporting criteria recommended for an expert elicitation study
seeking to attach a value and/or a distribution to a parameter for use in a model-based EE.
The expert panel felt that all 16 criteria should be reported in a standalone paper reporting the
expert elicitation study and when an expert elicitation study is included within a paper
reporting the EE it informs. In the latter case, this may require appropriate use of
supplementary appendices.
<Table 3 here>
In the first round of the Delphi, the expert panel reached the pre-defined threshold for
consensus (75% of experts) on the relevance of 15 of the original 17 potential reporting
criteria. The two criteria that failed to reach consensus were: data recording (n=7; 41% of
experts agreed needed) and ethical issues (n=9: 53% of experts agreed needed). Following the
second round, consensus was reached for the inclusion of ethical issues (n=9; 75%) but not
for the approach taken for data recording (n=8; 67% ). The expert panel viewed the reporting
criteria dealing with data aggregation and the interpretation of results to be most important as
100% of consensus was achieved in round one of the Delphi. Appendix 3 summarises the free
text comments made on the reporting criteria.
Reporting criteria for a Delphi study
Table 4 shows the final 11 criteria recommended for reporting a study designed to identify
the degree of consensus of opinion in an expert panel. All 11 criteria were judged to be
appropriate for reporting either a standalone Delphi study and / or one reported within the EE
the Delphi study had informed.
<Table 4 here>
10
Pharmacoeconomics: 24May2016
After round-one, the expert panel agreed on seven of the 11 criteria (see Appendix 4). No
consensus was reached on the need for reporting: the research rationale (n=12; 71% of
experts agreed); the literature review (n=15; 73%); the survey (n=11; 67%) and ethical issues
(n=9: 53%). After round-two, consensus was reached for 10 criteria. The expert panel did not
reach the pre-defined threshold for consensus (75% of experts) on including ethical issues
(n=9; 67%). However, the need to report ethical issues is a general requirement for most
journals and for this reason was included in the final list of required criteria. Appendix 3
summarises the free text comments offered on these reporting criteria.
Discussion
A two-round Delphi survey was used to generate criteria for two types of study designs
commonly used to incorporate expert judgement in model-based EEs of healthcare
interventions. A literature is emerging, both within and outside a healthcare focus, on the use
of quantitative expert judgement. O’Hagan and colleagues substantive textbook takes the
reader from the rationale and theoretical underpinning of the use of expert judgement through
the range of available elicitation methods and remaining areas where future research is
required [15].
Commentators have started to assimilate the range of methods available to identify expert
judgement [22] and written their experiences in designing elicitation studies [23]. In addition,
there are emerging guidelines on steps to follow when designing an expert elicitation
study[24]. Knol et al [25] suggested a seven step procedure [25, 24]. Implementing software
is now also readily available (e.g. SHELF, MATCH) [26-28]. Lack of agreement on which
methods should be used to elicit parameter values and the uncertainty (i.e. strength of belief)
associated with them is a remaining challenge. Probabilistic characterisation of a parameter
value is a key process to be able to incorporate expert judgements into model-based EEs. The
most relevant method to elicit a quantitative expression of expert judgement is likely to
depend on the type of parameter in question. Selecting the most appropriate method for
eliciting a parameter and its distribution is an open area for future empirical research.
Compared with quantitative methods to elicit expert judgements, developing methods to
conduct Delphi surveys in the context of model-based EEs has attracted less attention in the
literature commonly read by health economists, decision analysts or operations researchers.
The community of academic nurses and pharmacists have both embraced this method,
11
Pharmacoeconomics: 24May2016
writing the majority of published material on the use of Delphi as a survey method in health
services research [12, 29]. Developing a standardised toolkit for the design and conduct of a
Delphi survey is problematic as Delphi is not really a single method [11]. It is also
challenging as it is not clear whether it is a qualitative or quantitative technique. If used to
collate opinions (i.e. qualitative expressions of expert judgement) then it can be viewed to be
underpinned by a qualitative paradigm. However, if used as a means of behavioural
aggregation, as has been done in some applications in the context of generating parameter
values for model-based EE, then it could be viewed to be underpinned by a quantitative
paradigm [30]. The view taken in this study was in line with the recommendations of Sullivan
& Payne (2010) suggesting that a Delphi study is best suited to identify qualitative
expressions of expert opinion [13]. It should also be recognised that a Delphi survey is only
one of a number of available approaches to use as a consensus method and was chosen as the
focus for this study because other approaches, such as the nominal group technique, have
attracted less attention and use in the practice of model-based EEs.
There are a number of existing reporting criteria and guidelines published for use when
reporting EEs, such as the CHEERS [31], assessing the methodological quality of EEs [32],
conducting decision-analytic modelling [33] and stated preference studies [34]. These
existing guidelines, and the ones suggested in this study, all work on the premise that
standardisation of reporting is desirable. The belief that it is good to standardise reporting can
be viewed to be unequivocal. Caution is however necessary. The standardisation of reporting
should not serve to stagnate or halt future empirical research to improve the methods or
applied use of expert opinion or expert elicitation studies.
This study had a number of limitations. It clearly assumed that Delphi studies have an
appropriate role to draw forth qualitative expressions (“expert opinion”) of expert judgement
on how to report an “expert opinion” study. The tautological nature of the assumption is
acknowledged by the authors. However, in the absence of other methods using a Delphi
survey was a practical solution to identify the extent of consensus on how to report a Delphi.
Another potential limitation is that it relied on collating the opinion of experts that had
published studies using quantitative approaches to elicit probability distributions for model-
based economic evaluations. It was not feasible to use the opinions of known experts who
had experience in the use of Delphi methods because they could not be identified due to the
12
Pharmacoeconomics: 24May2016
lack of consistent keywords used in published manuscripts reporting model-based economic
evaluations using the Delphi method and lack of reporting guidelines of such studies.
This study achieved a final sample size of 12 experts. Sample size calculations for a Delphi
survey are not as established, or as well defined, as that for a randomised controlled trial.
Delphi processes in health care have used sample sizes between 4 and 3,000 panel members
[29]. Some guidance, has suggested that the optimum size of a panel is between 7 and 12
members, with seven members being the minimum panel size but some researchers have
recommended panels of 300 to 500 people [11]. The optimal size of a panel is a subject for
empirical research but it has been suggested that the size of a panel should be governed by
the purpose of the investigation [29]. Using authorship of peer-reviewed publications as a
proxy for expertise, this study identified a potential sampling frame of 50 respondents.
Related studies using Delphi surveys that arguably had a larger potential pool of respondents
of all health economists achieved a panel membership of 23 experts to develop criteria for the
assessment of the methodological quality of EEs [32]. We must acknowledge that the final
sample of 12 experts can only represent the views of a select few and their opinions may not
be generalisable to a wider audience. The test for the acceptance of these reporting criteria is
whether the guidelines are taken up and used in future studies.
The study was not able to achieve a consensus view on the definition of ‘expert opinion’. The
aim was to try and distinguish between a study designed to elicit a quantitative expression of
expert judgement on parameter values and the uncertainty surrounding them (expert
elicitation) compared with a study designed to draw forth qualitative expressions of expert
judgement (expert opinion). This distinction was not achieved in the definition we used.
However, we maintain that it is important to make this distinction as often, to paraphrase one
of the respondents in this study, when using expert judgement in model based EEs we may
not be using the study to collate expert opinion (e.g. to inform the model structure) but might
want to generate a value for a parameter and its distribution (to inform a model input
parameter).
The expert panel were also divided on the focus on the Delphi as a consensus method
suggesting that other approaches could have been used. Ideally, this study would have
generated reporting criteria for all types of consensus methods suitable for use in the context
of a model-based EE but this was not practical or feasible. There was further disagreement
13
Pharmacoeconomics: 24May2016
within the panel around whether it is appropriate to use the Delphi as a behavioural
aggregation method to combine expert estimates of a parameter value. We advocate the use
of the Delphi in line with previous recommendations [13] as a method useful when collating
opinions (qualitative expressions of expert judgement) on, for example care pathways or
model structures. However, we recognise that this view is not underpinned by empirical
research and future studies should compare and contrast whether mathematical or behavioural
aggregation methods are robust and practical when used in model-based EEs of healthcare
interventions.
Conclusion
This study has produced guidelines for reporting two types of study design to use expert
judgements in model-based EEs (1) an expert elicitation study requiring 16 reporting criteria
and (2) a Delphi study to collate expert opinion requiring 11 reporting criteria. These two sets
of criteria are suggested as guidelines to be used by journals wanting to assess the quality of
reporting such studies and researchers designing such studies or critiquing their quality.
Further research is required to develop and apply the methods to elicit parameter values
and/or their distributions and collate expert opinions. Specifically, it is necessary to conduct
empirical research on whether mathematical or behavioural aggregation methods are robust
and practical when used in model-based EE of healthcare interventions.
14
Pharmacoeconomics: 24May2016
References
1. Dakin H, Devlin N, Feng Y, Rice N, O'Neill P, Parkin D. The Influence of Cost Effectiveness and ‐Other Factors on Nice Decisions. Health economics. 2015;24(10):1256-71. 2. Brennan A, Chick SE, Davies R. A taxonomy of model structures for economic evaluation of health technologies. Health economics. 2006;15(12):1295-310. 3. Excellence NIfC. Guide to the methods of technology appraisal. London: NICE, 2004.4. Sculpher MJ, Claxton K, Drummond M, McCabe C. Whither trial based economic evaluation for ‐health care decision making? Health economics. 2006;15(7):677-87. 5. Marsden G, Wonderling D. Cost-effectiveness analysis: role and implications. Phlebology. 2013;28(suppl 1):135-40. 6. Annemans L, Genesté B, Jolain B. Early modelling for assessing health and economic outcomes of drug therapy. Value in Health. 2000;3(6):427-34. 7. IJzerman MJ, Steuten LM. Early assessment of medical technologies to inform product development and market access. Applied health economics and health policy. 2011;9(5):331-47. 8. Iglesias CP. Does assessing the value for money of therapeutic medical devices require a flexible approach? Expert review of pharmacoeconomics & outcomes research. 2015;15(1):21-32. 9. Morgan MG. Use (and abuse) of expert elicitation in support of decision making for public policy. Proceedings of the National Academy of Sciences of the United States of America. 2014;111(20):7176-84. doi:http://dx.doi.org/10.1073/pnas.1319946111.10. Kuhnert PM, Martin TG, Griffiths SP. A guide to eliciting and using expert knowledge in Bayesian ecological models. Ecology letters. 2010;13(7):900-14. doi:http://dx.doi.org/10.1111/j.1461-0248.2010.01477.x.11. Mullen PM. Delphi: myths and reality. Journal of health organization and management. 2003;17(1):37-52. 12. Hasson F, Keeney S, McKenna H. Research guidelines for the Delphi survey technique. Journal of advanced nursing. 2000;32(4):1008-15. 13. Sullivan W, Payne K. The appropriate elicitation of expert opinion in economic models. Pharmacoeconomics. 2011;29(6):455. 14. Rauch W. The decision delphi. Technological Forecasting and Social Change. 1979;15(3):159-69. 15. O'Hagan A. Uncertain judgements eliciting experts' probabilities. Chichester u.a.: Wiley; 2006.16. Gosling JP. Methods for eliciting expert opinion to inform health technology assessment. In: Vignette Commissioned by the MRC Methodology Advisory Group. Medical Research Council (MRC) and National Institure for Health Research (NIHR). 2014. https://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjr7Krwu_LMAhWGLsAKHbylCNUQFgghMAA&url=https%3A%2F%2Fwww.mrc.ac.uk%2Fdocuments%2Fpdf%2Fmethods-for-eliciting-expert-opinion-gosling-2014%2F&usg=AFQjCNGhWG6GGK0oAPr0IYtYCBttaC3Jnw&sig2=aRuqYgYzvqJ1ibAH8EDoHA&bvm=bv.122676328,d.ZGg.17. Grigore B, Peters J, Hyde C, Stein K. Methods to elicit probability distributions from experts: a systematic review of reported practice in health technology assessment. Pharmacoeconomics. 2013;31(11):991-1003. 18. Evans D. Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions. Journal of clinical nursing. 2003;12(1):77-84. 19. Evans C, Crawford B. Expert judgement in pharmacoeconomic studies. Pharmacoeconomics. 2000;17(6):545-53. 20. Moher D, Schulz KF, Simera I, Altman DG. Guidance for developers of health research reporting guidelines. PLoS Med. 2010;7(2):e1000217. 21. Jones J, Hunter D. Consensus methods for medical and health services research. BMJ: British Medical Journal. 1995;311(7001):376. 22. Johnson SR, Tomlinson GA, Hawker GA, Granton JT, Feldman BM. Methods to elicit beliefs for Bayesian priors: a systematic review. Journal of clinical epidemiology. 2010;63(4):355-69.
15
Pharmacoeconomics: 24May2016
23. Kadane J, Wolfson LJ. Experiences in elicitation. Journal of the Royal Statistical Society: Series D (The Statistician). 1998;47(1):3-19. 24. Kinnersley N, Day S. Structured approach to the elicitation of expert beliefs for a Bayesian‐designed clinical trial: a case study. Pharmaceutical statistics. 2013;12(2):104-13. 25. Knol AB, Slottje P, van der Sluijs JP, Lebret E. The use of expert elicitation in environmental health impact assessment: a seven step procedure. Environmental Health. 2010;9(1):19. 26. O' Hagan A. SHELF: the Sheffield Elicitation Framework. 2010. http://www.tonyohagan.co.uk/shelf/. Accessed 26/02/2016.27. Morris DE, Oakley JE, Crowe JA. A web-based tool for eliciting probability distributions from experts. Environmental Modelling & Software. 2014;52:1-4. 28. Morris E, Oakley J. MATCH. 2014. http://optics.eee.nottingham.ac.uk/match/uncertainty.php/. Accessed 26/02/2016.29. Cantrill J, Sibbald B, Buetow S. The Delphi and nominal group techniques in health services research. International Journal of Pharmacy Practice. 1996;4(2):67-74. 30. Melton Lr, Thamer M, Ray N, Chan J, Chesnut Cr, Einhorn T et al. Fractures attributable to osteoporosis: report from the National Osteoporosis Foundation. Journal of Bone and Mineral Research. 1997;12(1):16-23. 31. Husereau D, Drummond M, Petrou S, Carswell C, Moher D, Greenberg D et al. Consolidated Health Economic Evaluation Reporting Standards (CHEERS) statement. Bmj. 2013;346:f1049. 32. Evers S, Goossens M, De Vet H, Van Tulder M, Ament A. Criteria list for assessment of methodological quality of economic evaluations: Consensus on Health Economic Criteria. International journal of technology assessment in health care. 2005;21(02):240-5. 33. Ramos MCP, Barton P, Jowett S, Sutton AJ. A Systematic Review of Research Guidelines in Decision-Analytic Modeling. Value in Health. 2015;18(4):512-29. 34. Lancsar E, Louviere J. Conducting discrete choice experiments to inform healthcare decision making: a user's guide. Pharmacoeconomics. 2008;26(8):661-77. 35. Oxford Dictionaries. Oxford University Press. 2016. http://www.oxforddictionaries.com/definition/english. Accessed 26/02/2016.36. Garthwaite PH, Kadane JB, O'Hagan A. Statistical methods for eliciting probability distributions. Journal of the American Statistical Association. 2005;100(470):680-701. 37. The Free Dictionary. Farlex. 2016. http://www.thefreedictionary.com/UK. Accessed 26/02/2016.38. Macmillan Dictionary. Macmillan Publishers. 2016. http://www.macmillandictionary.com/. 26/02/2016.
16
Pharmacoeconomics
Box 1: ‘Agreed’ definitions of core concepts
An expert is: ‘someone who has in-depth knowledge of the topic of interest gained through their life experience, education or training’.
A study designed to generate expert parameter values uses:
‘a quantitative elicitation method to: derive point estimates and distributions for model input parameters; and/or pool expert judgement.’
The purpose of an expert elicitation study is:
‘to quantify a parameter value that appropriately represents the knowledge/judgement of the expert and the degree of uncertainty in that knowledge/judgement’.
A study designed to collate expert opinion uses:
‘a qualitative consensus method (e.g. a Delphi method or other approach) to collate views from experts to: frame the scope of the model-based economic evaluation; inform model conceptualisation; identify model face validity; quantify point estimates without specifying a distribution; pool elicitation results’.
18
Pharmacoeconomics
Table 1: Dictionary definitions of some core concepts
Concept Definition Source
opinion a view or judgement formed about
something, not necessarily based on fact or
knowledge; the beliefs or views of a group
or majority of people - statement of advice
by an expert on a professional matter
Oxford Dictionaries [35]
judgement the ability to make considered decisions or
come to sensible conclusions; an opinion or
conclusion
Oxford Dictionaries [35]
belief an acceptance that something exists or is
true, especially one without proof;
something one accepts as true or real; a
firmly held opinion
Oxford Dictionaries [35]
knowledge facts, information, and skills acquired
through experience or education; the
theoretical or practical understanding of a
subject; the sum of what is known;
information held on a computer system;
true, justified belief; certain understanding,
as opposed to opinion
Oxford Dictionaries [35]
expert A) individual ‘whose knowledge can
support informed judgement and prediction
about the issues of interest’
B) someone who has knowledge of the
subject of interest gained through their life
experience, education or training’
A)Morgan (2014) [9]
B) Garthwaite (2005). J Am
Stat Assoc. [36]
elicitation i) the act of getting information or a reaction
from somebody, often with difficulty
ii) stimulation that calls up (draws forth) a
particular class of behaviours;
iii) the process of making someone react in
a particular way
i) Oxford Dictionary. [35]
ii) The Free Dictionary [37]
iii) Macmillan Dictionary
[38]
19
Pharmacoeconomics
Table 2: Rating scale
Score Rating Description
Five-point scale used to rate definitions*
1 Strongly disagree This means that you think the definition as written does NOT
define the term and requires extensive modification
2 Disagree This means that you think the definition as written does NOT
define the term but only requires minor modification
3 Neither agree or
disagree
This means that you do not have a strong opinion on the
definition as written
4 Agree This means that you think the definition as written DOES
define the term but only requires some minor modification
5 Strongly agree This means that you think the definition as written DOES
define the term and requires no modification
Five-point scale used to rate reporting criteria*
1 definitely not
required
This means that you think that the criteria should NOT be
included in the reporting criteria.
2 possibly not
required
This means that you think that the criteria could probably be
omitted without any loss of key detail.
3 no strong opinion This means that you do not have a strong opinion to indicate
whether the criteria is, or is not, required.
4 possibly required This means that the criteria could be included but it is not vital
(it would be ‘nice to have’)
5 definitely required This means you think that the criteria should be included in the
reporting criteria otherwise key detail will not be reported.
*There was also an option for the respondent to indicate if ‘they do not know’ the answer
20
Pharmacoeconomics
Table 3: Reporting criteria for an expert elicitation study
Criterion Description NoteResearch rationale The need for using
an expert elicitation exercise should be described
This should ideally include some reference to the design and conduct of systematic reviews to identify key input parameters for the decision analytic model and a statement confirming that these reviews did not identify data relevant for the model-based economic analysis as specified
Research problem All uncertain quantities (model input parameters) that will be elicited should be described
In some instances, there may be a substantial number of uncertain quantities required, and a degree of 'pre-selection' will have occurred to identify a relevant sub-set. Clear justification for model parameters identified as key for the decision problem needs to be provided.
Measurement type of uncertain quantities
The rationale for the measure type of each uncertain quantity elicited should be described
The measurement type of uncertain quantities can be (but not limited to): scalar quantities (i.e. numbers); proportions (e.g. probabilities); ratios (e.g. odds, hazard); risk (e.g. relative); rate (e.g. mortality), etc. Some measures are easier to understand and elicit than others thus it is important to fully justify the selection of any measurement type.
Definition of an expert
The nature of the expert population should be described to clearly state what topic of expertise they represent and why
It is unlikely that a single expert will be sufficient and it is generally necessary to elicit judgement from a group of experts that were selected to represent the views of a larger population
Number of experts The selection criteria and final number of experts recruited to provide expert judgement should be reported.
Selection criteria need to be described in detail. There should be clear and specific pre-defined criteria used to identify how experts were selected and if/how their elicited quantities were used.
Preparation There should be clear reference made to a protocol that describes the design and conduct of the elicitation exercise.
None
Piloting It should be clearly reported if the elicitation exercise process was piloted and a summary of any modifications made.
The selection and number of experts used in the piloting process should be reported. Key aspects that may have required modification include: selection of experts; measure type and number of uncertain quantities to be elicited; training exercise; framing of the elicitation question; method of aggregation.
21
Pharmacoeconomics
Data collection The approach to collect the data should be reported.
Data can be collected from individual experts or a group/s of experts. Collecting data from individual experts means that a mathematical aggregation process may need to be used. Collecting data from a group/s of experts means that behavioural aggregation methods may be used.
Administration The mode of administering the elicitation exercise should be reported.
Elicitation exercises can be conducted face-to-face or via the telephone and/or computer. In a limited number of situations it may be feasible to collect the data using a self-administered online or postal survey but this is unlikely to be successful in most instances. Both face-to-face and telephone data collection is likely to be supported by using a computer.
Training The use of training materials should be reported and made available.
This may include background training materials sent to the experts and/or training in the use of probabilities and nature of distributions. This document need to provide explanation of efforts made to prevent influencing experts' knowledge and judgement. In practice, this recommendation will require a copy of the elicitation exercise to be included, which is likely to be presented as electronic supplementary material
The exercise The number and framing of questions used in the exercise should be reported and made available.
This will require a copy of the elicitation exercise to be included, which is likely to be presented as electronic supplementary material.
Data aggregation The type of aggregation method (mathematical or behavioural) should be reported together with a description of the method or process used to aggregate the data.
Mathematical aggregation (relevant when data were collected from multiple individual experts) can be conducted using a range of methods, for example: Bayesian methods; opinion pooling; Cooke's method. Behavioural aggregation (relevant when data were collected from group/s of experts) can be conducted using processes such as, for example: Delphi or Nominal Group technique.
Measures of performance for data aggregation
The processes followed to estimate measures of performance (calibration/information) for data aggregation need to be fully described
Calibration is the process of measuring the performance of experts by comparing their judgement with a 'seed parameter' (parameter whose true values are known or can be found within the duration of a study). Calibration scores represent the probability that any differences between expert's probabilities and observed values of 'seed parameters' might have arisen by chance. Information represents the degree to which an expert's distribution is concentrated, relative to some user-selected background measure.
Ethical issues The ethical issues for The use of expert elicitation should acknowledge the
22
Pharmacoeconomics
the expert sample and research community should be described.
issues of ethical responsibility, anonymity, reliability, and validity in an ongoing manner throughout the data collection and aggregation process.
Presentation of results
The individual, and aggregated, point estimate(s) and distribution for each uncertain quantity (quantities) should be presented.
The units of measurement should be clear and attention should be paid to the style of presentation that may benefit from the use of figures rather than relying on a tabular format.
Interpretation of results
The interpretation of uncertain quantities elicited should be presented together with a description of how the results will be used in the model-based economic analysis.
This should include an explanation of how the reader should interpret the results. It should be recognised that the number and type of experts used will affect the results obtained. The interpretation of results should comment on the degree of uncertainty observed.
23
Pharmacoeconomics
Table 4: Reporting criteria for a Delphi study designed to identify expert opinions
Criterion Description NoteResearch problem The research problem should
be clearly defined and ideally framed explicitly as a research question to be addressed.
When clarifying the research problem, remember the Delphi technique is a group facilitation technique and as such only lends itself to group involvement.
Research rationale The topic and use of the Delphi method should be justified.
The Delphi is best used when the research requires anonymity to avoid dominance of one opinion. It should also be remembered that the strength of the Delphi method lies in the use of iteration in which the process of gaining opinion occurs in rounds to allow individuals to change their opinion.
Literature review The rationale for using the Delphi method must be informed by a clear description of the evidence base for the topic of the study.
The focus of using the Delphi method should be where unanimity of opinion does not exist owing to a poor evidence base. This section should also describe the process of determining the most important issues to refer to in the design of the initial round of the Delphi.
Data collection This should include a clear explanation of the Delphi method employed.
This should be sufficiently detailed for a reader to be able to duplicate the process of conducting the Delphi method. This includes a description of the types of questions used (qualitative or quantitative and ranking, rating or scoring scale used). This section should describe which medium was used to collect the data (electronic or written communication). This section should also describe how results from previous rounds were fed back to the experts and whether feedback is given to the group and/or individual response.
The survey A copy of each round of the survey used in the Delphi method should be presented.
The use of journal supplementary appendices should be exploited to allow the reader access to a full copy of the survey used for each round of the Delphi.
Rounds This should state the number of rounds planned and used together with the plans for moving from one round to the next.
The structure of the initial round (either qualitative or quantitative) should be decided from the protocol stage of the study together with the number of rounds to be used.
The sample The sample or 'expert' panel should be described in terms of the definition of an expert in the context of the study and the selection and composition
It should be noted that the composition of the panel will affect the results obtained from the Delphi method. Careful thought should be given to the criteria employed to define an expert, the justification of a
24
Pharmacoeconomics
of the panel including how it was formed from a sampling frame and response rate achieved.
participant as an 'expert' and the use of non-probability sampling techniques (such as purposive or criterion methods).
Ethical issues The ethical issues for the expert sample and research community should be described.
The use of the Delphi method should acknowledge the issues of ethical responsibility, anonymity, reliability, and validity in an ongoing manner throughout the data collection and analysis process.
Data analysis The management of opinions, analysis and handling of both qualitative and quantitative data should be described.
As with any other survey-based approach, a pre-specified data analysis plan should be prepared. This should include a clear description of the meaning of 'consensus' in relation to the stated aim of the study and how 'agreement' is defined. This should also take account of reliability and validity issues identified.
Presentation of results
The results for each round, and final round, should be presented clearly while taking account of the audience of the study findings.
The response rate for each round should be stated. Careful consideration should be paid on how to present the interim (between round) and final results in either graphical and/or statistical representations. In round 1, a summary of the total number of issues generated should be presented. In the final round, the strength of overall consensus should be summarised. Reporting data from quantitative questions should acknowledge the limitations associated with eliciting point estimates (e.g. no indication of uncertainty).
Interpretation of results
The interpretation of consensus (not) gained should be presented together with the meaning of the results and direction of further research needed,
This should include an explanation of how the reader should interpret the results, and how to digest the findings in relation to the emphasis being placed upon them. It should be recognised that the composition of the panel will affect the results obtained. The interpretation of results should state whether 'outliers' to the overall consensus were asked for the reasons for their answers.
25