peer review history ... · • the treatment response outcome is not listed in the protocol paper....
TRANSCRIPT
1
PEER REVIEW HISTORY
BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to
complete a checklist review form (http://bmjopen.bmj.com/site/about/resources/checklist.pdf) and
are provided with free text boxes to elaborate on their assessment. These free text comments are
reproduced below.
ARTICLE DETAILS
TITLE (PROVISIONAL) A guided and unguided internet- and mobile-based intervention for
chronic pain: Health economic evaluation alongside a randomized
controlled trial
AUTHORS Paganini, Sarah; Lin, Jiaxi; Kählke, Fanny; Buntrock, Claudia; Leiding, Delia; Ebert, David; Baumeister, Harald
VERSION 1 – REVIEW
REVIEWER Dr Tania Gardner Department of Pain Medicine, St Vincent's Hospital Sydney, Australia
REVIEW RETURNED 22-Apr-2018
GENERAL COMMENTS The paper is well written and structured. It is for the most part easy to for the reader to understand. Its weakness lies in the statistical methods chosen to test cost effectiveness which is addressed by the authors in the limitations. This weakness makes it more difficult to accept the findings and language to reflect this may be appropriate within the discussion and conclusion sections.
REVIEWER Brent Leininger University of Minnesota, Minneapolis, MN, USA
REVIEW RETURNED 11-May-2018
GENERAL COMMENTS The manuscript reports cost-effectiveness results of a randomized trial comparing an online delivered Acceptance and Commitment Therapy intervention either guided by facilitators or unguided, and a waitlist control group for individuals with chronic pain. This is a timely and innovative project given the substantial burden of chronic pain and the amount of healthcare resources devoted to the condition. I have provided comments and requests for clarifications below in an effort to strengthen the manuscript. Abstract • The objective for the study is difficult to follow mainly due to the subscripting of guided/unguided ACTonPain. It’s not clear in the objective that this is a three group study. • Mean costs and effects by group are not provided in the results. This information would be helpful to the reader to understand the magnitude of total costs over the 6-months. It would also aid in the interpretation of the cost-effectiveness results. • The dominance of unguided ACT over WLC (less costs, greater effects) is important. I suggest this should be highlighted within the
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
2
abstract and the body of the manuscript. Also, the reporting of the probability of cost-effectiveness at a WTP of 0 for the comparisons of guided ACT and unguided ACT vs waitlist control creates confusion as it gives the impression that guided ACT is not cost-effective relative to a waitlist control. Given the dominance of unguided ACT over WLC, I suggest the results for guided ACT should focus on its cost-effectiveness relative to unguided ACT both within the abstract and body of the manuscript. • What's the rationale for reporting the WTP where guided ACT has a higher probability of cost-effectiveness over unguided ACT instead of the ICER within the abstract? Was the ICER not the primary cost-effectiveness outcome? Background • Pg 4, line 13: I believe the phrase “a particular form of CBT showed to be effective” should read “a particular form of CBT shown to be effective” • Pg 4, lines 22-25: “IMIs for chronic pain have been shown to effectively improve pain interference (standardized mean difference (SMD)=.4 [17], SMD=−0.50 [18]).” It’s unclear what this improvement in pain is relative to. Methods • The treatment response outcome is not listed in the protocol paper. This change will need to be noted along with a rationale (e.g. easier to interpret than a 1-point change on the MPI or PGIC) • Details on the population used to determine preferences for health states from the AQol-8D and assign QALY values would be helpful for the reader. • Please clarify if the resource use questionnaires for psychiatric illness also captured healthcare use for chronic pain. If not, the implications should be addressed within the discussion. • It’s unclear if or how costs from randomization to month 3 were included in the analysis. Please clarify • Was predictive mean matching for imputing costs and QALYs considered? This approach has been recommended for cost-effectiveness analyses as it ensures only plausible values are imputed. This may not be necessary if the authors are confident implausible values for costs or QALYs were not imputed, but this should be stated. (Faria R, Gomes M, Epstein D, White IR. A guide to handling missing data in cost-effectiveness analysis conducted within randomized controlled trials. Pharmacoeconomics 2014;32:1157–70) • Pg 11, lines 17-20: “The burden of the interventions on participants was not assessed, but the satisfaction with the intervention.” I believe the end of this sentence should read “but the satisfaction with the interventions was assessed.” Was this information reported in the primary results paper? If so, a reference would be helpful to alert the reader to where this information is located.
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
3
• The rationale for not also reporting cost-effectiveness from the healthcare perspective should be noted (e.g. intervention costs for guided and unguided ACT fall outside the healthcare sector). Results • The reported QALY measures for 6-months appear low (one-year QALY estimates from 0.50 to 0.54) compared to national EQ-5D estimates for pain and chronic pain from other countries. Please verify the accuracy of the measures and if accurate, discuss this difference. o Saarni SI, Härkänen T, Sintonen H, Suvisaari J, Koskinen S, Aromaa A, Lönnqvist J. The impact of 29 chronic conditions on health-related quality of life: a general population survey in Finland using 15D and EQ-5D. Quality of Life Research. 2006 Oct 1;15(8):1403-14. o Sullivan PW, Lawrence WF, Ghushchyan V. A national catalog of preference-based scores for chronic conditions in the United States. Medical care. 2005 Jul 1;43(7):736-49. • Table 4: Negative ICERs can have two interpretations, the table should clearly specify what treatment was dominant when a negative ICER is present. Interpretation of negative ICERs is also difficult as holding differences in costs favoring ACT fixed, smaller QALY gains lead to a more negative ICER. Recommend removing negative ICERs and highlighting which treatment is dominant. • Table 4: It’s unclear why the Cost-utility ICER for Guided ACT vs WLC using EQ5D QALYs is negative, as costs were higher and effects were greater for Guided ACT • Table 4: It’s unclear why the Cost-utility ICER for Unguided ACT vs WLC using EQ5D QALYs is positive, as costs were lower and effects were greater for Unguided ACT • The results of the sensitivity analysis should include the impact on the ICER. Discussion • The discussion is difficult to follow given the three pairwise comparisons. Suggest framing the discussion by noting the dominance of Unguided ACT over waitlist (while describing uncertainty) and then focusing on the cost-effectiveness of guided vs unguided ACT. • The impact of the sensitivity analysis on cost-effectiveness should be discussed
VERSION 1 – AUTHOR RESPONSE
Reviewers' Comments to Author and responses:
We would like to thank both reviewers as we think that the comments were very helpful and
contributed to the quality of the manuscript.
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
4
Reviewer: 1
Reviewer Name: Dr Tania Gardner
Institution and Country: Department of Pain Medicine, St Vincent's Hospital Sydney, Australia
Competing Interests: None declared
Comment 1: The paper is well written and structured. It is for the most part easy to for the reader to
understand. Its weakness lies in the statistical methods chosen to test cost effectiveness which is
addressed by the authors in the limitations. This weakness makes it more difficult to accept the
findings and language to reflect this may be appropriate within the discussion and conclusion
sections.
Authors‘ response:
Thank you for your evaluation and your comment. We followed your suggestion and included the
following sentence in the section “implications and future research”:
Pg 21, line 55: “Future research should especially focus on conducting methodologically sound
studies that are powered to statistically test health economic differences.”
Reviewer: 2
Reviewer Name: Brent Leininger
Institution and Country: University of Minnesota, Minneapolis, MN, USA
Competing Interests: None declared
The manuscript reports cost-effectiveness results of a randomized trial comparing an online delivered
Acceptance and Commitment Therapy intervention either guided by facilitators or unguided, and a
waitlist control group for individuals with chronic pain. This is a timely and innovative project given the
substantial burden of chronic pain and the amount of healthcare resources devoted to the condition. I
have provided comments and requests for clarifications below in an effort to strengthen the
manuscript.
Abstract
Comment 1:
The objective for the study is difficult to follow mainly due to the subscripting of guided/unguided
ACTonPain. It’s not clear in the objective that this is a three group study.
Authors‘ response:
Thank you for this indication. We changed it according to your suggestion:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
5
Pg 2, line 7-11: “This study aims at evaluating the cost-effectiveness and cost-utility of a guided and unguided internet-based intervention for chronic pain patients (ACTonPainguided/ACTonPainunguided) compared to a waitlist control condition (WLC) as well as the comparative cost-effectiveness of the two interventions.
Comment 2:
Mean costs and effects by group are not provided in the results. This information would be helpful to
the reader to understand the magnitude of total costs over the 6-months. It would also aid in the
interpretation of the cost-effectiveness results.
Authors‘ response:
We added the suggested information in the abstract, but had to delete other information (see section primary and secondary outcome measures) due to the limited number of words in the abstract: Pg 2, line 42-53: “Results: At 6-month follow-up treatment response and QALYs were highest in ACTonPainguided (44% and 0.280; mean costs=€6,945), followed by ACTonPainunguided (28% and 0.266; mean costs=€6,560) and WLC (16% and 0.244; mean costs=€6,908). At a willingness-to-pay of €0 the probability of being cost-effective was 50% for ACTonPainguided and 66% for ACTonPainunguided, respectively, for both treatment response and QALY compared to WLC and in the comparative analysis 35% per treatment response and 31% per QALY gained.”
Comment 3:
The dominance of unguided ACT over WLC (less costs, greater effects) is important. I suggest this
should be highlighted within the abstract and the body of the manuscript. Also, the reporting of the
probability of cost-effectiveness at a WTP of 0 for the comparisons of guided ACT and unguided ACT
vs waitlist control creates confusion as it gives the impression that guided ACT is not cost-effective
relative to a waitlist control. Given the dominance of unguided ACT over WLC, I suggest the results
for guided ACT should focus on its cost-effectiveness relative to unguided ACT both within the
abstract and body of the manuscript.
Authors‘ response:
Thank you for this comment. We highlighted the dominance of unguided ACT more but discussed this
in detail in the manuscript, due to limited number of words in the abstract.
Pg 3, line 2-12: “Findings indicate that ACTonPain has the potential of being a cost-effective
alternative or adjunct to established pain treatment, with ACTonPainunguided (vs. WLC) even leading
to lower costs at better health outcomes. However, whether the intervention should be delivered
guided or unguided depends on the society´s willingness-to-pay. The direct comparison of the two
interventions indicates a preference for ACTonPainunguided under health economic aspects.”
Comment 4:
What's the rationale for reporting the WTP where guided ACT has a higher probability of cost-
effectiveness over unguided ACT instead of the ICER within the abstract? Was the ICER not the
primary cost-effectiveness outcome?
Authors‘ response:
In terms of interpreting the results we tried, not to focus solely on the ICER, as sometimes the
interpretation is difficult (e.g. for the main outcome “treatment response” or the interpretation of
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
6
negative ICERs). As the information extracted from the acceptability curve contains a lot of valuable
information (including uncertainty in contrast to the ICER-value), we focused on reporting these
results.
A second reason was a pragmatic one, again, as the words of the abstract are limited and with three
comparison groups there would be a lot of information in the results section. However, ICERs can be
found in the main manuscript and we included them more into the discussion.
Background
Comment 5:
Pg 4, line 13: I believe the phrase “a particular form of CBT showed to be effective” should read “a
particular form of CBT shown to be effective”
Authors‘ response:
Thank you for the correction. We changed the sentence into:
Pg 4, line 9-16: “Treatments based on cognitive-behavioral therapy (CBT) or third-wave therapies, like
the Acceptance and Commitment Therapy (ACT, a particular form of CBT) have shown to be effective
for chronic pain patients [11, 12] and could show acceptable results concerning cost-effectiveness
[13].”
Comment 6:
Pg 4, lines 22-25: “IMIs for chronic pain have been shown to effectively improve pain interference
(standardized mean difference (SMD)=.4 [17], SMD=−0.50 [18]).” It’s unclear what this improvement
in pain is relative to.
Authors‘ response:
We agree and added the following:
Pg 4, line 23-27: “IMIs for chronic pain have been shown to effectively improve pain interference
compared to different control groups such as standard (medical) care, text-based material and mostly
waitlist control condition (standardized mean difference (SMD)=.4 [17], SMD=−0.50 [18]).”
Methods
Comment 7:
The treatment response outcome is not listed in the protocol paper. This change will need to be noted
along with a rationale (e.g. easier to interpret than a 1-point change on the MPI or PGIC)
Authors‘ response:
We added the following sentence in the “treatment response section”:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
7
Pg 6, line 52- 57; pg 7, line 2-5: “This outcome was not defined in the protocol paper. However, it was
chosen to calculate a reliable and meaningful change in pain interference according to the
recommendations of the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials
(IMMPACT) [33] instead of a one-point change on the MPI, that would be difficult to interpret.”
Comment 8:
Details on the population used to determine preferences for health states from the AQol-8D and
assign QALY values would be helpful for the reader.
Authors‘ response:
Thank you for the suggestion. We added the following sentence: Pg 7, line 45: “Utility weights are derived from the Australian adult population (Richardson 2011).”
Comment 9:
Please clarify if the resource use questionnaires for psychiatric illness also captured healthcare use
for chronic pain. If not, the implications should be addressed within the discussion.
Authors‘ response:
For this study, the “Trimbos and iMTA questionnaire for costs associated with psychiatric illness” was
extended/adapted to the healthcare use of chronic pain patients. We added this information in the
text:
Pg 8, line 13-19: “Resource use and costing: The Trimbos and iMTA questionnaire for costs
associated with psychiatric illness (TiC-P) [43, 44] was adapted to the German health care system
and to the healthcare use of individuals with chronic pain. It was used to assess the direct and indirect
costs of the past three month at T0 and T2.”
Comment 10:
It’s unclear if or how costs from randomization to month 3 were included in the analysis. Please clarify
Authors‘ response:
Thank you for the comment. We added this information in the section “Resource use and costing”: Pg 8, line 25-34: “To calculate the 6-month accumulated per-participants costs, the area under curve (AUC) method was used by linearly interpolating 3-month costs (measured at T0 und T2) to cover the full period of six months [42].
𝐴𝑈𝐶 = (𝐶𝑜𝑠𝑡𝑠 𝑇0
3+
𝐶𝑜𝑠𝑡𝑠 𝑇2
3
2) ∗ 3 + 𝐶𝑜𝑠𝑡𝑠 𝑇2“
Comment 11:
Was predictive mean matching for imputing costs and QALYs considered? This approach has been
recommended for cost-effectiveness analyses as it ensures only plausible values are imputed. This
may not be necessary if the authors are confident implausible values for costs or QALYs were not
imputed, but this should be stated. (Faria R, Gomes M, Epstein D, White IR. A guide to handling
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
8
missing data in cost-effectiveness analysis conducted within randomized controlled trials.
Pharmacoeconomics 2014;32:1157–70)
Authors‘ response:
Thank you for the comment. As a single imputation was used in the main analysis, we decided to
choose the single regression imputation.
It could be shown that complete case analyses do not show outcome differences compared to
reference analyses/ITT analyses (Vroomen, J. M., Eekhout, I., Dijkgraaf, M. G., van Hout, H., de
Rooij, S. E., Heymans, M. W., & Bosmans, J. E.. Multiple imputation strategies for zero-inflated cost
data in economic evaluations: which method works best? The European Journal of Health
Economics, 2016, 17(8), 939-950). Therefore, we calculated QALY outcomes and 6-month
accumulated per-participants costs with complete case data and compared them to the results of the
imputed data. As the differences to the reported values are small and outcomes are comparable, we
are confident, that there was no imputation of implausible values for costs or QALYs.
However, as single imputation approaches do not completely reflect missing data uncertainty, we
stated this limitation in the discussion.
Pg 21, line 20-30: “Finally, the usage of multiple imputation techniques is frequently recommended
(e.g. predictive mean matching [72]. We used a single imputation approach as it was done in the main
(effectiveness) analysis [25] that might not truly reflect missing data uncertainty. However, the
comparison with cost and QALY outcomes of complete case analysis revealed only small differences,
indicating that the risk of implausible values due to single imputation in this evaluation is low [73].
Furthermore, by using the non-parametric bootstrapping, sampling was considered.”
Comment 12:
Pg 11, lines 17-20: “The burden of the interventions on participants was not assessed, but the
satisfaction with the intervention.” I believe the end of this sentence should read “but the satisfaction
with the interventions was assessed.” Was this information reported in the primary results paper? If
so, a reference would be helpful to alert the reader to where this information is located.
Authors‘ response:
Thank you for the correction. We changed the beginning of the sentence as well:
Pg 12, line 9-14: “Possible negative effects were assessed as well as the satisfaction with the
intervention (for results, see [25]).”
Additionally, we cited the primary results paper.
Comment 13:
The rationale for not also reporting cost-effectiveness from the healthcare perspective should be
noted (e.g. intervention costs for guided and unguided ACT fall outside the healthcare sector).
Authors‘ response:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
9
We agree that this would be quite interesting. However, as different health economic perspectives
would be worthwhile to consider (next to the healthcare perspective as well the employers
perspective) and due to the fact, that the paper already contains a lot of information (due to the three
comparison groups), we decided to conduct the health economic evaluation from the most common
used and recommended societal perspective (Mathes T, Jacobs E, Morfeld J-C, Pieper D. Methods of
international health technology assessment agencies for economic evaluations-a comparative
analysis. BMC health services research 2013;13(1):371). This was also stated in the study protocol.
Therefore, we tend to not discussing this explicitly in the paper.
Results
Comment 14:
The reported QALY measures for 6-months appear low (one-year QALY estimates from 0.50 to 0.54)
compared to national EQ-5D estimates for pain and chronic pain from other countries. Please verify
the accuracy of the measures and if accurate, discuss this difference.
o Saarni SI, Härkänen T, Sintonen H, Suvisaari J, Koskinen S, Aromaa A, Lönnqvist J. The impact
of 29 chronic conditions on health-related quality of life: a general population survey in Finland using
15D and EQ-5D. Quality of Life Research. 2006 Oct 1;15(8):1403-14.
o Sullivan PW, Lawrence WF, Ghushchyan V. A national catalog of preference-based scores for
chronic conditions in the United States. Medical care. 2005 Jul 1;43(7):736-49.
Authors‘ response:
Thank you for this comment. We checked the QALY measures and discussed the results in the
discussion section:
Pg 19, line 18-34: “Estimated EQ-5D scores for one year ranged from 0.50 to 0.54, what appears rather
low compared to national EQ-5D estimates for (back) pain from other countries (e.g. 0.74-0.79 [56, 57]).
Lower estimates in the current study could have occurred due to the sociodemographic properties of
this study sample, as participants were predominately women (84%), reported comorbid medical or
mental conditions (57% and 39%, respectively) and the back was the most often reported pain location
(34%) [25]. Several studies showed that the mentioned characteristics (female sex, musculoskeletal
and mental disorders) are associated with lower quality of life scores [56-58]. Furthermore, Burström et
al. (2001) reported in their study that participants with low back pain showed quality of life weights of
0.55, what is comparable to the sample in the current study [58].”
Comment 15:
Table 4: Negative ICERs can have two interpretations, the table should clearly specify what treatment
was dominant when a negative ICER is present. Interpretation of negative ICERs is also difficult as
holding differences in costs favoring ACT fixed, smaller QALY gains lead to a more negative ICER.
Recommend removing negative ICERs and highlighting which treatment is dominant.
Authors‘ response:
We followed the suggestion and changed negative ICERs into statements about treatment
dominance. E.g.:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
10
Pg 15, Table 2, line 30-33: “ACTonPain unguided dominates WLC”
Comments 16 and 17:
Table 4: It’s unclear why the Cost-utility ICER for Guided ACT vs WLC using EQ5D QALYs is
negative, as costs were higher and effects were greater for Guided ACT
Table 4: It’s unclear why the Cost-utility ICER for Unguided ACT vs WLC using EQ5D QALYs is
positive, as costs were lower and effects were greater for Unguided ACT
Authors‘ response:
Thank you very much for this comment. We agree that these values were counter-intuitive and
therefore, we checked again all parts of the analysis.
After checking the baseline values of the AQoL-utilities and the EQ5D-utilities (that were not reported
yet), it can be suggested that the baseline values are not “similar”, as small differences in these
values can have a great impact.
AQoL baseline values were:
ACTonPainguided: M=0.496, SD=0.16; ACTonPainunguided: M=0.485, SD=0.17; waitlist: M=0.463,
SD=0.14
EQ5D baseline values were:
ACTonPainguided: M=0.469, SD=0.32; ACTonPainunguided: M=0.436, SD=0.31, waitlist: M=0.494,
SD=0.3
Therefore, we concluded that baseline adjustment seems to be necessary and calculated the main
and the sensitivity analyses again with adjustment for baseline values, respectively.
Pg 11, line 11-15: “At baseline, AQoL-utilities differed between groups (ACTonPainguided: M=0.496,
SD=0.16; ACTonPainunguided: M=0.485, SD=0.17; waitlist: M=0.463, SD=0.14). Therefore, baseline
adjustments were made in further calculations.”
Pg 12, line 9-13: “As baseline EQ5D-utilities differed between groups (ACTonPainguided: M=0.469,
SD=0.32; ACTonPainunguided: M=0.436, SD=0.31; waitlist: M=0.494, SD=0.3), baseline adjustment
was made in the sensitivity analyses.”
After baseline adjustment no implausible values were generated in the EQ5D anymore and the results
were comparable to AQoL-8D results (what is discussed in the manuscript). Because values changed
slightly, we adjusted table 4 as well as the discussion. The only changes (in interpretation) were, that
it became more clearly that ACTonPain guided is not cost-effective compared to ACTonPain
unguided and that the EQ5D-3L showed slightly higher incremental QALY differences compared to
the AQoL. All other interpretations stayed the same.
Comment 18:
The results of the sensitivity analysis should include the impact on the ICER.
Authors‘ response:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
11
As we do not state the ICERs explicitly in the results of the main analysis and focus more on the ICER
distribution on the cost-effectiveness plane and the WTP we would like to report the results
consistently.
Furthermore, as soon as the bootstrapped ICERs fall into all four quadrants of the cost-effectiveness
plane a “simple” interpretation of the mean ICER is hardly possible.
However, we included information about WTP at €0 (to provide more comparable information) and
added a discussion on the results of the sensitivity analysis in the discussion section.
Pg 17, line 34-44: “After non-parametric bootstrapping, using the EQ-5D-3L resulted in smaller
incremental QALY gains in all comparisons compared to the results using the AQoL-8D (see table 4).
At a WTP of €0 the probability of ACTonPainguided of being cost-effective compared to waitlist was
50%. The probability of ACTonPainunguided of being cost-effective compared to waitlist was 64% at a
WTP of €0. ACTonPainguided vs. ACTonPainunguided resulted in a probability of being cost-effective of
31% at a WTP of €0.”
Discussion
Comment 19:
The discussion is difficult to follow given the three pairwise comparisons. Suggest framing the
discussion by noting the dominance of Unguided ACT over waitlist (while describing uncertainty) and
then focusing on the cost-effectiveness of guided vs unguided ACT.
Authors‘ response:
Thank you for this feedback. We revised the structure of the discussion and highlighted the outcome
of unguided ACT. You can find the changed parts of the discussion highlighted in the manuscript:
Pg 18-21
Comment 20:
The impact of the sensitivity analysis on cost-effectiveness should be discussed
Authors‘ response:
We included the following:
Pg 19, line 14-34: “The results of the sensitivity analyses revealed smaller incremental QALY gains by
using the EQ5D-3L compared to the AQoL-8D but overall conclusions are the same as in the main
analyses. Only the comparison between ACTonPainguided/unguided resulted in a higher ICER
(114,858), so that the guided version would not be judged as cost-effective according to the NICE
threshold. However, the distribution on the cost-effectiveness plane was similar compared to the
results of the main analysis.”
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
12
VERSION 2 – REVIEW
REVIEWER Brent Leininger University of Minnesota, Minneapolis, MN, USA
REVIEW RETURNED 01-Aug-2018
GENERAL COMMENTS The authors have addressed most of the concerns raised by reviewers. A few issues remain which are highlighted in the attached document. - The reviewer also provided a marked copy with additional comments. Please contact the publisher for full details.
VERSION 2 – AUTHOR RESPONSE
Minor revisions
Reviewers' Comments to Author and responses:
Reviewer: 2
Reviewer Name: Brent Leininger
Institution and Country: University of Minnesota, Minneapolis, MN, USA
Competing Interests: None declared
Comment 1 and comment 2 (Abstract):
1: Participants receiving guided ACT reported better health outcomes, but at a higher cost. The
potential value of this intervention is lost when just reporting the cost-effectiveness at a WTP of 0, as
it assumes society is unwilling to pay for health. Reporting the ICER would allow the reader to assess
value if society were willing to pay for health.
2: Within the abstract, this statement relies on a WTP of 0. What if a society is willing to pay for
health gains? Reporting the actual ICER would provide support for this statement, if the ICER is
above commonly accepted thresholds.
Authors‘ response:
We included the ICER/ICUR in the abstract, but as we think that only stating the ICER is not enough
to interpret the results correctly, we did not delete the probabilities. Instead we added the 95%-
probability of being cost-effective, to show, that with a high probability the WTP would be quite
high:
“Results: At 6-month follow-up treatment response and QALYs were highest in ACTonPainguided
(44% and 0.280; mean costs=€6,945), followed by ACTonPainunguided (28% and 0.266; mean
costs=€6,560) and WLC (16% and 0.244; mean costs=€6,908). ACTonPainguided vs WLC revealed an
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
13
ICER/ICUR of 171 and 3,033, respectively, while ACTonPainunguided dominated WLC. At a
willingness-to-pay of €0 the probability of being cost-effective was 50% for ACTonPainguided and
66% for ACTonPainunguided. This probability rises to 95% when society´s willingness-to-pay is
91,000€(ACTonPainguided ) and €127,000(ACTonPainunguided) per QALY. ACTonPainguided vs.
ACTonPainunguided revealed an ICER/ICUR of 2,949 and 198,377.
Comment 3: “Have
been shown”
Authors‘ response:
Thank you. We corrected that.
Comment 4:
Please include the use of baseline and month 3 to 6 costs to estimate costs from months 0 to 3 as a
limitation.
Authors‘ response:
Thank you for that comment. We added: “Furthermore, costs between randomization and three
months after randomization were calculated with the area under the curve method. This is just an
estimate and not a representation of the actual costs incurred during this period.” in the limitation
section
Comment 5 and comment 6:
5: 6/0.01 = 600, please check the accuracy of reported ICERs and provide an explanation for the
difference.
6: 388/0.008 = 48,500; which is in line with figure 2. The reported ICERs appear inaccurate. Please
check on the accuracy and provide an explanation for the discrepancy.
Authors‘ response:
Thank you for that question. The incremental costs (6 and 388) and effects (0.01 and 0.008) that you
mentioned are the “mean incremental costs” and the “mean incremental effects” (calculated after
bootstrapping).
However, the mean (bootstrapped) ICER is not the result of “mean incremental costs/ mean
incremental effects”.
After bootstrapping 5000 ICERs are calculated and then the mean of those 5000 ICERs is calculated
(what results in 3,033 and 198,377, respectively, for the two examples in your comments).
It has to be calculated like that, because otherwise (when “simply” dividing the two means)
information would get lost.
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
14
So the value of the mean bootstrapped ICERs/ICURs are all stated correctly in the manuscript.
Comment 7:
“This metric and CEAC curve appears at odds with an ICER of 198,377. Please check for accuracy.”
Authors‘ response:
Here we report two different values. The value €41.350 is the break even point, at which the guided
and the unguided version have the same probability (0.5) of being cost-effective.
€198,377 would have to be invested for gaining one QALY with the guided version compared to the
unguided version.
Comment 8:
When reviewing table 4, incremental QALY gains measured with the EQ5D are larger than those
measured with the AqoL.
Authors‘ response:
Thank you very much. We corrected that.
Comment 9:
In addition to the NICE thresholds, reference to how costeffectiveness is assessed in Germany,
where the study was conducted would be appropriate.
Authors‘ response:
In Germany there is no official WTP threshold and “…decisions for German Social Health Insurance
(“Gesetzliche Krankenversicherung, GKV”) are made on a case-by-case basis by a decision-making
body called “Gemeinsamer Bundesausschuss” (G-BA) with no obvious or transparent decision
criteria.” (Ahlert, M., Breyer, F., Schwettmann, L., 2013, p. 2. What you ask is what you get:
willingness-to-pay for a QALY in Germany. DIW Berlin Discussion Paper). Furthermore they state that
“…findings show first that Germans have no higher WTP for health gains than other Europeans.”
(Ahlert et al., 2013, p.1).
This was the reason why we used the only officially stated threshold by the NICE as reference and
based on the before mentioned aspects we think this is appropriate.
Comment 10:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
15
The response to reviewers‘ comments included more information in this section. “Only the
comparison between ACTonPainguided/unguided resulted in a higher ICER (114,858), so that the
guided version would not be judged as cost-effective according to the NICE threshold. However, the
distribution on the cost-effectiveness plane was similar compared to the results of the main
analysis.” Please include this information, noting the ICER was lower when EQ5D was used (114,858
vs 198,377) but still above recommended thresholds for cost-effectiveness (if that’s the case after
checking the accuracy of reported ICERs).
Authors‘ response:
Thank you for this comment. As the article is already quite comprehensive we left out this part (and
missed to delete it in the point-to-point reply). We think that the information in these sentences is
not essential and that the most important information is that overall conclusions are the same as in
the main analyses (like it is already stated in the text). Therefore, we would prefer to leave the text
at it is right now at this point.
Comment of the authors:
Comment 1:
At the beginning of the manuscript the address and affiliation of the corresponding author has been
changed:
a Corresponding author: Sarah Paganini, Department of Sports and Sport Science, Sport Psychology,
University of Freiburg, Germany, Schwarzwaldstr. 175, 79117 Freiburg, Phone: +49 (0)761 /
2034514, Email: [email protected]
Comment 2:
Furthermore, we revised the discussion slightly. We wanted to emphasize, that although ACTonPain
unguided dominates WLC, WTP would have to be quite high to reach a 95% probability. Therefore,
the decision whether the intervention is cost-effective or not cannot be made clearly. We thought
that this was not highlighted enough before.
“Comparing both ACTonPain interventions to waitlist and by taking uncertainty into account,
ACTonPainunguided couldan be judged as a potentially cost-effective intervention as it dominates
WLC by leading to higher QALY gains and more individuals with a treatment response at lower costs.
However, when assuming that an intervention should reach a likelihood of being cost-effective of
95% or greater it has to be considered that the WTP would have to be €13,460 for treatment
response and €127,000 for a QALY gain. Therefore, the judgement of whether the intervention is
cost-effective or not ultimately depends on the society’s WTP for treatment response or a QALY
gain, respectively.”
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
16
And
“…when comparing the costs that would have to be invested by using ACTonPainguided (compared to
waitlist) for a QALY gained (€3,033) to the only official WTP threshold stated by the National
Institute for Health and Clinical Excellence (NICE) of £20,000 to £30,000 [54] (~ €22,647 - €33,971;
conversion according to the European Central bank [55]), this intervention would also be categorized
as a potentially cost-effective treatment. Here again, uncertainty has to be considered as well as the
required WTP for a likelihood of being cost-effective of 95% of €6,490 (treatment response) and
€91,000 (QALY gain).”
And
Implications and future research
“…Findings from this health economic evaluation study show that both versions of ACTonPain have
the potential of being cost-effective, with the unguided version even leading to lower costs
(compared to WLC). However, uncertainty should be taken into account.”
Comment 3:
Furthermore, we deleted an example in the discussion because it is not stated in the results:
In terms of QALYs gained, the guided version only reaches a probability of 31% of being cost-
effective at a WTP of €0 and even with rising WTP threshold, the probability does not increase much
(e.g. at a WTP of €50,000 to about 53%).
VERSION 3 – REVIEW
REVIEWER Brent Leininger University of Minnesota, United States of America
REVIEW RETURNED 06-Nov-2018
GENERAL COMMENTS The authors have addressed most of the concerns raised by reviewers during the last revision. I recommend the authors address the following two concerns (one of which is new and arose based on the authors' response to the last round of comments): #1 - Please add a sentence or two in the discussion on German cost-effectiveness thresholds. (e.g. from the response to reviewers: "Germans have no higher WTP for health gains than other Europeans." #2 - Negative incremental cost-effectiveness ratios can represent two very different findings (findings in the southeast and northwest
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
17
quadrants of the cost-effectiveness plane both result in negative ICERs). In addition, the magnitude of a negative ICER is meaningless. Bootstrapped ICERs reported in the paper crossed multiple quadrants on the cost-effectiveness plane and included negative values (where interpretation is needed and the magnitude is meaningless). Therefore, I strongly recommend the authors report ICER/ICURs based on the mean costs/mean effects instead of mean ICER/ICURs from the bootstrap replicates. Mean ICERs from bootstrap replicates are uninterpretable when negative values are present. (see Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health Econ. 1997 Jul-Aug;6(4):327-40. and Briggs AH, O'Brien BJ, Blackhouse G. Thinking outside the box: recent advances in the analysis and presentation of uncertainty in cost-effectiveness studies. Annu Rev Public Health. 2002;23:377-401.)
VERSION 3 – AUTHOR RESPONSE
Minor Revisions
Reviewers' Comments to Author and responses:
Reviewer: 2
Reviewer Name: Brent Leininger
Institution and Country: University of Minnesota, United States of America
Please state any competing interests or state ‘None declared’: None declared
Please leave your comments for the authors below
The authors have addressed most of the concerns raised by reviewers during the last revision. I
recommend the authors address the following two concerns (one of which is new and arose based
on the authors' response to the last round of comments):
#1 - Please add a sentence or two in the discussion on German cost-effectiveness thresholds. (e.g.
from the response to reviewers: "Germans have no higher WTP for health gains than other
Europeans."
Authors‘ response:
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
18
We would prefer not to state this explicitely as a wider discussion would be necessary, what seems
not appropriate here. Ahlert et al. 2014 indeed found that „when the same questionnaire is used,
the WTP values found in Germany are similar if not even lower than the ones in comparable other
European countries.“ (Ahlert, M., Breyer, F., & Schwettmann, L. (2014). How You Ask Is What You
Get: Willingnessto-Pay for a QALY in Germany, DIW Berlin. Discussion Papers, 1384.). To argue with
this result it would be necessary to discuss, how the NICE threshold has been calculated.
Our discussion is not mainly based on the NICE threshold and the ICER, but on the cost-effectiveness
acceptability curves. The relevant aspect is, that this is the only official threshold and that we take it
as a possible reference for that reason. In order to rule out a direct comparison or
missunderstanding, we added the following sentence:
Page 17, line 56-57: This threshold might serve as a reference, but it has to be considered, that it
might differ for the German population.
#2 - Negative incremental cost-effectiveness ratios can represent two very different findings (findings
in the southeast and northwest quadrants of the cost-effectiveness plane both result in negative
ICERs). In addition, the magnitude of a negative ICER is meaningless. Bootstrapped ICERs reported in
the paper crossed multiple quadrants on the cost-effectiveness plane and included negative values
(where interpretation is needed and the magnitude is meaningless). Therefore, I strongly recommend
the authors report ICER/ICURs based on the mean costs/mean effects instead of mean ICER/ICURs
from the bootstrap replicates. Mean ICERs from bootstrap replicates are uninterpretable when
negative values are present. (see Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness
analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health
Econ. 1997 Jul-Aug;6(4):327-40. and Briggs AH, O'Brien BJ, Blackhouse G. Thinking outside the box:
recent advances in the analysis and presentation of uncertainty in cost-effectiveness studies. Annu
Rev Public Health. 2002;23:377-401.)
Authors‘ response:
Thank you for that comment. For calculating the mean ICER we relied on the three-stage process as
described by Briggs et al. 1997 (Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness
analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health
Econ. 1997 Jul-Aug;6(4):327-40
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
19
(Last step of the three-stage process; B stands for the amount of bootstrap replicates)
We considered the two different meanings of negative ICERs according to the papers that you
recommended by Briggs et al, 1997 and 2002. In Briggs et al. (2002) they state that the „problems“
caused by negative ICERs can be overcome through two possible approaches. One of these is the
appropriate representation of uncertainty on the cost-effectiveness plane. We did that and it can be
clearly seen in the cost-effectiveness planes (Figure 1) as well as in Table 4, how many of the
cost/effect pairs fall into the SE or the NW quadrant (in percentage terms, respectively).
Briggs et al., (2002, p. 387) further recommend to demonstrate the cost-effectiveness acceptability
curve „as it directly summarizes the evidence in support of the intervention being cost-effective for
all potential values of the decision rule.“ We did that as well (see Figure 2).
Thus, we followed the recommendations of Briggs et al. (1997 and 2002) but we know that the
question „mean of the ratios“ or „ratio of the means“ is an often discussed issue. We could find one
research article, that strengthens your position: Stinnett, A. A., & Paltiel, A. D. (1997). Estimating CE
ratios under second-order uncertainty: the mean ratio versus the ratio of means. Medical decision
making, 17(4), 483-489. Here the authors state some relevant advantages of the „ratio of means“
and therefore we adjusted our results according to your recommendation. Confidence intervalls
already relied on the cost/effect pairs and thus, stay the same.
The ICERs were replaced (marked in yellow) in Table 4, in the parts of the discussion and in the
abstract, respectively. As the interpretation stayed the same and because we already relied more on
the interpretation of the cost-effectiveness acceptability curve only few parts of the discussion had
to be changed.
We deleted one sentence (page 18):
“The direct comparison of ACTonPainguided and ACTonPainunguided shows more treatment responders
and (slightly) higher QALY gains for the guided version, but at higher costs. According to the NICE
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from
20
guidelines costs are far above the WTP threshold for a QALY gain and would thus be judged as
probably not cost-effective. In terms of QALYs gained, the guided version only reaches a probability
of 31% of being cost-effective at a WTP of €0 and even with rising WTP threshold, the probability
does not increase much.”
Furthermore, we removed the citation of a negative ICER in the discussion (page 45, line: 45-46):
“A further systematic review focused on economic evaluations of third-wave CBT therapies
(including ACT), were available ICERs ranged from negative ICERs indicating dominance over the
control group -€19,300 (National Health Service perspective) to €56,637 (societal perspective) per
QALY gained [13]”
VERSION 4 – REVIEW
REVIEWER Brent Leininger University of Minnesota, United States of America
REVIEW RETURNED 02-Jan-2019
GENERAL COMMENTS The authors' have been responsive to my comments and addressed my concerns.
on March 10, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2018-023390 on 9 A
pril 2019. Dow
nloaded from