peer review history ... · • the treatment response outcome is not listed in the protocol paper....

20
1 PEER REVIEW HISTORY BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to complete a checklist review form (http://bmjopen.bmj.com/site/about/resources/checklist.pdf) and are provided with free text boxes to elaborate on their assessment. These free text comments are reproduced below. ARTICLE DETAILS TITLE (PROVISIONAL) A guided and unguided internet- and mobile-based intervention for chronic pain: Health economic evaluation alongside a randomized controlled trial AUTHORS Paganini, Sarah; Lin, Jiaxi; Kählke, Fanny; Buntrock, Claudia; Leiding, Delia; Ebert, David; Baumeister, Harald VERSION 1 – REVIEW REVIEWER Dr Tania Gardner Department of Pain Medicine, St Vincent's Hospital Sydney, Australia REVIEW RETURNED 22-Apr-2018 GENERAL COMMENTS The paper is well written and structured. It is for the most part easy to for the reader to understand. Its weakness lies in the statistical methods chosen to test cost effectiveness which is addressed by the authors in the limitations. This weakness makes it more difficult to accept the findings and language to reflect this may be appropriate within the discussion and conclusion sections. REVIEWER Brent Leininger University of Minnesota, Minneapolis, MN, USA REVIEW RETURNED 11-May-2018 GENERAL COMMENTS The manuscript reports cost-effectiveness results of a randomized trial comparing an online delivered Acceptance and Commitment Therapy intervention either guided by facilitators or unguided, and a waitlist control group for individuals with chronic pain. This is a timely and innovative project given the substantial burden of chronic pain and the amount of healthcare resources devoted to the condition. I have provided comments and requests for clarifications below in an effort to strengthen the manuscript. Abstract • The objective for the study is difficult to follow mainly due to the subscripting of guided/unguided ACTonPain. It’s not clear in the objective that this is a three group study. • Mean costs and effects by group are not provided in the results. This information would be helpful to the reader to understand the magnitude of total costs over the 6-months. It would also aid in the interpretation of the cost-effectiveness results. • The dominance of unguided ACT over WLC (less costs, greater effects) is important. I suggest this should be highlighted within the on March 10, 2020 by guest. Protected by copyright. http://bmjopen.bmj.com/ BMJ Open: first published as 10.1136/bmjopen-2018-023390 on 9 April 2019. Downloaded from

Upload: others

Post on 10-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

1

PEER REVIEW HISTORY

BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to

complete a checklist review form (http://bmjopen.bmj.com/site/about/resources/checklist.pdf) and

are provided with free text boxes to elaborate on their assessment. These free text comments are

reproduced below.

ARTICLE DETAILS

TITLE (PROVISIONAL) A guided and unguided internet- and mobile-based intervention for

chronic pain: Health economic evaluation alongside a randomized

controlled trial

AUTHORS Paganini, Sarah; Lin, Jiaxi; Kählke, Fanny; Buntrock, Claudia; Leiding, Delia; Ebert, David; Baumeister, Harald

VERSION 1 – REVIEW

REVIEWER Dr Tania Gardner Department of Pain Medicine, St Vincent's Hospital Sydney, Australia

REVIEW RETURNED 22-Apr-2018

GENERAL COMMENTS The paper is well written and structured. It is for the most part easy to for the reader to understand. Its weakness lies in the statistical methods chosen to test cost effectiveness which is addressed by the authors in the limitations. This weakness makes it more difficult to accept the findings and language to reflect this may be appropriate within the discussion and conclusion sections.

REVIEWER Brent Leininger University of Minnesota, Minneapolis, MN, USA

REVIEW RETURNED 11-May-2018

GENERAL COMMENTS The manuscript reports cost-effectiveness results of a randomized trial comparing an online delivered Acceptance and Commitment Therapy intervention either guided by facilitators or unguided, and a waitlist control group for individuals with chronic pain. This is a timely and innovative project given the substantial burden of chronic pain and the amount of healthcare resources devoted to the condition. I have provided comments and requests for clarifications below in an effort to strengthen the manuscript. Abstract • The objective for the study is difficult to follow mainly due to the subscripting of guided/unguided ACTonPain. It’s not clear in the objective that this is a three group study. • Mean costs and effects by group are not provided in the results. This information would be helpful to the reader to understand the magnitude of total costs over the 6-months. It would also aid in the interpretation of the cost-effectiveness results. • The dominance of unguided ACT over WLC (less costs, greater effects) is important. I suggest this should be highlighted within the

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

2

abstract and the body of the manuscript. Also, the reporting of the probability of cost-effectiveness at a WTP of 0 for the comparisons of guided ACT and unguided ACT vs waitlist control creates confusion as it gives the impression that guided ACT is not cost-effective relative to a waitlist control. Given the dominance of unguided ACT over WLC, I suggest the results for guided ACT should focus on its cost-effectiveness relative to unguided ACT both within the abstract and body of the manuscript. • What's the rationale for reporting the WTP where guided ACT has a higher probability of cost-effectiveness over unguided ACT instead of the ICER within the abstract? Was the ICER not the primary cost-effectiveness outcome? Background • Pg 4, line 13: I believe the phrase “a particular form of CBT showed to be effective” should read “a particular form of CBT shown to be effective” • Pg 4, lines 22-25: “IMIs for chronic pain have been shown to effectively improve pain interference (standardized mean difference (SMD)=.4 [17], SMD=−0.50 [18]).” It’s unclear what this improvement in pain is relative to. Methods • The treatment response outcome is not listed in the protocol paper. This change will need to be noted along with a rationale (e.g. easier to interpret than a 1-point change on the MPI or PGIC) • Details on the population used to determine preferences for health states from the AQol-8D and assign QALY values would be helpful for the reader. • Please clarify if the resource use questionnaires for psychiatric illness also captured healthcare use for chronic pain. If not, the implications should be addressed within the discussion. • It’s unclear if or how costs from randomization to month 3 were included in the analysis. Please clarify • Was predictive mean matching for imputing costs and QALYs considered? This approach has been recommended for cost-effectiveness analyses as it ensures only plausible values are imputed. This may not be necessary if the authors are confident implausible values for costs or QALYs were not imputed, but this should be stated. (Faria R, Gomes M, Epstein D, White IR. A guide to handling missing data in cost-effectiveness analysis conducted within randomized controlled trials. Pharmacoeconomics 2014;32:1157–70) • Pg 11, lines 17-20: “The burden of the interventions on participants was not assessed, but the satisfaction with the intervention.” I believe the end of this sentence should read “but the satisfaction with the interventions was assessed.” Was this information reported in the primary results paper? If so, a reference would be helpful to alert the reader to where this information is located.

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

3

• The rationale for not also reporting cost-effectiveness from the healthcare perspective should be noted (e.g. intervention costs for guided and unguided ACT fall outside the healthcare sector). Results • The reported QALY measures for 6-months appear low (one-year QALY estimates from 0.50 to 0.54) compared to national EQ-5D estimates for pain and chronic pain from other countries. Please verify the accuracy of the measures and if accurate, discuss this difference. o Saarni SI, Härkänen T, Sintonen H, Suvisaari J, Koskinen S, Aromaa A, Lönnqvist J. The impact of 29 chronic conditions on health-related quality of life: a general population survey in Finland using 15D and EQ-5D. Quality of Life Research. 2006 Oct 1;15(8):1403-14. o Sullivan PW, Lawrence WF, Ghushchyan V. A national catalog of preference-based scores for chronic conditions in the United States. Medical care. 2005 Jul 1;43(7):736-49. • Table 4: Negative ICERs can have two interpretations, the table should clearly specify what treatment was dominant when a negative ICER is present. Interpretation of negative ICERs is also difficult as holding differences in costs favoring ACT fixed, smaller QALY gains lead to a more negative ICER. Recommend removing negative ICERs and highlighting which treatment is dominant. • Table 4: It’s unclear why the Cost-utility ICER for Guided ACT vs WLC using EQ5D QALYs is negative, as costs were higher and effects were greater for Guided ACT • Table 4: It’s unclear why the Cost-utility ICER for Unguided ACT vs WLC using EQ5D QALYs is positive, as costs were lower and effects were greater for Unguided ACT • The results of the sensitivity analysis should include the impact on the ICER. Discussion • The discussion is difficult to follow given the three pairwise comparisons. Suggest framing the discussion by noting the dominance of Unguided ACT over waitlist (while describing uncertainty) and then focusing on the cost-effectiveness of guided vs unguided ACT. • The impact of the sensitivity analysis on cost-effectiveness should be discussed

VERSION 1 – AUTHOR RESPONSE

Reviewers' Comments to Author and responses:

We would like to thank both reviewers as we think that the comments were very helpful and

contributed to the quality of the manuscript.

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

4

Reviewer: 1

Reviewer Name: Dr Tania Gardner

Institution and Country: Department of Pain Medicine, St Vincent's Hospital Sydney, Australia

Competing Interests: None declared

Comment 1: The paper is well written and structured. It is for the most part easy to for the reader to

understand. Its weakness lies in the statistical methods chosen to test cost effectiveness which is

addressed by the authors in the limitations. This weakness makes it more difficult to accept the

findings and language to reflect this may be appropriate within the discussion and conclusion

sections.

Authors‘ response:

Thank you for your evaluation and your comment. We followed your suggestion and included the

following sentence in the section “implications and future research”:

Pg 21, line 55: “Future research should especially focus on conducting methodologically sound

studies that are powered to statistically test health economic differences.”

Reviewer: 2

Reviewer Name: Brent Leininger

Institution and Country: University of Minnesota, Minneapolis, MN, USA

Competing Interests: None declared

The manuscript reports cost-effectiveness results of a randomized trial comparing an online delivered

Acceptance and Commitment Therapy intervention either guided by facilitators or unguided, and a

waitlist control group for individuals with chronic pain. This is a timely and innovative project given the

substantial burden of chronic pain and the amount of healthcare resources devoted to the condition. I

have provided comments and requests for clarifications below in an effort to strengthen the

manuscript.

Abstract

Comment 1:

The objective for the study is difficult to follow mainly due to the subscripting of guided/unguided

ACTonPain. It’s not clear in the objective that this is a three group study.

Authors‘ response:

Thank you for this indication. We changed it according to your suggestion:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

5

Pg 2, line 7-11: “This study aims at evaluating the cost-effectiveness and cost-utility of a guided and unguided internet-based intervention for chronic pain patients (ACTonPainguided/ACTonPainunguided) compared to a waitlist control condition (WLC) as well as the comparative cost-effectiveness of the two interventions.

Comment 2:

Mean costs and effects by group are not provided in the results. This information would be helpful to

the reader to understand the magnitude of total costs over the 6-months. It would also aid in the

interpretation of the cost-effectiveness results.

Authors‘ response:

We added the suggested information in the abstract, but had to delete other information (see section primary and secondary outcome measures) due to the limited number of words in the abstract: Pg 2, line 42-53: “Results: At 6-month follow-up treatment response and QALYs were highest in ACTonPainguided (44% and 0.280; mean costs=€6,945), followed by ACTonPainunguided (28% and 0.266; mean costs=€6,560) and WLC (16% and 0.244; mean costs=€6,908). At a willingness-to-pay of €0 the probability of being cost-effective was 50% for ACTonPainguided and 66% for ACTonPainunguided, respectively, for both treatment response and QALY compared to WLC and in the comparative analysis 35% per treatment response and 31% per QALY gained.”

Comment 3:

The dominance of unguided ACT over WLC (less costs, greater effects) is important. I suggest this

should be highlighted within the abstract and the body of the manuscript. Also, the reporting of the

probability of cost-effectiveness at a WTP of 0 for the comparisons of guided ACT and unguided ACT

vs waitlist control creates confusion as it gives the impression that guided ACT is not cost-effective

relative to a waitlist control. Given the dominance of unguided ACT over WLC, I suggest the results

for guided ACT should focus on its cost-effectiveness relative to unguided ACT both within the

abstract and body of the manuscript.

Authors‘ response:

Thank you for this comment. We highlighted the dominance of unguided ACT more but discussed this

in detail in the manuscript, due to limited number of words in the abstract.

Pg 3, line 2-12: “Findings indicate that ACTonPain has the potential of being a cost-effective

alternative or adjunct to established pain treatment, with ACTonPainunguided (vs. WLC) even leading

to lower costs at better health outcomes. However, whether the intervention should be delivered

guided or unguided depends on the society´s willingness-to-pay. The direct comparison of the two

interventions indicates a preference for ACTonPainunguided under health economic aspects.”

Comment 4:

What's the rationale for reporting the WTP where guided ACT has a higher probability of cost-

effectiveness over unguided ACT instead of the ICER within the abstract? Was the ICER not the

primary cost-effectiveness outcome?

Authors‘ response:

In terms of interpreting the results we tried, not to focus solely on the ICER, as sometimes the

interpretation is difficult (e.g. for the main outcome “treatment response” or the interpretation of

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

6

negative ICERs). As the information extracted from the acceptability curve contains a lot of valuable

information (including uncertainty in contrast to the ICER-value), we focused on reporting these

results.

A second reason was a pragmatic one, again, as the words of the abstract are limited and with three

comparison groups there would be a lot of information in the results section. However, ICERs can be

found in the main manuscript and we included them more into the discussion.

Background

Comment 5:

Pg 4, line 13: I believe the phrase “a particular form of CBT showed to be effective” should read “a

particular form of CBT shown to be effective”

Authors‘ response:

Thank you for the correction. We changed the sentence into:

Pg 4, line 9-16: “Treatments based on cognitive-behavioral therapy (CBT) or third-wave therapies, like

the Acceptance and Commitment Therapy (ACT, a particular form of CBT) have shown to be effective

for chronic pain patients [11, 12] and could show acceptable results concerning cost-effectiveness

[13].”

Comment 6:

Pg 4, lines 22-25: “IMIs for chronic pain have been shown to effectively improve pain interference

(standardized mean difference (SMD)=.4 [17], SMD=−0.50 [18]).” It’s unclear what this improvement

in pain is relative to.

Authors‘ response:

We agree and added the following:

Pg 4, line 23-27: “IMIs for chronic pain have been shown to effectively improve pain interference

compared to different control groups such as standard (medical) care, text-based material and mostly

waitlist control condition (standardized mean difference (SMD)=.4 [17], SMD=−0.50 [18]).”

Methods

Comment 7:

The treatment response outcome is not listed in the protocol paper. This change will need to be noted

along with a rationale (e.g. easier to interpret than a 1-point change on the MPI or PGIC)

Authors‘ response:

We added the following sentence in the “treatment response section”:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

7

Pg 6, line 52- 57; pg 7, line 2-5: “This outcome was not defined in the protocol paper. However, it was

chosen to calculate a reliable and meaningful change in pain interference according to the

recommendations of the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials

(IMMPACT) [33] instead of a one-point change on the MPI, that would be difficult to interpret.”

Comment 8:

Details on the population used to determine preferences for health states from the AQol-8D and

assign QALY values would be helpful for the reader.

Authors‘ response:

Thank you for the suggestion. We added the following sentence: Pg 7, line 45: “Utility weights are derived from the Australian adult population (Richardson 2011).”

Comment 9:

Please clarify if the resource use questionnaires for psychiatric illness also captured healthcare use

for chronic pain. If not, the implications should be addressed within the discussion.

Authors‘ response:

For this study, the “Trimbos and iMTA questionnaire for costs associated with psychiatric illness” was

extended/adapted to the healthcare use of chronic pain patients. We added this information in the

text:

Pg 8, line 13-19: “Resource use and costing: The Trimbos and iMTA questionnaire for costs

associated with psychiatric illness (TiC-P) [43, 44] was adapted to the German health care system

and to the healthcare use of individuals with chronic pain. It was used to assess the direct and indirect

costs of the past three month at T0 and T2.”

Comment 10:

It’s unclear if or how costs from randomization to month 3 were included in the analysis. Please clarify

Authors‘ response:

Thank you for the comment. We added this information in the section “Resource use and costing”: Pg 8, line 25-34: “To calculate the 6-month accumulated per-participants costs, the area under curve (AUC) method was used by linearly interpolating 3-month costs (measured at T0 und T2) to cover the full period of six months [42].

𝐴𝑈𝐶 = (𝐶𝑜𝑠𝑡𝑠 𝑇0

3+

𝐶𝑜𝑠𝑡𝑠 𝑇2

3

2) ∗ 3 + 𝐶𝑜𝑠𝑡𝑠 𝑇2“

Comment 11:

Was predictive mean matching for imputing costs and QALYs considered? This approach has been

recommended for cost-effectiveness analyses as it ensures only plausible values are imputed. This

may not be necessary if the authors are confident implausible values for costs or QALYs were not

imputed, but this should be stated. (Faria R, Gomes M, Epstein D, White IR. A guide to handling

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

8

missing data in cost-effectiveness analysis conducted within randomized controlled trials.

Pharmacoeconomics 2014;32:1157–70)

Authors‘ response:

Thank you for the comment. As a single imputation was used in the main analysis, we decided to

choose the single regression imputation.

It could be shown that complete case analyses do not show outcome differences compared to

reference analyses/ITT analyses (Vroomen, J. M., Eekhout, I., Dijkgraaf, M. G., van Hout, H., de

Rooij, S. E., Heymans, M. W., & Bosmans, J. E.. Multiple imputation strategies for zero-inflated cost

data in economic evaluations: which method works best? The European Journal of Health

Economics, 2016, 17(8), 939-950). Therefore, we calculated QALY outcomes and 6-month

accumulated per-participants costs with complete case data and compared them to the results of the

imputed data. As the differences to the reported values are small and outcomes are comparable, we

are confident, that there was no imputation of implausible values for costs or QALYs.

However, as single imputation approaches do not completely reflect missing data uncertainty, we

stated this limitation in the discussion.

Pg 21, line 20-30: “Finally, the usage of multiple imputation techniques is frequently recommended

(e.g. predictive mean matching [72]. We used a single imputation approach as it was done in the main

(effectiveness) analysis [25] that might not truly reflect missing data uncertainty. However, the

comparison with cost and QALY outcomes of complete case analysis revealed only small differences,

indicating that the risk of implausible values due to single imputation in this evaluation is low [73].

Furthermore, by using the non-parametric bootstrapping, sampling was considered.”

Comment 12:

Pg 11, lines 17-20: “The burden of the interventions on participants was not assessed, but the

satisfaction with the intervention.” I believe the end of this sentence should read “but the satisfaction

with the interventions was assessed.” Was this information reported in the primary results paper? If

so, a reference would be helpful to alert the reader to where this information is located.

Authors‘ response:

Thank you for the correction. We changed the beginning of the sentence as well:

Pg 12, line 9-14: “Possible negative effects were assessed as well as the satisfaction with the

intervention (for results, see [25]).”

Additionally, we cited the primary results paper.

Comment 13:

The rationale for not also reporting cost-effectiveness from the healthcare perspective should be

noted (e.g. intervention costs for guided and unguided ACT fall outside the healthcare sector).

Authors‘ response:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

9

We agree that this would be quite interesting. However, as different health economic perspectives

would be worthwhile to consider (next to the healthcare perspective as well the employers

perspective) and due to the fact, that the paper already contains a lot of information (due to the three

comparison groups), we decided to conduct the health economic evaluation from the most common

used and recommended societal perspective (Mathes T, Jacobs E, Morfeld J-C, Pieper D. Methods of

international health technology assessment agencies for economic evaluations-a comparative

analysis. BMC health services research 2013;13(1):371). This was also stated in the study protocol.

Therefore, we tend to not discussing this explicitly in the paper.

Results

Comment 14:

The reported QALY measures for 6-months appear low (one-year QALY estimates from 0.50 to 0.54)

compared to national EQ-5D estimates for pain and chronic pain from other countries. Please verify

the accuracy of the measures and if accurate, discuss this difference.

o Saarni SI, Härkänen T, Sintonen H, Suvisaari J, Koskinen S, Aromaa A, Lönnqvist J. The impact

of 29 chronic conditions on health-related quality of life: a general population survey in Finland using

15D and EQ-5D. Quality of Life Research. 2006 Oct 1;15(8):1403-14.

o Sullivan PW, Lawrence WF, Ghushchyan V. A national catalog of preference-based scores for

chronic conditions in the United States. Medical care. 2005 Jul 1;43(7):736-49.

Authors‘ response:

Thank you for this comment. We checked the QALY measures and discussed the results in the

discussion section:

Pg 19, line 18-34: “Estimated EQ-5D scores for one year ranged from 0.50 to 0.54, what appears rather

low compared to national EQ-5D estimates for (back) pain from other countries (e.g. 0.74-0.79 [56, 57]).

Lower estimates in the current study could have occurred due to the sociodemographic properties of

this study sample, as participants were predominately women (84%), reported comorbid medical or

mental conditions (57% and 39%, respectively) and the back was the most often reported pain location

(34%) [25]. Several studies showed that the mentioned characteristics (female sex, musculoskeletal

and mental disorders) are associated with lower quality of life scores [56-58]. Furthermore, Burström et

al. (2001) reported in their study that participants with low back pain showed quality of life weights of

0.55, what is comparable to the sample in the current study [58].”

Comment 15:

Table 4: Negative ICERs can have two interpretations, the table should clearly specify what treatment

was dominant when a negative ICER is present. Interpretation of negative ICERs is also difficult as

holding differences in costs favoring ACT fixed, smaller QALY gains lead to a more negative ICER.

Recommend removing negative ICERs and highlighting which treatment is dominant.

Authors‘ response:

We followed the suggestion and changed negative ICERs into statements about treatment

dominance. E.g.:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

10

Pg 15, Table 2, line 30-33: “ACTonPain unguided dominates WLC”

Comments 16 and 17:

Table 4: It’s unclear why the Cost-utility ICER for Guided ACT vs WLC using EQ5D QALYs is

negative, as costs were higher and effects were greater for Guided ACT

Table 4: It’s unclear why the Cost-utility ICER for Unguided ACT vs WLC using EQ5D QALYs is

positive, as costs were lower and effects were greater for Unguided ACT

Authors‘ response:

Thank you very much for this comment. We agree that these values were counter-intuitive and

therefore, we checked again all parts of the analysis.

After checking the baseline values of the AQoL-utilities and the EQ5D-utilities (that were not reported

yet), it can be suggested that the baseline values are not “similar”, as small differences in these

values can have a great impact.

AQoL baseline values were:

ACTonPainguided: M=0.496, SD=0.16; ACTonPainunguided: M=0.485, SD=0.17; waitlist: M=0.463,

SD=0.14

EQ5D baseline values were:

ACTonPainguided: M=0.469, SD=0.32; ACTonPainunguided: M=0.436, SD=0.31, waitlist: M=0.494,

SD=0.3

Therefore, we concluded that baseline adjustment seems to be necessary and calculated the main

and the sensitivity analyses again with adjustment for baseline values, respectively.

Pg 11, line 11-15: “At baseline, AQoL-utilities differed between groups (ACTonPainguided: M=0.496,

SD=0.16; ACTonPainunguided: M=0.485, SD=0.17; waitlist: M=0.463, SD=0.14). Therefore, baseline

adjustments were made in further calculations.”

Pg 12, line 9-13: “As baseline EQ5D-utilities differed between groups (ACTonPainguided: M=0.469,

SD=0.32; ACTonPainunguided: M=0.436, SD=0.31; waitlist: M=0.494, SD=0.3), baseline adjustment

was made in the sensitivity analyses.”

After baseline adjustment no implausible values were generated in the EQ5D anymore and the results

were comparable to AQoL-8D results (what is discussed in the manuscript). Because values changed

slightly, we adjusted table 4 as well as the discussion. The only changes (in interpretation) were, that

it became more clearly that ACTonPain guided is not cost-effective compared to ACTonPain

unguided and that the EQ5D-3L showed slightly higher incremental QALY differences compared to

the AQoL. All other interpretations stayed the same.

Comment 18:

The results of the sensitivity analysis should include the impact on the ICER.

Authors‘ response:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

11

As we do not state the ICERs explicitly in the results of the main analysis and focus more on the ICER

distribution on the cost-effectiveness plane and the WTP we would like to report the results

consistently.

Furthermore, as soon as the bootstrapped ICERs fall into all four quadrants of the cost-effectiveness

plane a “simple” interpretation of the mean ICER is hardly possible.

However, we included information about WTP at €0 (to provide more comparable information) and

added a discussion on the results of the sensitivity analysis in the discussion section.

Pg 17, line 34-44: “After non-parametric bootstrapping, using the EQ-5D-3L resulted in smaller

incremental QALY gains in all comparisons compared to the results using the AQoL-8D (see table 4).

At a WTP of €0 the probability of ACTonPainguided of being cost-effective compared to waitlist was

50%. The probability of ACTonPainunguided of being cost-effective compared to waitlist was 64% at a

WTP of €0. ACTonPainguided vs. ACTonPainunguided resulted in a probability of being cost-effective of

31% at a WTP of €0.”

Discussion

Comment 19:

The discussion is difficult to follow given the three pairwise comparisons. Suggest framing the

discussion by noting the dominance of Unguided ACT over waitlist (while describing uncertainty) and

then focusing on the cost-effectiveness of guided vs unguided ACT.

Authors‘ response:

Thank you for this feedback. We revised the structure of the discussion and highlighted the outcome

of unguided ACT. You can find the changed parts of the discussion highlighted in the manuscript:

Pg 18-21

Comment 20:

The impact of the sensitivity analysis on cost-effectiveness should be discussed

Authors‘ response:

We included the following:

Pg 19, line 14-34: “The results of the sensitivity analyses revealed smaller incremental QALY gains by

using the EQ5D-3L compared to the AQoL-8D but overall conclusions are the same as in the main

analyses. Only the comparison between ACTonPainguided/unguided resulted in a higher ICER

(114,858), so that the guided version would not be judged as cost-effective according to the NICE

threshold. However, the distribution on the cost-effectiveness plane was similar compared to the

results of the main analysis.”

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

12

VERSION 2 – REVIEW

REVIEWER Brent Leininger University of Minnesota, Minneapolis, MN, USA

REVIEW RETURNED 01-Aug-2018

GENERAL COMMENTS The authors have addressed most of the concerns raised by reviewers. A few issues remain which are highlighted in the attached document. - The reviewer also provided a marked copy with additional comments. Please contact the publisher for full details.

VERSION 2 – AUTHOR RESPONSE

Minor revisions

Reviewers' Comments to Author and responses:

Reviewer: 2

Reviewer Name: Brent Leininger

Institution and Country: University of Minnesota, Minneapolis, MN, USA

Competing Interests: None declared

Comment 1 and comment 2 (Abstract):

1: Participants receiving guided ACT reported better health outcomes, but at a higher cost. The

potential value of this intervention is lost when just reporting the cost-effectiveness at a WTP of 0, as

it assumes society is unwilling to pay for health. Reporting the ICER would allow the reader to assess

value if society were willing to pay for health.

2: Within the abstract, this statement relies on a WTP of 0. What if a society is willing to pay for

health gains? Reporting the actual ICER would provide support for this statement, if the ICER is

above commonly accepted thresholds.

Authors‘ response:

We included the ICER/ICUR in the abstract, but as we think that only stating the ICER is not enough

to interpret the results correctly, we did not delete the probabilities. Instead we added the 95%-

probability of being cost-effective, to show, that with a high probability the WTP would be quite

high:

“Results: At 6-month follow-up treatment response and QALYs were highest in ACTonPainguided

(44% and 0.280; mean costs=€6,945), followed by ACTonPainunguided (28% and 0.266; mean

costs=€6,560) and WLC (16% and 0.244; mean costs=€6,908). ACTonPainguided vs WLC revealed an

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

13

ICER/ICUR of 171 and 3,033, respectively, while ACTonPainunguided dominated WLC. At a

willingness-to-pay of €0 the probability of being cost-effective was 50% for ACTonPainguided and

66% for ACTonPainunguided. This probability rises to 95% when society´s willingness-to-pay is

91,000€(ACTonPainguided ) and €127,000(ACTonPainunguided) per QALY. ACTonPainguided vs.

ACTonPainunguided revealed an ICER/ICUR of 2,949 and 198,377.

Comment 3: “Have

been shown”

Authors‘ response:

Thank you. We corrected that.

Comment 4:

Please include the use of baseline and month 3 to 6 costs to estimate costs from months 0 to 3 as a

limitation.

Authors‘ response:

Thank you for that comment. We added: “Furthermore, costs between randomization and three

months after randomization were calculated with the area under the curve method. This is just an

estimate and not a representation of the actual costs incurred during this period.” in the limitation

section

Comment 5 and comment 6:

5: 6/0.01 = 600, please check the accuracy of reported ICERs and provide an explanation for the

difference.

6: 388/0.008 = 48,500; which is in line with figure 2. The reported ICERs appear inaccurate. Please

check on the accuracy and provide an explanation for the discrepancy.

Authors‘ response:

Thank you for that question. The incremental costs (6 and 388) and effects (0.01 and 0.008) that you

mentioned are the “mean incremental costs” and the “mean incremental effects” (calculated after

bootstrapping).

However, the mean (bootstrapped) ICER is not the result of “mean incremental costs/ mean

incremental effects”.

After bootstrapping 5000 ICERs are calculated and then the mean of those 5000 ICERs is calculated

(what results in 3,033 and 198,377, respectively, for the two examples in your comments).

It has to be calculated like that, because otherwise (when “simply” dividing the two means)

information would get lost.

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

14

So the value of the mean bootstrapped ICERs/ICURs are all stated correctly in the manuscript.

Comment 7:

“This metric and CEAC curve appears at odds with an ICER of 198,377. Please check for accuracy.”

Authors‘ response:

Here we report two different values. The value €41.350 is the break even point, at which the guided

and the unguided version have the same probability (0.5) of being cost-effective.

€198,377 would have to be invested for gaining one QALY with the guided version compared to the

unguided version.

Comment 8:

When reviewing table 4, incremental QALY gains measured with the EQ5D are larger than those

measured with the AqoL.

Authors‘ response:

Thank you very much. We corrected that.

Comment 9:

In addition to the NICE thresholds, reference to how costeffectiveness is assessed in Germany,

where the study was conducted would be appropriate.

Authors‘ response:

In Germany there is no official WTP threshold and “…decisions for German Social Health Insurance

(“Gesetzliche Krankenversicherung, GKV”) are made on a case-by-case basis by a decision-making

body called “Gemeinsamer Bundesausschuss” (G-BA) with no obvious or transparent decision

criteria.” (Ahlert, M., Breyer, F., Schwettmann, L., 2013, p. 2. What you ask is what you get:

willingness-to-pay for a QALY in Germany. DIW Berlin Discussion Paper). Furthermore they state that

“…findings show first that Germans have no higher WTP for health gains than other Europeans.”

(Ahlert et al., 2013, p.1).

This was the reason why we used the only officially stated threshold by the NICE as reference and

based on the before mentioned aspects we think this is appropriate.

Comment 10:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

15

The response to reviewers‘ comments included more information in this section. “Only the

comparison between ACTonPainguided/unguided resulted in a higher ICER (114,858), so that the

guided version would not be judged as cost-effective according to the NICE threshold. However, the

distribution on the cost-effectiveness plane was similar compared to the results of the main

analysis.” Please include this information, noting the ICER was lower when EQ5D was used (114,858

vs 198,377) but still above recommended thresholds for cost-effectiveness (if that’s the case after

checking the accuracy of reported ICERs).

Authors‘ response:

Thank you for this comment. As the article is already quite comprehensive we left out this part (and

missed to delete it in the point-to-point reply). We think that the information in these sentences is

not essential and that the most important information is that overall conclusions are the same as in

the main analyses (like it is already stated in the text). Therefore, we would prefer to leave the text

at it is right now at this point.

Comment of the authors:

Comment 1:

At the beginning of the manuscript the address and affiliation of the corresponding author has been

changed:

a Corresponding author: Sarah Paganini, Department of Sports and Sport Science, Sport Psychology,

University of Freiburg, Germany, Schwarzwaldstr. 175, 79117 Freiburg, Phone: +49 (0)761 /

2034514, Email: [email protected]

Comment 2:

Furthermore, we revised the discussion slightly. We wanted to emphasize, that although ACTonPain

unguided dominates WLC, WTP would have to be quite high to reach a 95% probability. Therefore,

the decision whether the intervention is cost-effective or not cannot be made clearly. We thought

that this was not highlighted enough before.

“Comparing both ACTonPain interventions to waitlist and by taking uncertainty into account,

ACTonPainunguided couldan be judged as a potentially cost-effective intervention as it dominates

WLC by leading to higher QALY gains and more individuals with a treatment response at lower costs.

However, when assuming that an intervention should reach a likelihood of being cost-effective of

95% or greater it has to be considered that the WTP would have to be €13,460 for treatment

response and €127,000 for a QALY gain. Therefore, the judgement of whether the intervention is

cost-effective or not ultimately depends on the society’s WTP for treatment response or a QALY

gain, respectively.”

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

16

And

“…when comparing the costs that would have to be invested by using ACTonPainguided (compared to

waitlist) for a QALY gained (€3,033) to the only official WTP threshold stated by the National

Institute for Health and Clinical Excellence (NICE) of £20,000 to £30,000 [54] (~ €22,647 - €33,971;

conversion according to the European Central bank [55]), this intervention would also be categorized

as a potentially cost-effective treatment. Here again, uncertainty has to be considered as well as the

required WTP for a likelihood of being cost-effective of 95% of €6,490 (treatment response) and

€91,000 (QALY gain).”

And

Implications and future research

“…Findings from this health economic evaluation study show that both versions of ACTonPain have

the potential of being cost-effective, with the unguided version even leading to lower costs

(compared to WLC). However, uncertainty should be taken into account.”

Comment 3:

Furthermore, we deleted an example in the discussion because it is not stated in the results:

In terms of QALYs gained, the guided version only reaches a probability of 31% of being cost-

effective at a WTP of €0 and even with rising WTP threshold, the probability does not increase much

(e.g. at a WTP of €50,000 to about 53%).

VERSION 3 – REVIEW

REVIEWER Brent Leininger University of Minnesota, United States of America

REVIEW RETURNED 06-Nov-2018

GENERAL COMMENTS The authors have addressed most of the concerns raised by reviewers during the last revision. I recommend the authors address the following two concerns (one of which is new and arose based on the authors' response to the last round of comments): #1 - Please add a sentence or two in the discussion on German cost-effectiveness thresholds. (e.g. from the response to reviewers: "Germans have no higher WTP for health gains than other Europeans." #2 - Negative incremental cost-effectiveness ratios can represent two very different findings (findings in the southeast and northwest

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

17

quadrants of the cost-effectiveness plane both result in negative ICERs). In addition, the magnitude of a negative ICER is meaningless. Bootstrapped ICERs reported in the paper crossed multiple quadrants on the cost-effectiveness plane and included negative values (where interpretation is needed and the magnitude is meaningless). Therefore, I strongly recommend the authors report ICER/ICURs based on the mean costs/mean effects instead of mean ICER/ICURs from the bootstrap replicates. Mean ICERs from bootstrap replicates are uninterpretable when negative values are present. (see Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health Econ. 1997 Jul-Aug;6(4):327-40. and Briggs AH, O'Brien BJ, Blackhouse G. Thinking outside the box: recent advances in the analysis and presentation of uncertainty in cost-effectiveness studies. Annu Rev Public Health. 2002;23:377-401.)

VERSION 3 – AUTHOR RESPONSE

Minor Revisions

Reviewers' Comments to Author and responses:

Reviewer: 2

Reviewer Name: Brent Leininger

Institution and Country: University of Minnesota, United States of America

Please state any competing interests or state ‘None declared’: None declared

Please leave your comments for the authors below

The authors have addressed most of the concerns raised by reviewers during the last revision. I

recommend the authors address the following two concerns (one of which is new and arose based

on the authors' response to the last round of comments):

#1 - Please add a sentence or two in the discussion on German cost-effectiveness thresholds. (e.g.

from the response to reviewers: "Germans have no higher WTP for health gains than other

Europeans."

Authors‘ response:

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

18

We would prefer not to state this explicitely as a wider discussion would be necessary, what seems

not appropriate here. Ahlert et al. 2014 indeed found that „when the same questionnaire is used,

the WTP values found in Germany are similar if not even lower than the ones in comparable other

European countries.“ (Ahlert, M., Breyer, F., & Schwettmann, L. (2014). How You Ask Is What You

Get: Willingnessto-Pay for a QALY in Germany, DIW Berlin. Discussion Papers, 1384.). To argue with

this result it would be necessary to discuss, how the NICE threshold has been calculated.

Our discussion is not mainly based on the NICE threshold and the ICER, but on the cost-effectiveness

acceptability curves. The relevant aspect is, that this is the only official threshold and that we take it

as a possible reference for that reason. In order to rule out a direct comparison or

missunderstanding, we added the following sentence:

Page 17, line 56-57: This threshold might serve as a reference, but it has to be considered, that it

might differ for the German population.

#2 - Negative incremental cost-effectiveness ratios can represent two very different findings (findings

in the southeast and northwest quadrants of the cost-effectiveness plane both result in negative

ICERs). In addition, the magnitude of a negative ICER is meaningless. Bootstrapped ICERs reported in

the paper crossed multiple quadrants on the cost-effectiveness plane and included negative values

(where interpretation is needed and the magnitude is meaningless). Therefore, I strongly recommend

the authors report ICER/ICURs based on the mean costs/mean effects instead of mean ICER/ICURs

from the bootstrap replicates. Mean ICERs from bootstrap replicates are uninterpretable when

negative values are present. (see Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness

analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health

Econ. 1997 Jul-Aug;6(4):327-40. and Briggs AH, O'Brien BJ, Blackhouse G. Thinking outside the box:

recent advances in the analysis and presentation of uncertainty in cost-effectiveness studies. Annu

Rev Public Health. 2002;23:377-401.)

Authors‘ response:

Thank you for that comment. For calculating the mean ICER we relied on the three-stage process as

described by Briggs et al. 1997 (Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness

analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health

Econ. 1997 Jul-Aug;6(4):327-40

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

19

(Last step of the three-stage process; B stands for the amount of bootstrap replicates)

We considered the two different meanings of negative ICERs according to the papers that you

recommended by Briggs et al, 1997 and 2002. In Briggs et al. (2002) they state that the „problems“

caused by negative ICERs can be overcome through two possible approaches. One of these is the

appropriate representation of uncertainty on the cost-effectiveness plane. We did that and it can be

clearly seen in the cost-effectiveness planes (Figure 1) as well as in Table 4, how many of the

cost/effect pairs fall into the SE or the NW quadrant (in percentage terms, respectively).

Briggs et al., (2002, p. 387) further recommend to demonstrate the cost-effectiveness acceptability

curve „as it directly summarizes the evidence in support of the intervention being cost-effective for

all potential values of the decision rule.“ We did that as well (see Figure 2).

Thus, we followed the recommendations of Briggs et al. (1997 and 2002) but we know that the

question „mean of the ratios“ or „ratio of the means“ is an often discussed issue. We could find one

research article, that strengthens your position: Stinnett, A. A., & Paltiel, A. D. (1997). Estimating CE

ratios under second-order uncertainty: the mean ratio versus the ratio of means. Medical decision

making, 17(4), 483-489. Here the authors state some relevant advantages of the „ratio of means“

and therefore we adjusted our results according to your recommendation. Confidence intervalls

already relied on the cost/effect pairs and thus, stay the same.

The ICERs were replaced (marked in yellow) in Table 4, in the parts of the discussion and in the

abstract, respectively. As the interpretation stayed the same and because we already relied more on

the interpretation of the cost-effectiveness acceptability curve only few parts of the discussion had

to be changed.

We deleted one sentence (page 18):

“The direct comparison of ACTonPainguided and ACTonPainunguided shows more treatment responders

and (slightly) higher QALY gains for the guided version, but at higher costs. According to the NICE

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from

20

guidelines costs are far above the WTP threshold for a QALY gain and would thus be judged as

probably not cost-effective. In terms of QALYs gained, the guided version only reaches a probability

of 31% of being cost-effective at a WTP of €0 and even with rising WTP threshold, the probability

does not increase much.”

Furthermore, we removed the citation of a negative ICER in the discussion (page 45, line: 45-46):

“A further systematic review focused on economic evaluations of third-wave CBT therapies

(including ACT), were available ICERs ranged from negative ICERs indicating dominance over the

control group -€19,300 (National Health Service perspective) to €56,637 (societal perspective) per

QALY gained [13]”

VERSION 4 – REVIEW

REVIEWER Brent Leininger University of Minnesota, United States of America

REVIEW RETURNED 02-Jan-2019

GENERAL COMMENTS The authors' have been responsive to my comments and addressed my concerns.

on March 10, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2018-023390 on 9 A

pril 2019. Dow

nloaded from