953.full

10
Original Article Responsiveness of efficacy endpoints in clinical trials with over the counter analgesics for headache Bernhard Aicher 1 , Hubertus Peil 1 , Barbara Peil 2 and Hans-Christoph Diener 3 Abstract  Aim: To quantify and compare the responsiveness within the meaning of clinical relevance of efficacy endpoints in a clinical trial with over the counter (OTC) analgesics for headache. Efficacy endpoints and observed differences in clinical trials need to be clinically meaningful and mirror the change in the clinical status of a patient. This must be demonstrated for the specific disease indication and the particular patient population based on the application of treatments with proven efficacy. Methods: Patient’s global efficacy assessment during two study phases (pre-phase and treatment phase) was used to classify patients as satisfied or non-satisfied with the efficacy of their medication. The analysis is based on 1734 patients included in the efficacy analysis of a randomized, placebo-controlled, double-blind, multi-centre parallel group trial with six treatment arms. Based on this classif ication and the pain inte nsity recor ded by the patien ts on a 100 mm visual analogue scale, group differences by assessment categories and receiver operating characteristic (ROC) curve methods were used to quantify responsiveness of the efficacy endpoints ‘time to 50% pain relief’, ‘time until reduction of pain intensity to 10 mm’, ‘weighted sum of pain intensity difference’ (%SPIDw eighted), ‘pain intensity difference (PID) relative to baseline at 2 hours’, and ‘pain-free at 2 hours’. Results: Clinically relevant differences between patients satisfied and non-satisfied with the treatment were observed for all efficacy endpoints. Patients with the highest rating of efficacy had the fastest and strongest pain relief. In comparison, patients assessing efficacy as ‘less good’ reached a 50% pain relief on average nearly an hour later than those scoring efficacy as at least ‘good’. Simultaneously, their extent of pain relief was only half as great 2 hours after medication intake. Patients scoring efficacy as ‘poor’ experienced practically no pain relief within the 4 hour observation interval. ROC curve calculations confirmed an adequate responsiveness for all continuous endpoints. The following cut-off points for differentiating between satisfied and non-satisfied patients were deduced from the data in the pre- and treatment phase, respe ctive ly: ‘time to 50% pain reli ef’ 1:10 and 1:31 h:mi n, ‘time until reduc tion of pain inte nsity to 10 mm’ 2:40 and 3:0 0 h:min, ‘%SPIDweig hte d’ 68 and 64%, ‘PID at 2 hou rs’ 35 and 35 mm. The sensi tiv ity and spe cif ici ty bas ed on these cut-off points ranged from 70 to 79%. The binary endpoint ‘pain-fre e at 2 hours’ showed a clearly higher specificity (80 and 87%) than sensitivity (65 and 61%) in the pre- and treatment phase, respectively. Conclusions: When global assessment of efficacy by the patient was used as external criterion, ROC curve calculations confirmed a high responsiveness for all efficacy endpoints included in this study. Clinically relevant differences between patients satisfied and non-satisfied with the treatment were observed. The endpoint ‘%SPIDweighted’ proved slightly but consistently superior to the other endpoints. SPID and %SPIDweighted are not easy to interpret and the time course of pain reduction is of high importance for the patients in the treatment of acute pain, including headache. The endpoint ‘pain-free at 2 hours’ showed the expected high specificity, but at the cost of a concurrently low sensitivity and clearly makes less use of the available information than the endpoint ‘time to 50% pain reduction’, which combines the highly relevant aspects of time course and extent of pain reduction. Responsiveness, the ability of an outcome measure to detect clinically important changes in a specific condition of a patient, should be added in future revisions of IHS guidelines for clinical trials in headache disorders. 1 Boehringer Ingelheim Pharma GmbH & Co. KG, Germany 2 Institu te of Medical Biometry and Informatics Heidelber g, Germa ny 3 University Duisburg-Essen, Germany Corresponding author: Hans-Christoph Diener, Department of Neurology and Headache Centre, University Hospital Essen, Hufelandstrasse 55, 45147 Essen. Email: [email protected] Cephalalgia 32(13) 953–962 ! International Headache Society 2012 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/033 3102412452 047 cep.sagepub.com

Upload: ricky-herdianto

Post on 03-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 1/10

Original Article

Responsiveness of efficacy endpointsin clinical trials with over the counter analgesics for headache

Bernhard Aicher 1, Hubertus Peil1, Barbara Peil2 and

Hans-Christoph Diener 3

Abstract

 Aim: To quantify and compare the responsiveness within the meaning of clinical relevance of efficacy endpoints in a clinicaltrial with over the counter (OTC) analgesics for headache. Efficacy endpoints and observed differences in clinical trialsneed to be clinically meaningful and mirror the change in the clinical status of a patient. This must be demonstrated forthe specific disease indication and the particular patient population based on the application of treatments with provenefficacy.

Methods: Patient’s global efficacy assessment during two study phases (pre-phase and treatment phase) was used toclassify patients as satisfied or non-satisfied with the efficacy of their medication. The analysis is based on 1734 patientsincluded in the efficacy analysis of a randomized, placebo-controlled, double-blind, multi-centre parallel group trial withsix treatment arms. Based on this classification and the pain intensity recorded by the patients on a 100 mm visualanalogue scale, group differences by assessment categories and receiver operating characteristic (ROC) curve methodswere used to quantify responsiveness of the efficacy endpoints ‘time to 50% pain relief’, ‘time until reduction of painintensity to 10 mm’, ‘weighted sum of pain intensity difference’ (%SPIDweighted), ‘pain intensity difference (PID) relativeto baseline at 2 hours’, and ‘pain-free at 2 hours’.Results: Clinically relevant differences between patients satisfied and non-satisfied with the treatment were observed forall efficacy endpoints. Patients with the highest rating of efficacy had the fastest and strongest pain relief. In comparison,patients assessing efficacy as ‘less good’ reached a 50% pain relief on average nearly an hour later than those scoringefficacy as at least ‘good’. Simultaneously, their extent of pain relief was only half as great 2 hours after medication intake.

Patients scoring efficacy as ‘poor’ experienced practically no pain relief within the 4 hour observation interval. ROCcurve calculations confirmed an adequate responsiveness for all continuous endpoints. The following cut-off points fordifferentiating between satisfied and non-satisfied patients were deduced from the data in the pre- and treatment phase,respectively: ‘time to 50% pain relief’ 1:10 and 1:31 h:min, ‘time until reduction of pain intensity to 10 mm’ 2:40 and3:00 h:min, ‘%SPIDweighted’ 68 and 64%, ‘PID at 2 hours’ 35 and 35 mm. The sensitivity and specificity based onthese cut-off points ranged from 70 to 79%. The binary endpoint ‘pain-free at 2 hours’ showed a clearly higher specificity(80 and 87%) than sensitivity (65 and 61%) in the pre- and treatment phase, respectively.Conclusions: When global assessment of efficacy by the patient was used as external criterion, ROC curve calculationsconfirmed a high responsiveness for all efficacy endpoints included in this study. Clinically relevant differences betweenpatients satisfied and non-satisfied with the treatment were observed. The endpoint ‘%SPIDweighted’ proved slightly butconsistently superior to the other endpoints. SPID and %SPIDweighted are not easy to interpret and the time course of pain reduction is of high importance for the patients in the treatment of acute pain, including headache. The endpoint‘pain-free at 2 hours’ showed the expected high specificity, but at the cost of a concurrently low sensitivity and clearly

makes less use of the available information than the endpoint ‘time to 50% pain reduction’, which combines the highlyrelevant aspects of time course and extent of pain reduction. Responsiveness, the ability of an outcome measure todetect clinically important changes in a specific condition of a patient, should be added in future revisions of IHSguidelines for clinical trials in headache disorders.

1Boehringer Ingelheim Pharma GmbH & Co. KG, Germany2Institute of Medical Biometry and Informatics Heidelberg, Germany3University Duisburg-Essen, Germany

Corresponding author:

Hans-Christoph Diener, Department of Neurology and Headache

Centre, University Hospital Essen, Hufelandstrasse 55, 45147 Essen.

Email: [email protected]

Cephalalgia

32(13) 953–962

! International Headache Society 2012

Reprints and permissions:

sagepub.co.uk/journalsPermissions.nav

DOI: 10.1177/0333102412452047

cep.sagepub.com

Page 2: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 2/10

Keywords

Responsiveness of efficacy endpoints, clinical relevance, over the counter analgesics, headache, ROC curve calculations,randomized clinical trials

Date received: 13 March 2012; revised: 15 May 2012; accepted: 27 May 2012

Introduction

Headache pain is difficult to define and quantify.

Consequently, quantification of analgesic efficacy is

also difficult, in particular in clinical trials (1). The

selection of primary efficacy endpoints in headache

studies is a critical factor (2). In the 1970s and 1980s

attention was focused on the benefits and drawbacks

of the various methods for measuring pain, ranging

from yes/no responses through graded scales with

variable numbers of categories to the continuous

visual analogue scale (3–6), but the debate has since

come to centre on the question of adequate primary

efficacy endpoints or outcome measures (2,7–10).

During this period the discussion on headache study

methodology concentrated on the methods used in the

clinical development programme of sumatriptan,

including the 4-stage Likert scale used as a method

of measurement and the efficacy endpoint selected

(‘percentage of patients with a reduction in headache

severity from moderate or severe to none or mild’)

(11–13). The Committee on Clinical Trials in

Migraine of the International Headache Society sug-

gested as the primary efficacy parameter in acute treat-ment trials in migraine the ‘number of migraine

attacks resolved within 2 hours’ (14), although this

suggestion has received some criticism in comparison

with the response criterion referred to above (15). It

has been noted out on several occasions, however,

that little research has been conducted to determine

which of these endpoints are considered by headache

sufferers themselves to be most important (16,17), and

it has also been commented that new efficacy end-

points need to be defined for certain patients (18).

The issue of which endpoint to use is still vexing (7).

There is no consensus regarding how an endpoint

translates into patient acceptability, or about the rela-

tive importance of each attribute in determining this

acceptability (19). It is, however, universally accepted

that in the context of clinical trials, outcome variables

and observed differences need to be clinically meaning-

ful (20) and mirror the change in the clinical status of a

patient.

We used methods described for the assessment of 

outcome measures (21) to quantify and compare the

performance of efficacy endpoints in clinical trials

with OTC analgesics in headache. In general, outcome

measures have to be reliable, valid and responsive. In a

previous paper (22) we reported some results on the

validity of various efficacy endpoints in OTC headache

trials. In this paper we focus on the responsiveness of 

these efficacy endpoints.

Responsiveness is defined as the ability of an

outcome measure to detect clinically important changes

in a specific condition of a patient (21,23,24). An

outcome measure with high responsiveness should be

able to discriminate between the true condition states of 

the patient. If we restrict to two states of a condition

(present or absent), this allows evaluation of respon-

siveness with methods originally used to assess the per-

formance of diagnostic tests. The condition to be

‘diagnosed’ (25) could be, for example, whether the

clinical status was improved or non-improved or

whether the patient was satisfied or non-satisfied with

their treatment. The quantification and comparison of 

responsiveness of various outcome measures necessi-

tates the definition of an external criterion for the dif-

ferentiation between patients with the condition present

or not. There is currently no gold standard for an exter-

nal criterion in pain outcome measures. Examples of 

external criteria used comprise pain assessment and dis-ability rating (23), return to full activities (25), use of 

additional rescue medication (26), global perceived

effect assessed by the patient (24), standardized effect

size (27), power of a test or sample size needed to detect

a clinically important difference (21) or patient’s global

impression of change (28). We used the patient’s global

assessment of efficacy as an external criterion to com-

pare the responsiveness of common efficacy endpoints

in headache trials.

MethodsPatients, study design and treatments

The data for this analysis were collected as part of a

randomized, placebo-controlled, double-blind, multi-

centre parallel group trial with six treatment arms, con-

ducted between September 1998 and January 2003 (29).

The primary objective of the study was to investigate

the efficacy, safety and tolerability of the fixed combin-

ation of acetylsalicylic acid þ paracetamol þ caffeine in

comparison with the combination without caffeine, the

single preparations, and placebo in patients who were

954 Cephalalgia 32(13)

Page 3: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 3/10

used to treating their episodic tension-type headache or

migraine attacks with non-prescription analgesics.

The patients were enrolled by practitioners and

specialists in general and internal medicine throughout

Germany. Male or female patients (18–65 years)

who were not consulting for headache were asked

whether they had headaches that they treated withnon-prescription analgesics. Usual headaches had to

meet International Headache Society criteria (30) for

episodic tension-type headache (2.1) and/or migraine

with or without aura (1.1, 1.2.1). They must have

experienced these headaches for at least 12 months

with a minimum of two headache episodes within the

previous 3 months.

Patients were excluded if previous or concomitant

diseases or medication could interfere with one of the

study drugs or influence headache symptoms. Drug

overuse connected with the headache and alcohol or

drug abuse was also an exclusion criterion, as was preg-

nancy, lactation or participation in another clinical trial

within 4 weeks of entering this study.

Before enrolment the patients gave their written

informed consent according to paragraphs 40 and 41

of the German Drug Law (AMG) and International

Conference on Harmonisation, Guidance for Good

Clinical Practice, E6 (ICH GCP) standards. Patients

were allowed to terminate participation in the trial at

any time, without giving reasons. The study was con-

ducted in accordance with the Declaration of Helsinki,

the AMG and ICH GCP standards and did not start

before independent ethics committee approval was

obtained.Patients randomly allocated to one of the six treat-

ment groups treated their headache attack with a single

dose of the allocated study medication. Before the ran-

domized treatment phase a headache episode treated

with the patient’s usual non-prescription medication

was recorded (open pre-phase).

Endpoints

Patients recorded pain intensity on a 100 mm visual

analogue scale (VAS) before and then 30 min and 1,

2, 3 and 4 hours after drug intake in the patient diary.

The calculated time to 50% pain relief was chosen as

primary endpoint based on the pain intensity recorded

on the VAS.

The secondary endpoints of this study, which com-

prise both efficacy and tolerability parameters, were:

. calculated time until reduction of pain intensity to

10 mm on the VAS

. percentage of patients with at least 50% pain relief 

after 30 min, 1, 2, 3 and 4 hours (evaluated on the

VAS)

. percentage of patients pain-free, defined as patients

with reduction of pain intensity to at least 10 mm on

the VAS, after 30 min, 1, 2, 3 and 4 hours

. pain intensity difference after 30 min, 1, 2, 3 and 4

hours (evaluated on the VAS)

. weighted sum of pain intensity difference (SPID)

expressed as a percentage of the maximum achiev-able SPID (%SPIDweighted)

. extent of impairment of daily activities before and

after 30 min, 1, 2, 3 and 4 hours of drug administra-

tion (4-point verbal rating scale (VRS): 0 ¼ ‘not

impaired’, 1 ¼ ‘somewhat impaired’, 2 ¼ ‘greatly

impaired’, 3 ¼ ‘usual daily activities impossible’)

. global assessment of efficacy by the patient (4-point

VRS: 1 ¼ ‘very good’, 2 ¼ ‘good’, 3 ¼ ‘less good’,

4 ¼ ‘poor’) within 12 hours after administration of 

the trial medication based on the question ‘How

do you assess the efficacy of your tablets?’

. global assessment of tolerability by the patient and

investigator (4-point VRS: 1 ¼ ‘very good’,

2 ¼ ‘good’, 3 ¼ ‘less good’, 4 ¼ ‘poor’)

. safety assessment: recording of adverse events (time

of onset, duration and intensity of adverse events;

relationship between the drug treatment and adverse

event determined by the investigator)

Efficacy and safety endpoints were calculated twice,

for the assessments at the end of the open pre-phase

and for the assessments at the end of the randomized

treatment phase.

Statistical analysis

The efficacy endpoints were compared by means of 

descriptive statistics sorted by the categories of the

patient’s global assessment of efficacy.

The performance of the endpoints in discriminating

between patients satisfied and patients non-satisfied

with the efficacy was further evaluated using receiver

operating characteristic (ROC) methodology (31).

Patients assessing efficacy as very good or good were

classified as satisfied, patients scoring efficacy less

good or poor as non-satisfied. Subsequently and for

each continuous endpoint separately, cut-off points on

the measurement scale for the endpoint were deduced

from ROC curves using logistic regression analysis. If 

the patient was classified as satisfied with the efficacy

and the endpoint measurement was above the cut-off 

point, the outcome was called true positive (Figure 1A).

If the patient was classified as non-satisfied and the

endpoint measurement was below the cut-off point,

the outcome was called true negative. The ROC

curve displays the true positive (sensitivity) versus

the false positive (one minus specificity) rates for

the range of possible cut-off points for predicting

 Aicher et al. 955

Page 4: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 4/10

the global assessment of efficacy by the patient

(Figure 1B and C). The area under the ROC curve

(AUC) was calculated as a summary measure of its

discriminatory ability. The cut-off point was chosen

where the sensitivity and specificity were equal assum-

ing equal importance of sensitivity and specificity.

As the endpoint ‘pain-free’ at a pre-specified timepoint is a binary variable a ROC curve cannot be cal-

culated. There is only one pair of sensitivity and

specificity.

All calculations were performed twice based on the

data of the pre-phase and the data of the randomized

treatment phase.

Results

Patient characteristics

The full analysis set comprised 1743 patients recruited in

133 centres (29). Of these, 15 patients in the pre-phase

and nine patients in the randomized treatment phase did

not assess the global efficacy at the end of the respective

study phase. The remaining 1728 patients in the pre-

phase and 1734 patients in the randomized treatment

phase were included in the evaluation (76% women,

24% men; median age: 38 years; range 16–72 years).

Without treatment, the usual pain intensity was severe

or very severe in 62% and moderate in 37% of patients.

The severity of pain was associated with disability of 

performing usual daily activities. The mean Æ SD pain

intensity at baseline was 59.1 Æ 20.6 mm in the open pre-

phase and 64.3 Æ 20.3 mm in the randomized treatmentphase.

Major efficacy results

The superior efficacy of the triple combination con-

taining acetylsalicylic acid, paracetamol and caffeine

could be shown for all efficacy endpoints such as the

‘time to 50% pain relief’ (primary endpoint), ‘time

until reduction of pain intensity to 10 mm’, ‘pain

intensity difference’, ‘%SPIDweighted’, ‘extent of 

impairment of daily activities’, and ‘patient’s global

efficacy assessment’ (29).

Group differences sorted by the assessment

of efficacy 

The majority of patients assessed efficacy as very good

or good in both study phases (68% in the pre-phase,

64% in the randomized treatment phase, Table 1).

Whereas the number of patients scoring efficacy as

less good was nearly equal in both study phases (25%

and 24% in the pre-phase and treatment phase, respect-

ively), a slightly higher percentage (7% versus 12%) of 

patients recorded poor efficacy in the randomized treat-

ment phase.

All efficacy endpoints improved in parallel to the

increase of the patient’s efficacy assessment (Table 1).

Patients with the highest rating of efficacy had the

fastest and strongest pain relief. In comparison,

patients assessing efficacy as less good reached a50% pain relief on average nearly an hour later

(median time to 50% pain relief 1:45 h:min) compared

with those scoring efficacy as at least good (median

time to 50% pain relief 0:51 h:min) in the pre-phase.

Simultaneously, the extent of pain relief was only half 

as great 2 hours after medication intake (mean PID

25.8mm versus 46.2mm as assessed on the VAS).

Patients scoring efficacy as poor experienced almost

no pain relief within the 4 hour observation period

(median time to 50% pain relief  > 4 hours and

mean PID 7.5 mm).

The corresponding values in the randomized treat-

ment phase were overall slightly worse regarding the

time to pain relief and slightly better with respect to

the extent of pain relief (Table 1). The differences

between the categories of the patient’s global assess-

ment of efficacy were qualitatively and quantitatively

well comparable to those in the pre-phase.

ROC curves and cut-off points

The ROC curves of all continuous efficacy endpoints

were very close together and partly crossing (Figure 2).

The ROC curves, the sensitivity and specificity, and

consistently the AUC were more similar for the datafrom the randomized treatment phase than those from

the pre-phase for all endpoints (Table 2). The AUC in

the pre-phase ranged from 0.77 to 0.86 and that in the

treatment phase from 0.84 to 0.89. The endpoint

‘%SPIDweighted’ was slightly but consistently superior

to the other endpoints.

The optimal cut-off points for differentiating

between satisfied and non-satisfied patients were lower

in the pre-phase than in the randomized treatment

phase for the majority of endpoints. The following

cut-off points were deduced from the ROC curves for

the pre-phase and treatment phase, respectively: ‘time

to 50% pain relief’ 1:10 and1:31 h:min, ‘time until

reduction of pain intensity to 10mm’ 2:40 and

3:00 h:min, ‘%SPIDweighted’ 68 and 64%, ‘PID at 2

hours’ 35 and 35 mm. Based on these cut-off points

the sensitivity and specificity ranged from 70 to 77%

in the pre-phase and 76 to 79% in the treatment phase.

The binary endpoint ‘pain-free at 2 hours’ showed a

clearly higher specificity of correctly predicting a non-

satisfied patient (80 and 87% for the pre-phase and

treatment phase, respectively) than the sensitivity of 

correctly predicting a satisfied patient (65 and 61%).

956 Cephalalgia 32(13)

Page 5: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 5/10

Discussion

Responsiveness must be demonstrated for the specific

disease indication and the particular patient population

based on the application of treatments of proven effi-

cacy (32). The data used was taken from a clinical study

that showed the superior efficacy of the fixed combin-

ation containing acetylsalicylic acid, paracetamol and

caffeine over the combination without caffeine, the

single preparations, and placebo in the treatment of 

headache for all efficacy endpoints, such as the ‘time

to 50% pain relief’ (primary endpoint), ‘time until

reduction of pain intensity to 10 mm’, ‘pain intensity

difference’, ‘%SPIDweighted’, ‘extent of impairment

of daily activities’, and ‘patient’s global efficacy assess-

ment’ (29).

The quantification of the responsiveness always

necessitates the choice of a reference criterion that

describes the status or change in a status of a patient’s

condition. There is currently no gold standard for this

criterion in pain outcome measures. The choice of the

criterion is always problematic (24). It should be spe-

cific to the disease and the patient population studied.

In self-medication, the patient’s choice of a particular

therapy, which they take for their headache, depends

on the subjective perception of efficacy and tolerability

of the drug without consulting a doctor. As the global

assessment of overall efficacy by the patient aims to

summarize the patient’s overall impression about their

state or change in their state (33), it is of particular

importance in clinical trials with OTC medications for

self-medication of headaches and qualifies as a criterion

with which to quantify and compare the performance

of efficacy endpoints in these trials. It is relevant and

sensible to ask the patient to assess their perceived

benefit (24) and to use their decision as a reference.

Table 1. Summary of descriptive statistics for primary and secondary efficacy endpoints grouped by patient’s global efficacy

assessment.

Assessment of efficacy

Very good

n ¼ 255

Good

n ¼ 923

Less good

n ¼ 424

Poor

n ¼ 126

Pre-phaseTime to 50% pain relief [h:min]

Median 0:33 0:51 1:45 >4:00

Interquartile range 0:19 to 1:00 0:34 to 1:25 0:50 to >4:00 3:28 to >4:00

Time until reduction of pain intensity to 10 mm [h:min]

Median 0:53 1:44 >4:00 >4:00

Interquartile range 0:28 to 1:49 0:56 to 2:48 2:13 to >4:00 >4:00 to >4:00

%SPIDweighted [%]

Mean (SD) 85.3 (16.4) 75.3 (20.1) 40.2 (43.9) 5.0 (42.4)

PID at 2 hours [mm]Mean (SD) 50.9 (21.1) 46.2 (20.5) 25.8 (24.6) 7.5 (22.5)

Pain-free at 2 hours [%]

Percentage 78.4 61.5 23.2 7.1

Very good

n ¼ 353

Good

n ¼ 759

Less good

n ¼ 412

Poor

n ¼ 210

Randomized treatment phase

Time to 50% pain relief [h:min]

Median 0:38 1:00 2:19 >4:00

Interquartile range 0:22 to 1:02 0:38 to 1:40 1:16 to >4:00 >4:00 to >4:00

Time until reduction of pain intensity to 10 mm [h:min]Median 0:59 1:56 >4:00 >4:00

Interquartile range 0:44 to 1:52 1:05 to 3:18 2:37 to >4:00 >4:00 to >4:00

%SPIDweighted [%]

Mean(SD) 85.5 (13.2) 73.7 (18.9) 43.7 (32.8) 8.5 (27.2)

PID at 2 hours [mm]

Mean(SD) 56.7 (21.5) 49.6 (21.4) 27.2 (21.9) 6.5 (18.1)

Pain-free at 2 hours [%]

Percentage 78.4 53.2 18.2 2.4

 Aicher et al. 957

Page 6: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 6/10

(A)Classification

due to cut-off for the endpoint

non-improved improved

Assessment

of efficacy

by patient

non-satisfied true negative false positive

satisfied false negative true positive

(B) (C)

cut-off x

100-Specificity (false positive) [%]

   S  e  n  s   i   t   i  v

   i   t  y   (   t  r  u  e  p  o  s   i   t   i  v  e   )   [   %   ]

100-Spec (x)

   S  e  n  s   (  x   )

0

   0

   5   0

   1   0   0

50 100

Figure 1. ROC method. (A) Decision table with possible outcomes. (B) Distribution of data of the efficacy endpoint for the two

groups of patients with efficacy assessment as non-satisfied or satisfied including a possible cut-off point. (C) ROC curve for all possible

cut-off points.

1.0

0.8

0.6

0.4

0.2

0.0

1.0

0.8

0.6

0.4

0.2

0.00.0 0.2 0.4

1-Specificity

Time to 50% pain reliefTime to pain intensity of 10 mm% SPIDweightedPain intensity difference at 2 hoursPain-free at 2 hours

Time to 50% pain reliefTime to pain intensity of 10 mm% SPIDweightedPain intensity difference at 2 hoursPain-free at 2 hours

Pre-phaseRandomized treatment phase

   S  e  n  s   i   t   i  v   i   t  y

   S  e  n  s   i   t   i  v   i   t  y

0.6 0.8 1.0 0.0 0.2 0.4

1-Specificity

0.6 0.8 1.0

Figure 2. Receiver operating characteristic (ROC) curves for primary and secondary efficacy endpoints based on patient’s global

efficacy assessment as external criterion.

958 Cephalalgia 32(13)

Page 7: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 7/10

The global assessment of efficacy in clinical trials

reflects these perceptions best. However, as we do not

postulate global assessment of efficacy as the ‘gold

standard’, further comparisons of efficacy endpoints

against other reference criteria may be helpful.

When global assessment of efficacy by the patient

was used as external criterion, we have shown clinically

relevant differences between patients satisfied and non-

satisfied with the treatment for all efficacy endpoints

included in the analysis. Patients satisfied with their

medication reported a 50% pain relief approximately

within 1–1.5 hours. A pain-free state should be reached

by at least 3 hours after intake of the medication for

patients to be satisfied with their medication. Thereduction in pain intensity 2 hours after medication

intake needed to be greater than 35mm for these

patients assessed on the 100 mm VAS.

The cut-offs for differentiating between patients

satisfied and non-satisfied with their treatment were

determined assuming equal sensitivity and specificity.

This was possible for all continuous endpoints. The

binary endpoint ‘pain-free at 2 hours’ used an a priori

fixed cut-off point with clearly higher specificity of cor-

rectly classifying non-satisfied patients compared with

the sensitivity of correctly classifying satisfied patients.

Nearly balanced sensitivity and specificity would result

in the binary outcome pain-free at 3 hours. Higher spe-

cificity than sensitivity could be reached for the other

endpoints if more stringent cut-off points were chosen.

Outweighing specificity or sensitivity might be mean-

ingful for specific study objectives. Without any restric-

tions it is reasonable to balance both.

Lipton stated ‘The assessment of migraine pain,

associated symptoms, and disability is subjective, in

that clinicians rely on patient rating of the severity of 

migraine symptoms. Patient assessment and corres-

ponding physician evaluation form the basis for

treatment decisions and assessment of the efficacy of a

migraine therapy’ (34). The FDA added, ‘For some

treatment effects, the patient is the only source of 

data. For example, pain intensity and pain relief are

the fundamental measures used in the development of 

analgesic products. Many patient-reported outcome

instruments are able to detect mean changes that are

very small; accordingly it is important to consider

whether such changes are meaningful’ (35). Although

there are some surveys that judge and rank the rele-

vance of possible endpoints from the perspective of 

the patients (36–39), there is no clinical trial in head-

ache to our knowledge in which the specificity and the

sensitivity of the primary endpoint were analysed quan-titatively with the global assessment of efficacy of the

patients as reference criteria. This obvious question

remained unanswered. Corresponding analyses would

be helpful, but needs access to individual patient-

related information.

With migraine and tension-type headache pain

intensity increases over a certain period of time until

fully developed. If the medication is taken very early,

endpoints related to the baseline value are not very

useful. This can be a problem in early intervention stu-

dies. The patients in the Thomapyrin study (29), how-

ever, were not instructed to take their study medication

at any certain time point and the baseline values and

the development of pain intensity show that in most

patients the headache pain was fully developed.

The ROC curve calculations confirmed a high

responsiveness for all efficacy endpoints included

in this study. The observed differences between end-

points are, in general, small. Therefore the ROC

curves for all endpoints were very close and partly

crossing, although the endpoint ‘%SPIDweighted’

proved slightly but consistently superior to the other

endpoints.

Table 2. Area under the receiver operating characteristic (ROC) curve and sensitivity and specificity of cut-off points for primary

and secondary efficacy endpoint based on patient’s global efficacy assessment as external criterion.

AUC Cut-off point Sensitivity [%] Specificity [%]

Pre-phase

Time to 50% pain relief [h:min] 0.77 1:10 70 70

Time until reduction of pain intensity to 10 mm [h:min] 0.81 2:40 75 75%SPIDweighted [%] 0.86 68 77 77

PID at 2 hours [mm] 0.78 35 71 71

Pain-free at 2 hours [%] À À 65 80

Randomized treatment phase

Time to 50% pain relief [h:min] 0.85 1:31 76 76

Time until reduction of pain intensity to 10 mm [h:min] 0.84 3:00 77 77

%SPIDweighted [%] 0.89 64 79 79

PID at 2 hours [mm] 0.84 35 76 76

Pain-free at 2 hours [%] À À 61 87

 Aicher et al. 959

Page 8: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 8/10

As recommended in the 2nd edition of the

‘Guidelines for controlled trials of drugs in migraine’

the ‘percentage of patients pain-free at 2 h, before any

rescue medication, should usually be the primary meas-

ure of efficacy’ (40). According to the authors of the

comments on this recommendation, this endpoint has

the advantage that it ‘reflects patients’ expectations, issimple and not affected by rescue medication’ (40). The

value of the 2-hour pain-free measure cannot be over-

emphasized, according to Ramadan (41). Although

migraine patients stated incomplete or inconsistent

pain relief as important issues for their assessment of 

a treatment in a telephone survey (36), the Thomapyrin

study (29) showed that at least for analgesics used for

the self-medication of headaches the patients weighted

‘time to 50% pain reduction’ much higher than ‘time to

pain-free’ for their global evaluation of efficacy (22).

Goadsby stated as a disadvantage that for patients

with slowly settling headache the transition to no pain

is difficult to discern and for some patients the reduc-

tion in headache pain to mild is a substantial and very

beneficial result (7). Tfelt-Hansen and coauthors say in

the guideline that ‘resolution, not alleviation, within 2 h

might seem unrealistic with some drugs’ (40). The end-

point ‘pain-free at 2 hours’ showed the expected high

specificity, but at the cost of a concurrently low sensi-

tivity. It clearly makes less use of the available infor-

mation than, for example, the endpoint ‘time to 50%

pain reduction’ and copes less well with the objective ‘to

choose appropriate endpoints that reflect realistic treat-

ment goals for individual patients’ (42).

It was recommended in the 2nd edition of the‘Guidelines for controlled trials of drugs in tension-

type headache’ again that ‘pain-free rate at 2 h should

be the primary efficacy measure’ (43). However, other

possible endpoints were also discussed: ‘Sum of pain

intensity differences’ (SPID) could theoretically be

useful because it has the advantage of summarizing the

benefits of treatment over a clinically relevant period,

e.g. 2 h’ (43). Ramadan pointed to the common use of 

endpoints such as SPID in pain studies (41). The version

of SPID weighted according to the time points of pain

intensity assessment proved to be the endpoint with the

highest responsiveness in the present study. However,

SPID and %SPIDweighted are not easy to interpret.

The time course of pain reduction is of higher import-

ance for the patients in the treatment of acute pain,

including headache, than for example in the treatment

of chronic pain under steady state conditions of the

treatment. Tfelt-Hansen et al., in their review of single

attack data, found that SPID did not appear to add

anything and assumed that SPID usually gives similar

results to other headache relief measures (44). Our ana-

lysis using the ROC method supports this assumption.

The ROC method allows the quantification and

comparison of the responsiveness between clinical end-

points of very different types as the ROC curve depends

only on the ranks of the observations and is independent

of the scale used to measure the endpoint.

Responsiveness, the ability of an outcome measure

to detect clinically important changes in a specific con-

dition of a patient, is not yet sufficiently considered inthe discussion of possible endpoints in both IHS guide-

lines, even though it is an aspect of great relevance in

clinical trials. This should be added in future revisions

of these guidelines.

Funding

This work was supported by Boehringer Ingelheim Pharma

GmbH & Co.KG Germany.

Conflict of interests

BA and HP are employees of Boehringer Ingelheim Pharma

GmbH & Co. KG, Germany. BP declares no conflicts of 

interest. HCD received honoraria for participation in clinical

trials, contribution to advisory boards or oral presentations

from: Addex Pharma, Allergan, Almirall, AstraZeneca, Bayer

Vital, Berlin Chemie, Coherex, CoLucid, Boehringer

Ingelheim, Bristol-Myers Squibb, GlaxoSmithKline,

Gru ¨ nenthal, Janssen-Cilag, Lilly, La Roche, 3M Medica,

Medtronic, Menerini, Minster, MSD, Novartis, Johnson &

Johnson, Pierre Fabre, Pfizer, Schaper and Bru ¨ mmer,

SanofiAventis, and Weber & Weber. HCD has no ownership

interest and does not own stocks of any pharmaceutical

company.

References1. Wallenstein SL and Houde RW. The clinical evaluation of 

analgesic effectiveness. In: Ehrenpreis S, Neidle A (eds)

Methods in narcotics research. New York: Marcel

Dekker, 1975, pp.127–145.

2. Lipton RB. Methodologic issues in acute migraine clinical

trials. Neurology 2000; 55: 53–97.

3. Huskisson EC. Measurement of pain. Lancet 1974; 2:

1127–1131.

4. Kremer E, Atkinson JH and Ignelzi RJ. Measurement of 

pain: patient preference does not confound pain measure-

ment. Pain 1981; 10: 241–248.

5. Jensen MP, Karoly P and Braver S. The measurement of 

clinical pain intensity: a comparison of six methods. Pain

1986; 27: 117–126.

6. Skovlund E and Flaten O. Response measures in

the acute treatment of migraine. Cephalalgia 1995; 15:

519–522.

7. Goadsby PJ. The scientific basis of medication choice in

symptomatic migraine treatment. Can J Neurol Sci  1999;

26(Suppl. 3): S20–S26.

8. Salonen R. Drug comparisons: why are they so difficult?

Cephalalgia 2000; 20(Suppl. 2): 25–32.

9. Sheftell FD and Fox AW. Acute migraine treatment out-

come measures: a clinician’s view. Cephalalgia 2000;

20(Suppl. 2): 14–24.

960 Cephalalgia 32(13)

Page 9: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 9/10

10. McCrory DC, Gray RN, Tfelt-Hansen P, Steiner TJ and

Taylor FR. Methodological issues in systematic reviews

of headache trials: adapting historical diagnostic classifi-

cations and outcome measures to present-day standards.

Headache 2005; 45: 459–465.

11. Pilgrim AJ. Methodology of clinical trials of sumatriptan

in migraine and cluster headache. Eur Neurol  1991; 31:

295–299.12. Pilgrim AJ. Early-phase clinical trials in migraine:

efficacy, safety and dose-finding. Cephalalgia 1995;

(Suppl. 15): 10–13.

13. Pilgrim AJ. Lessons learned from the clinical develop-

ment of sumatriptan. In: Olesen J, Tfelt-Hansen P (eds)

Headache treatment: trial methodology and new drugs.

Philadelphia: Lippincott-Raven Publishers, 1997,

pp.119–124.

14. IHS Committee on Clinical Trials in Migraine: Guide-

lines for controlled trials of drugs in migraine (first edi-

tion). Cephalalgia 1991; 11: 1–12.

15. Tfelt-Hansen P. Complete relief (HIS criterion) or no or

mild pain (Glaxo criterion)? In: Olesen J, Tfelt-Hansen P(eds) Headache treatment: trial methodology and new

drugs. Philadelphia: Lippincott-Raven Publishers, 1997,

pp.157–160.

16. Pietrini U, De Luca M and Del Bene E. Endpoints to

evaluate efficacy of symptomatic drugs in migraine:

what do patients want? Headache 2002; 42: 948–949.

17. Coeytaux RR, Frasier PY and Reid A. Patient-centered

outcomes for frequent headaches. Headache 2007; 47:

480–485.

18. Landy SH, McGinnis JE and McDonald SA. Pilot study

evaluating preference for 3-mg versus 6-mg subcutaneous

sumatriptan. Headache 2005; 45: 346–349.

19. Bigal M, Rapoport A, Aurora S, Sheftell F, Tepper S and

Dahlof C. Satisfaction with current migraine therapy:experience from 3 centers in US and Sweden. Headache

2007; 47: 475–479.

20. Friedman LM, Furberg CD and DeMets DL.

Fundamentals of clinical trials. Boston: John Wright

PSG Inc, 1982, pp.1–224.

21. Kirshner B and Guyatt GH. A methodological frame-

work for assessing health indices. J Chron Dis 1985; 38:

27–36.

22. Pageler L, Diener HC, Pfaffenrath V, Peil H and

Aicher B. Clinical relevance of efficacy endpoints in

OTC headache trials. Headache 2009; 49: 646–654.

23. Bronfort G and Bouter LM. Responsiveness of general

health status in chronic low back pain: a comparison of 

the COOP Charts and the SF-36. Pain 1999; 83:

201–209.

24. Beurskens AJHM, de Vet HCW and Ko ¨ ke AJA. Respon-

siveness of functional status in low back pain: a compari-

son of different instruments. Pain 1996; 65: 71–76.

25. Deyo RA and Centor RM. Assessing the responsiveness

of functional scales to clinical change: an analogy to diag-

nostic test performance. J Chron Dis 1986; 39: 897–906.

26. Farrar JT, Portenoy RK, Berlin JA, Kinman JL and

Strom BL. Defining the clinically important differ-

ence in pain outcome measures. Pain 2000; 88:

287–294.

27. Buchbinder R, Bombardier C, Yeung M and Tugwell P.

Which outcome measures should be used in rheumatoid

arthritis clinical trials? Arthritis Rheum 1995; 38:

1568–1580.

28. Farrar JT, Young Jr JP, LaMoreaux L, Werth JL and

Poole RM. Clinical importance of changes in chronic

pain intensity measured on an 11-point numerical pain

rating scale. Pain 2001; 94: 149–158.29. Diener HC, Pfaffenrath V, Pageler L, Peil H and Aicher

B. The fixed combination of acetylsalicylic acid, paraceta-

mol and caffeine is more effective than single substances

and dual combination for the treatment of headache: a

multicentre, randomized, double-blind, single dose, pla-

cebo-controlled parallel group study. Cephalalgia 2005;

25: 776–787.

30. Headache Classification Committee of the International

Headache Society. Classification and diagnostic criteria

for headache disorders, cranial neuralgias and facial pain.

Cephalalgia 1998; 8(Suppl. 7): 1–96.

31. Pepe MS. The statistical evaluation of medical tests for

classification and prediction. Oxford: Oxford UniversityPress, 2003.

32. Revicki DA, Cella D, Hays RD, Sloan JA, Lenderking

WR and Aaronson NK. Responsiveness and minimal

important differences for patient reported outcomes.

Health Qual Life Outcomes 2006; 4: 70–74.

33. ICH Harmonised Tripartite Guideline E9. Statistical

principles for clinical trials. CPMP/ICH/363/96, 1998: 1–37.

34. Lipton RB. Methodologic issues in acute migraine clin-

ical trials. Neurology 2000; 55(Suppl. 2): S3–S7.

35. U.S. Department of Health and Human Services FDA

Center for Drug Evaluation and Research, U.S. Depart-

ment of Health and Human Services FDA Center for

Biologics Evaluation and Research, U.S. Department of 

Health and Human Services FDA Center for Devices andRadiological Health. Guidance for industry: patient-

reported outcome measures: use in medical product

development to support labeling claims: draft guidance.

Health Qual Life Outcomes 2006; 4: 79.

36. Lipton RB and Stewart WF. Migraine therapy: do doc-

tors understand what patients with migraine want from

therapy? Headache 1999; 39(Suppl. 2): S20–S26.

37. Colman SS, Brod MI, Krishnamurthy A, Rowland CR,

Jirgens KJ and Gomez-Mancilla B. Treatment satisfac-

tion, functional status, and health-related quality of life

of migraine patients treated with almotriptan or suma-

triptan. Clin Ther 2001; 23: 127–145.

38. Dodick D. Patient perceptions and treatment preferences

in migraine management. CNS Drugs 2002; 16(Suppl. 1):

19–24.

39. Lipton RB, Hamelsky SW and Dayno JM. What do

patients with migraine want from acute migraine treat-

ment? Headache 2002; 42(Suppl. 1): 3–9.

40. Tfelt-Hansen P, Block G, Dahlo ¨ f C, Diener HC, Ferrari

MD, Goadsby PJ, et al. Guidelines for controlled trials of 

drugs in migraine: second edition. Cephalalgia 2000; 20:

765–786.

41. Ramadan NM. Assessing the efficacy of drugs for the

acute treatment of migraine: issues in clinical trial

design. CNS Drugs 2002; 16: 181–196.

 Aicher et al. 961

Page 10: 953.full

7/28/2019 953.full

http://slidepdf.com/reader/full/953full 10/10

42. Antonaci F, Sances G, Guaschino E, De Cillis I, Bono G

and Nappi G. Meeting patient expectations in migraine

treatment: what are the key endpoints? J Headache Pain

2008; 9: 207–213.

43. Bendtsen L, Bigal ME, Cerbo R, Diener HC, Holroyd K,

Lampl C, et al. Guidelines for controlled trials of drugs in

tension-type headache: second edition. Cephalalgia 2010;

30: 1–16.

44. Tfelt-Hansen P, McCaroll K and Lines C. Sum of pain

intensity differences (SPID) in migraine trials. A com-

ment based on four rizatriptan trials. Cephalalgia 2002;

22: 664–666.

962 Cephalalgia 32(13)