statistical issues in the design of a trial, part 2

61
Statistical Issues in Statistical Issues in the Design of a Trial, the Design of a Trial, Part 2 Part 2 Karen Pieper, MS Karen Pieper, MS Duke Clinical Research Institute Duke Clinical Research Institute

Upload: merrill-bryan

Post on 30-Dec-2015

28 views

Category:

Documents


1 download

DESCRIPTION

Statistical Issues in the Design of a Trial, Part 2. Karen Pieper, MS Duke Clinical Research Institute. Primary vs. Secondary Hypotheses. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistical Issues in the Design of a Trial, Part 2

Statistical Issues in Statistical Issues in the Design of a Trial, the Design of a Trial, Part 2Part 2

Karen Pieper, MSKaren Pieper, MSDuke Clinical Research InstituteDuke Clinical Research Institute

Page 2: Statistical Issues in the Design of a Trial, Part 2

Primary vs. Secondary HypothesesPrimary vs. Secondary Hypotheses

In most studies, we determine one In most studies, we determine one comparison (maybe 2 or 3) that will be comparison (maybe 2 or 3) that will be our our primaryprimary comparison of interest. comparison of interest. All other comparisons are considered All other comparisons are considered secondarysecondary..

Page 3: Statistical Issues in the Design of a Trial, Part 2

Stating the Primary vs. Stating the Primary vs. Secondary HypothesesSecondary Hypotheses

What does this mean?What does this mean?

■ We are saying that we are designing our We are saying that we are designing our study to answer question X. Whatever study to answer question X. Whatever results are achieved, the study is valid, results are achieved, the study is valid, and so the results are conclusive (either and so the results are conclusive (either positive or negative). However, all other positive or negative). However, all other endpoints evaluated are not necessarily endpoints evaluated are not necessarily conclusive but rather are important for conclusive but rather are important for generating new hypotheses.generating new hypotheses.

Page 4: Statistical Issues in the Design of a Trial, Part 2

The Multiple Comparisons ProblemThe Multiple Comparisons Problem

Why do we need to do this?Why do we need to do this?

Page 5: Statistical Issues in the Design of a Trial, Part 2

Multiple ComparisonsMultiple Comparisons

Flip a coin.Flip a coin.What is your chance of getting a head (H)?What is your chance of getting a head (H)?

50%50%

Now flip a coin Now flip a coin 2 times2 times..What is the chance that you get at least one head?What is the chance that you get at least one head?

HH H HH H T TT T H TH T TT

¾ or 75%¾ or 75%

Page 6: Statistical Issues in the Design of a Trial, Part 2

Multiple ComparisonsMultiple Comparisons

Note, the formula for this is:Note, the formula for this is:

1 - (probability of a tail)1 - (probability of a tail)number of tossesnumber of tosses

1 - (0.5)1 - (0.5)22

==

1 - 0.251 - 0.25

==

0.750.75

Page 7: Statistical Issues in the Design of a Trial, Part 2

Multiple ComparisonsMultiple Comparisons

When we perform a test for a clinical When we perform a test for a clinical study, we prespecify that results will be study, we prespecify that results will be considered significant only if there is no considered significant only if there is no more than a 5% chance that an effect more than a 5% chance that an effect will be found when, in fact, there really will be found when, in fact, there really is no effect. (Type I error)is no effect. (Type I error)

Page 8: Statistical Issues in the Design of a Trial, Part 2

Multiple ComparisonsMultiple Comparisons

If we perform 2 tests, then the probability of If we perform 2 tests, then the probability of erroneously declaring at least one of them erroneously declaring at least one of them

statistically significant is: statistically significant is:

1 - 0.951 - 0.9522 = 1 - 0.9025 = 0.0975 = 1 - 0.9025 = 0.0975

If we perform 10 tests, then the probability of If we perform 10 tests, then the probability of erroneously declaring that at least one of them is erroneously declaring that at least one of them is

statistically significant is:statistically significant is:

1 - 0.951 - 0.951010 = 1 - 0.5987 = 0.401 = 1 - 0.5987 = 0.401

Page 9: Statistical Issues in the Design of a Trial, Part 2

Multiple ComparisonsMultiple Comparisons

To avoid this problem, we can:To avoid this problem, we can:

Only perform one testOnly perform one test

■ In this case, we specify the primary question that we In this case, we specify the primary question that we want to answer. Then we specify a list of secondary want to answer. Then we specify a list of secondary questions.questions.

■ In doing this, we are claiming that the primary In doing this, we are claiming that the primary question or hypothesis is the only one about which we question or hypothesis is the only one about which we will make conclusive statements.will make conclusive statements.

■ The other questions (or secondary hypotheses) are of The other questions (or secondary hypotheses) are of interest but are not conclusive. We often refer to these interest but are not conclusive. We often refer to these secondary questions as “hypothesis generating.”secondary questions as “hypothesis generating.”

Page 10: Statistical Issues in the Design of a Trial, Part 2

Multiple ComparisonsMultiple Comparisons

Or, we can:Or, we can:

Decrease the probability that we are willing to Decrease the probability that we are willing to accept for a Type I erroraccept for a Type I error

■ Say we only declare significance if our test Say we only declare significance if our test probability is < 0.01. In this case, if we do 5 probability is < 0.01. In this case, if we do 5 tests, then we have:tests, then we have:

1 - (0.99)1 - (0.99)55 = 1 - 0.95 = 0.05 = the probability = 1 - 0.95 = 0.05 = the probability of making a Type I error in the studyof making a Type I error in the study

Page 11: Statistical Issues in the Design of a Trial, Part 2

Calculating Sample SizeCalculating Sample SizeMethod AMethod A

■ Choose a number of dollarsChoose a number of dollars

■ Calculate the number of dollars required Calculate the number of dollars required per patient enrolledper patient enrolled

■ Divide one number into the otherDivide one number into the other

Page 12: Statistical Issues in the Design of a Trial, Part 2

Calculating Sample SizeCalculating Sample SizeMethod BMethod B

■ Develop consensus among providers, Develop consensus among providers, patients, and payers on the difference in patients, and payers on the difference in outcomes needed to change clinical practice outcomes needed to change clinical practice (a.k.a. minimally important difference or MID)(a.k.a. minimally important difference or MID)

■ Estimate standard therapy event rateEstimate standard therapy event rate

■ Calculate number of patients required to Calculate number of patients required to demonstrate MID with low probability for demonstrate MID with low probability for missing a real differencemissing a real difference

Page 13: Statistical Issues in the Design of a Trial, Part 2

Sample Size EstimationSample Size Estimation

Sample size calculations depend on:Sample size calculations depend on:

■ Type I error rateType I error rate

■ Type II error rateType II error rate

■ Endpoint to be analyzedEndpoint to be analyzed

■ Statistical method to be used in analyzing the Statistical method to be used in analyzing the endpointendpoint

■ Estimated value for the endpoint one expects Estimated value for the endpoint one expects to see in the control armto see in the control arm

■ Estimated improvement one expects to see in Estimated improvement one expects to see in the treatment armthe treatment arm

■ Amount of variation in the endpoint measuredAmount of variation in the endpoint measured

Page 14: Statistical Issues in the Design of a Trial, Part 2

Sample Size EstimationSample Size EstimationType I and Type II ErrorsType I and Type II Errors

Tru

thT

ruth

Test ResultsTest Results

No TxNo Tx Tx HasTx HasEffectEffect EffectEffect

No Tx EffectNo Tx Effect Type IType IErrorError

Tx Has EffectTx Has Effect Type IIType II PowerPowerErrorError

Page 15: Statistical Issues in the Design of a Trial, Part 2

Sample Size EstimationSample Size Estimation

Sample size calculations depend on:Sample size calculations depend on:

■ Endpoint to be analyzedEndpoint to be analyzed

■ Yes/no responsesYes/no responses

■ Continuous responsesContinuous responses

■ Questionnaire dataQuestionnaire data

■ Survival from event following a long period of timeSurvival from event following a long period of time

■ Repeated measures of an outcome Repeated measures of an outcome over several weeksover several weeks

■ Etc. Etc.

Page 16: Statistical Issues in the Design of a Trial, Part 2

Sample Size EstimationSample Size Estimation

Sample size calculations depend on:Sample size calculations depend on:

■ The technique used to analyze the endpointThe technique used to analyze the endpoint

■ The formula used to calculate a sample size The formula used to calculate a sample size is based on the statistical test one plans to is based on the statistical test one plans to use in the final analysis.use in the final analysis.

■ Different tests often involve different Different tests often involve different assumptions.assumptions.

■ Each formula for calculating sample size will Each formula for calculating sample size will give a somewhat different answer.give a somewhat different answer.

Page 17: Statistical Issues in the Design of a Trial, Part 2

Sample Size EstimationSample Size Estimation

Sample size calculations depend on:Sample size calculations depend on:

■ The estimated value for the endpoint one The estimated value for the endpoint one expects to see in the control armexpects to see in the control arm

■ The rarity of the endpoint. The rarer the The rarity of the endpoint. The rarer the endpoint, the more patients it takes to detect a endpoint, the more patients it takes to detect a difference. Estimates of control rates come from: difference. Estimates of control rates come from:

■ Previous studies in the literaturePrevious studies in the literature

■ Pilot studies Pilot studies

■ ““Best clinical guess”Best clinical guess”

Page 18: Statistical Issues in the Design of a Trial, Part 2

Sample Size EstimationSample Size Estimation

Sample size calculations depend on:Sample size calculations depend on:

■ The estimated improvement one expects to see The estimated improvement one expects to see in the treatment arm in the treatment arm

■ The greater the improvement one expects to see, The greater the improvement one expects to see, the fewer the patients required.the fewer the patients required.

■ The amount of variation in the endpoint measureThe amount of variation in the endpoint measure

■ If your measure is a continuous measure (for If your measure is a continuous measure (for example, volume measures, blood pressures, example, volume measures, blood pressures, ejection fractions, percentages, weight), then less ejection fractions, percentages, weight), then less variation in the measure means that fewer variation in the measure means that fewer patients are needed to detect a difference.patients are needed to detect a difference.

Page 19: Statistical Issues in the Design of a Trial, Part 2

Sample Size: The Effect of Properly Sample Size: The Effect of Properly Estimating Treatment EffectsEstimating Treatment EffectsExample:Example:Impact II Primary Endpoint in Treated PatientsImpact II Primary Endpoint in Treated Patients

Changing Changing Low DoseLow Dose PlaceboPlaceboLow-dose ArmLow-dose Arm (n = 1300)(n = 1300) (n = 1285)(n = 1285) p-valuep-value

Actual resultsActual results 118 (9.1%)118 (9.1%) 149 (11.6%)149 (11.6%) 0.0350.035

Adding 1 eventAdding 1 event 119 (9.2%)119 (9.2%) 149 (11.6%)149 (11.6%) 0.0420.042

Adding 2 eventsAdding 2 events 120 (9.2%)120 (9.2%) 149 (11.6%)149 (11.6%) 0.0490.049

Adding 3 eventsAdding 3 events 121 (9.3%)121 (9.3%) 149 (11.6%)149 (11.6%) 0.0570.057

Page 20: Statistical Issues in the Design of a Trial, Part 2

Sample Size: SummarySample Size: Summary

You can decrease the sample size needed by:You can decrease the sample size needed by:

■ Allowing for a bigger Type I errorAllowing for a bigger Type I error

■ Allowing for a bigger Type II errorAllowing for a bigger Type II error

■ Increasing the level of improvement one expects to achieveIncreasing the level of improvement one expects to achieve

■ Choosing a more powerful way of testingChoosing a more powerful way of testing

■ For a binary endpoint, choosing the one that is closest to For a binary endpoint, choosing the one that is closest to 50% in likelihood of being observed in the control arm50% in likelihood of being observed in the control arm

■ For survival endpoints, extending the length of follow-upFor survival endpoints, extending the length of follow-up

■ For continuous measures, decreasing the variation in the For continuous measures, decreasing the variation in the outcomeoutcome

Page 21: Statistical Issues in the Design of a Trial, Part 2

Statistical TermsStatistical Terms

Page 22: Statistical Issues in the Design of a Trial, Part 2

Point EstimatePoint Estimate

The statistic one calculates to estimate theThe statistic one calculates to estimate theresult of interestresult of interest

■ Examples:Examples:

■ Percent of patients with the eventPercent of patients with the event

■ Mean of the outcomeMean of the outcome

■ Kaplan-Meier rate of survivalKaplan-Meier rate of survival

■ Ratio of percentagesRatio of percentages

■ Differences in percentagesDifferences in percentages

Page 23: Statistical Issues in the Design of a Trial, Part 2

Odds RatioOdds Ratio

Example: Example:

PURSUIT trial: ACS patients; primary endpoint of PURSUIT trial: ACS patients; primary endpoint of

CEC-adjudicated MI or death at 30 days CEC-adjudicated MI or death at 30 days

Eptifibatide PlaceboEptifibatide Placebo

47224722 Total PatientsTotal Patients 47394739 Total PatientsTotal Patients

672672 EventsEvents 745745 EventsEvents

Odds in the Integrilin group:Odds in the Integrilin group: 672 / 4050 = 0.166672 / 4050 = 0.166

Odds in the placebo group:Odds in the placebo group: 745 / 3994 = 0.187745 / 3994 = 0.187

Odds ratio:Odds ratio: 0.166 / 0.187 = 0.8890.166 / 0.187 = 0.889

Page 24: Statistical Issues in the Design of a Trial, Part 2

Risk RatioRisk Ratio

Example: Example:

PURSUIT trial: ACS patients; primary endpoint of PURSUIT trial: ACS patients; primary endpoint of

CEC-adjudicated MI or death at 30 days CEC-adjudicated MI or death at 30 days

Eptifibatide PlaceboEptifibatide Placebo

47224722 Total PatientsTotal Patients 47394739 Total PatientsTotal Patients

672672 EventsEvents 745745 EventsEvents

Risk in the Integrilin group:Risk in the Integrilin group: 672 / 4722 = 14.2%672 / 4722 = 14.2%

Risk in the placebo group:Risk in the placebo group: 745 / 4739 = 15.7%745 / 4739 = 15.7%

Risk ratio:Risk ratio: 14.2 / 15.7 = 0.90514.2 / 15.7 = 0.905

Page 25: Statistical Issues in the Design of a Trial, Part 2

Risk Difference = Risk Difference = Risk in Group A - Risk in Group BRisk in Group A - Risk in Group B

Risk in the Integrilin group:Risk in the Integrilin group: 672 / 4722 = 14.2%672 / 4722 = 14.2%

Risk in the placebo group:Risk in the placebo group: 745 / 4739 = 15.7%745 / 4739 = 15.7%

Risk difference =Risk difference = 15.7% - 14.2% = 1.5% 15.7% - 14.2% = 1.5%

The treatment has saved 1.5% of those treated from having the event.The treatment has saved 1.5% of those treated from having the event.

Page 26: Statistical Issues in the Design of a Trial, Part 2

Number Needed to TreatNumber Needed to Treat

Number of patients who need to be treated Number of patients who need to be treated to prevent one bad outcometo prevent one bad outcome

■ Formula is Formula is 1 / absolute risk reduction1 / absolute risk reduction

■ Risk reduction in the previous slide Risk reduction in the previous slide was 1.5%, so the number needed to treat was 1.5%, so the number needed to treat to prevent one 30-day death or MI is to prevent one 30-day death or MI is 1 / 0.015 = 67 patients1 / 0.015 = 67 patients..

Page 27: Statistical Issues in the Design of a Trial, Part 2

Percent Change = Percent Change = (Risk in Treated - Risk in Control) / (Risk in Treated - Risk in Control) / Risk in ControlRisk in Control■ Risk in the Integrilin group:Risk in the Integrilin group: 672 / 4722 = 14.2%672 / 4722 = 14.2%

■ Risk in the placebo group:Risk in the placebo group: 745 / 4739 = 15.7%745 / 4739 = 15.7%

■ Risk difference / control riskRisk difference / control risk

= (15.7% - 14.2%) / 15.7% = (15.7% - 14.2%) / 15.7%

= 1.5% / 15.7%= 1.5% / 15.7%

= 0.09554= 0.09554

= 9.6% change= 9.6% change

Page 28: Statistical Issues in the Design of a Trial, Part 2

MeanMean

The 50th percentile. This is the value The 50th percentile. This is the value such that half the group falls below it such that half the group falls below it and half the group falls above it.and half the group falls above it.

MedianMedian

The average across a group of patientsThe average across a group of patients

Page 29: Statistical Issues in the Design of a Trial, Part 2

VarianceVarianceA measure of how far the data fall from the meanA measure of how far the data fall from the mean

To calculate, take each patient’s data and subtract it To calculate, take each patient’s data and subtract it from the mean. Square the difference to get rid of the from the mean. Square the difference to get rid of the negative signs. Add up all of these squared deviations negative signs. Add up all of these squared deviations and divide the total by N to obtain the average squared and divide the total by N to obtain the average squared deviation.deviation.

Standard DeviationStandard DeviationThe square root of the varianceThe square root of the variance

Variance is on a squared scale. Taking the square root Variance is on a squared scale. Taking the square root puts it back on the same scale as the original values.puts it back on the same scale as the original values.

Page 30: Statistical Issues in the Design of a Trial, Part 2

ExampleExample

AgeAge Deviation from MeanDeviation from MeanSquared Squared

DeviationsDeviations

5555 55 - 52.5 = 2.555 - 52.5 = 2.5 6.256.25

6060 60 - 52.5 = 7.560 - 52.5 = 7.5 56.2556.25

7272 72 - 52.5 = 19.572 - 52.5 = 19.5 380.25380.25

2323 23 - 52.5 = -29.523 - 52.5 = -29.5 870.25870.25

Sum = 210Sum = 210 Sum = 1313Sum = 1313

Mean = Mean = 210 / 4 = 210 / 4 =

52.552.5

VARIANCE = VARIANCE = 1313 / 4 = 1313 / 4 =

328.25 328.25

Page 31: Statistical Issues in the Design of a Trial, Part 2

0

200

400

600

800

1000

0

200

400

600

800

1000

0

200

400

600

800

1000

Box-and-whisker ChartBox-and-whisker Chart

BaselineBaseline PostPost 2 2 4 4 18–24 18–24 PTCAPTCA HoursHours HoursHours HoursHours

Page 32: Statistical Issues in the Design of a Trial, Part 2

P-valuesP-values

Just how magical is 0.05?Just how magical is 0.05?

Page 33: Statistical Issues in the Design of a Trial, Part 2

P-valuesP-values

A p-value is a probability. It is the A p-value is a probability. It is the probability of obtaining the existing probability of obtaining the existing results or even more extreme results results or even more extreme results if the effect observed is really due to if the effect observed is really due to random chance alone.random chance alone.

Page 34: Statistical Issues in the Design of a Trial, Part 2

P-valuesP-values■ For example, in PURSUIT, the primary results For example, in PURSUIT, the primary results

were:were:

30-day death/MI with:30-day death/MI with:

EptifibatideEptifibatide 14.2%14.2%

PlaceboPlacebo 15.7%15.7%

P-valueP-value 0.0420.042

■ This p-value indicates that a difference of at This p-value indicates that a difference of at least this much would occur in fewer than 42 out least this much would occur in fewer than 42 out of 1000 similar experiments if eptifibatide had of 1000 similar experiments if eptifibatide had no effect on death or MI out to 30 days.no effect on death or MI out to 30 days.

Page 35: Statistical Issues in the Design of a Trial, Part 2

P-valuesP-values

■ For the primary hypothesis, we usually use a For the primary hypothesis, we usually use a critical value of 0.05. critical value of 0.05.

■ According to this rule, if we complete the study According to this rule, if we complete the study and:and:

■ Get a p-value of 0.051Get a p-value of 0.051, then we , then we cannotcannot declare a statistically significant difference declare a statistically significant difference between the two groupsbetween the two groups

■ Get a p-value of 0.049Get a p-value of 0.049, then we , then we cancan declare a declare a statistically significant difference between the statistically significant difference between the two groupstwo groups

Page 36: Statistical Issues in the Design of a Trial, Part 2

P-values SummaryP-values Summary

■ Many statisticians involved in research Many statisticians involved in research consider p-values that are close to 0.05 (on consider p-values that are close to 0.05 (on either side) to be “borderline” in significance or either side) to be “borderline” in significance or they say that there is a “trend” towards a they say that there is a “trend” towards a significant difference.significant difference.

■ The further a p-value is from 0.05, the more one The further a p-value is from 0.05, the more one believes that it is a true effect (when smaller believes that it is a true effect (when smaller than 0.05) or that there is no true difference in than 0.05) or that there is no true difference in the groups (when larger than 0.05). the groups (when larger than 0.05).

Page 37: Statistical Issues in the Design of a Trial, Part 2

Confidence Intervals Confidence Intervals

Definition of a 95% confidence interval (C.I.):Definition of a 95% confidence interval (C.I.):

If you were to do the study an infinite number If you were to do the study an infinite number of times, then 95% of the estimates of effect of times, then 95% of the estimates of effect would fall within the bounds of the interval.would fall within the bounds of the interval.

Page 38: Statistical Issues in the Design of a Trial, Part 2

Ratio Plot (“Blobogram”)Ratio Plot (“Blobogram”)

0.50.5 1.01.0 1.51.5Tx BTx BBetterBetter

Tx ATx ABetterBetter

Point estimate of the effectPoint estimate of the effect(size and # patients)(size and # patients)

95% C.I.95% C.I.

Page 39: Statistical Issues in the Design of a Trial, Part 2

Ratio Plot (“Blobogram”)Ratio Plot (“Blobogram”)

0.50.5 1.01.0 1.51.5Tx BTx BBetterBetter

Tx ATx ABetterBetter

Tx A Better than Tx BTx A Better than Tx B

Uncertain (p > 0.05)Uncertain (p > 0.05)

Page 40: Statistical Issues in the Design of a Trial, Part 2

0.50.5 1.01.0 1.51.5

Ratio Plot (“Blobogram”)Ratio Plot (“Blobogram”)

Tx BTx BBetterBetter

Tx ATx ABetterBetter

Tx B (new Tx) better than Tx ATx B (new Tx) better than Tx A

Tx B worse than Tx ATx B worse than Tx A

Tx B probably better than Tx A, Tx B probably better than Tx A, but may be equivalentbut may be equivalent

Tx B may be worse than Tx A, Tx B may be worse than Tx A, but may be statistically and but may be statistically and clinically equivalentclinically equivalent

Tx B may be worse than Tx A or Tx B may be worse than Tx A or may be equivalent (or better!)may be equivalent (or better!)

Page 41: Statistical Issues in the Design of a Trial, Part 2

Confidence IntervalsConfidence Intervals

Death or MI at 30 DaysDeath or MI at 30 Days

EE EE

LA LA

WE WE

NA NA

OverallOverall

0.50.5 11 22

n =n = 43584358

n =n = 42434243

n =n = 585585

n =n = 17621762

0.89 (0.79, 0.99) 0.89 (0.79, 0.99)

0.75 (0.63, 0.91)0.75 (0.63, 0.91)

0.92 (0.77, 1.11)0.92 (0.77, 1.11)

1.03 (0.60, 1.76)1.03 (0.60, 1.76)

1.09 (0.85, 1.39) 1.09 (0.85, 1.39)

Page 42: Statistical Issues in the Design of a Trial, Part 2

0.6

0.7

0.8

0.9

1

14.2 vs. 15.7 (672/4722 events)

13.8 vs. 15.7 (652/4722 events)

12.7 vs. 15.7 (600/4722 events)

p = 0.042p = 0.007

p < 0.0010.6

0.7

0.8

0.9

1

14.2 vs. 15.7 (672/4722 events)

13.8 vs. 15.7 (652/4722 events)

12.7 vs. 15.7 (600/4722 events)

p = 0.042p = 0.007

p < 0.001

The Effect of Fewer 30-day Death/MIs on P-values The Effect of Fewer 30-day Death/MIs on P-values and Confidence Intervals in the Eptifibatide Armand Confidence Intervals in the Eptifibatide Arm

Page 43: Statistical Issues in the Design of a Trial, Part 2

Relationship Between Relationship Between P-values and C.I.P-values and C.I.

1.1. Decide how you want to look at the treatment Decide how you want to look at the treatment effect (the point estimate).effect (the point estimate).

■ Difference:Difference: 15.7 - 14.2 = change of 1.5 in death/MI rates15.7 - 14.2 = change of 1.5 in death/MI rates

■ Percent Change:Percent Change: 1.5 / 15.7 = 9.5% decrease in rates1.5 / 15.7 = 9.5% decrease in rates

■ Odds Ratio:Odds Ratio: 0.89 odds of death/MI for Integrilin vs. placebo0.89 odds of death/MI for Integrilin vs. placebo

Page 44: Statistical Issues in the Design of a Trial, Part 2

Relationship Between Relationship Between P-values and C.I.P-values and C.I.■ The p-value and 95% confidence interval are The p-value and 95% confidence interval are

calculated using exactly the same measures calculated using exactly the same measures from the data. from the data.

■ Usually, they each use the same measure of Usually, they each use the same measure of effect size and the same measure of how effect size and the same measure of how much variation existed in the outcome in the much variation existed in the outcome in the study.study.

■ Each shows the chances that the results are Each shows the chances that the results are attributable to random chance alone.attributable to random chance alone.

Page 45: Statistical Issues in the Design of a Trial, Part 2

Superiority TrialsSuperiority Trials

These trials test for statistically These trials test for statistically significant and clinically meaningful significant and clinically meaningful improvements improvements (or harm!)(or harm!) from the use from the use of the experimental treatment over the of the experimental treatment over the results obtained through the use of results obtained through the use of standard care.standard care.

Page 46: Statistical Issues in the Design of a Trial, Part 2

Experimental Experimental Tx BetterTx Better

Control Control Tx BetterTx Better

MID: Minimally MID: Minimally important differenceimportant difference

Superiority Trial ResultsSuperiority Trial ResultsMIDMID

Study AStudy A

11

Study BStudy B

Study DStudy D

Study CStudy C

00 22

Page 47: Statistical Issues in the Design of a Trial, Part 2

EquivalenceEquivalence

■ Equivalence studies are designed to evaluate Equivalence studies are designed to evaluate whether the difference in outcomes for the whether the difference in outcomes for the new treatment compared to those obtained new treatment compared to those obtained with standard care falls within the boundaries with standard care falls within the boundaries of the minimally important difference (MID).of the minimally important difference (MID).

■ MID is the largest difference one will accept MID is the largest difference one will accept between the outcomes of 2 groups and still between the outcomes of 2 groups and still consider them clinically similar. consider them clinically similar.

Page 48: Statistical Issues in the Design of a Trial, Part 2

Equivalence Trial ResultsEquivalence Trial ResultsMIDMID MIDMID

Experimental Experimental Tx BetterTx Better

Control Control Tx BetterTx Better

Study AStudy A

11

Study BStudy B

Study DStudy D

Study CStudy C

00 22

Page 49: Statistical Issues in the Design of a Trial, Part 2

Equivalency in Cardiovascular Drug Equivalency in Cardiovascular Drug DevelopmentDevelopment■ As mortality decreases, larger sample sizes are As mortality decreases, larger sample sizes are

needed to demonstrate a relative risk reductionneeded to demonstrate a relative risk reduction

■ New therapies that may be similar to or slightly better New therapies that may be similar to or slightly better than existing therapies may be important to:than existing therapies may be important to:

■ Improve ease of useImprove ease of use

■ Reduce costReduce cost

■ Make small advancesMake small advances

■ Underpowered studies failing to show a difference Underpowered studies failing to show a difference with a new therapy may miss a “true” worse outcome with a new therapy may miss a “true” worse outcome

Page 50: Statistical Issues in the Design of a Trial, Part 2

Equivalency in Cardiovascular Drug Equivalency in Cardiovascular Drug DevelopmentDevelopment

■ Failing to demonstrate a difference is not the Failing to demonstrate a difference is not the same as proving that no difference exists.same as proving that no difference exists.

■ Absolute equivalence is impossible to prove; Absolute equivalence is impossible to prove; there is always some degree of uncertainty.there is always some degree of uncertainty.

■ The goal is to refute the hypothesis that the The goal is to refute the hypothesis that the treatments lead to different outcomes by at treatments lead to different outcomes by at least the margin of the MID.least the margin of the MID.

Page 51: Statistical Issues in the Design of a Trial, Part 2

Noninferiority TrialsNoninferiority Trials

■ The results will be evaluated assuming that The results will be evaluated assuming that the experimental treatment will not be worse the experimental treatment will not be worse than the standard of care by a clinically than the standard of care by a clinically meaningful amount.meaningful amount.

■ Unlike equivalency studies, these studies do Unlike equivalency studies, these studies do not look for small improvements from the not look for small improvements from the experimental drug (one-sided as opposed to experimental drug (one-sided as opposed to two-sided evaluations). two-sided evaluations).

Page 52: Statistical Issues in the Design of a Trial, Part 2

Noninferiority TrialsNoninferiority Trials

■ Compliance criticalCompliance critical

■ Noncompliance improves chance of Noncompliance improves chance of declaring noninferioritydeclaring noninferiority

■ Intention-to-treat no longer conservativeIntention-to-treat no longer conservative

■ Precision in outcome measure criticalPrecision in outcome measure critical

Page 53: Statistical Issues in the Design of a Trial, Part 2

Experimental Experimental Tx BetterTx Better

Control Control Tx BetterTx Better

Noninferiority Trial ResultsNoninferiority Trial Results

MIDMID

Study AStudy A

11

Study BStudy B

Study CStudy C

00 22

MIDMID

Page 54: Statistical Issues in the Design of a Trial, Part 2

Do Do TTirofiban irofiban AAnd nd RReoPro eoPro GGive Similar ive Similar EEfficacy Outcomes fficacy Outcomes TTrialrial

N Engl J Med 2001;344:1888-94N Engl J Med 2001;344:1888-94

Page 55: Statistical Issues in the Design of a Trial, Part 2

Statistical ConsiderationsStatistical Considerations

Sample size provides 88% power to Sample size provides 88% power to declare tirofiban noninferior to declare tirofiban noninferior to abciximab, based on the relative efficacy abciximab, based on the relative efficacy of abciximab to placebo in EPISTENT*of abciximab to placebo in EPISTENT*

* the upper bound of the 1-sided 95% C.I. * the upper bound of the 1-sided 95% C.I. for the odds ratio (tirofiban relative to abciximab) for the odds ratio (tirofiban relative to abciximab) must be below 1.47must be below 1.47

N Engl J Med 2001;344:1888-94N Engl J Med 2001;344:1888-94

Page 56: Statistical Issues in the Design of a Trial, Part 2

Primary EndpointPrimary Endpoint

■ 30-day composite of:30-day composite of:

■ DeathDeath

■ Myocardial infarctionMyocardial infarction—CK-MB > 3 x ULN in two samplesCK-MB > 3 x ULN in two samples—New Q wavesNew Q waves

■ Urgent TVRUrgent TVR—PCI or CABGPCI or CABG

N Engl J Med 2001;344:1888-94N Engl J Med 2001;344:1888-94

Page 57: Statistical Issues in the Design of a Trial, Part 2

Primary EndpointPrimary Endpoint30-day Death, MI, Urgent TVR30-day Death, MI, Urgent TVR

Upper bound Upper bound of 95% C.I. of 95% C.I. = 1.51= 1.51

Noninferiority Noninferiority boundaryboundary

R.R. = 1.26R.R. = 1.261.471.47

1.001.00

AbciximabAbciximabBetterBetter

TirofibanTirofibanBetterBetter

p = 0.038p = 0.038

7.6%7.6%

6.0%6.0%

0%0%

2%2%

4%4%

6%6%

8%8%

10%10%

R.R. = 1.26R.R. = 1.26

TirofibanTirofiban AbciximabAbciximab30-

da

y D

eath

, MI,

Urg

en

t T

VR

(%

)3

0-d

ay

Dea

th, M

I, U

rge

nt

TV

R (

%)

N Engl J Med 2001;344:1888-94N Engl J Med 2001;344:1888-94

Page 58: Statistical Issues in the Design of a Trial, Part 2

Study DesignStudy Design—GUSTO V—GUSTO V

RandomizationRandomization

N = 16,588 patients: ST , Sxs < 6 hours N = 16,588 patients: ST , Sxs < 6 hours

Standard-Dose Standard-Dose ReteplaseReteplase

(10 + 10 U Double Bolus)(10 + 10 U Double Bolus)

Standard-Dose Standard-Dose ReteplaseReteplase

(10 + 10 U Double Bolus)(10 + 10 U Double Bolus)

Heparin: 5000 UHeparin: 5000 U1000 U/hr1000 U/hr

(800 U/hr for < 70 kg)(800 U/hr for < 70 kg)

Abciximab +Abciximab +Low-Dose ReteplaseLow-Dose Reteplase(5 + 5 U Double Bolus)(5 + 5 U Double Bolus)

Abciximab +Abciximab +Low-Dose ReteplaseLow-Dose Reteplase(5 + 5 U Double Bolus)(5 + 5 U Double Bolus)

Heparin: 60 U/kg Heparin: 60 U/kg (max 5000 U)(max 5000 U)

7 U/kg-hr7 U/kg-hr

Page 59: Statistical Issues in the Design of a Trial, Part 2

EndpointsEndpoints—GUSTO V—GUSTO V

■ PrimaryPrimary■ Mortality (all-cause) by 30 daysMortality (all-cause) by 30 days

■ SecondarySecondary■ Mortality (30-day) or nonfatal disabling stroke Mortality (30-day) or nonfatal disabling stroke

(in-hospital or 7-day)(in-hospital or 7-day)■ Hemorrhagic stroke (in-hospital or 7-day)Hemorrhagic stroke (in-hospital or 7-day)■ Mortality by 1 yearMortality by 1 year■ ReinfarctionReinfarction■ Coronary revascularizationCoronary revascularization■ Other prespecified complications of MIOther prespecified complications of MI

Page 60: Statistical Issues in the Design of a Trial, Part 2

Statistical Methods—GUSTO VStatistical Methods—GUSTO V

■ Superiority TestingSuperiority Testing

■ One-sided Type I error < 2.5% for control mortality One-sided Type I error < 2.5% for control mortality rates ranging from 5–9%.rates ranging from 5–9%.

■ Approximately 80% power to detect 15% reduction if Approximately 80% power to detect 15% reduction if control mortality rate = 7.4%control mortality rate = 7.4%

■ Noninferiority TestingNoninferiority Testing

■ Less than 10% relative increase in mortality—upper Less than 10% relative increase in mortality—upper bound of 95% C.I. for relative risk of 1.10bound of 95% C.I. for relative risk of 1.10

■ One-sided Type I error ranges from 2.051–2.627% for One-sided Type I error ranges from 2.051–2.627% for control mortality rates ranging from 5–9%control mortality rates ranging from 5–9%

Page 61: Statistical Issues in the Design of a Trial, Part 2

Primary Endpoint—GUSTO VPrimary Endpoint—GUSTO V

Reteplase

(n = 8260)

Reteplase

(n = 8260)

Abciximab+ Reteplase(n = 8328)

Abciximab+ Reteplase(n = 8328)

00

22

44

66

8830-Day Mortality (%)30-Day Mortality (%)

5.915.915.625.62

Odds Ratio = 0.948(0.832–1.081)

p = 0.43

NoninferiorityNoninferiorityboundaryboundary

Upper bound of 95%Upper bound of 95%C.I. = 1.076C.I. = 1.076

0.80.8

0.90.9

11

1.11.1

Relative RiskRelative Riskand 95% C.I.and 95% C.I.

ReteplaseReteplaseBetterBetter

AbciximabAbciximab++

ReteplaseReteplaseBetterBetter