analysis of stratified trials – challenging the “standard” methods devan v. mehrotra clinical...

49
Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories Jan 10, 2008

Post on 19-Dec-2015

238 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

Analysis of Stratified Trials – Challenging the “Standard”

Methods

Devan V. MehrotraClinical Biostatistics

Department SeminarMerck Research Laboratories

Jan 10, 2008

Page 2: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

2

Outline• Part I: binary response variable

> Mantel-Haenszel test> Minimum risk weights> Simulation results> Conclusions

• Part II: continuous non-normal response variable> Motivating example> Technical details> Simulation results> Conclusions

Page 3: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

3

Part IAnalysis of Binary Data

Page 4: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

4

Stratified Trials with Binary Endpoints

• 2 treatments (A and B), number of strata = sBinary response (responder/non-responder)

• pij = true (population) proportion for strat i, trt j

i = piA - piB = true difference for strat i

fi = true (population) relative frequency for strat i = true overall difference

• = observed proportion for strat i, trt jnij = observed number of subjects in strat i, trt j

ijp

i

iif

.ˆˆˆiBiAii ppiw , stratum to assigned weight

Page 5: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

5

Hypothesis Testing: General Framework

Superiority or Non-Inferiority Trials

correction continuity term, sample finite ccai

trial) yinferiorit-non (for

trial) ysuperiorit (for

iiii

iii

w

iiii

iii

w

Vwa

ccwZ

Vwa

ccwZ

)ˆ(

ˆ

)ˆ(

ˆ

2

0

2

trial) yinferiorit-(non vs.

trial) ty(superiori vs.

0100

10

::

0:0:

HH

HH

? and for use to What :IMPORTANT cc,awV iii ,),ˆ(

Page 6: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

6

Mantel-Haenszel Test (1959)Superiority Trials

1

5.0

1/

/

/

i iBiA

iBiA

iBiAiBiAi

iiBiAiBiA

iBiAiBiACMHi

nnnn

cc

nnnna

nnnn

nnnnw

s

iiii

iii

MH

Vwa

ccwZ

1

2

2

2

ˆ

)|ˆ(|

iBiA

iBiBiAiAi nn

pnpnp

ˆˆ

where ,iiiBiA

i ppnn

V

1

11

Note: MH test is optimal

is constant across strata.

iBiB

iAiA

pp

pp

1/

1/ if only and if

Page 7: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

7

Choice of Variance

• Null variance [Miettinen & Nurminen 1985, Farrington & Manning, 1990]

m.l.e. of under the restriction

Note: MH test uses the null variance.

• Observed (OBS) variance

• Note: With 1:1 randomization, for superiority trials, and usually so (but not always) for non-inferiority trials.

i

iB

iBiB

iA

iAiAi V

n

pp

n

ppV ~~1~~1~

ˆ

ijp~ ijp0 iCiT pp

i

iB

iBiB

iA

iAiAi V

npp

npp

V ˆˆ1ˆˆ1ˆˆ

ii VV ~ˆ always is

Page 8: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

8

(pA, pB) pairs where Null or Observed Variance is “Better”

Non-Inferiority Margin = 15%

EQUAL ALLOCATION, VR = VARIANCE RATIO (NULL:OBS) VR < 0.98 (N), 0.98 < VR < 1.02 (=), VR > 1.02 (O)

P_A (Test) ‚ 1.00 ˆ O O O O O O O O O O O ‚ 0.95 ˆ O O O O O O O O O O O ‚ 0.90 ˆ O O O O O O O O O O O ‚ 0.85 ˆ O O O O O O O O O O = ‚ 0.80 ˆ O O O O O O O O O = ‚ 0.75 ˆ O O O = = = = = = ‚ 0.70 ˆ O = = = = = = = ‚ 0.65 ˆ = = = = = = = ‚ 0.60 ˆ = = = = = = ‚ 0.55 ˆ = N = = = ‚ 0.50 ˆ N = = = ‚ 0.45 ˆ = = = ‚ 0.40 ˆ = = ‚ 0.35 ˆ = ‚ Šƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒ 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

P_B (Control) (Above, P_T minus P_C >= -0.15)

Page 9: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

9

(pA, pB) pairs where Null or Observed Variance is “Better”

Non-Inferiority Margin = 5%

EQUAL ALLOCATION, VR = VARIANCE RATIO (NULL:OBS)

VR < 0.98 (N), 0.98 < VR < 1.02 (=), VR > 1.02 (O) P_A (Test) ‚ 1.00 ˆ O O O O O O O O O O O ‚ 0.95 ˆ O O O O O O O O O O = ‚ 0.90 ˆ O O O O O O O O O = ‚ 0.85 ˆ O O O O O O O = = ‚ 0.80 ˆ O O O O O = = = ‚ 0.75 ˆ O O O = = = = ‚ 0.70 ˆ O O = = = = ‚ 0.65 ˆ O = = = = ‚ 0.60 ˆ = = = = ‚ 0.55 ˆ = = = ‚ 0.50 ˆ = = ‚ 0.45 ˆ = ‚ Šƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒ 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

P_B (Control) (Above, P_T minus P_C >= -0.05)

Page 10: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

10

Choice of Weights • Cochran-Mantel-Haenszel (CMH) weights

>> Estimator of is ~ unbiased.

• Minimum Risk (MR) weights [Mehrotra & Railkar, 2000]

>> Estimator of has smallest mean squared error.

>> If (optimal weights!)

constant is if iBiAi

iiBiAiBiA

iBiAiBiACMHi nnf

nnnn

nnnnw :ˆ

/

/

(2000) Railkar & Mehrotra see strata), two ( formula general For

:trial strata two for Formula

MRMRMR wwVVVV

VVfVw 121

21

1

2

211

21

1

12

11

2

2111

11 1,

ˆˆˆˆˆˆ

ˆˆˆˆˆˆ

ii

iMRii V

Vw 1

1

ˆˆ

ˆ constant,

Page 11: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

11

Choice of Finite Sample Term

• With CMH weights (i.e., with MH test):

is used.

• With MR weights:

is recommended.1ia

1/ iBiAiBiAi nnnna

Page 12: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

12

Choice of Continuity Correction

• With CMH weights:

is used by original MH test.

However, is a less conservative choice.

• With MR weights:

is recommended.

See Mehrotra & Railkar, Stats in Med, 2000

cc 0

1

163

i iBiA

iBiA

nn

nncc

1

5.0

i iBiA

iBiA

nnnn

cc

Page 13: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

13

Motivating Example RevisitedTest for Superiority

Strat Vaccine A Vaccine B Diff . OR Null

iV~

Obs

iV

1 .771 (37/ 48)

.647 (44/ 68)

.124 1.835 .086 .084

2 .156 (5/ 32)

.000 (0/ 12)

.156 infinity .091 .064

Weights 2-tailed Method w1 w2 w w

ˆV p-value MH (original cc) .763 .267 .131 .069 .0882 MH (cc=0) .763 .267 .131 .069 .0573 MR (null variance) .400 .600 .143 .064 .0318* MR (obs variance) .400 .600 .143 .051 .0068*

* establishes superiority at 2-tailed = .05

MH = Mantel-Haenszel test; MR = test using minimum risk weights

Page 14: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

14

Simulation ResultsTest for Superiority (2 strata)

Type I Error Rate (alpha = 5%) Strat 1 Strat 2 Method

p1A p1B p2A p2B N* MHorig MHcc=0 MRnull MRobs .83 .83 .37 .37 0 50 2.6 5.0 4.5 5.1 .83 .83 .37 .37 0 150 3.6 5.0 4.7 4.9 .83 .83 .37 .37 0 250 3.9 5.1 4.8 4.9 .83 .83 .37 .37 0 500 4.1 4.9 4.7 4.8

Power (%) .884 .83 .470 .37 .068a 500 75 77 77 77 .898 .83 .438 .37 .068b 500 76 78 81 81 .906 .83 .419 .37 .068c 500 77 79 83 83 .914 .83 .401 .37 .068d 500 78 80 84 84

* per treatment group

(f1 = .7, f2 = .3); No TxS interaction on (a) logit, (b) proportion, (c ) square root, and (d) log scales; 100,000 simulations.

Page 15: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

15

Illustrative Example # 2Test for Non-Inferiority

vs. 05005.0:05.0: 010 .δHH

Stratum Test (A) Control (B) A – B Null Observed i iAp iBp

i iV~ iV

1 .891

(98/ 110) .891

(98/ 110) .000 .0427 .0420

2 .978

(88/ 90) .978

(88/ 90) .000 .0283 .0220

METHOD 1-tailed Weights_Variance 1w 2w

iiiw w ˆˆ )ˆ( wV p-value

CMH_NULL .55 .45 .000 .0267 .0307 CMH_OBS .55 .45 .000 .0251 .0234* MR_OBS .21 .79 .000 .0195 .0059*

* establishes non-inferiority at 1-tailed = .025

Page 16: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

16

Simulation ResultsTest for Non-inferiority (2 strata)

Type I Error Rate

(nominal = .025) 0 N A

CMH_NULL

B CMH_OBS

C MR_OBS

.20 74 .026 .026 .023 .15 130 .025 .025 .025 .10 285 .024 .025 .024 .05 1130 .025 .025 .025

Results based on 100,000 simulations.

true

group treatment per size sample

00021

122111

,90.,70.

,,50.~

Hipppp

nNnnNfBnn

N

iCiTCC

TCTCT

Page 17: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

17

Simulation Results: PowerTest for Non-inferiority (2 strata)

Power

0 N A CMH_NULL

B CMH_OBS

C MR_OBS

$$ saved C vs. A*

.20 74 .871 .886 .900 $80K .15 130 .870 .881 .900 $150K .10 285 .865 .871 .900 $330K .05 1130 .863 .865 .900 $1.42M

* Based on in N required to achieve 90% power with popular method A, and assuming $5,000 per subject. Results based on 100,000 simulations.

true

group treatment per size sample

121

122111

00,90.,70.

,,50.~

Hipppp

nNnnNfBnn

N

iCiTCC

TCTCT

Page 18: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

18

For stratified trials with binary responses:

The popular Mantel-Haenszel test uses sample size (CMH) weights with null variances. It has good power properties if and only if the odds ratio is constant across strata.

Using minimum risk (MR) weights with observed (OBS) variances will usually provide notably more power than CMH weights with null variances for both superiority and non-inferiority trials.

Recommendation: consider MR_OBS as a default, but use simulations to quantify power differences between methods when planning a new trial.

Summary (Part I)

Page 19: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

19

Part IIAnalysis of Continuous Data Using Ranks

Page 20: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

20

Motivating ExampleHypothetical viral loads of HIV+ subjects (log10 copies/ml)

Stratum Placebo Vaccine

Females 3.90, 3.96 1.40, 2.802.90

Males 3.50, 3.503.56, 3.593.69, 3.854.06, 4.364.36, 4.434.68, 4.694.70, 4.855.06, 5.50

1.79, 2.322.54, 3.423.59, 3.894.64, 5.235.32

Page 21: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

21

Motivating Example (continued)

• Observed viral load summaries (log10 copies/ml):

• Compared to placebo, the VLs for vaccine appear to be “shifted” to the left (i.e., are numerically smaller). Is the shift statistically significant?

Stratum Summary Placebo Vaccine

FemalesMean

MedianSDn

3.933.930.04

2

2.372.800.84

3

MalesMean

MedianSDn

4.274.360.6216

3.643.591.27

9

Page 22: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

22

Motivating Example (continued)

Stratified rank-based analysis: SAS implementation

• PROC FREQ; TABLES gender * trt * vload/CMH SCORES=RANK;

RUN;

• PROC FREQ; TABLES gender * trt * vload/CMH SCORES=MODRIDIT;

RUN;

• PROC TWOSAMPL; [Part of PROC StatXact module]

WI/AS; PO trt; RE vload; ST gender;

RUN;

Page 23: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

23

Motivating Example (continued)

• 2-tailed p-values using the three “methods”:

Different conclusions at =.05 … why?

• PROC FREQ> Ranks based on pooled sample within each stratum (“stratum-specific” ranks)> SCORES = RANK equal stratum weights

SCORES = MODRIDIT unequal stratum weights

• PROC TWOSAMPL: Ranks based on overall pooled sample, ignoring strata (“stratum-invariant” ranks), with equal stratum weights.

PROC FREQRANK PROC FREQMODRIDIT PROC TWOSAMPL

p = .1506 p = .0642 p = .0436*

Page 24: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

24

Technical Details

Stratifi ed Rank-Based Tests

ijkY = response f or stratum i, treatment j, subject k

( ijnkjsi ,,1;2,1;,,1 )

Assumptions kiY 1 ~ i.i.d iyF [placebo]

kiY 2 ~ i.i.d iiyF [vaccine]

Ri is the fi xed eff ect of stratum i

i is the treatment eff ect (“shif t”) in stratum i

No T x S interaction ii (constant shif t)

iH i 0:0 vs. 0:1 iH f or at least one i

Page 25: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

25

Technical Details (continued)

Let ijkR = rank of ijkY (stratum-specifi c OR stratum-invariant)

iii

iiii

obsHSVw

HSESwZ

02

0

|

|, p-value = ||2 obsZZP

1

11

in

kkii RS , iw weight f or stratum i

2

1 121

10|

j

n

kijk

ii

ii

ij

Rnn

nHSE

2

1 1

2

1

0

2121

210

|1

|j

n

k i

iijk

iiii

iii

ij

nHSE

Rnnnn

nnHSV

Page 26: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

26

Technical Details (continued)

Three Popular Rank-Based Tests

Test Stratum weights

Comments

TEQ wi = 1 • PROC FREQ SCORES = RANK• Stratum-specific ranks

TvE wi = 1/(ni + 1) • PROC FREQ SCORES = MODRIDIT• van Elteren test (1960)• Stratum-specific ranks

wi = 1 • PROC TWOSAMPL• Stratum-invariant ranks

*EQT

21 iii nnn :Note

Page 27: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

27

• If there is no true treatment by stratum interaction (i = for all i), the van Elteren test is optimal among all the stratified test, i.e., wi = 1/(ni + 1) are optimal weights.

• However, if interaction exists, the van Elteren test can suffer from a power loss.

• In general, is there an asymptotically optimal test (with optimal weights) that allows for interaction?

YES … we derived it , based on stratum-specific ranks.

Technical Details (continued)

)( optT

Page 28: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

28

Technical Details (continued)

Weights needed f or optT :

)(,1

5.021, kijii

i

iopti YYPwith

nw

Since i is unknown, we studied a test based on

estimated optimal weights. i can be estimated as

kjiikijii nnYYI

,2121 )/()(

Page 29: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

29

Technical Details (continued)

We studied two other published tests:

Aligned rank test ( alignT ) [Hodges and Lehmann, Annals of Stat, 1962]

Step 1: Calculate iijk

alignijk bYY , where ib is the

Hodges-Lehmann estimate of the stratum "location" (median of all pairwise means of the observations in stratum i) Step 2: Perf orm unstratifi ed Wilcoxon rank sum test using align

ijkY

Page 30: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

30

Technical Details (continued)

Brunner's test ( BrunnerT ) [Brunner, Puri and Sun, J ASA, 1995]

Define overall treatment eff ect as:

2

1 1

21.2

21

2 )2

1()

21

ˆ(

s

i

s

i

iiiiis

nnRnp , where

)2

1(

1ˆ 2

.21

ii

ii

nR

np ,

2

12

12.2

in

jjiii RnR ( itjR is a stratum-specifi c rank)

Under 0:0 sH (equivalent to iH i 0:0 )

2

1

221.2

221 ~)

21

( s

s

i

iiiii

nnRnD

itn

j

itit

titjitj

t itiiit

itii

iii

nRRR

nnnn

nnn

nn 1

2.

)(2

12

21

21

21

2 ),4

1]

2

1[

)(

1(

1

where )(titjR is the rank within the stratum by treatment cell.

Page 31: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

31

Technical Details (continued)

We also studied two versions of an adaptive test:

Let TxSp = p-value f or treatment by stratum (T x S) interaction, and ),ˆ( n = Spearman’s rank correlation between the (estimated) treatment eff ect ( ) and stratum size (n). Our proposed adaptive test:

.

,0),(10.0

alignadpt

EQadptTxS

TTotherwise

TTnandpIf

How to obtain the T x S interaction p-value ( TxSp )?

Page 32: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

32

Technical Details (continued)

• Adaptive test 1adapT based on TxS test of Öhrvik [1999]:

Let ijkZ = rank of alignijkY (stratum-invariant rank)

Test statistic:

s

i jijij

NZn

NNQ

1

2

1

2.int )

21

()1(

12

where

s

i jijnN

1

2

1

and

ijn

kijkij ZZ

1.

21int ~ sQ under the hypothesis of no TxS interaction

P-value: int2

1 QPp sTxS

Page 33: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

33

Technical Details (continued)

• Adaptive test 2adapT based on TxS test of Brunner et al.

[1995]:

Test statistic: ,)ˆ

)/1(

1ˆ(

1

1

2

12

1

22

s

i

s

j j

js

jj

ii

B

ppQ

where 2i and ip are as described f or Brunner's test

2

1~ KBQ under the hypothesis of no TxS interaction

P-value: BKTxS QPp 2

1

Note: The two adaptive tests have similar perf ormance in simulations, so 1adapadap TT f rom here on.

Page 34: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

34

Estimate and 100(1-)% CI for Obtained by Inverting the Given Test

• Let

Let p(c) = 1-tailed p-value for test applied to

Obtained via a numerical search.

Technical Details (continued)

2

1~

jcY

jYcY

ijk

ijkijk

if

if

cYijk~

21)(

2)(

50.)(ˆ

cpc

cpc

cpc

U

L

which for )( limit Upper

which for )( limit Lower

which for )( estimate Point

Page 35: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

35

TEQ .1505

[with stratum-invariant ranks] .0435*

TvE van Elteren test, .0643

TBrunner described on slide 29 .0250*

.0990

Tadap

.0654

Talign Aligned rank test .0654

11 ii nw

15.0ˆ iii nw optT

1iw*

EQT 1iw

Note: All methods except use stratum-specific ranks

*EQT

Motivating Example Revisited2-tailed p-values

alignadap

EQadapTxS

TTotherwise

TTnandpIf

,0),(10.0

Page 36: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

36

Method: TEQ TvE Talign

p-value .1506 .0435 .0643 .0654

Estimate

.80 0.94 1.00 .84

95% CI(-0.28, 1.61)

(.01, 1.71) (-.04, 1.61) (-.09, 1.53)

Motivating Example RevisitedEstimates and 95% CIs for (selected methods)

*EQT

Stratum Summary

Placebo Vaccine

P - V

Females

Mediann

3.932

2.803

1.13

MalesMedian

n4.3616

3.599

0.77

Page 37: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

37

• 2 treatments, 1:1 randomization per stratum

• Number of strata = 2, 4, 6, 8, 10, and 12

• Stratum size (ni): 10*i for stratum i

• Different choices of i:

– constant for each stratum (no TxS interaction)

– positively or negatively associated with stratum size (TxS interaction, with 50% power to detect it)

• Four different distributions for Y:

– Normal

– Log Normal

– Mixture of Normals: 0.9N(m,v) + 0.1N(m*,v*)

– t3

Simulation Study

Page 38: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

38

Simulation ResultsType I Error Rate (nominal = 5%)

Normal Distribution

Note: 5.00% + 3 std. errors = 5.92% (5000 simulations)

Number of Strata Test 2 4 6 8 10 12

EQT 4.6 4.7 4.8 5.0 4.7 5.3 *

EQT 5.0 4.8 4.7 5.0 5.0 4.9

vET 5.0 4.8 4.7 5.3 4.6 5.3

optT 4.7 4.1 4.8 4.4 4.5 4.7

alignT 5.5 5.3 5.0 5.7 5.1 5.3 1adapT 5.7 5.3 5.1 5.7 5.1 5.5

BrunnerT 11.3 10.7 12.1 11.7 11.1 10.9

Page 39: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

39

Simulation ResultsType I Error Rate (nominal = 5%)

Lognormal Distribution

Note: 5.00% + 3 std. errors = 5.92% (5000 simulations)

Number of Strata Test 2 4 6 8 10 12

EQT 4.1 4.9 5.0 5.2 5.3 5.3 *

EQT 4.9 4.8 5.5 5.3 5.4 4.6

vET 4.6 4.4 5.2 5.4 4.9 5.3

optT 4.4 3.8 5.0 4.6 5.0 5.0

alignT 5.0 5.1 5.4 5.4 5.0 5.5 1adapT 5.0 5.3 5.5 5.4 5.0 5.5

BrunnerT 11.0 11.3 12.6 12.1 11.6 11.0

Page 40: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

40

Simulation Results Type I Error Rate (nominal = 5%)

Mixture of Normals Distribution

Note: 5.00% + 3 std. errors = 5.92% (5000 simulations)

Number of Strata Test 2 4 6 8 10 12

EQT 4.3 4.8 5.2 5.1 4.9 5.0 *

EQT 4.5 4.9 5.4 4.8 4.9 4.7

vET 4.7 4.8 4.8 4.8 5.3 4.9

optT 4.9 4.1 4.4 4.0 4.5 5.2

alignT 5.3 5.2 5.5 5.2 5.1 5.0 1adapT 5.3 5.3 5.4 5.3 5.1 5.0

BrunnerT 11.1 11.2 11.4 11.1 10.7 11.4

Page 41: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

41

Simulation ResultsType I Error Rate (nominal = 5%)

t3 Distribution

Note: 5.00% + 3 std. errors = 5.92% (5000 simulations)

Number of Strata Test 2 4 6 8 10 12

EQT 3.9 4.7 4.4 5.0 5.3 4.8 *

EQT 4.4 4.7 4.6 4.7 4.6 5.1

vET 4.4 4.6 4.4 4.8 5.1 4.6

optT 4.3 3.6 4.5 4.7 4.9 5.1

alignT 4.9 5.0 4.8 4.8 5.3 5.0 1adapT 4.9 5.1 4.9 4.9 5.5 5.1

BrunnerT 11.4 11.2 11.9 12.0 11.6 11.7

Page 42: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

42

Simulation Results: Power (%) No T x S interaction (constant

Normal Lognormal No. of Strata No. of Strata

Test 2 4 6 8 2 4 6 8 EQT 75.8 78.4 77.1 78.4 78.2 77.8 78.5 76.9 *

EQT 80.0 82.7 81.1 81.4 82.2 80.6 81.0 77.4

vET 79.0 82.1 81.5 82.5 81.5 81.3 83.2 81.2

optT 67.1 59.3 51.8 49.6 70.3 59.2 53.5 47.6

alignT 82.1 84.0 83.2 83.9 84.3 83.2 84.6 82.5

1adapT 82.4 84.2 83.4 84.1 84.6 83.3 84.9 82.6

Note: if there is no T x S interaction, optvE TT

Page 43: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

43

Simulation Results: Power (%) No T x S interaction

Mixture of Normals t3 Distribution No. of Strata No. of Strata

Test 2 4 6 8 2 4 6 8 EQT 75.6 79.3 76.6 77.5 79.4 78.0 76.6 78.0 *

EQT 79.7 80.3 76.2 75.6 82.7 79.0 74.1 72.8

vET 79.1 82.5 80.4 82.4 82.5 81.9 80.9 82.8

optT 67.7 60.6 50.7 48.8 72.5 59.2 50.7 49.2

alignT 81.9 84.2 82.2 83.3 83.6 83.3 81.9 83.6

1adapT 82.7 84.4 82.3 83.4 84.1 83.5 82.0 83.6

Note: if there is no T x S interaction, optvE TT

Page 44: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

44

Simulation Results: Power (%)Normal Distribution

adapalignoptvEEQEQ TTTTTT 65ˆ4321 *

0),( : nssociationPositive A 0),( : nssociationNegative A

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

2

3

45

61

2

34

5

6

1

2

3

4

5

6

1

23

4

5

6

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

234

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

Page 45: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

45

Simulation Results: Power (%)Lognormal Distribution

adapalignoptvEEQEQ TTTTTT 65ˆ4321 *

0),( : nssociationPositive A 0),( : nssociationNegative A

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

2

3

45

6

1

2

3

4

5

6

1

23

4

5

6

1

23

4

5

6

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

234

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

Page 46: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

46

Simulation Results: Power (%)Mixture of Normals

adapalignoptvEEQEQ TTTTTT 65ˆ4321 *

0),( : nssociationPositive A 0),( : nssociationNegative A

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

2

3

4

5

61

2

3

4

5

6

1

23

4

5

6

1

2

3

4

5

6

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

234

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

Page 47: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

47

Simulation Results: Power (%)t3 Distribution

adapalignoptvEEQEQ TTTTTT 65ˆ4321 *

0),( : nssociationPositive A 0),( : nssociationNegative A

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

2

34

5

61

23

4

5

61

2

3

4

5

6

1

2

3

4

5

6

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

23

4

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

Page 48: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

48

For rank-based analyses of stratified trials:

> No single method is uniformly the best

Recommendation: use the aligned rank test (Talign) or either of the proposed adaptive tests (Tadap1 or Tadap2). Both tests were more powerful than the van Elteren test (TvE) in every case studied, notably so when there was a true (but hard to detect) T x S interaction.

> It is time to retire the popular van Elteren test!

Conclusions (Part II)

Page 49: Analysis of Stratified Trials – Challenging the “Standard” Methods Devan V. Mehrotra Clinical Biostatistics Department Seminar Merck Research Laboratories

49

• Brunner, E., Puri, M. L., and Sun, S. (1995). Nonparametric Methods for Stratified Two-Sample Designs with Application to Multiclinic Trials. Journal of American Statistical Association, 90, 1004-1014.

• Hodges, J. L. and Lehman, E. C. (1962). Rank Methods for Combination of Independent Experiments in the Analysis of Variance. Annals of Mathematical Statistics, 33, 482-497.

• Mehrotra, D.V. and Railkar, R. (2000). Minimum Risk Weights for Comparing Treatments in Stratified Binomial Trials. Statistics in Medicine, 19, 811-825.

• Wang, W., Mehrotra, D.V., Chan, I.S.F. and Heyse, J.F. (2006). Non-Inferiority /Equivalence Trials in Vaccine Development. Journal of Biopharmaceutical Statistics, 16, 429-441.

• Öhrvik, J. (1999). Aligned Ranks: A Method of Gaining Efficiency in Rank Tests. http://www.stat.fi/isi99/proceedings/arkisto/varasto/hrvi0423.pdf

• van Elteren, P. H. (1960). On the Combination of Independent Two Sample Tests of Wilcoxon. Bulletin of the Institute of International Statistics, 37, 351-361.

References