analysis of stratified trials – challenging the “standard” methods devan v. mehrotra clinical...

Analysis of Stratified Trials – Challenging the “Standard”

Methods

Devan V. MehrotraClinical Biostatistics

Department SeminarMerck Research Laboratories

Jan 10, 2008

2

Outline• Part I: binary response variable

> Mantel-Haenszel test> Minimum risk weights> Simulation results> Conclusions

• Part II: continuous non-normal response variable> Motivating example> Technical details> Simulation results> Conclusions

3

Part IAnalysis of Binary Data

4

Stratified Trials with Binary Endpoints

• 2 treatments (A and B), number of strata = sBinary response (responder/non-responder)

• pij = true (population) proportion for strat i, trt j

i = piA - piB = true difference for strat i

fi = true (population) relative frequency for strat i = true overall difference

• = observed proportion for strat i, trt jnij = observed number of subjects in strat i, trt j

ijp

i

iif

.ˆˆˆiBiAii ppiw , stratum to assigned weight

5

Hypothesis Testing: General Framework

Superiority or Non-Inferiority Trials

correction continuity term, sample finite ccai

trial) yinferiorit-non (for

trial) ysuperiorit (for

iiii

iii

w

iiii

iii

w

Vwa

ccwZ

Vwa

ccwZ

)ˆ(

ˆ

)ˆ(

ˆ

2

0

2

trial) yinferiorit-(non vs.

trial) ty(superiori vs.

0100

10

::

0:0:

HH

HH

? and for use to What :IMPORTANT cc,awV iii ,),ˆ(

6

Mantel-Haenszel Test (1959)Superiority Trials

1

5.0

1/

/

/

i iBiA

iBiA

iBiAiBiAi

iiBiAiBiA

iBiAiBiACMHi

nnnn

cc

nnnna

nnnn

nnnnw

s

iiii

iii

MH

Vwa

ccwZ

1

2

2

2

ˆ

)|ˆ(|

iBiA

iBiBiAiAi nn

pnpnp

ˆˆ

where ,iiiBiA

i ppnn

V

1

11

Note: MH test is optimal

is constant across strata.

iBiB

iAiA

pp

pp

1/

1/ if only and if

7

Choice of Variance

• Null variance [Miettinen & Nurminen 1985, Farrington & Manning, 1990]

m.l.e. of under the restriction

Note: MH test uses the null variance.

• Observed (OBS) variance

• Note: With 1:1 randomization, for superiority trials, and usually so (but not always) for non-inferiority trials.

i

iB

iBiB

iA

iAiAi V

n

pp

n

ppV ~~1~~1~

ˆ

ijp~ ijp0 iCiT pp

i

iB

iBiB

iA

iAiAi V

npp

npp

V ˆˆ1ˆˆ1ˆˆ

ii VV ~ˆ always is

8

(pA, pB) pairs where Null or Observed Variance is “Better”

Non-Inferiority Margin = 15%

EQUAL ALLOCATION, VR = VARIANCE RATIO (NULL:OBS) VR < 0.98 (N), 0.98 < VR < 1.02 (=), VR > 1.02 (O)

P_A (Test) ‚ 1.00 ˆ O O O O O O O O O O O ‚ 0.95 ˆ O O O O O O O O O O O ‚ 0.90 ˆ O O O O O O O O O O O ‚ 0.85 ˆ O O O O O O O O O O = ‚ 0.80 ˆ O O O O O O O O O = ‚ 0.75 ˆ O O O = = = = = = ‚ 0.70 ˆ O = = = = = = = ‚ 0.65 ˆ = = = = = = = ‚ 0.60 ˆ = = = = = = ‚ 0.55 ˆ = N = = = ‚ 0.50 ˆ N = = = ‚ 0.45 ˆ = = = ‚ 0.40 ˆ = = ‚ 0.35 ˆ = ‚ Šƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒ 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

P_B (Control) (Above, P_T minus P_C >= -0.15)

9

(pA, pB) pairs where Null or Observed Variance is “Better”

Non-Inferiority Margin = 5%

EQUAL ALLOCATION, VR = VARIANCE RATIO (NULL:OBS)

VR < 0.98 (N), 0.98 < VR < 1.02 (=), VR > 1.02 (O) P_A (Test) ‚ 1.00 ˆ O O O O O O O O O O O ‚ 0.95 ˆ O O O O O O O O O O = ‚ 0.90 ˆ O O O O O O O O O = ‚ 0.85 ˆ O O O O O O O = = ‚ 0.80 ˆ O O O O O = = = ‚ 0.75 ˆ O O O = = = = ‚ 0.70 ˆ O O = = = = ‚ 0.65 ˆ O = = = = ‚ 0.60 ˆ = = = = ‚ 0.55 ˆ = = = ‚ 0.50 ˆ = = ‚ 0.45 ˆ = ‚ Šƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒ 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

P_B (Control) (Above, P_T minus P_C >= -0.05)

10

Choice of Weights • Cochran-Mantel-Haenszel (CMH) weights

>> Estimator of is ~ unbiased.

• Minimum Risk (MR) weights [Mehrotra & Railkar, 2000]

>> Estimator of has smallest mean squared error.

>> If (optimal weights!)

constant is if iBiAi

iiBiAiBiA

iBiAiBiACMHi nnf

nnnn

nnnnw :ˆ

/

/

(2000) Railkar & Mehrotra see strata), two ( formula general For

:trial strata two for Formula

MRMRMR wwVVVV

VVfVw 121

21

1

2

211

21

1

12

11

2

2111

11 1,

ˆˆˆˆˆˆ

ˆˆˆˆˆˆ

ii

iMRii V

Vw 1

1

ˆˆ

ˆ constant,

11

Choice of Finite Sample Term

• With CMH weights (i.e., with MH test):

is used.

• With MR weights:

is recommended.1ia

1/ iBiAiBiAi nnnna

12

Choice of Continuity Correction

• With CMH weights:

is used by original MH test.

However, is a less conservative choice.

• With MR weights:

is recommended.

See Mehrotra & Railkar, Stats in Med, 2000

cc 0

1

163

i iBiA

iBiA

nn

nncc

1

5.0

i iBiA

iBiA

nnnn

cc

13

Motivating Example RevisitedTest for Superiority

Strat Vaccine A Vaccine B Diff . OR Null

iV~

Obs

iV

1 .771 (37/ 48)

.647 (44/ 68)

.124 1.835 .086 .084

2 .156 (5/ 32)

.000 (0/ 12)

.156 infinity .091 .064

Weights 2-tailed Method w1 w2 w w

ˆV p-value MH (original cc) .763 .267 .131 .069 .0882 MH (cc=0) .763 .267 .131 .069 .0573 MR (null variance) .400 .600 .143 .064 .0318* MR (obs variance) .400 .600 .143 .051 .0068*

* establishes superiority at 2-tailed = .05

MH = Mantel-Haenszel test; MR = test using minimum risk weights

14

Simulation ResultsTest for Superiority (2 strata)

Type I Error Rate (alpha = 5%) Strat 1 Strat 2 Method

p1A p1B p2A p2B N* MHorig MHcc=0 MRnull MRobs .83 .83 .37 .37 0 50 2.6 5.0 4.5 5.1 .83 .83 .37 .37 0 150 3.6 5.0 4.7 4.9 .83 .83 .37 .37 0 250 3.9 5.1 4.8 4.9 .83 .83 .37 .37 0 500 4.1 4.9 4.7 4.8

Power (%) .884 .83 .470 .37 .068a 500 75 77 77 77 .898 .83 .438 .37 .068b 500 76 78 81 81 .906 .83 .419 .37 .068c 500 77 79 83 83 .914 .83 .401 .37 .068d 500 78 80 84 84

* per treatment group

(f1 = .7, f2 = .3); No TxS interaction on (a) logit, (b) proportion, (c ) square root, and (d) log scales; 100,000 simulations.

15

Illustrative Example # 2Test for Non-Inferiority

vs. 05005.0:05.0: 010 .δHH

Stratum Test (A) Control (B) A – B Null Observed i iAp iBp

i iV~ iV

1 .891

(98/ 110) .891

(98/ 110) .000 .0427 .0420

2 .978

(88/ 90) .978

(88/ 90) .000 .0283 .0220

METHOD 1-tailed Weights_Variance 1w 2w

iiiw w ˆˆ )ˆ( wV p-value

CMH_NULL .55 .45 .000 .0267 .0307 CMH_OBS .55 .45 .000 .0251 .0234* MR_OBS .21 .79 .000 .0195 .0059*

* establishes non-inferiority at 1-tailed = .025

16

Simulation ResultsTest for Non-inferiority (2 strata)

Type I Error Rate

(nominal = .025) 0 N A

CMH_NULL

B CMH_OBS

C MR_OBS

.20 74 .026 .026 .023 .15 130 .025 .025 .025 .10 285 .024 .025 .024 .05 1130 .025 .025 .025

Results based on 100,000 simulations.

true

group treatment per size sample

00021

122111

,90.,70.

,,50.~

Hipppp

nNnnNfBnn

N

iCiTCC

TCTCT

17

Simulation Results: PowerTest for Non-inferiority (2 strata)

Power

0 N A CMH_NULL

B CMH_OBS

C MR_OBS

$$ saved C vs. A*

.20 74 .871 .886 .900 $80K .15 130 .870 .881 .900 $150K .10 285 .865 .871 .900 $330K .05 1130 .863 .865 .900 $1.42M

* Based on in N required to achieve 90% power with popular method A, and assuming $5,000 per subject. Results based on 100,000 simulations.

true

group treatment per size sample

121

122111

00,90.,70.

,,50.~

Hipppp

nNnnNfBnn

N

iCiTCC

TCTCT

18

For stratified trials with binary responses:

The popular Mantel-Haenszel test uses sample size (CMH) weights with null variances. It has good power properties if and only if the odds ratio is constant across strata.

Using minimum risk (MR) weights with observed (OBS) variances will usually provide notably more power than CMH weights with null variances for both superiority and non-inferiority trials.

Recommendation: consider MR_OBS as a default, but use simulations to quantify power differences between methods when planning a new trial.

Summary (Part I)

19

Part IIAnalysis of Continuous Data Using Ranks

20

Motivating ExampleHypothetical viral loads of HIV+ subjects (log10 copies/ml)

Stratum Placebo Vaccine

Females 3.90, 3.96 1.40, 2.802.90

Males 3.50, 3.503.56, 3.593.69, 3.854.06, 4.364.36, 4.434.68, 4.694.70, 4.855.06, 5.50

1.79, 2.322.54, 3.423.59, 3.894.64, 5.235.32

21

Motivating Example (continued)

• Observed viral load summaries (log10 copies/ml):

• Compared to placebo, the VLs for vaccine appear to be “shifted” to the left (i.e., are numerically smaller). Is the shift statistically significant?

Stratum Summary Placebo Vaccine

FemalesMean

MedianSDn

3.933.930.04

2

2.372.800.84

3

MalesMean

MedianSDn

4.274.360.6216

3.643.591.27

9

22


Stratified rank-based analysis: SAS implementation

• PROC FREQ; TABLES gender * trt * vload/CMH SCORES=RANK;

RUN;

• PROC FREQ; TABLES gender * trt * vload/CMH SCORES=MODRIDIT;

RUN;

• PROC TWOSAMPL; [Part of PROC StatXact module]

WI/AS; PO trt; RE vload; ST gender;

RUN;

23


• 2-tailed p-values using the three “methods”:

Different conclusions at =.05 … why?

• PROC FREQ> Ranks based on pooled sample within each stratum (“stratum-specific” ranks)> SCORES = RANK equal stratum weights

SCORES = MODRIDIT unequal stratum weights

• PROC TWOSAMPL: Ranks based on overall pooled sample, ignoring strata (“stratum-invariant” ranks), with equal stratum weights.

PROC FREQRANK PROC FREQMODRIDIT PROC TWOSAMPL

p = .1506 p = .0642 p = .0436*

24

Technical Details

Stratifi ed Rank-Based Tests

ijkY = response f or stratum i, treatment j, subject k

( ijnkjsi ,,1;2,1;,,1 )

Assumptions kiY 1 ~ i.i.d iyF [placebo]

kiY 2 ~ i.i.d iiyF [vaccine]

Ri is the fi xed eff ect of stratum i

i is the treatment eff ect (“shif t”) in stratum i

No T x S interaction ii (constant shif t)

iH i 0:0 vs. 0:1 iH f or at least one i

25

Technical Details (continued)

Let ijkR = rank of ijkY (stratum-specifi c OR stratum-invariant)

iii

iiii

obsHSVw

HSESwZ

02

0

|

|, p-value = ||2 obsZZP

1

11

in

kkii RS , iw weight f or stratum i

2

1 121

10|

j

n

kijk

ii

ii

ij

Rnn

nHSE

2

1 1

2

1

0

2121

210

|1

|j

n

k i

iijk

iiii

iii

ij

nHSE

Rnnnn

nnHSV

26


Three Popular Rank-Based Tests

Test Stratum weights

Comments

TEQ wi = 1 • PROC FREQ SCORES = RANK• Stratum-specific ranks

TvE wi = 1/(ni + 1) • PROC FREQ SCORES = MODRIDIT• van Elteren test (1960)• Stratum-specific ranks

wi = 1 • PROC TWOSAMPL• Stratum-invariant ranks

*EQT

21 iii nnn :Note

27

• If there is no true treatment by stratum interaction (i = for all i), the van Elteren test is optimal among all the stratified test, i.e., wi = 1/(ni + 1) are optimal weights.

• However, if interaction exists, the van Elteren test can suffer from a power loss.

• In general, is there an asymptotically optimal test (with optimal weights) that allows for interaction?

YES … we derived it , based on stratum-specific ranks.


)( optT

28


Weights needed f or optT :

)(,1

5.021, kijii

i

iopti YYPwith

nw

Since i is unknown, we studied a test based on

estimated optimal weights. i can be estimated as

kjiikijii nnYYI

,2121 )/()(

29


We studied two other published tests:

Aligned rank test ( alignT ) [Hodges and Lehmann, Annals of Stat, 1962]

Step 1: Calculate iijk

alignijk bYY , where ib is the

Hodges-Lehmann estimate of the stratum "location" (median of all pairwise means of the observations in stratum i) Step 2: Perf orm unstratifi ed Wilcoxon rank sum test using align

ijkY

30


Brunner's test ( BrunnerT ) [Brunner, Puri and Sun, J ASA, 1995]

Define overall treatment eff ect as:

2

1 1

21.2

21

2 )2

1()

21

ˆ(

s

i

s

i

iiiiis

nnRnp , where

)2

1(

1ˆ 2

.21

ii

ii

nR

np ,

2

12

12.2

in

jjiii RnR ( itjR is a stratum-specifi c rank)

Under 0:0 sH (equivalent to iH i 0:0 )

2

1

221.2

221 ~)

21

( s

s

i

iiiii

nnRnD

itn

j

itit

titjitj

t itiiit

itii

iii

nRRR

nnnn

nnn

nn 1

2.

)(2

12

21

21

21

2 ),4

1]

2

1[

)(

1(

1

where )(titjR is the rank within the stratum by treatment cell.

31


We also studied two versions of an adaptive test:

Let TxSp = p-value f or treatment by stratum (T x S) interaction, and ),ˆ( n = Spearman’s rank correlation between the (estimated) treatment eff ect ( ) and stratum size (n). Our proposed adaptive test:

.

,0),(10.0

alignadpt

EQadptTxS

TTotherwise

TTnandpIf

How to obtain the T x S interaction p-value ( TxSp )?

32


• Adaptive test 1adapT based on TxS test of Öhrvik [1999]:

Let ijkZ = rank of alignijkY (stratum-invariant rank)

Test statistic:

s

i jijij

NZn

NNQ

1

2

1

2.int )

21

()1(

12

where

s

i jijnN

1

2

1

and

ijn

kijkij ZZ

1.

21int ~ sQ under the hypothesis of no TxS interaction

P-value: int2

1 QPp sTxS

33


• Adaptive test 2adapT based on TxS test of Brunner et al.

[1995]:

Test statistic: ,)ˆ

)/1(

1ˆ(

1

1

2

12

1

22

s

i

s

j j

js

jj

ii

B

ppQ

where 2i and ip are as described f or Brunner's test

2

1~ KBQ under the hypothesis of no TxS interaction

P-value: BKTxS QPp 2

1

Note: The two adaptive tests have similar perf ormance in simulations, so 1adapadap TT f rom here on.

34

Estimate and 100(1-)% CI for Obtained by Inverting the Given Test

• Let

Let p(c) = 1-tailed p-value for test applied to

•

Obtained via a numerical search.


2

1~

jcY

jYcY

ijk

ijkijk

if

if

cYijk~

21)(

2)(

50.)(ˆ

cpc

cpc

cpc

U

L

which for )( limit Upper

which for )( limit Lower

which for )( estimate Point

35

TEQ .1505

[with stratum-invariant ranks] .0435*

TvE van Elteren test, .0643

TBrunner described on slide 29 .0250*

.0990

Tadap

.0654

Talign Aligned rank test .0654

11 ii nw

15.0ˆ iii nw optT

1iw*

EQT 1iw

Note: All methods except use stratum-specific ranks

*EQT

Motivating Example Revisited2-tailed p-values

alignadap

EQadapTxS

TTotherwise

TTnandpIf

,0),(10.0

36

Method: TEQ TvE Talign

p-value .1506 .0435 .0643 .0654

Estimate

.80 0.94 1.00 .84

95% CI(-0.28, 1.61)

(.01, 1.71) (-.04, 1.61) (-.09, 1.53)

Motivating Example RevisitedEstimates and 95% CIs for (selected methods)

*EQT

Stratum Summary

Placebo Vaccine

P - V

Females

Mediann

3.932

2.803

1.13

MalesMedian

n4.3616

3.599

0.77

37

• 2 treatments, 1:1 randomization per stratum

• Number of strata = 2, 4, 6, 8, 10, and 12

• Stratum size (ni): 10*i for stratum i

• Different choices of i:

– constant for each stratum (no TxS interaction)

– positively or negatively associated with stratum size (TxS interaction, with 50% power to detect it)

• Four different distributions for Y:

– Normal

– Log Normal

– Mixture of Normals: 0.9N(m,v) + 0.1N(m*,v*)

– t3

Simulation Study

38

Simulation ResultsType I Error Rate (nominal = 5%)

Normal Distribution

Note: 5.00% + 3 std. errors = 5.92% (5000 simulations)

Number of Strata Test 2 4 6 8 10 12

EQT 4.6 4.7 4.8 5.0 4.7 5.3 *

EQT 5.0 4.8 4.7 5.0 5.0 4.9

vET 5.0 4.8 4.7 5.3 4.6 5.3

optT 4.7 4.1 4.8 4.4 4.5 4.7

alignT 5.5 5.3 5.0 5.7 5.1 5.3 1adapT 5.7 5.3 5.1 5.7 5.1 5.5

BrunnerT 11.3 10.7 12.1 11.7 11.1 10.9

39


Lognormal Distribution



EQT 4.1 4.9 5.0 5.2 5.3 5.3 *

EQT 4.9 4.8 5.5 5.3 5.4 4.6

vET 4.6 4.4 5.2 5.4 4.9 5.3

optT 4.4 3.8 5.0 4.6 5.0 5.0

alignT 5.0 5.1 5.4 5.4 5.0 5.5 1adapT 5.0 5.3 5.5 5.4 5.0 5.5

BrunnerT 11.0 11.3 12.6 12.1 11.6 11.0

40

Simulation Results Type I Error Rate (nominal = 5%)

Mixture of Normals Distribution



EQT 4.3 4.8 5.2 5.1 4.9 5.0 *

EQT 4.5 4.9 5.4 4.8 4.9 4.7

vET 4.7 4.8 4.8 4.8 5.3 4.9

optT 4.9 4.1 4.4 4.0 4.5 5.2

alignT 5.3 5.2 5.5 5.2 5.1 5.0 1adapT 5.3 5.3 5.4 5.3 5.1 5.0

BrunnerT 11.1 11.2 11.4 11.1 10.7 11.4

41


t3 Distribution



EQT 3.9 4.7 4.4 5.0 5.3 4.8 *

EQT 4.4 4.7 4.6 4.7 4.6 5.1

vET 4.4 4.6 4.4 4.8 5.1 4.6

optT 4.3 3.6 4.5 4.7 4.9 5.1

alignT 4.9 5.0 4.8 4.8 5.3 5.0 1adapT 4.9 5.1 4.9 4.9 5.5 5.1

BrunnerT 11.4 11.2 11.9 12.0 11.6 11.7

42

Simulation Results: Power (%) No T x S interaction (constant

Normal Lognormal No. of Strata No. of Strata

Test 2 4 6 8 2 4 6 8 EQT 75.8 78.4 77.1 78.4 78.2 77.8 78.5 76.9 *

EQT 80.0 82.7 81.1 81.4 82.2 80.6 81.0 77.4

vET 79.0 82.1 81.5 82.5 81.5 81.3 83.2 81.2

optT 67.1 59.3 51.8 49.6 70.3 59.2 53.5 47.6

alignT 82.1 84.0 83.2 83.9 84.3 83.2 84.6 82.5

1adapT 82.4 84.2 83.4 84.1 84.6 83.3 84.9 82.6

Note: if there is no T x S interaction, optvE TT

43

Simulation Results: Power (%) No T x S interaction

Mixture of Normals t3 Distribution No. of Strata No. of Strata

Test 2 4 6 8 2 4 6 8 EQT 75.6 79.3 76.6 77.5 79.4 78.0 76.6 78.0 *

EQT 79.7 80.3 76.2 75.6 82.7 79.0 74.1 72.8

vET 79.1 82.5 80.4 82.4 82.5 81.9 80.9 82.8

optT 67.7 60.6 50.7 48.8 72.5 59.2 50.7 49.2

alignT 81.9 84.2 82.2 83.3 83.6 83.3 81.9 83.6

1adapT 82.7 84.4 82.3 83.4 84.1 83.5 82.0 83.6

Note: if there is no T x S interaction, optvE TT

44

Simulation Results: Power (%)Normal Distribution

adapalignoptvEEQEQ TTTTTT 65ˆ4321 *

0),( : nssociationPositive A 0),( : nssociationNegative A

50

60

70

80

90

Po

we

r (%

)

2 strata 4 strata 6 strata 8 strata

1

2

3

45

61

2

34

5

6

1

2

3

4

5

6

1

23

4

5

6

50

60

70

80

90

Po

we

r (%

)


1

234

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

45

Simulation Results: Power (%)Lognormal Distribution



50

60

70

80

90

Po

we

r (%

)


1

2

3

45

6

1

2

3

4

5

6

1

23

4

5

6

1

23

4

5

6

50

60

70

80

90

Po

we

r (%

)


1

234

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

46

Simulation Results: Power (%)Mixture of Normals



50

60

70

80

90

Po

we

r (%

)


1

2

3

4

5

61

2

3

4

5

6

1

23

4

5

6

1

2

3

4

5

6

50

60

70

80

90

Po

we

r (%

)


1

234

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

47

Simulation Results: Power (%)t3 Distribution



50

60

70

80

90

Po

we

r (%

)


1

2

34

5

61

23

4

5

61

2

3

4

5

6

1

2

3

4

5

6

50

60

70

80

90

Po

we

r (%

)


1

23

4

56

1

2

3

4

56

1

2

3

4

56

1

2

3

4

56

48

For rank-based analyses of stratified trials:

> No single method is uniformly the best

Recommendation: use the aligned rank test (Talign) or either of the proposed adaptive tests (Tadap1 or Tadap2). Both tests were more powerful than the van Elteren test (TvE) in every case studied, notably so when there was a true (but hard to detect) T x S interaction.

> It is time to retire the popular van Elteren test!

Conclusions (Part II)

49

• Brunner, E., Puri, M. L., and Sun, S. (1995). Nonparametric Methods for Stratified Two-Sample Designs with Application to Multiclinic Trials. Journal of American Statistical Association, 90, 1004-1014.

• Hodges, J. L. and Lehman, E. C. (1962). Rank Methods for Combination of Independent Experiments in the Analysis of Variance. Annals of Mathematical Statistics, 33, 482-497.

• Mehrotra, D.V. and Railkar, R. (2000). Minimum Risk Weights for Comparing Treatments in Stratified Binomial Trials. Statistics in Medicine, 19, 811-825.

• Wang, W., Mehrotra, D.V., Chan, I.S.F. and Heyse, J.F. (2006). Non-Inferiority /Equivalence Trials in Vaccine Development. Journal of Biopharmaceutical Statistics, 16, 429-441.

• Öhrvik, J. (1999). Aligned Ranks: A Method of Gaining Efficiency in Rank Tests. http://www.stat.fi/isi99/proceedings/arkisto/varasto/hrvi0423.pdf

• van Elteren, P. H. (1960). On the Combination of Independent Two Sample Tests of Wilcoxon. Bulletin of the Institute of International Statistics, 37, 351-361.

References

http://www.stat.fi/isi99/proceedings/arkisto/varasto/hrvi0423.pdf

analysis of stratified trials – challenging the “standard” methods devan v. mehrotra clinical...

Documents