2014 eugm - blinded adaptations, permutations tests and t tests
TRANSCRIPT
Blinded Adaptations, Permutation Tests & T-Tests
Michael Proschan (NIAID)
Introduction
• Joint work with Ekkehard Glimm and Martin Posch 2014, Stat. in Med. online
• See also Posch & Proschan 2012, Stat. in Med. 31, 4146-4153
Introduction
• Clinical trials are pre-meditated!• We pre-specify everything
– Superiority/noninferiority– Population (inclusion/exclusion criteria)– Primary endpoint– Secondary endpoints– Analysis methods– Sample size/power
Introduction
• Changes made after seeing data are rightly questioned: are investigators trying to get an unfair advantage?– Changing primary endpoint because another
endpoint has a bigger treatment effect– Increasing sample size because the p-value is close– Changing primary analysis because “assumptions
are violated”– Changing population because of promising
subgroup results
Introduction
• What’s the harm? 0.05 is arbitrary anyway• Problem: if unlimited freedom to change
anything, the real error rate could be huge• Reminiscent of Bible code controversy
– Clairvoyant messages such as “Bin Laden” and “twin towers” by skipping letters in Old Testament
– Similar messages can be found by skipping letters in any large book (Brendan McKay)
Introduction
• But changes made before unblinding are different
• Under strong null hypothesis that treatment has NO effect, blinded data give no info about treatment effect– Impossible to cheat even if it seems like cheating
• E.g., even if blinded data show bimodal distribution, it is not caused by treatment if strong null is true
Permutation Tests
• Permutation tests condition on all data other than treatment labels
• Under strong null, (D,Z ) are independent, where Z are ±1 treatment indicators & D are data – Observed data D would have been observed
regardless of the treatment given– It is as if we observed D FIRST, then made the
treatment assignments Z
Permutation Tests
• Peaking at data changes nothing because permutation tests already condition on D
• Conditional distribution of test statistic T(Z,Y) given D is that of T(Z,y) where y is fixed
• Distribution of Z depends on randomization method – Simple– Permuted block, etc.
T T C C C T C T C C T T C T T C
4 8 4 0 1 3 0 4 4 0 2 5 0 2 1 0
T-C T-C T-C T-C
Overall T-C
4.0 3.0 1.5 1.5
2.5
Permutation Tests
T C C T C T C T T T C C C T C T
4 8 4 0 1 3 0 4 4 0 2 5 0 2 1 0
T-C T-C T-C T-C
Overall T-C
-4.0 3.0 -1.5 0.5
-0.5
Permutation Tests
11
Rerandomization Distribution
T-C Mean
Fre
qu
en
cy
-3 -2 -1 0 1 2 3
02
04
06
08
01
00
Permutation Distribution
Blinded 2-Stage Procedures
• Blinded 2-stage adaptive procedures use 1st stage to make design changes– Sample size (Gould, 1992, Stat. in Med. 11, 55-66;
Gould & Shih, 1992 Commun. in Stat. 21, 2833-2853) – Primary endpoint (e.g., diastolic versus systolic blood
pressure)• Previous argument shows that if adaptation is
made before unblinding, a permutation test on 1st stage data is still valid
Blinded 2-Stage Procedures
• Careful! Subtle errors are possible• E.g., in adaptive regression, which of the
following is (are) valid?1. From ANCOVAs Y=β01+βz+βixi, i=1,…,k, pick xi that
minimizes MSE; do permutation test on winner2. From ANCOVAs Y=β01+βixi, i=1,…,k, pick xi that
minimizes MSE; do permutation test on Y=β01+βz+β*x*, where x* is winner
Blinded 2-Stage Procedures
• Careful! Subtle errors are possible• E.g., in adaptive regression, which of the
following is (are) valid?1. From ANCOVAs Y=β01+βz+βixi, i=1,…,k, pick xi that
minimizes MSE; do permutation test on winner2. From ANCOVAs Y=β01+βixi, i=1,…,k, pick xi that
minimizes MSE; do permutation test on Y=β01+βz+β*x*, where x* is winner
Blinded 2-Stage Procedures
• Unblinding and apparent α-inflation also possible if strong null is false
• E.g., change primary endpoint based on “blinded” data (X,Y1,Y2), Y1 and Y2 are potential primaries and X=level of study drug in blood– X completely unblinds– Can then pick Y1 or Y2 with biggest z-score– Clearly inflates α– Problem: strong null requires no effect on ANY variable
examined (including X=level of study drug)
Blinded 2-Stage Procedures
• Claim: the following procedure is valid– After viewing 1st stage data D1, choose test
statistic T1(Y1,Z1) and second stage data to collect
– After observing D2, choose T2(Y2,Z2) and method of combining T1 and T2, f(T1,T2)
– Conditional distribution of f(T1,T2) given (D1,D2) is its stratified permutation distribution
– Stratified permutation test controls conditional, & therefore unconditional type I error rate
Focus of Rest of Talk
• Permutation tests are asymptotically equivalent to t-tests
• Suggests that adaptive t-tests might be valid if adaptive permutation tests are
• We consider connections between permutation and t-tests, and validity of adaptive t-tests from adaptive permutation tests
One-Sample Case
• Community randomized trials sometimes pair match & randomize within pairs
• E.g., COMMIT trial used community intervention to help people quit smoking—11 matched pairs
• D=difference in quit rates between treatment (T) & control (C)
T C D=T-CPair i 0.30 0.25 +0.05
One-Sample Case
• Community randomized trials sometimes pair match & randomize within pairs
• E.g., COMMIT trial used community intervention to help people quit smoking—11 matched pairs
• D=difference in quit rates between treatment (T) & control (C)
C T D=T-CPair i 0.30 0.25 -0.05
One-Sample Case
• Permuting labels changes only sign of D• Permutation test conditions on |Di|= di
+;
-di+ and di
+ are equally likely
• The permutation distribution of Di is dist. of
21 w.p.1
21 w.p.1 where,
/
/ZdZ iii
One-Sample Case
• In 1st stage, adapt based on |D1|,…,|Dn| (blinded)– E.g., increase stage 2 sample size because |Di| is very
large
• What is conditional distribution of 1st stage sum ΣDi given |D1|=d1
+,…,|Dn|= dn+ and the adaptation?
– The adaptation is a function of |D1|,…,|Dn|
– The null distribution of ΣDi given |D1|=d1+,…,|Dn|= dn
+ IS its permutation distribution
– Conclusion: permutation test on stage 1 data still valid
One-Sample Case
• Mean and variance of permutation distribution are
222 )(var
0)(E
iiiii
iiii
dZEddZ
ZEddZ
One-Sample Case
• Asymptotically, permutation distribution is normal with this mean and variance (Lindeberg-Feller CLT)
• I.e., conditional distribution of Di given |D1|=d1+,
…,|Dn|= dn+ is asymptotically N(0,di
2)
• Depends on |D1|=d1+,…,|Dn|= dn
+ only through
L2=di2
One-Sample Case
• Asymptotically, permutation distribution of
• Like t-test with variance estimate s02 instead of
usual sample variance s2
n
LDns
ns
DT
Nd
dN
D
DT
ii
i
i
i
i
222
020
2
2
2
)/1( ;'
)1,0(,0
'
One-Sample Case
• Recap: Permutation distribution of T’ is dist of
• Conclusion: T’ is asymptotically indep of L2
22
2
12
on dependt doesn' )1,0(
given '
|||,...,| given '
i
i
n
i
i
DLN
DT
DDD
DT
One-Sample Case
• Begs question, is this true for all sample sizes under normality assumption?
• if Di are iid N(0,2), then can
• Seems crazy, but it’s true!
? oft independen be ' 2
2 i
i
i DD
DT
One-Sample Case
• One way to see that T’ is independent of Di2 uses
Basu’s theorem: • Recall S is sufficient for θ if F(y|s) does not depend
on θ; it is complete if E{g(S)}=0 for all θ implies g(S)≡0 with probability 1
• A is ancillary if its distribution does not depend on θ• Basu, 1955, Sankhya 15, 377-380:
If S is a complete, sufficient statistic and A is ancillary, then S and A are independent
One-Sample Case
• Consider Di iid N(0,2) with 2 unknown
–Di2 is complete and sufficient
– T’= Di/(Di2)1/2 is ancillary because it is scale-
invariant
– By Basu’s theorem, T’ and Di2 are independent
One-Sample Case
• Same argument shows that the usual t-statistic is independent of Di
2
• Under Di iid N(0,2) with 2 unknown
– Di2 is complete and sufficient
– Usual t-statistic T= Di/(ns2)1/2 is ancillary
– By Basu’s theorem, T and Di2 are independent
( Shao (2003): Mathematical Statistics, Springer)
One-Sample Case
• This result is important for adaptive sample size calculations– Stage 1 with n1= half of original sample size: change
second stage sample size to n2=n2(ΣDi2)
– Conditioned on ΣDi2:
• Test statistic T1 has exact t-distribution with n1-1 d.f.
• Test statistic T2 has exact t-distribution with n2-1 d.f. and is independent of T1
• P-values P1 and P2 are independent U(0,1)
• Y={n11/2Φ-1(P1)+n2
1/2Φ-1(P2)}/(n1+n2)1/2 is N(0,1) under H0
One-Sample Case
• Reject if Y>zα
• Conditioned on ΣDi2, type I error rate is α
• Unconditional type I error rate is α as well• Most other two-stage procedures are only
approximate
One-Sample Case
• Could even make other adaptations like changing primary endpoint
• Look at ΣDi2 for each endpoint and determine which one
is primary – E.g., pick endpoint with smallest Di
2
• Slight generalization of our result shows that conditional distribution of T given adaptation is still exact t
One-Sample Case
• Shows that conditional type I error rate given adaptation is controlled at level α
• Unconditional type I error rate must also be controlled at level α
• Derivation assumes multivariate normality with variance/covariance not depending on mean
Two-Sample Case
• Can use same reasoning in 2-sample setting • With equal sample sizes, the numerator is
• Permutation distribution is distribution of
• Let sL2 be “lumped” variance of all data (treatment
and control)
iiC
iT
i YZYY
0 ,1 each , iiii ZZyZ
Two-Sample Case
• Mean and variance of permutation distribution are
• Basu’s theorem shows usual 2-sample T is independent of sL
2 under null hypothesis of common mean
• Conditional distribution of T given sL2 is still t
22
)(1
1var
0)(EE
Lii
iiii
syyn
yZ
ZyyZ
Two-Sample Case
• Two-stage procedure– Stage 1: look at lumped variance and change stage 2
sample size– Conditioned on 1st stage lumped variance & H0
• T1 has t-distribution with n1-2 d.f.
• T2 has t-distribution with n2-2 d.f. & independent of T1
• P-values P1 and P2 are independent uniforms
• {n11/2Φ-1(P1)+n2
1/2Φ-1(P2)}/(n1+n2)1/2 is N(0,1) under H0
– Controls type I error rate conditionally and unconditionally
Summary
• Permutation tests are often valid even in adaptive settings if blind is maintained
• There is a close connection between permutation tests and t-tests
• Can deduce validity of adaptive t-tests from validity of adaptive permutation tests