a note on anova assumptions and robust analysis for a cross-over study

STATISTICS IN MEDICINEStatist. Med. 2002; 21:1377–1386 (DOI: 10.1002/sim.1103)

A note on ANOVA assumptions and robust analysis for across-over study

Xun Chen∗;†, Peng-Liang Zhao and Ji Zhang

Clinical Biostatistics; Merck Research Laboratories; RY34-A316; Rahway; NJ 07065; U.S.A.

SUMMARY

Analysis of variance (ANOVA) methods are usually applied to analyse continuous data from cross-over studies. The analysis, however, may not have appropriate type I error when certain assumptionsare violated. In this paper, we �rst clarify a conventionally minimum set of assumptions that validatethe F-tests of ANOVA models for cross-over studies. We then provide a practical veri�cation=remedyprocedure based upon the theoretical developments. By applying the veri�cation=remedy procedure,more robust analysis results can be expected from the ANOVA models. Copyright ? 2002 John Wiley& Sons, Ltd.

KEY WORDS: cross-over trial; OLS; MVN; robust analysis; type H structure

1. INTRODUCTION

Cross-over designs have been used extensively in clinical trials, for a variety of experimentsincluding phase I and II pharmacokinetic trials and all phases of trials in chronic diseases.In a cross-over trial, each subject receives a series of (di�erent) treatments in di�erent periods.To avoid the confounding of period and treatment e�ects, a cross-over trial contains at leasttwo di�erent sequences, where treatments are assigned in di�erent orders and the allocationof subjects to each sequence is randomized. As a special kind of repeated measurement trial,one major advantage of a cross-over trial is that it performs tests for treatment e�ects usingwithin-subject variation, which usually leads to more powerful analysis than parallel designsperforming tests based on (larger) between-subject variation [1].Analysis of variance (ANOVA) methods based on ordinary least squares (OLS) estimates

are usually applied to analyse continuous data from cross-over trials [2; 3]. The procedureis available in SASTM Proc GLM and other equivalent software. It is known that the analy-sis is valid under certain assumptions, and the assumptions should be veri�ed. However, inpractice, we �nd that it is very often that the practitioners do not know exactly what therequired assumptions are for the ANOVA of a cross-over study to be valid, or they know

∗Correspondence to: Xun Chen, Clinical Biostatistics, Merck Research Laboratories, RY34-A316, Rahway, NJ 07065,U.S.A.

†E-mail: Xun [email protected] May 2001

Copyright ? 2002 John Wiley & Sons, Ltd. Accepted July 2001

1378 X. CHEN, P.-L. ZHAO AND J. ZHANG

the assumptions but do not know how to verify them using the available software package,or the practitioners are unsure of what to do if a departure from the assumptions is detected.This paper addresses each of the above concerns by giving practical guidance. Section 2

formulates the problem and states explicitly the theoretical assumptions that validate ANOVAF-tests for cross-over studies. Section 3 provides a practical veri�cation=remedy procedure.The veri�cation procedure is developed based on the residuals of the ANOVA model. For-mulae are provided to adjust the results of the ANOVA model when signi�cant departuresfrom certain assumptions are detected. A real data example is given in Section 4 to illustratethe application of the robust analysis procedure for a cross-over trial.

2. MODEL AND ASSUMPTIONS

Suppose we have an s-sequence p-period cross-over trial comparing t treatments. Assumethere are ni subjects within sequence i (i=1; : : : ; s), and

∑si=1 ni= n. Let yijk represent the

response of subject ij (the jth subject within sequence i) at its kth period. The usual linearmodel for the cross-over trial can be written as

yijk =�+ sij + �k + �d[i; k] + �d[i; k−1] + ”ijk (1)

where �, sij, �k , �d[i; k] and �d[i; k−1] represent the overall mean, subject e�ect, period e�ect,treatment e�ect and carry-over e�ect, respectively; ”ijk represents the random experimentalerror of subject ij at period k. Let y=(y111; : : : ; y11p; : : : ; ysns1; : : : ; ysnsp)

′, s=(s11; : : : ; ssns)′,

�=(�1; : : : ; �t)′, �=(�1; : : : ; �t)′ and ”=(”111; : : : ; ”11p; : : : ; ”snsp)′, model (1) in matrix notation

is as the following:

y= 1np�+ (1p⊗ In)s+ X��+ X��+ X��+ ” (1′)

where ⊗ denotes the Kronecker product, In denotes an n× n identity matrix, and 1p denotes ap-dimensional vector of ones. X�= Ip⊗ 1n denotes the np×p design matrix for period e�ects�. X� denotes the np× t design matrix for treatment e�ects �. It is determined by the particulardesign. X� is the np× t design matrix for carry-over e�ects �, and is directly determined byX�. Note model (1) (and (1′)) only considers the �rst-order carry-over e�ects, and assumesthere are no second- or higher-order carry-over e�ects. In fact, we recognize that many dataanalysts even do not include the �rst-order carry-over e�ect � in the model in practice. Theresults and conclusions of the paper remain the same, however.The standard analysis of variance (ANOVA) method can be applied to analyse model (1′)

when the response variable y is continuous. Consider the two hypotheses H0�: �1 = · · ·= �t andH0�: �1 = · · ·= �t , that we are interested in most of the time. In ANOVA models, the F-ratioused for the hypothesis test is given by

f =MSRMSE

(2)

where MSE= y′Eyrank(E) , E= I − X (X ′X )−X ′, and X = (1np |1p⊗ In|X�|X�|X�) represents the full

design matrix; MSR= y′Ryrank(R) , R=X (X

′X )−X ′ −M (M ′M)−M ′, M = (1np |1p⊗ In|X�|X�) whentesting the hypothesis of no treatment e�ects (H0�), and M = (1np |1p⊗ In|X�|X�) when testing

Copyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:1377–1386

ROBUST ANOVA ANALYSIS FOR A CROSS-OVER STUDY 1379

the hypothesis of no carry-over e�ects (H0�). ANOVA procedures, such as Proc GLM inSASTM, conduct F-tests following the distribution assumption

f ∼F(rank(R); rank (E)) (3)

(under the corresponding null hypothesis). The distribution assumption (3) is valid when ysatis�es multi-normal distribution with a so-called ‘compound symmetry’ variance-covariancematrix (or say, with equal variances and equal covariances) [4]:

y∼MVN(X �; (�21 Ip + �22Jp)⊗ In) (4)

where �=(�; s′; �′; �′; �′)′ denotes the parameter vector of model (1′), MVN denotes themulti-normal distribution and Jp denotes a p×p matrix of ones. It has been known for awhile that the assumptions in (4) are unnecessarily stringent to derive valid F-tests in mostcircumstances [4; 5]. We shall demonstrate brie�y how these assumptions can be weakenednext.It can be shown that model (1′) is equivalent to model

Dy=DX �+D” (5)

in terms of the OLS estimates for period, treatment and carry-over e�ects, where D=Dp⊗ Inand Dp= Ip − 1

p Jp. The D transformation standardizes response y over di�erent periods for

each subject by eliminating the nuisance parameters � and s. Let �∗=(�′; �′; �′)′, (5) can berewritten as

Dy= X �∗ +D” (5′)

where X represents the transformed design matrix DX after truncating its (zero) columns for� and s. The corresponding matrices E and R in (2) and (3) can be rewritten as E=D −X (X ′X )−X ′ and R= X (X ′X )−X ′ − M (M ′M)−M ′, respectively, where M denotes the matrixDM after truncating its (zero) columns for � and s. It can be shown that the OLS residualsof model (1′) are identical to those of model (5); rank(E)= rank(E), rank(R)= rank(R),MSE= (Dy)′E(Dy)

rank(E)=MSE, and MSR = (Dy)′R(Dy)

rank(R)=MSR.

Note D2 =D and DX = X , the F-tests for period, treatment and carry-over e�ects (basedon (3)) are thus valid as long as

D”∼MVN(0; �2D)or equivalently

Dy∼MVN(X �∗; �2D) (6)

where �2 = 2 var(yijk −yijl) is an unknown constant, and again D=(Ip− 1p Jp)⊗ In. Condition

(6) is weaker than condition (4) as following. First, the MVN requirement on Dy is weakerthan that on y – simply considering that assuming the distribution of two variables to bebivariate normal is always stronger than assuming the di�erence of the two variables to benormal. Secondly, the required structure for var(y) is more general in (6). It can be shownthat var(Dy)=�2D is equivalent to var(y)=DIAG(Hij(�2))n×n, where DIAG represents the



diagonal matrix pattern and Hij(�2) denotes the variance-covariance matrix of subject ij,satisfying the following constraints:

Hij(�2)=

�2 + 2�1(ij) �1(ij) + �2(ij) · · · �1(ij) + �p(ij)�1(ij) + �2(ij) �2 + 2�2(ij) · · · �2(ij) + �p(ij)

· · · · · · · · · · · ·�1(ij) + �p(ij) �2(ij) + �p(ij) · · · �2 + 2�p(ij)

(7)

(7) is referred to as a type H matrix with constant �2 [4]. The type H structure involvesp+ 1 parameters and heteroscedasticity is allowed for di�erent periods. It is obviously moregeneral than ‘compound symmetry’ structure, which only contains two parameters; Hij(�2)reduces to a compound symmetry matrix if and only if �1(ij) = · · ·= �p(ij). Furthermore, notethat �k(ij)(k=1; : : : ; p) are arbitrary values satisfying 2�k(ij)¿−�2 and it can be di�erentfor di�erent subjects ij. It is thus not necessary for di�erent subjects to have the samevariance-covariance matrices in a cross-over study to validate the F-tests in ANOVA models– var(y)=DIAG(Hij(�2)) rather than var(y)=H (�2)⊗ In. (var(Dy)=�2Dp⊗ In though.) Forexample

H (1)=

1:50 0:75 0:500:75 2:00 0:750:50 0:75 1:50

and

H (1)=

2:00 0:75 1:000:75 1:50 0:751:00 0:75 2:00

are both type H matrices with �2 = 1, and both matrices can serve as the variance-covariancematrix for di�erent subjects of a three-period cross-over study.The condition (6) is more than su�cient to validate the F-tests following (3). It can be

proved that given the MVN distribution of Dy, var(Dy)=�2D is also necessary to validatethe F-distribution claimed in (3) [4]. Hence, we claim (6) as a conventionally minimum set ofassumptions that validate the F-tests (on period, treatment and carry-over e�ects) of ANOVAmodels for cross-over studies.

3. ASSUMPTION VERIFICATION AND VIOLATION REMEDY

Let yij denote the responses of subject ij at its p di�erent periods, yij=(yij1; : : : ; yijp)′. zijdenotes the standardized responses of subject ij (over di�erent periods) with zij=Dpyij= yij−1p

∑pk=1 yijk . The condition (6) is thus equivalent to

cov(zij; zi′j′)=0 (8)

and

zij ∼MVN(�i; �2Dp) (9)



where ∀ i; i′=1; : : : ; s, j; j′=1; : : : ; ns, i′ �= i or j′ �= j, �i denotes the unknown mean of zijwithin sequence i.Conditions (8) and (9) explicitly describe the theoretical requirements on observations of

cross-over studies in order to validate the F-tests of the corresponding ANOVA models. Inpractice, however, we can barely test these assumptions exactly as they are stated. Usuallywe omit the veri�cation for assumption (8), as it is reasonable to assume that the randomlyselected di�erent subjects are independent with each other, which implies cov(zij; zi′j′)=0directly. The veri�cation for assumption (9) can be conducted in di�erent degrees in practice,mostly depending on the sample size of the cross-over study. A practical procedure will bediscussed next.

3.1. The multi-normality of zij

Under the assumption of zij ∼MVN, the OLS residuals of model (1′) (or equivalently theOLS residuals of model (5)), say, ˜”ij=(”ij1; : : : ; ”ijp)′ (i=1; : : : ; s and j=1; : : : ; ni), shouldalso be multi-normally distributed as

˜”ij= zij − fi(Dy) (10)

where fi denotes a linear function for sequence i. The form of fi may di�er with di�er-ent cross-over designs. (For a 2× 2 cross-over study, it is known that fi(Dy)= 1

ni

∑nij=1 zij.)

Any signi�cant non-MVN features of ˜”ij may imply either non-MVN of zij within one ormore sequences or heteroscedasticity of zij across di�erent sequences. Furthermore, it can beshown that ˜”ij ∼MVN is equivalent to ˜”ij∗ =(”ij1; : : : ; ”ij(p−1))′ ∼MVN, due to the fact that∑p

k=1˜”ijk =0. Speci�cally, for a 2× 2 cross-over trial, the MVN veri�cation will be reduced

to verifying the UVN of residuals from period 1 (or 2).There is a large amount of literature that discusses the methods of assessing the MVN of

a random vector. Readers can refer to Looney [6] for a review of MVN tests based on UVNtest results. Royston’s H-test [7] is recommended as it is fairly easy to apply (developedbased on the results of Shapiro–Wilk’s UVN test) while maintaining adequate power in smallsample cases (see reference [7] for details).

3.2. The structure of var(zij)

Denote var(zij) by V . It can be shown that V =�2Dp is equivalent to V being a type H matrixwith constant �2 as V =D′

p var(yij)Dp. The hypothesis of V =�2Dp can thus be tested by using

the Mauchly’s sphericity test [8]. The Mauchly’s sphericity test is available in SASTM ProcGLM, where V is estimated by (pooled across all sequences)

V =1

n− ss∑i=1

ni∑j=1

(zij − 1

ni

ni∑l=1zil

) (zij − 1

ni

ni∑l=1zil

)′(11)

which is identical to

V =1

n− ss∑i=1

ni∑j=1

(”ij − 1

ni

ni∑l=1

˜”il

) (”ij − 1

ni

ni∑l=1

˜”il

)′(11′)



due to (10). Rejecting the type H structure of V could imply either a non-type H variance-covariance matrix within one or more sequences or heteroscedasticity (di�erent �2) acrossdi�erent sequences.When the sample size (ni) is large enough, one may further verify the type H assumption for

var(zij) within each sequence and then test homoscedasticity of �2 across di�erent sequences– it is possible that var(zij) within one sequence is not type H or is di�erent from one another,yet V demonstrates perfect H structure. The variance-covariance matrix of zij within sequencei can then be estimated by

var(zij)=1

ni − 1ni∑j=1

(”ij − 1

ni

ni∑l=1

˜”il

) (”ij − 1

ni

ni∑l=1

˜”il

)′(12)

Speci�cally, a 2× 2 variance-covariance matrix is always a type H matrix. Thus for a 2× 2cross-over study, there is no need to verify the structure of its variance-covariance matrix.(For a large 2× 2 cross-over study, one may verify whether the �2’s within the two di�erentsequences are the same.)

3.3. Violation remedy

Studies have shown that the OLS based F-tests are quite robust against light or mediumdeparture from the normality assumption of zij [9]. Only when heavy departure from MVNoccurs may the user need to consider proper transformations or consider non-parametric meth-ods [9; 10]. However, the OLS based tests are not robust against a non-type H structure ofvar(zij), and proper adjustment is needed when departure is detected [11; 12]. Bellavanceet al. [12] found that the intuitive empirical generalized least squares method, which appliesthe generalized least squares method based on an empirical estimate of var(yij), tends to betoo liberal most of the time. They recommended using the modi�ed F-test instead, that is,given the MVN distribution of zij, the distribution of the F-ratio in (2) can be generallyapproximated by bF(d1; d2), with

b= rank(E)Tr(R�)

rank(R)Tr(E�)

d1 =[Tr(R�)]2

Tr[(R�)2]

d2 =[Tr(E�)]2

Tr[(E�)2]

(13)

where �=var(Dy), Tr(R�) denotes the trace of the matrix R�. (When var(zij)=�2Dp forall ij, that is, �=�2D, (13) reduces to (3).) Bellavance et al. [12] found that the modi�edF-test based on (13) gave very good control on the type I error over a variety of variance-covariance structures by Monte Carlo simulations. We should note that (13) not only dependson the variance-covariance matrix �, but also on the speci�c design matrix of the cross-overstudy. The adjustment factors provided by SASTM Proc GLM, such as the Greenhouse andGeisser factor and the Huynh and Feldt factor, are not reliable in adjusting the F-tests forcross-over studies, as those factors are derived from a simple one-way repeated measurementstudy [13].Similarly, we can also derive robust interval estimates for any contrast regardless of the

structure of var(Dy). Let c be an arbitrary contrast vector for parameter �∗, and let c′�∗



denote the parameter of interest. For model (1′), the OLS estimate for c′�∗ is

c′˜�∗= c′(X ′X )−X ′Dy= h′Dy

where h′= c′(X ′X )−X ′. Assuming var(Dy)=�2D, SASTM Proc GLM estimates �2 by MSE

= y′Eyrank(E) =

(Dy)′E(Dy)rank(E)

=MSE and var(c′˜�∗)= h′ var(Dy)h by

MSE× c′(X ′X )−c (14)

since X ′DX = X ′X and thus h′Dh= c′(X ′X )−c. For a general variance-covariance matrixvar(zij)=�, it can be shown that

c′˜�∗ − c′�∗√(14)

∼ t(d2)√{

rank(E)h′�hTr[E�]c′(X ′X )−c

}(15)

When �=�2D, (15) reduces to the inference applied in SASTM Proc GLM, that is,

c′˜�∗ − c′�∗√(14)

∼ t(rank(E))

Note in formulae (13) and (15), matrices E, R, and X can all be directly derived from thedesign matrix X . Only �=var(Dy) is unknown. For calculations performed in (13) and (15),it su�ces to estimate � by V ⊗ In.So far we have provided a veri�cation=remedy procedure in a practical manner. The pro-

cedure is summarized in Figure 1. We have developed SASTM macros to conduct the fullveri�cation=remedy procedure. The codes are available from the �rst author upon request. Inthe next section, we illustrate the veri�cation=remedy procedure through a real life data set.

4. EXAMPLE

Here we use a real data set to illustrate how to use the procedure developed in Section 3to achieve a robust analysis for a cross-over study. The data can be found in reference [1].In this data set, three drugs, referred to as A, B and C, were compared in terms of theire�ect on systolic blood pressures. In the trial, 12 subjects were randomly allocated to oneof the six possible treatment sequences, ABC, ACB, BCA, BAC, CAB and CBA, such thateach sequence has 2 subjects. The patients’ systolic blood pressures were measured multipletimes pre- and post- A, B or C treatment. Only the data for the time period 60 minutespost-dose are analysed here. Preliminary analysis from SASTM Proc GLM suggests that thereis a signi�cant treatment e�ect (P=0:013) and no evidence for carry-over e�ect (P=0:680).To validate the results, we need to verify the underlying assumptions:

(i) Test for multi-normality. The Shapiro–Wilk’s W test con�rms the UVN feature of theresiduals from periods 1 and 2 (P=0:361, and P=0:230, respectively). Proceeding withRoyston’s H test [7] for MVN evaluation, no signi�cant departure from MVN is foundfor the residuals (P=0:256). We thus conclude that the MVN assumption of zij is wellsatis�ed.



Fit the data with model (1)

(with or without carryover effect)

Residual ε→ˆ

Test the MVN of by Royston’s H test

(reduce to UVN test for 2X2 study)*ˆ ijε

MVN stronglyrejected

Transformation

or non-parametric

method

Yes

No

Period # (p)>2

Yes

Estimate V by (11,) and test for

Type H structure

Type H rejectedYes Adjust results by

formulae (13) and (15).

No

Large samplestudy

Yes

Conclude with original

ANOVA outputs

No

Test Type H w. each seq

Test across seq.’s.

Non-H ordiff.

No

Yes

No

2σ2σ

→

Figure 1. A �ow chart of the proposed veri�cation=remedy procedure.

(ii) Test for variance-covariance structure. The average variance-covariance matrix of thestandardized response variable (zij) can be estimated applying formula (11′), that is

V =

28:33 9:17 −37:50

9:17 7:42 −16:58−37:50 −16:58 58:08

The variance-covariance matrix demonstrates obvious departure from the expectedpattern of �2D3. The departure is con�rmed by the Mauchly’s sphericity test – withp-value=0:015. Adjustment for the original F-tests is recommended.



Table I. P-values of F-tests before=after adjustment.

F-test Treatment e�ect Carry-over e�ect

Unadjusted 0.013 0.680Adjusted 0.057 0.774

Table II. Ninety-�ve per cent con�dence intervals for treatment e�ects.

A − C B− C A + B− 2CUnadjusted (−1:5; 10:8) (3:6; 15:9) (5:9; 23:0)Adjusted (−3:2; 12:5) (1:9; 17:6) (3:5; 25:4)

Applying formula (13), Table I provides the adjusted F-test results for the hypotheses ontreatment e�ect and carry-over e�ect (also see reference [12]). In contrast to the unadjustedresults, a somewhat di�erent interpretation can be seen.Applying formula (15), Table II illustrates the adjusted con�dence intervals for some con-

trasts on treatment e�ects.

5. CONCLUSIONS AND DISCUSSIONS

In this paper we show that expressions (8) and (9) can be considered as a minimal setof assumptions that are theoretically needed to validate the OLS based F-tests of ANOVAmodels for cross-over studies. Based on the theoretical developments, a procedure from ver-i�cation to remedy is provided in a practical manner. We focus our discussions on smallsample cases, which are very often seen in the early phases of clinical trials. By applying thisveri�cation=remedy procedure, more robust analysis results can be expected from the ANOVAmodels.Finally, we need to mention that model (1′) for a cross-over study can also be anal-

ysed by the mixed-model approach, available in Proc Mixed in SASTM. Considered as amore contemporary tool, one major advantage of the mixed model approach is that a big-ger family of variance-covariance structures is allowed. When the variance-covariance struc-ture clearly departs from type H, and suggests a di�erent pattern, say AR(1), for exam-ple, instead of considering adjustment, the mixed-model approach can be applied to achievemore powerful analysis. However, a major drawback of the mixed-model approach, such asSASTM Proc Mixed, is that all hypothesis tests on the � parameters are developed basedon large sample theory. When the sample size is small, which is usually the fact for earlyphase cross-over studies, the inference based on asymptotic theory is quite tenuous. Thisis also the reason that the ANOVA method, as conducted by SASTM Proc GLM, has con-tinued to be used in analysing cross-over studies, especially in small samplecases.



ACKNOWLEDGEMENT

The authors are grateful to Professor Bellavance for his help in developing the SAS macro.

REFERENCES

1. Jones B, Kenward MG. Design and Analysis of Cross-over Trials. Chapman and Hall: London, 1998.2. Ratkowsky DA, Evans MA, Alldredge JR. Cross-over Experiments. Marcel Dekker, Inc.: New York, 1993.3. Sche�e H. The Analysis of Variance. Wiley: New York, 1959.4. Huynh H, Feldt L. Conditions under which mean square ratios in repeated measurements designs have exactF-distributions. Journal of the American Statistical Association 1970; 65:1582–1589.

5. Rouanet H, Lepine D. Comparison between treatments in a repeated-measurement design: ANOVA andmultivariate methods. British Journal of Mathematical and Statistical Psychology 1970; 23(2):147–163.

6. Looney S. How to use tests for univariate normality to assess multivariate normality. American Statistician1995; 49:64–70.

7. Royston JP. Some techniques for assessing multivariate normality based on the Shapiro–Wilk W. AppliedStatistics 1983; 32:121–133.

8. Mauchly JW. Signi�cance test for sphericity of a normal n-variate distribution. Annals of Mathematical Statistics1940; 11:204–209.

9. Neter J, Wasserman W, Kutner M. Applied Linear Statistical Models. 3rd edn. IRWIN: 1990.10. Bellavance F, Tardif S. A nonparametric approach to the analysis of three-treatment three-period cross-overdesigns. Biometrika 1995; 82:865–875.

11. Jan SL, Randles RH. Interdirection tests for simple repeated-measures designs. Journal of American StatisticalAssociation 1996; 91:1611–1618.

12. Bellavance F, Tardif S, Stephens MA. Tests for the analysis of variance of cross-over designs with correlatederrors. Biometrics 1996; 52: 607–612.

13. Goad CL, Johnson DE. Crossover experiments: a comparison of ANOVA tests and alternative analyses. Journalof Agricultural, Biological, and Environmental Statistics 2000; 5:69–473.


a note on anova assumptions and robust analysis for a cross-over study

Documents