tensor analysis of anova decomposition

Tensor Analysis of ANOVA DecompositionAuthor(s): Akimichi TakemuraSource: Journal of the American Statistical Association, Vol. 78, No. 384 (Dec., 1983), pp. 894-900Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2288201 .

Accessed: 09/08/2013 14:36

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 128.104.1.219 on Fri, 9 Aug 2013 14:36:14 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=astata

http://www.jstor.org/stable/2288201?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


Tensor Analysis of ANOVA Decomposition AKIMICHI TAKEMURA*

The analogy between ANOVA for general n-factor crossed layouts and the ANOVA-type decomposition of square integrable statistics is demonstrated using the notions and notations of tensor analysis and multilinear algebra. A theory of tensors is developed in such a way that (a) it can be immediately applied in computer programs, and (b) it can be easily generalized to L2 spaces. KEY WORDS: ANOVA; Crossed layout; ANOVA-type decomposition; Tensor product; Direct product; Kro- necker product; U statistics.

1. INTRODUCTION

The purpose of this paper is to demonstrate the almost complete analogy between ANOVA for general n-factor crossed layouts and the ANOVA-type decomposition of square integrable statistics used in the literature on U statistics and in connection with the jackknife estimate of variance (Efron and Stein 1981, Bhargava 1980, Karlin and Rinott 1982). This will be done using notions and notations of tensor analysis and multilinear algebra. It will be clear that the latter is just an infinite dimensional generalization of the former. Usually the analogy between the two is understood in an operational sense, namely, how the higher-order interaction terms are defined by an inclusion-exclusion argument. Various subclass means are added and subtracted in the usual ANOVA; various conditional expectations in the second case. Then the orthogonality of the interaction terms is proved. By using tensor analysis we can in a sense reverse the argument. Orthogonal subspaces of an appropriate vector space can be directly described. Only the dimensionality is different in the two cases. The inclusion-exclusion pattern then follows from the form of the orthogonal projectors onto these subspaces.

In ANOVA and experimental design, the tensor approach has been employed by a number of people. It pro- vides a natural and powerful tool for treating general n- factor crossed layouts and other designs. Unfortunately, terminology and notation were not standardized; in particular, what is essentially the same notion has been called tensor, Kronecker, direct, or outer product. Ap- proaches employed were sometimes elementary, sometimes more abstract. This is one of the reasons why this approach has not been very .often taught.

In the field of experimental design, Kurkjian and Zelen

* Akimichi Takemura is a graduate student at the Department of Sta- tistics, Stanford University, Stanford, CA 94305. Research was sup- ported in part by Office of Naval Research Contract N00015-75-C-0442. The author wishes to thank M.L. Eaton, C. Hirotsu, M. Steele, B. Efron, and a referee for helpful suggestions.

(1962) introduced a "calculus for factorial arrangements." Following this work there have been many pa- pers using direct product notation for construction and analysis of various designs, including Kurkjian and Zelen (1963), Zelen and Federer (1964,1965), Federer and Zelen (1966), Bock (1963), Paik and Federer (1974), Cotter, John, and Smith (1973), Cotter (1974,1975), John and Dean (1975a,b). The terminology and notational conventions introduced by Kurkjian and Zelen (1962) seem to be rather arbitrary. Connection between their "calculus" and the standard tensor analysis or multilinear algebra is not made clear. Another drawback is that they confine their theory to the usual matrix theory, and multilinear aspects tend to be lost. For example, they define direct product of matrices as a partitioned matrix of a larger dimensionality (this is still a common practice today in statistics). But this introduces an unpleasant ordering of indices, and the symmetry inherent in the problem be- comes obscured.

Another group of people employing this technique are found in the coordinate-free approach in linear models, for example Jacobsen (1968), Eaton (1970), and Haber- man (1975). Jacobsen (1968) seems to have presented the first systematic treatment of ANOVA from the viewpoint of multilinear algebra. In addition to the new viewpoint his treatment of the nested model and of the missing observation method is interesting. Unfortunately, his results do not seem to have been published in a more widely available form and have been almost forgotten in the later literature. Furthermore, his treatment suffers from ex- cessive mathematical formalism and arbitrary notational conventions. Later Haberman (1975) gave a thorough treatment that so far can be regarded as a standard rief- erence. One problem with these mathematical treatments is that both an essentially elementary nature of the approach and the practical computational aspects are often difficult to grasp.

In Section 2 we define tensors as multidimensional arrays as in the usual tensor analysis (Sokolnikoff 1964, Ch. 2). By doing this the unpleasant ordering of indices mentioned above is avoided. Operations on these arrays are explicitly described. In any high-level computer language multidimensional arrays can be used as easily as matrices, so this approach can be immediately incorporated in computer programs. Standard terminology of tensor analysis and of multilinear algebra will be employed. Further- more, we develop the theory in such a way that it can be easily generalized to L2 spaces.

? Journal of the American Statistical Association December 1983, Volume 78, Number 384

Theory and Methods Sectlon

894



Takemura: Tensor Analysis of ANOVA Decomposition 895

In Section 3 we briefly look at the general n-factor crossed layout.

In Section 4 we treat the ANOVA-type decomposition of a statistic with finite second moment by generalizing the results of Sections 2 and 3 to L2 spaces. The decomposition was first introduced by Hoeffding (1948) in connection with U statistics. Often the linear terms of this decomposition (corresponding to the main effects in ANOVA) are called Hajek projection after Hajek (1968) and are used extensively to prove asymptotic normality of various statistics. See Serfling (1980) for further references. Recently, more attention has been paid to the full decomposition. Rubin and Vitale (1980) developed a general asymptotic theory of U statistics using the full decomposition. Efron and Stein (1981) used the decomposition in their study of the jackknife estimate of variance. Further results and generalizations are given in Bhargava (1980) and Karlin and Rinott (1982). Instead of a standard reference for the decomposition each of them gave a definition of the decomposition using variations of an inclusion-exclusion argument. Our approach is different from these, as mentioned at the beginning.

2. TENSOR PRODUCTS OF VECTORS, MATRICES, VECTOR SPACES, AND SUBSPACES

In this section we develop a theory of tensors. Partic- ular references used in this section are Greub (1978, Ch. 1) and Sokolnikoff (1964, Ch. 2). A full abstract treatment can be found in Chapter 1 of Greub (1978): In a pure mathematical treatment tensors are developed in a coordinate-free way. This is elegant but not desirable from the viewpoint of computational applicability in statistics. On the other hand, the traditional tensor analysis (So- kolnikoff 1964) is more practical but is too closely tied to physics and much emphasis is placed on curvilinear co- ordinates, which we do not need here. We take the appropriate notions and notations needed from both of them. Proofs can be found in the various references given here and hence are omitted, with a few exceptions.

Let Rm be the set of all column vectors x = (x1, . .

x")' with m elements of real numbers. To denote the components of a column (or contravariant) vector we use superscripts following the traditional notation in tensor analysis. Vector addition and scalar multiplication are defined in the usual componentwise way. Now the tensor (Kronecker, direct, outer) product x 0 y of x (e Rm) and y (E Rn) is a two-dimensional array defined by componentwise multiplication:

(x0y)ii = xi. (2.1)

Namely, x 0 y is a two-dimensional array of dimensions m and n whose (i, j) element is x'yj (i = 1, ... , m,j =

1, ... , n). Now we define the addition of tensor products in a componentwise way:

(ax 0$ y + bxi 0 9)iJ - axiy1 + bxy (2.2)

where a, b are scalars. This leads to a vector space generated by {x 0 y, x E Rm, y E Rn}, which we denote by

Rm 0 R'. Namely

Rm () Rn= span{x 0 y, x E Rm, y EGR }

= { ai(ix 0 iy), k:finite}. (2.3)

Here the index i is written to the left of the corresponding vectors x and y because usual subscripts are used as covariant indices in tensor analysis. This point will be discussed later in this section in connection with linear transformations. Rm 0 Rn is called the tensor product of Rm and Rn. A general element u E Rm 0 Rn is called simply a tensor.

As one might expect, Rm 0 Rn is just the set of all two- dimensional arrays of dimensions m and n. We will make this point clear in two propositions.

Lemma 2.1. x 0 y is bilinear in x and y. Namely,

(ax + bX) 0 y = a(x 0 y) + b(f 0 y),

x0(cy + dy) = c(x0y) + d(x09), (2.4)

where a, b, c, d are scalars. Let mie denote a vector in Rm whose ith element is one

and other elements are zero. {mie, i = 1, . . . , m} forms an obvious basis of Rm. Now consider mie 0 m e, which has 1 in the (i, j) position and 0 everywhere else. Then

Proposition 2.1. {mie X n ae,= 1, .. ., m,j = 1 ..., n} is a basis of Rm 0 Rn.

Hence the following corollary.

Corollary 2.1. dim(Rm 0 Rn) = mn and Rm 0 Rn co- incides with the set of all two-dimensional arrays of dimensions m and n.

Remark 2.1. An element u E Rm 0 Rn, which can be written as u = x 0 y for some x E Rm, y E Rn is called decomposable. Rm 0 Rn does not consist only of decomposable elements. This is easily seen by noting that x 0 y is of rank 1 in the terminology of the usual matrix theory.

In Rm we have the usual inner product. In Rm 0 Rn a natural inner product is defined in an analogous way. Let u, v E Rm 0 Rn. Then we define

m n (u, v) = E E uijvij. (2.5)

i=1 j=1

Proposition 2.2. For two decomposable elements x 0 x, y 0 y of Rm 0 Rn we have

(x 0J x y 0 ) (x, y)(, (x). (2.6)

Proof.

(x 0S) y 09) = E (X 04). (y 1 s) i,j

_ x iiy iE

i j

= (x, y)*(x, y).



896 Joumal of the American Statistical Association, December 1983

Now we proceed to define tensor products of more than two vectors. Let ix E Rmi, i = 1, ... , k. Then 1x 0 .. 0 kX is defined to be a k-dimensional array of dimensions mi, . ,Mk such that

(X 0 0kX)ilik jX ..kX (2.7) Addition is defined componentwise and the space generated by {1x 0 0. 0 kX, ix E Rm', i = 1,. . . , k} is called the tensor product of Rtmi, . . ., R Mk and denoted by Rml 0 0. 0 R k or Xi=1k Rmi. Lemma 2.1, Proposition 2.1, and Corollary 2.1 hold for k > 2 with obvious mod- ifications. Now for general element u, v of Xi= 1k Rmi we define the natural inner product by

(u, V) = ... uil ..*ik Vil .-ik (2.8) il=1 ik= 1

Then analogous to (2.6) for two decomposable elements 1x O ...O kX, 1Y 0 ...O kY of 0i=1 kR m we have (1X X .. kX, 1Y C - kY)

= (lx, 1y) * (2x, 2Y) * *(kX, kY). (2.9)

Remark 2.2. If ix and iy are orthogonal for some i, then 1x 0 0. ) kx and IY 0.. 0 kY are orthogonal.

Next we consider subspaces and their orthogonal com- plements. Let U1, . . . , Uk be subspaces of Rm1, . . .. Rmk respectively. Then a subspace U1 0 ... 0 Uk Of Rm1 0 ... 0 R k is defined to be the subspace generated by {x 0 ... kx,, ix E Ui, i = 1, ... , k}. Namely U1 0 .0 Uk ,= span{Ix.0 .*0. kX,

ixE Ui, i= 1,...,k}. (2.10)

Let Uij denote the orthogonal complement of Ui in R'i. For convenience we define U? = Ui, Uil = Uil. Then we have

Theorem 2.1. 2k subspaces {?&= Ik Upi, E, = 0, 1, i = 1, ... , k} form a decomposition of 0i= 1k Rm'i into mutually orthogonal subspaces.

This is clear by taking the appropriate orthonormal basis of Rmi, i = 1,... , k and applying Remark 2.2.

Corollary 2.2.

( Ul 0 . . UkY_

= span{0i=Ik U it, Ei = 1 for some i}. (2.11)

Now we define tensor product of matrices. An n x m matrix A is considered to represent a linear transformation from Rm to Rn. In this sense we want to distinguish matrices from two-dimensional arrays (elements of Rn 0 Rm). In tensor analysis this is done by writing the second index as subscripts. Namely, the (i, j) element of a matrix A is denoted by Aj'. Superscripts are called contravariant indices and subscripts are called covariant indices. The reason behind this is discussed in Remark 2.4. Now let ,A be n, x mi matrices, i = 1, ... ., k. We want to define a tensor product of IA, ... ., kA in a meaningful way.

For notational convenience we first do this for the case k = 2. For matrices A (nI x m1) and B (n2 x M2) we define A 0 B as a four-dimensional array with two contravariant indices il, i2 and two covariant indices jl, j2 such that

(A X B)ju2i' i2 = Aj1i' . Bj2 i2. (2.12) This is again a componentwise multiplication as in (2.1). Now A 0 B defines a linear transformation from Rm' 0 Rm2 to RI' 0 Rn2 as follows. Let u E Rm" 0 R"'2 then v = (A 0 B)u E R"' 0 Rn2 is defined by

ml M2 vW2 = a (A 0 B)12ili2uiIJ2

jl=l 12=1

ml m2

= z 2, A1l' * B122. UjIJ2. (2.13) jl=l j2=1

As in (2.6) we have the following proposition.

Proposition 2.3. For a decomposable element x X y of Rm' OR m2

(A0B)(x0y) = Ax0By. (2.14)

Remark 2.3. We could have used (2.14) as a definition of A 0 B. It shows how A 0 B maps decomposable elements of Rm' 0 Rm2. Since the decomposable elements generate Rm' C0 Rm2, A 0 B for general elements can be defined by linearity. This is more elegant mathematically but for practical applications formula (2.13) will be useful. The same remark applies to Proposition 2.2.

Generalization of the previous argument to tensor product of more than 2 matrices is immediate. Instead of (2.13) and (2.14) we have

(1A 0 ... 0 kA)Uil ik

= lA j, ... kAIk jl .k (2.15) Ji ....ik

and

(A $ ... X kA)(Ix ( ... O kX) = 1Alx 0C) .. kAkx, (2.16)

respectively.

Remark 2.4. Rm was defined as the set of column or contravariant vectors. The dual space Rm* can be defined as the set of row or covariant vectors whose components are denoted with subscripts x = (x1,. . . , xm). Then Rm* 0 Rn*, Rm 0 Rn*, and so on, can be defined in a similar way as Rm 0 Rn is defined by (2.1)-(2.3). By the natural isomorphism between Rm 0 Rn* and the space of all linear transformations from Rn to R' (see Greub 1978, Section 1.28) a linear transformation A can be identified with an element of Rm 0 Rn*, and it has one contravariant index and one covariant index.

Our last item in this section is a discussion ons orthogonal projectors. Let V be a vector space and U C V be a subspace. A linear transformation Pu from V to itself




is called the orthogonal projector onto U if

Pux = x for x E U,

Pux=0 for xEUt (2.17)

Theorem 2.2. Let Ui C Rli, i = 1, . . . , n be subspaces and Pui be the orthogonal projectors onto Ui in R'i, i = 1, . . . , n. Then the orthogonal projector onto 0, ln U, C Oi= 1k R'i is given by Pu, 0 0) ) Pun.

Proof. Let ix E Ui, i = 1, . . . , k. Then by (2.16)

(Pul 0 .. PUJO(X 0 ..

kX) = PU, 1X 0 .. PUIkX

= 1XO . @ @kX. (2.18)

Hence for general elements u of U1 0 ... 0 Uk we have (Pu1 0 ... 0 Puk)u = u by linearity. Now by Corollary 2.2, (U1 0 0' ) Uk)' is generated by {lx 0 0 X kX, ix E U- for some i}. For such 1x 0 ... 0 kX

(PUI X ..X PUk)(1X X ..X kX) =PU,1 IX X . PUkkX

- 0. (2.19)

Hence by linearity

(PU, 0 .. 0 PUk)(Ul 0 . OUX = {?}

See Haberman (1975, Lemma 8) for an alternative proof using the fact that P u, 0 ... 0 Puk is idempotent and self- adjoint.

3. ANOVA FOR CROSSED LAYOUTS

Now we take a brief look at ANOVA for an n-factor crossed layout with single observation per cell. For more detailed treatments of various designs see the references given in Section 1. For each combination of factor levels (i1, . . , in) we have an observation x1 i. Therefore the set of observations {Xil .in} can be considered as a (ran- dom) tensor x E 0X= n Rmi. ANOVA is essentially a decomposition of 0 i= 1 n Rmi into mutually orthogonal subspaces. When all interactions are considered it is decomposed into 2n subspaces. Usually this is done by an inclusion-exclusion argument. Here we give the de- sired decomposition directly as follows. Let mil be a vector in Rmi with all components equal to 1. Let Ui = span{mil} and consider the decomposition of 0= In Rm, in Theorem 2.1. We use the notational convention of Theorem 2.1. Following Scheffe (1959, Sec. 4.6) let

ilk denote the (i1, . . ., ik)-interaction subspace for 1 il< ...< ikln,k = 0, ... ,fn.Weclaimthat

?-T I...ik = U16' (.. iE Un0 0 n, (3.1)

where

Ei = 1 if i E {it, . . . ,

= 0 otherwise.

This can be shown by considering the orthogonal projector onto the right side of (3.1). Note that the orthogonal projector onto U, = span{mi1l} is given by (in matrix form).

iF = - Mil Mil J (3.2) Mi

For x = (x1, xm we have iFx = Furthermore miI - iF is the orthogonal projector onto Ui , where miI denotes the mi x mi identity matrix. Now by Theorem 2.2 the orthogonal projector onto the right side of (3.1) is given by

pil ..ik = 1Q ) kQ, (3.3) where

iQ = miI - iF if i E {il, . . . ik,

= iF otherwise.

For example consider P12:

P12 = (M'I- 1F) 0 (M2- 2F)0 3F 0 C)0 nF = MIJ 0 M2I 0 3F 0 .. 0) nF

- 1F0 M2I 0 3F0 ... 0 nF - MII 0 2F0 3F0$ * .)0 nF

+ 1F0 2F0 3F0 C) 0) nF.

(3.4)

Operating P12 to x we obtain the usual expression:

(P1X)i"in= _ + .... (3.5)

Note that the expansion of the expression for the projector leads to the inclusion-exclusion. This pattern should be clear for general pillik. This proves (3.1).

The sum of squares due to (i , . . . , ik)-interaction for an observed tensor x is given by

Sil ---ik = (Pil ...ikX, Pil ...ikX)* (3.6)

Note that the actual computation of (3.6) can be done using (3.3), (2.15), and (2.8).

The degrees of freedom (df) of (i1, . . , ik)-interaction are given by dim(1iU...ik). Noting that dim(Ui) = 1 and dim(Ui') = mi - 1 we obtain by Corollary 2.1

k

dfof (i1, .. , ik)-interaction = 17 (mij - 1). (3.7) j= 1

4. ANOVA DECOMPOSITION OF A STATISTIC

In this section we study the ANOVA-type decomposition of a square integrable statistic S(xi, . . . , Xk). For this purpose we extend the results in the previous sections to L2 spaces. Particular references used here are Maurin (1967, Sec. 3.10) and Murray and von Neumann (1936, Ch. 2). Let (XI, ,u ), . . , (Xn , pn) be probability spaces. We consider the L2 space of the product probability space (Xil I Ril) X X (Xik , Rik):

L 2(Xil, . , Xik)

= { .(il * , Xik) |Jf 42pl (dix 1) pFj*(dxik) < o}.k

(4.1)



898 Journal of the American Statistical Association, December 1983

Note that L2(Xi,, . . ., Xi) C L2(XjI, . ... , Xi) if {il, . . . , ikl C {jl, . . ., ji}. For simplicity we assume that X, ... , X, are locally compact, separable, metrizable spaces so that L2 spaces in (4.1) are separable. See Dieu- donne (1976, Ch. 13).

For notational convenience let n = 2, general case being an obvious modification. Let 4)(xl) E L2(X1), +P(x2) E L2(X2). Intuitively we can think of 4), 4* as having con- tinuous indices xi, x2. Sum of squares is replaced by squared integrals. Now define 4 )04* by a componentwise multiplication:

(4)0 4X)(xI, X2) = 4(x*) * 4+2) E L2(X1, X2). (4.2) Note that

| ((x1 2 )(x))2 1 (dxX1) R2(dX2)

= f 4(xi)21 (dX1) f 4(x2)2u 2(dx2) < cc (4.3)

Hence 0) a* E L2(X1, X2). Now let L2(Xl) 0 L2(X2) C L2(X1, X2) be defined as

in (2.3); namely,

L2(X1) 0 L2(X2)

= span{+) 0 4*, 4) E L2(XI), 4* E L2(X2)}

= closure of a,4)i 04 ,, k: finite}. (4.4)

Proposition 4.1. L2(X1) 0 L2(X2) = L2(X1, X2). This is a standard construction (Maurin 1967, Example

of Sec. 3.10) and a simple consequence of the following well-known result.

Lemma 4.1. Let {I), 4)2,... }and {4I,, 4*2,... }be complete orthonormal systems of L2(X1), L2(X2), respectively. Then {4Qj*j, i = 1, 2, . .. ,j = 1, 2, . .. } is a complete orthonormal system of L2(X1, X2).

For a proof of this see Murray and von Neumann (1936, Lemma 2.2.1) or Courant and Hilbert (1953, Sec. 11. 1.6). Lemma 4.1 shows that, as in Lemma 2.1 and Corollary 2.1, decomposable elements of the form 4)i 0 *j = Xi i 4* generate the whole L2(X1, X2) space.

Now let us take a look at the inner product. Let 4)1, 4)2 E L2(X,), 41*, 4*2 E L2(X2). Then

()10 1,4 2 4 *2)

= f 4)I(x )4*I(x2)42(x )42(x2),(dxO)1x2(dx2)

= f ,1 (X)142(XI) x1 (dx,) f 41 (X2)4J2(X2)2(dx2)

= (41, 42) * (41, 4*2) (4.5) This is the same relation as in Proposition 2.2. We see that the inner product of L2(X,, X2) corresponds to the inner product introduced to Rm 0 Rn in Section 2. There- fore all orthogonality relations of Section 2 can be trans-

lated here. In particular Theorem 2.1 can be generalized as follows.

Theorem 4.1. Let Ui be a closed subspace in L2(X,), i = 1, ... , n. Let U1 I 0 X.0 U, be defined as a closed subspace generated by {4)l 0 ..0 4,, 4)i E Ui, i = 1, .. ., n}. Let U? = Ui, Ui' = Uij for convenience. Then 2n subspaces {O&=In Ui, Ei = 0, 1, i = 1, . . ., n} form a decomposition of L2(XI, . . ., Xn) into mutually orthogonal closed subspaces.

Now let Fi:L2(X,) -- L2(X,), i = 1, 2, be bounded linear transformations. We define a linear operator (easily seen to be bounded) F1 0 F2: L2(XI, X2) - L2(X1, X2) by

(Fl 0 F2)(4)1 0 2) = Fi4), 0 F24)2 (4.6) for decomposable elements and extend by linearity. See Remark 2.3. For a further justification of this see Murray and von Neumann (1936).

Next we consider orthogonal projections. Note that the definition of orthogonal projector in (2.17) is independent of the dimensionality. Therefore, with the same proof for Theorem 2.2, we have the following.

Theorem 4.2. Let Ui C L2(X1), i = 1, . , n be closed subspaces and Pui be the orthogonal projectors onto Ui in L2(X1), i = 1, . . ., n. Then the orthogonal projector onto 0X= In Ui C L2(XI, . . ., X,) is given by Pu, 0 .. 0 PUnC

Now let us come back to ANOVA-type decomposition. Let li(xi) 1 E L2(Xi) and Ui = span{li} C L2(X ). Let Fi be a linear transformation corresponding to taking the mean. For 4) E L2(X5)

Fi4 = f (xi)1ix(dxi)

- t4) = (M0)1i E L2(X5). (4.7) Then Fili = 1i and Fif4 = 0 for 4) E L2(X5) such that (li, 4)) = 0. Therefore Fi is the orthogonal projector onto Ui and Pui = Fi. Denoting the identity map of L2(Xi) by Ii we have Pui = Ii - Fi. Now we define Ti, Iik by (3.1) and Pi,...ik by (3.3) with Ii, Fi replacing miI, iF.

To see how Pi,*.ik behaves we fix complete orthonormal systems {4i, 4)2i, . . ..} of L2(X,), i = 1, . .. , n, such that 4),5 = li, i = 1, ... , n. Note that Pu,4),i = 4)i = 1, Pui4)1 = 0, forj] 2. Also (Ii - Pui)4,i = 0, (Ii -

Pui)4)k/ = 4)! for] > 2. Using these relations we obtain Pil* *ik l ()j1 0 (0 (4jnn

= 4)'0 0 4)jnn if j? 2 2 for 1 E {i1, . . . ,ik

and j= 1 for l {i ,I ... . ik},

= 0 otherwise. (4.8)

Now consider S E L2(X,, ... , Xn). By Lemma 4.1 {4)i 0 0 * "nJ} forms a basis of L2(XI, . . ., xn). Hence we can write

S = z a11.1l)1 0 0 *j4l) () Xjn, (4.*9) Jl Jn




where

ajl in = S(x, Xn . , xn)41 1I 40jnn [x1 (dx 1) * n(dXn).

Using (4.8) we obtain the following theorem.

Theorem 4.3. 00

pil ik S = 1 bj. ik l... ( ik (4.10) ji=2,1'-1lk

where

bj':I...= f S(XI, X . . *,nl (XJ

( jk (X ik) tL l (dx 1).. Rn(dXn) -

Remark 4.1. Actually (4.10) does not cover the case k = 0. In this case PqS = a,..., = %S.

Theorem 4.1 gives a "coordinatewise" description of Pi,l .k given complete orthonormal systems.

In Efron and Stein (1981), Bhargava (1980), and Karlin and Rinott (1982) these projections are given using conditional expectations. We will show that two definitions are the same. Let Ei:L2(XI, .. . , Xn) L2(XI, ..

Xn) be defined by

Ei= f (XI,... , X),i(dXi)

= E((+ I XI, . i I ., x ,Xi+ , . ,Xn). (4.1 1) Let I denote the identity map in L2(XI, . . ., Xn). Then

(I - Ei)41 = 41 - t(4 | x1, I XI, x, i, x1, i+ I Xn).

(4.12)

Let

Hi, ...ik = Glo ... o Gn, (4.13)

where o denotes composition of the maps and

Gi = I - Ei if i E{il,.., ik},

= Ei otherwise.

Then we get the following theorem.

Theorem 4.4. Hi,..-ik = Pi,.I*ik* (4.14)

Proof. Since {4O,l 0 * 0 41.,,,nn} forms a basis, it suffices to prove that

Hi, ...ik(4'jl 0 ... n

41)>n) = PFi...k(4j'l 0 0 j 4)in)

for all (j I, ... ,jn). Let q, = ., 0 0 41Ji" Then

Ei = f i,I 0 ... 0 4j'n Rsi(dxi) 4p I***(Cg(,jii)l 1 .** (hjnn (Xn) -

Hence

E1'= 0 if j,?22

=41 ifji= 1.

From this it follows that

Hi, ... ik(,1j' 0 4~jn)

= 4.' 0 0 (ii ifj, ?2 for I E {il, ,ik

andji = 1 forl {i, . . , ik},

= 0 otherwise. (4.15)

This is identical to (4.8).

If we expand the right side of (4.13) we obtain the inclusion-exclusion pattern of conditional expectations. For example,

HIS =(I- El) o E2 0 ... o EnS

= S A2(dx2) ..(dnn)

- f S,I(dxI) . . .n(dXn)

= $(S I xl) - f(S). (4.16)

H12S = (I- El) o (I- E2) o E3 o ... o EnS

= E3 o ... o EnS - El o E3 o ... o EnS

- E2oE3o ... o EnS + El o E2 o ... o ES

= (S I XI, X2) - V(S I X2) - V(S Ix) X + (S).

(4.17)

These expressions are used as definitions of the terms of ANOVA-type decomposition in Efron and Stein (1981), Bahargava (1980), and Karlin and Rinott (1982).

[Received March 1982. Revised September 1982.]

REFERENCES BHARGAVA, R.P. (1980), "A Property of the Jackknife Estimation of

the Variance When More Than One Observation is Omitted," Tech- nical Report 140, Stanford University, Dept. of Statistics.

BOCK, R.D. (1963), "Programming Univariate and Multivariate Anal- ysis of Variance," Technometrics, 5, 95-117.

COTTER, S.C. (1974), "A General Method of Confounding for Sym- metrical Factorial Experiments," Journal of the Royal Statistical Society, Ser. B, 36, 267-276.

(1975), "Partial Confounding in Symmetrical Factorial Experi- ments," Journal of the Royal Statistical Society, Ser. B, 37, 54-62.

COTTER, S.C., JOHN, J.A., and SMITH, T.M.F. (1973), "Multi-Fac- tor Experiments in Nonorthogonal Designs," Journal of the Royal Statistical Society, Ser. B, 35, 361-367.

COURANT, R., and HILBERT, D. (1953), Methods of Mathematical Physics (Vol. I), New York: John Wiley.

DIEUDONNE, J. (1976), Treatise on Analysis (Vol. II), New York: Academic Press.

EATON, M.L. (1970), "Gauss-Markov Estimation for Multivariate Lin- ear Models: A Coordinate Free Approach," Annals of Mathematical Statistics, 41, 528-538.

EFRON, B., and STEIN, C. (1981), "The Jackknife Estimate of Var- iance," Annals of Statistics, 9, 586-596.

FEDERER, W.T., and ZELEN, M. (1966), "Analysis of Multifactor Classifications With Unequal Number of Observations," Biometrics, 22, 525-552.

GREUB, W. (1978), MultilinearAlgebra (2nd ed.), New York: Springer- Verlag.

HABERMAN, S.J. (1975), "Direct Products and Linear Models for Complete Factorial Tables," Annals of Statistics, 3, 314-333.



900 Journal of the American Statistical Association, December 1983

HAJEK, J. (1968), "Asymptotic Normality of Simple Linear Rank Sta- tistics Under Alternatives," Annals of Mathematical Statistics, 39, 325-346.

HOEFFDING, W. (1948), "A Class of Statistics With Asymptotically Normal Distributions," Annals of Mathematical Statistics, 19, 293- 325.

JACOBSEN, R.L. (1968), "Linear Algebra and ANOVA," Ph.D. The- sis, Cornell University.

JOHN, J.A., and DEAN, A.M. (1975a), "Single Replicate Factorial Experiments in Generalized Cyclic Designs: I. Symmetrical Arrange- ments," Journal of the Royal Statistical Society, Ser. B, 37, 63-71.

-~ (1975b), "Single Replicate Factorial Experiments in Generalized Cyclic Designs: II. Asymmetrical Arrangements," Journal of the Royal Statistical Society, Ser. B, 37, 72-76.

KARLIN, S., and RINOTT, Y. (1982), "Applications of ANOVA-Type Decompositions for Comparisons of Conditional Variance Statistics Including Jackknife Estimates," Annals of Statistics, 10, 485-501.

KURKJIAN, B., and ZELEN, M. (1962), "A Calculus for Factorial Arrangements," Annals of Mathematical Statistics, 33, 600-619.

-~ (1963), "Applications of the Calculus for Factorial Arrange- ments, I. Block and Direct Product Designs," Biometrika, 50, 63-73.

MAURIN, K. (1967), Methods of Hilbert Spaces, Warsaw: Polish Sci- entific Publishers.

MURRAY, F.J., and NEUMANN, J.v. (1936), "On Rings of Opera- tors," Annals of Mathematics, 37, 116-229.

PAIK, U.B., and FEDERER, W.T. (1974), "Analysis of Nonorthogonal n-Way Classifications," Annals of Statistics, 2, 1000-1021.

RUBIN, H., and VITALE, R.A. (1980), "Asymptotic Distributions of Symmetric Statistics," Annals of Statistics, 8, 165-170.

SCHEFFE, H. (1959), The Analysis of Variance, New York: John Wiley.

SERFLING, R.J. (1980), Approximation Theorems of Mathematical Statistics, New York: John Wiley.

SOKOLNIKOFF, I.S. (1964), Tensor Analysis, Theory and Application to Geometry and Mechanics of Continua (2nd ed.) New York: John Wiley.

ZELEN, M., and FEDERER, W.T. (1964), "Applications of the Cal- culus for Factorial Arrangements, II. Designs With Two-Way Elim- ination of Heterogeneity," Annals of Mathematical Statistics, 35, 658-672.

(1965), "Applications of the Calculus for Factorial Arrange- ments, III. Analysis of Factorials With Unequal Numbers of Obser- vations," Sankhya, A, 27, 383-400.



tensor analysis of anova decomposition

Documents