group invariance in statistical inference (narayan)

7/28/2019 Group Invariance in Statistical Inference (Narayan)

http://slidepdf.com/reader/full/group-invariance-in-statistical-inference-narayan 1/176

Group Invariance in

Statistical Inference



T h i s page i s in te nt ion al ly lef t bl an k



G r o u p I n v a r i a n c e i n

Stat is t ical Inference

N a r a y a n C . G i r i

University of Montreal, Canada

W o r l d S c i e n t i f i c

h Singapore * New Jersey • London • Hong Kong



Published by

World Scientific Publishing Co. Pte. Ltd.

P O Box 128, Fairer Road, Singapore 912805

USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

GROUP INVARIANCE IN S T AT I S T I CAL I NF E RE NCE

Copyright © 1996 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, orparts thereof, may not be reproduced in anyform or by any

means, electronic or mechanical, including photocopying, recording or any information

storage and retrieval system now known or to be invented, without written permission from

the Publisher.

For photocopying of material in this volume, please pay a copying fee through the

Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this

case permission to photocopy is not required from the publisher.

ISBN 981-02-1875-3

Printed in Singapore.



To

NILIMA

NABANITA

NANDIAN



T h i s page i s in te nt ion al ly lef t bl an k



CONTENTS

Chapter 0. G R O U P I N V A R I A N C E 1

0,0. Introduction 1

0.1. Examples 3

Chapter I . M A T R I C E S , G R O U P S A N D J A C O B I A N S 7

1.0. Introduction 7

1.1. Matrices 7

1.1.1. Characteristic Roots and Vectors 8

1.1.2. Factorization of Matrices 9

1.1.3. Partitioned Matrices 9

1.2. Groups 10

1.3. Homomorphism, Isomorphism and Direct Product 12

1.4. Topological Transitive Groups 12

1.5. Jacobians 13

Chapter 2. I N V A R I A N C E 16

2.0. Introduction 16

2.1. Invariance of Distributions 16

2.1.1. Transformation of Variable in Abstract Integral 22

2.2. Invariance of Testing Problems 25

2.3. Invariance of Statistical Tests and Maximal Invariant 26

2.4. Some Examples of Maximal Invariants 28

2.5. Distribution of Maximal Invariant 312.5.1. Existence of an Invariant Probability Measure on 0{p)

(Group of p x p Othogonal Matrices) 33

2.6. Applications 33

2.7. The Distribution of a Maximal Invariant in the General Case 36

vii



viii Contents

2.8. An Important Multivariate Distribution 37

2.9. Almost Invariance, Sufficiency and Invariance 44

2.10. Invariance, Type D and E Regions 45

Chapter 3. E Q U I V A R I A N T E S T I M A T I O N I N C U R V E D

M O D E L S 49

3.1. Best Equivariant Estimation of y, with \ Known 50

3.1.1. Maximum Likelihood Estimators 53

3.2. A Particular Case 53

3.2.1. An Application 58

3.2.2. Maximum Likelihood Estimator 58

3.3. Best Equivariant Estimation in Curved Covariance Models 60

3.3.1. Characterization of Equivariant Estimators of S 613.3.2. Characterization of Equivariant Estimators of 0 63

Chapter 4. S O M E B E S T I N V A R I A N T T E S T S I N

M U L T I N O R M A L S 68


4.1. Tests of Mean Vector 68

4.2. The Classification Problem (Two Populations) 75

4.3. Tests of Multiple Correlation 824.4. Tests of Multiple Correlation with Part ial Information 85

Chapter 5. S O M E M I N I M A X T E S T S I N M U L T I N O R M A L E S 91


5.1. Locally Minimax Tests 93

5.2. Asymptotically Minimax Tests 106

5.3. Minimax Tests 111

5.3.1. HotelUng's T 2 Test 112

5.3.2. R 2 -Test 124

5.3.3. e-Minimax Test (Linnik, 1966) 135

Chapter 6. L O C A L L Y M I N I M A X T E S T S I N S Y M M E T R I C A L 137

D I S T R I B U T I O N S 137


6.1. Eliiptical ly Symmetric Distributions 137

6.2. Locally Minimax Tests in E p(n, S ) 140

6.3. Examples 143

Chapter 7. T Y P E D A N D E R E G I O N S 162



Group Invariance in

Statistical Inference



Chapter 0

GROUP INVARIANCE

0 .0 . I n t r o d u c t i o n

One of the unpleasant facts of statistical problems is that they are often

too big or too difficult to admit of prac tica l solutions. Stat istic al decisions

are made on the basis of sample observations. Sample observations often con

tain information which is not relevant to the making of the statistical deci

sion. Some simplifications are introduced by characterizing the decision rules

in terms of the sufficient statistic (minimal) which discard that part of sam

ple observations which is of no value for any decision making concerning the

parameter and thereby reducing the dimension of the sample space to that of

the minimal sufficient statistic. This , however, does not reduce the dimension

of the parametric space. By introducing the group invariance principle and

restricting attention to invariant decision rules a reduction to the dimension

of the parametric space is possible. In view of the fact that sufficiency and

group invariance are both successful in reducing the dimension of the statis

tical problems, one is naturally interested in knowing whether both principles

can be used simultaneously and if so, in what order. Hall, Wijsman and Ghos h

(1965) have shown that under certain conditions this reduction can be carried

out by using both principles simultaneously and the order in which they are

used is immaterial in such cases. However, one can avoid verifying these conditions by replacing the sample space by the space of sufficient statistic and

then using group invariance on the space of sufficient statistic. In this mono

graph we treat multivariate problems only where the reduction in dimension is

1



2 Group Invariance in Statistical Inference

very significant. In what follows we use the term invar iance to ind icate group

invariance.

In statistics the term invariance is used in the mathematical sense

to denote a property that remains unchanged (invariant) under a group of

transformations. In actual practice many stati stica l problems possess such a

property. As in other branches of applied sciences it is a generally accepted

principle in statistics that if a problem with an unique solution is invariant

under a group of transformations, then the solution should also be invariant

under it. This notion has an old origin in statistical sciences. Apart from this

natural justification for the use of invariant decision rules, the unpublished

work of Hunt and Stein towards the end of Second World War has given this

principle a Strang support as to its applica bili ty and me an in gM ne ss to provevarious optimum properties like min imax, admiss ibil ity etc. of sta tis tical de

cision rules.

Although a great deal has been written concerning this principle in sta

tistical inference, no great amount of literature exists concerning the problem

of discerning whether or not a given statistical problem is actually invariant

under a certain group of transformations. Brillinger (1963) gave necessary and

sufficient conditions that a statistical problem must satisfy in order that it be

invariant under a fairly large class of group of transformations including Lie

groups.

In our treatment in this monograph we treat invariance in the framework of

stat istical decision rules only. D e F i n e t t i (1964) in his theory of exchangeability

treats invariance of the distribution of sample observations under finite permu

tations. It provides a crucial link between his theory of subjective probability

and the frequency approach of probability . T he classica l sta tis tical methods

take as basic a family of distributions, the true distribution of the sample ob

servations is an unknown member of this family about which the statistical

inference is required. Accord ing to De Finett i' s approach no probabili ty is

unknown. If « i , i E s , . . . are the outcomes of a sequence of experiments con

ducted under similar conditions, subject ive uncerta inty is expressed directly

by ascribing to the corresponding random variables X t ,X^, • • • a known joint

distribution. When some of the X's are observed, predictive inference about

others is made by conditioning the orig inal distr ibutions on the observations.De Finetti has shown that these approaches are equivalent when the subjec-

tivist's joint distribution is invariant under finite permutation.

Two other related principles, known in the literature, are the weak invari

ance and the strong invariance principles. T he weak invariance principle is used



Group Invariance 3

to demonstrate the sufficiency of the classical assumptions associated with the

weak convergence of stable laws (Billingsley, 1968). This is popularly known

as Donsker's theorem (Donsker, 1951). Let X\, X2, • • • be independently dis

tributed random variable with the same mean zero and the variance a 2 and let

Sj = £j=i = ^ . * = | . ;

i

-

1

. - " . « - Donsker proved that { * „ ( ( ) }converges weakly to Bro wni an motion. Strong invariance principle has been

introduced to prove the strong convergence result (Tusnady , 1977). Here the

term invariance is used in the sense that if X i , X j , . . . are independently dis

tributed random variables with the same mean 0 and the same variance a 2

,

and if ft is a continuous function on [0,1| then the limiting distribution of h(Xi)

does not depend on any other property of Xi.

0 .1 . E x a m p l e s

We now give an example to show how the solution to a statistical problem

can be obtained through direct application of group theoretic results.

E x a m p l e 0.1 .1 . Let X a = ( X n l , . . . , X a p ) \ a = l,...,N(> p) be inde

pendently and identically distributed p-variate normal vectors with the same

mean ^ = (m,.. . , j t p ) ' , and the same positive definite covariance matrix E .

The parametric space f! is the space of all {11, £ ) . The problem of testing H0 : // = 0 against the alternatives Hi : 11 0 remains unchanged (invariant)

under the full linear group Gj(p) of p x p nonsingular matrices g transforming

each Xi -* gXi, i = l,...,N. Let

- N N

It is well-known {Giri, 1977) that (X,S) is sufficient for ( / * , £ ) . The induced

transformation on the space of {X,S) is given by

(X,S)^(gX,gSg'), g € Gi(p) • (0.2)

Since this transformation permits arbitrary changes of X, S and any reasonable

statistical test procedure should not depend on any such arbitrary change by

g, we conclude that a reasonable statistical test procedure should depend on(X, S) only through

T 2

= N(N - l l X ' S "1

* . (0.3)

It is well-known that (Giri, 1977) the distribution of T 2 is given by

4 Group /nuariance in Statistical Inference

fr*(t 2

\6 2

) =(N-l)T(^N-p))

(\ 8

2

y(t 2

j{N -

+

>-l

T{\N + j) if £ 2 > 0 (0.4)

where 6 2

= J V / i ' £ - y Under H 0S 2

= 0 and under ff i S 2 > 0. Apply ing

Neyman and Pearson's Lemma we conclude from (0.4) that the uniformly

most powerful test based on T 2 of H 0 against H i is the well-known Hotelling's

T 2 test, which rejects H 0 for large values of T 2 .

N o t e . In this problem the dimension of fl is p + = P ( P +3

) ) w n i c h is

also the dimension of the (X,S). For the distribution of T2

the parameter isa scalar quantity.

One main reason of the intuitive appeal, that for an invariant problem with

an unique solution, the solution should be invariant, is probably the belief that

there should be a unique way of analysing a collection of statistical data. As a

word of caution we should point out that, if in cases where the use of invariant

decision rule conflicts violently with the desire to make a correct decision or to

have a smaller risk, it must be abandoned. We give below one such example

which is due to Charles Stein as reported by Lehmann (1959, p. 338).

E x a m p l e 0.1.2. Let X = f Xi,...,X p )', Y = (Yi,...,Y p )' be indepen

dently distributed normal p-vectors with the same mean 0 and positive definite

covariance matrices S , 6 S respectively where 6 is an unknown scalar constant.

Consider the problem of testing Ho : 6 — 1 against H\ : 6 > 1, Th i s problem

is invariant under Gj(p) transforming X —> gX,Y —» gY,g e Gt(p). Since this

group is transitive (see Chapter 1) on the space of values of (X, Y) with probability one, the uniformly most powerful invariant test of level a under Gi{p}

is the trivial test $ ( X , Y) = a which rejects Ho with constant probability a

for all values {x,y) of (X,Y). Hence the maximum power that can be achieved

over the alternatives Hi by any invariant test under G;(p) is also a . But the

test which rejects Ha whenever

(0.5)

where the constant C depends on level a, has strictly increasing power 0(6}

whose minimum over the set 6 > Si > 1 is /?{<5i) > /3(1) = a. For more

discussions and results refer to Giri (1983a, 1983b).

Group Invariance 5

E x e r c i s e s

1. Let Xi,... ,X n be independently and identically distributed normal random

variables with the same mean 9 and the same unknown variance o 2 and let

H A

:d = 0 and H i : 9 / 0.

(a) Find the largest group of transformations which leaves the problem of

testing Ha against Hi invariant.

(b) Using the group theoretic notion show that the two-sided student (-test

is uniformly most powerful among all tests based on (.

2. Univar ia te G e n e r a l L i n e a r Hypothesis . Let , . . . , X n be indepen

dently distributed normal random variables with E(Xi) = (ft, V a r ( X ; ) = a 2 ,

i — 1, . . . , T I . Let fi be the linear coordinate space of dimension of n andlet lip. and I L U be two linear subspaces of fl such that dim I I ^ — k and

dim I I J J — i , I > k. Consider the problem of testing HQ : 8 — ... ,9 P )' G

lTm against the alternatives H i : 9 6 HQ.

(a) Find the largest groups of transformations which leave the problem

invariant.

(b) Using the group theoretic notions show that the usual F-test is uni

formly most powerful for testing HQ against H \ .

3. Let Xy,..., X n be independently distributed normal random variables with

the same mean 9 and the same variance a 2 . For testing H Q : a 2 = a\

against H\ : a 2 = a 2 < af, where <T 2

,O~% are known, find the most powerful

invariant test using the group theoretic notions.

References

1. P. Billingsley, Convergence of Probability Measures, Wiley, New York, 1968.

2. D. Brillinger, Necessary and sufficient conditions for a statistical problem to be

invariant under a Lie group, Ann. Math. Statist., 34 (1963) 492-500.

3. B. De Finetti, Studies in Subjective Probability, H. E , Kyburg and H, E . Smoker,

eds., Wiley, New York, (1964) 93-158.

4. M. Donsker, An invariance principle for certain probability limit theorems, Amer.

Math. So c, 6 (1951).

5. N. Giri, Hunt-Stein theorem. Encyclopedia of Statistical Sciences, Vol. 3, Kotz-Johnson, eds., Wiley, New York, 1983a, 686-689.

6. N. Giri, Invariance Concepts in Statistics, Encyclopedia of Statistical Sciences,

Vol. 4, Kotz-Jofmson, eds., Wiley, New York, 1983b, 219-224.

7. N. Giri, Multivariate Statistical Inference, Academic Press, New York, 1977.



6 Group Invariance m Statistical Inference

8. W. J.Hall, R. A. Wijsman and J . K. Ghosh, The relationship between sufficiency

and invariance with application in sequential analysis, Ann. Math. Statist, 36

(1965) 575-614.

9. E . L. Lehmann, Testing of Statistical Hypothesis, Wiley, New York, 1959.

10. G. Tusnady, in Recent Developments in Statistics, Barra, J. R., Brodeau,

F. , Romier, G. and Van Cutsera, B. eds, North-Holland, Amsterdam, (1977)

289-300.



Chapter 1

MATRICES, GROUPS AND JACOBIANS

1 . 0 . I n tr o d u c t i o n

The study of group invariance requires knowledge of matrix algebra and

group theory. We present here some basic results on matrices, groups and

Jacobians without proofs.

Proofs can be obtained from Gir i (1993, 1996) or any textbook on these

topics.

1 .1 . Matr ice s

A p x q matrix C = (c \j) is a rectangular array of real number Cy written

as

where e% is the element in the it h row and the jth column. The transpose C

of C in a q x p matrix obtained by interchanging the rows and the columns of

C. If q = p, C is called a square matrix of order p. K q — 1,0 is a 1 X p row

vector and if p = 1, C is a g x 1 column vector.

A square matrix C is symmetric if C — C.

A square matrix C = (c ^) of order p is a diagonal matrix D ( c u , c pp )

with diagonal elements e n , . . . ,Cp P if all off-diagonal elements of C are zero.

A diagonal matrix with unit diagonal elements is an indentity matrix and

is denoted by / . Sometimes it will be necessary to write it as I K

to denote an

identity matrix of order k.

(1.1)

7




A square matrix C — (cij) of order p is a lower triangular matrix if =

0 for j > i. The determinant of the lower triangular matrix det C = H(=i c » -

We shall also write det C — \C\ for convenience.

A square matrix C = (ey ) of order p is a upper triangular matrix if Cj, = 0

for i > j and det C = [IS=Ic

"-A square matrix of order p is nonsingular if det C ?^ 0. If det C = 0 then

C , is a singular matrix.

A nonsingular matrix C of order p is orthogonal if CC = CC = I .

The inverse of a nonsingular matrix C of order p is the unique matrix C _ 1

such that C C - 1 = C _ 1 C = J . From this it follows that det C " 1 = (det C)~ l .

A square matrix (7 = (c^ ) of order p or the associated quadratic form

x'Cx = J2i Z l j djXiXj is positive definite if x'Cx > 0 for x = ( i i , . . . , x p )' / 0.

If C is positive definite C _ 1 is positive definite and for any nonsingular matrix

A of order pACA' is also positive definite.

1.1.1. Characteristic roots and vectors

The characteristic roots of a square matrix C of order p are given by the

roots of the characteristic equation

d e t ( C - A / ) = 0 (1.2)

where A is real. A s

det {8C8' - XI ) = det ( C - XI )

for any orthogonal matrix 8 of order p, the characteristic roots of C remain in

variant under the transformation of C —> 8C8'. The vector x — ( i ^ , . . . , x p )' /

0 satisfying{C -X I )x = 0 (1.3)

is the characteristic vector of C corresponding to its characteristic root X. If x

is a characteristic vector of C corresponding to its characteristic root X, then

any scalar multiple ax, a j i 0, is also a characteristic vector of C corresponding

to X.

S o m e R e s u l t s o n C h a r a c t e r i s t i c R o o t s a n d V e c t o r s

1. T h e characteristic roots of a real symmetric matrix are real.

2. The characteristic vectors corresponding to distinct characteristic roots of

a symmetric matrix are orthogonal.

Matrices, Groups and Jacobians 9

3. T he character ist ic roots of a symmetri c positive definite matrix C are ail

positive.

4. Given any real square symmetric matrix C of order p, there exists an orthog

onal matrix 9 of order p such that 9C9' is a diagonal matrix D{\\,..., A p )

where A i , . . . , A p are the characteristic roots of C. Hence det C — n f = i A<

and tr C — X]f=i *W- Note that tr C is the sum of diagonal elements of C.

1.1.2. Factorization of matrices

In this sequal we shall use frequently the following factorizations of

matrices.

For every positive definite matrix C of order p there exists a nonsingular

matrix A of order p such that C = AA' and, hence, there exists a nonsingular

matrix B (B = A'1 ) of order p such that BCB' = I.

Given a symmetric nonsingular matrix C of order p, there exists a nonsin

gular matrix A of order p such that

- i ) m

where the order of / is equal to the number of positive characteris tic roots of C and the order of —I is equal to the number of negative characteristic roots

of C .

Given a symmetric positive definite matrix C of order p there exists a

nonsingular lower traingular matrix A (an upper triangular matrix B) of the

same order p such that

C = AA' = B'B. (1.5)

C h o l e s k y D e c o m p o s i t i o n For every positive definite matrix C there exists an unique lower triangular matrix of positive diagonal elements D such that

C = DD'.

1.1.3. Partitioned matrices

Let C — ( c y) be a p x q matrix and let C be partitioned into submatrices

Cij as

C = ( ^ n

^ \ C a i C22 /

where C u - (cij)(i = l , . . . , m ; j = l,...,n)\C 12 - (ftj)(» = l , . . . , m ; jf =




n+ l , . . . , q ) ; C 2 l = (cy)(t = m+ l , . . . , p ; j = l , . . . , n ) ; C 2 2 = =

m + l j . . , ,p; 3' = J l + 1 , . . . . , j j j . For any square matrix

C — f^1' \

\C2i C22 / where Cn,C 2 2 are square matrices and C 2 2 is nonsingular

(1) det ( C ) = det ( C 2 2 ) det ( C n - C a C ^ C j i ) ,

(2) C is positive definite if and only if C U ,C 22 - C 2iG^C- n or equivalently

C22-.Cn — C\ 2C 22C 2 \ are positive definite.

L e t C - 1 = B be simi lar ly partitioned into submatrices Bi:i ,i,j = 1,2.

Then

C^,1

— Bu - B12B22B21, C22 — B22 - B 2iBll 'Bi 2 ,

C^C 12 = -Bl2 B^. (1.6)

1.2 . Groups

A group G is a set with an operation r satisfying the following axioms.

A i . For any two elements a, b E G, arb € G.

A 2 . For any three elements a,b,c G G\ {arb)rc — or(brc) .

A3. There exists an element e (identity element) such that for all o £ G,are—

a.

A 4 . For any a G G there exists an element a~ l (inverse element) such that

a r a - 1

— e.

In what follows we will write for the convenience of notation arb = ab. In

such writing the reader may not confuse it with the arithmetic product ab.

A group G is abelian if for any pair of elements o, 6 belonging to G, ab = ba.

A non-empty subset H of G is a subgroup if the restriction of the group

operation T to H satisfies the axioms A i , , . . , A4,

E x a m p l e s of G r o u p s

E x a m p l e 1.1. A. Le t X be a set and G be the set of all one-to-one

mappings

g : X -> X

with g{x) = g(y)\x,y G I ; implies x — y and for x G X there exists y G X such

that y = g{x). With the group operation defined by

9r92(x) =SI(£J2M);SI,S2 € G ,

G forms a permutation group.

http://c22-.cn/

http://c22-.cn/

http://c22-.cn/

E x a m p l e 1.2. The additive group of real numbers is the set of all reals

with the group operation ab = a + 6. The multiplicative group of all nonzero

reals with the group operation ab — a multiplied by b.

E x a m p l e 1.3. Let X be a linear space of dimension n.

Define for x 0 G X,g xo(x) — x + X Q,X G X. T h e set of a ll {g Xo } forms an

additive abelian group and is called the translation group.

E x a m p l e 1.4. Let X be a linear space of dimension n and let Gi(n) be

the set of all nonsingular linear transformations X onto X. Gi(n) with matrix

multiplication as the group operation is called the full linear group.

E x a m p l e 1.5. The affine group is the set of pairs (g,x),x G X,g € Gi{n)

with the group operation defined by (9i,£i)(<?2,X2) = {9192,91X2 + Xi) with

91.92 6 Gi(n) and X\,x 2 G X. T h e identity element and the inverses are

E x a m p l e 1.6 . T h e set of all nonsingular lower triangular matrices of

order p with the usual matrix multiplication as the group operation forms

a group GT(P) (with identity matrix as the unit element). The set of allupper triangular nonsingular matrices of order p forms the group GUT(P) with

identity matrix as the unit element.

E x a m p l e 1.7 . T h e set of all orthogonal matrices 6 of order p with the

matrix multiplication as the group operation forms a group 0(p).

E x a m p l e 1.8. Let o C £1 C £2 C • • • C X k = X be a strictly increasing

sequence of linear subspaces of X and let G be a subgroup of Gi(p) such that

g G Gi{p) if and only if gXi = Xi, i = 1 , . . . , k. T h e group G is the group of

nonsingular linear transformations on X to X which leaves Xi invariant.

Choose a basis x^\ ... , x $ , t = 1 , . . . , k for Xi-Xi-i with n, — dim (X,) -

dim (3t,_i) and XQ — <f> (null set).

Then g G G can be written as

(1,0) and ( s . x r 1 - O r 1

9 =

/ 9(U) 9(12)0 9(22)

fl(lJfc) \ 9<2fc)

(1.7)

\ 0 0

12 Group /nuariance in Statistical inference

where g/*^ is a block of Bj rows and rij columns for i < j . I f n; = 1 for al l

i,G = GUT-

1.3. H o m o m o r p h i s m , I s o m o r p h i s m a n d D i r e c t P r o d u c t

Let G, H be groups. A mapping / of G into H is called a homomorphism

if, for gi, g 2 £ G ,

/(9i!72) = / ( 9 i ) * / M (1-8)

where * is the group operation in H. I f e is the identity element in G , }{e)

is the identity element in H and / ( g - 1 ) — [/( <?) ] - 1 . I n addition, if / is an

one-to-one mapping, then / is called an isomorphisim.

The Cartesian product G x H of groups G and H with the operation(9u hiK92,h 2 ) = (9i92,hih 2 y,gi,g2 £ GM M e H is called the direct

product of G and if.

A subgroup H of G is a normal subgroup of G if for all h £ H and s e (?,

ghg~l £ / / , or equivalently if gHg

-1 — H .

E x a m p l e 1.9. The group Gf ( p ) of lower triangular matrices of order p

with positive diagonal elements is a normal subgroup of GT ( P)- Any subgroup

of an abelian group is normal.

Let G be a group and let i f be a subgroup of G. The set G /H of all sets

of the form gH, g € G is called the quotient group of G modulo H with group

operation defined by, for j&jjfe G &,.(%H){gzH) is aset of elements obtained

by multiplying all elements of giH with all elements of g 2 H. The identity

element is H and (gH)-1

— g~v

H.

1.4. Topological T r a n s i t i v e G r o u p s

Let X be a set and A a collection of subsets of X. (X,A) is called a

topological space if A satisfies the following axioms:

TAi : X £ A.

T A j : The union of an arbitrary subfamily of A belongs to A.

T A 3 ; The intersection of a finite subfamily of A belongs to A.

The elements of A are called open sets. If, in addition the topological space

(X,A) satisfies: for x,y £ X,x / y, there exist two open sets A,B, belongingto A such that A n B = <p with x e A, y £ B; then it is called a Hausdorff

space.

A collection of open sets is a base for a topology if every open set in A is

the union of a sub-collection of sets in the base.



Matrices, Groups and Jacotians 13

If a group G has a topology r defined on it such that under T the mapping

ab from G x G into G and a - 1 from G into G ( a , b £ G ) are continuous, then

( G , r ) is a topological group.

A compact group is a topological group which is a compact space. A loca lly

compact group is a topological group whi ch is a locally compact space.A compact space is a Hausdorff space X such that every open covering of X

can be reduced to a finite subcovering. A locally compact space is a Hausdorff

space such that every point has at least one compact neighborhood.

Let X be a set. T h e group G operates from the left on X if there exists a

function on G x X into X whose value at {g,x} is denoted by gx such that (i)

ex — x for all x € X and e, the identity element of G ; (i i) g^igix) — gzgi(x).

This implies that gtG is one to one on X into X.

Let G operate from the left on X. G operat es t ran s i t ive ly on X if for

every Xi,x 2 £ X there exists ag £ G such that gx\ — x 2-

E x a m p l e 1.10. Let X be a linear space. T he full linear group operates

transitively on X — { 0 } .

E x a m p l e 1.1 1. T h e group 0{n) of n x n orthogonal matrices operates

transitively on the set of all n x p(n ^ p) real matrices, x satisfying x'x =

E x a m p l e 1.1 2. T h e linear group G; (p ) acts transitive ly on the set of all

p x p positive definite matrices s, transfering s to gsg',g £ G/(p) .

1 .5 . Jacob ians

Let Xi,...,X n be a sequence of continuous random variables with pdf

fx, xAXi,--.,X n

} and let Yi = gi(X u

... ,X n

)>i = l,...,n be a set of continuous one-to-one transformations of X%, • • •, X n^ Assume that the func

tion gi,...,g n have continuous partial derivatives with respect to X ] . . . , X n .

Denote the inverse functions by X ; = hi(Yi,..., Y n ),i = l , . . . , n . Let J - 1 be

the determinant of the n x n matrix of partial derivatives

J is called the Jacob ian of the transformation X\,..., X n to V j , . . . , Y n .

The pdf of Yi,..., Y n is given by

M i M i

9Y„

(1.9)

dX„9V„

M, . . . r y-(Ki.--- . Vn) = fx,,...^Cftltflj • • • . » » ) . . • • * M i * l . - - v . S n ) | / |

14 Group Invariance tn Statistical Inference

where \J\ is the absolute value of J. We now state some results on Jacobian

without proof. We refer to Gir i (1994), OIkin (1962) and Roy (1959) for further

results and proofs.

Some R e s u l t s o n J a c o b i a n s

Let X = ( X i , . . . , X p ) ' , y = ( Y 1 , . . . , Y P ) ' € Ev . The Jacobian of the

transformation X —> Y — gX g g G j( p) is (det g)~l

•

Let X, Y be p x n matrices. T h e Jacobian of the transform ation X —* Y =

gx,geGt{p) is (det g)"1 .

Let X,Y bepxn matrices and let A 6 Gj(p),Z? € Gt{n). The Jacobian of

the transformation X -> Y = AXB is (det A'1

)* (det S -l

)p

.

Let g, k € G r ( p ) . The Jacobain of the transformation g —» kg is

n f = i ( f t . i ) _ 1 where h — (/i ,j) and the Jacobian of the transformation g —* gh is

n f = 1 < M ' - p - 1 .

Let g,k e GUT ( P)- T h e Jac obian of the transform ation g —» kg is

n ^ i C l » ) ' - p - 1 where k = (hij) and the Jacobian of the transformation g —• gh

^ n L

Let GST - (P ) be the multiplicative group of p x p lower tria ngu lar nonsingular

matrices in the block form, i.e. g e G B T ( P ) .

9(21) 9(22) 0

•9(1:1) 9(*2) * • • • 9(ltfc)

9 = \ m m 0 |

where g^q are submatrices of order pi x pi such that Y^iPi = P- F ° r 9.^ 6

GBT ( P) the Jacobian of the transformation g -* kg is n ! = i i d e t / i ( i i ) j - , T i ,

where cr; = Y? 3=\P3* a

o=

°- T h e Jacob ian of the transformation g —> gh

is n t j d e t h ^ F -1

"p

.

Let GBUT ( P) be the group of nonsingular upper triangular matrices of order

p in the block form i.e. g e GBUT,

' 9 (H ) 9(12) • • • 9<u) \

? _ I 0 9(22) •• 9(2fc) j

0 0 9(fcfc)

where g^ are submatrices of order p; satisfying P i = p . For g,h e G e t 7 T ,

the Jacobian of the transformation g -t gh is n L i [ d e t fya)]-"' a n d t h a t o f

9 ^ / » 9 i s n L i [ d e t « ( u ) ] " i - , " p .

Let 5 be a symmetric positive definite matrix of order p. The Jacobian of

the transformation 5 -> gSg',g e Gt(p) is [det g ] " t p + 1 ) and that of S - » 5 _ 1

is (det S ) 2 " .

Let 5 be a symmetric positive definite matrix of order p and let g be the

unique lower triangular matrix such that S = gg'. The Jacobian of the transformation S -> gSg' is 2~" ^ i f f f i i ) - ^ 1 - * 5 with g = (jfc), The Jacobian of

the transformation S —• gSg',g g G £ ( p ) is [det g]~<

- p+1

K

E x e r c i s e s

1. Show that for any nonsingular symmetric matrix A of order p and non null

p vectors x,y

(c) |A + xa;'| = \A\{\ + x'A~1

x).

2. Let A,B be p x q and qxp matrices respectively. Show that \I P + AB\ =

\I q+BA\.

3. Show that for any lower triangular matrix C the diagonal elements are its

characteristic roots.

4. Let X be the set of all n x p real matrices X satisfying X'X — I p . Show

that the group 0(n) of orthogonal matrices 9 of order p, which transform

X —> 6X acts transitively on X.

5. Let 5 be the set of all symmetric positive definite matrices s of order p.

Show that G j( p) which transforms s —t gsg',g £ G)(p), acts transitively on

5.

References

1. N. Giri, Multivariate Statistical Inference, Academic Press, New York, 1977.

2. N. Giri, Introduction to Probability and Statistics, 2nd Edition (Expanded), Marcel

Dekker, New York, 1993.

3. N, Giri, Multivariate Statistical Analysis, Marcel Dekker, New York, 1996.

4. I. Olkin, Note on the Jacobian of certain transformations useful in multivariate analysis, Biometrika, 40 (1962), 43-46.

5. S. N. Roy, Some Aspects of Multivariate Analysis, Wiley, New York, 1957.

(b) a?{A + xx')-1

x =

(a) (A + xy')-1

^l + y'A~

l

x

x'A-^x

l + x'A-tx1



Chapter 2

INVARIANCE

2 . 0 , I n t r o d u c t i o n

Invariance is a mathematical term for symmetry. In prac tice many sta

tistical problems involving testing of hypothesis, exhibit symmetries, which

imposes additional restrictions for the choice of appropriate statistical tests,

for example, the statistical tests must also exhibit the same kind of symmetry

as is present in the problem. In this chapter we shall consider the principle of invariance of statistical testing problems only. Fo r an addit ional reference the

reader is referred to Lehmann (1959), Ferguson (1967) and Nachbin (1965).

2 .1. In va r ia nce o f D i s t r ib ut io ns

Let [X,A) be a measure space and ft the set of points 8, that is, f! — {9}.

Consider a function P on ft to the set of probability measures on (X t A) whose

value at 8 is P B

. Let G be a group of transformations operating from the lefton X such that g € G

g : X°"°X is (A, A) measurable; (2.1)

and

9 : ft ° - 0 (2.2)

is such that if X has distribution P6 , gX, X € X has distribution P s- where

0* = 98 6 ft. Al l transformations considered in connection wi th invar iance

will be taken for granted as one-to-one from X onto X. An equivalent way of

stating (2.2) is as follows:

16



Invariance I 7

Pe{gX e A) = P- g6 (X €A),Ae A, (2.2a)

or

P 9{g-1

A) = P se{A) (2.2b)

This can also be written as

Pt(B) = P s»{gB), Be A

If all Pe are distinct, that is, 81 / 82 Pe, PH * t n e n

9 * S a homomorphism.

The condition (2.2a) is often known as the condition of invariance of the dis

tribution.

Very often in statistical problems there exists a measure A such that Pa

is absolutely continuous with respect to A for all 8 S fi so that p$ is the

corresponding probability density function with respect to the measure A. I n

other words, we can write

Also in great many cases of interest, it is possible to choose the measure A

such that it is invariant under G, viz

Defin i t ion 2 .1 .1 . A measure A satisfying (2.4) is left invariant.

In such cases the condition of invariance of distribution under G reduces to

The general theory of invariant measures in a large class of topological

groups was first given by Haar (1933). However, the results we need were

all known by the end of the nineteenth century. In the terminology of Haar,

Ps(A) is called a positive integral of p«, not necessarily a probability density

function. The basic result is that for a large class of topological groups there is

a left invariant measure, positive on open sets and finite on compact sets and

to within multiplication by a positive constant this measure is unique. Haar

proved this result for locally compact topological groups. T h e definition of

right invariant positive integrals, viz. Pe{A) = Pge(Ag) is analagous to that

for left invariant posit ive integrals.

(2.3)

\(A) = \[gA) for all A G A, geG. (2.4)

Pgs(gx) = pe(x) for alia; e X, g e G. (2.5)

Because of the poineering works of Haar such invariant measures are called

invariant (left or right) Haar measures. A rigorous presentation of Haar mea

sure will not be attempted here. The reader is referred to Nachbin (1965).

It has been shown that such invariant measures do exist , and from a left

invariant Haar measure one can construct the right invariant Haar measure.

We need also the following measure-theoritic notions for further develop

ments. Let G be a locally compact group and let B be the <7-algebra of compact

subsets of G.

Def in i t ion 2.1.1. (Relatively left invariant measure). A measure v on

(G,B) is relat ively left invariant with left multiplier x{g) i f f satisfies

v{gB) = x(9)v(B), f o r a l l B e B . (2.5a)

The multiplier x(t») is a continuous homomorphisim from G —» R+ . I f v

is relatively left invariant with left multiplier \(g) then \ ( g - 1 ) = l/x(ff) and

x(ff - l )f(^ff) ' s a l eft invariant measure.

Def in i t ion 2.1.2. A group G acts topologically on the left of x if the

mapping T(g, x) : x x G — » x

1S

continuous.

Def in i t ion 2.1.3. (Proper G-space) . Le t the group G acts topologically

on the left of x a n d let h be a mapping on G x \ — * X x X given by

Hg>x) = {gx,x)

for g G G,x € X- The group G acts properly on \ if for every compact

C C x x X> / »" J {G) is compact.

The space x is called a proper G-space fG acts properly on \ . A n equivalent

concept of the proper G-space is the Cartan G-space.

Def in i t ion 2.1.4 (Cartan G-s pace). Let G acts topologically on \ . Then

X is called a Cartan G-space if for every x e x there exists a neighborhood V

of a such that (V, V ) = {g e G\{gV) n V # 4>} has a compact closure.

Wijsman (1967, 1978,1985, 1986) demonstrate the verif ication of the condi

tion of Cartan G-space and proper actions in a number of multivariate testingproblems on covariance structures.

E x a m p l e 2.1.1. Let G be a subgroup of the permutation group of a finite

se t* - For anys ubset A of x define A(A) = number of points in A. Themeasure



Invariance 19

A is an invariant measure under G and is unique upto a positive constant. It

is known as counting measure.

E x a m p l e 2.1.2. Let x be the Euclidean n space and G the group of

translations defined by i3

e % t

g Xl

e G implying

g Xl (x) =x + xi.

Clearly G acts transitively on x- T h e n-dimens ional Lebesgue measure A is

invariant under G. Again A is unique upto a positive multiplicative constant.

E x a m p l e 2. 1. 3. Consider the group Gi(n) operating on itself from the

left. Since an element g operating from the left on Gi(n) is linear, it has aJacobian with respect to this linear operation. Obvious ly, Gi(n) is a vector

space of dimension n 2 . Let us make the convention to display a matrix of

dimension n x n b y column vectors of dimension n x 1, by writing the first

column horizontally. Then the second is likewise and so forth. In this manner

a matrix of dimension nxn represents a point in R n . L et g,x,y e Gi(n) and

let g = (gi } ),x — (x z j),y — (y t] ). If we display these matrices by their column

vectors, the relationship y — gx gives us the coordinates of the point

as

Vij = y^9ikXkj

k=l

in terms of the coordinates of

{^111 • • ' *^ l n > "' • ! X ni, . . . , Xjin) -

The Jacobian is thus the determinant of the n 2 x n 2 matrix

(9 0 . . . 0^

0 g ... 0

\ 0 0 9>

where each 0 indicates null matrix of dimension nxn. Hence the Jacobian of

the transformation y = gx is (detg} n

. Let us now define for y e Gt{n)

dX(y) -(det T / ) "

Then

dX(gy) =(det g) nY[dVij

(det(gy))"

(det 3/)"

= dX(y)

Hence A is an invariant measure on G ;(n) under (?/(n). It is also unique upto

a positive multiplicative constant.

E x a m p l e 2.1.4. Let G be a group o f n x n nonsingular lower triangular

matrices with positive diagonal elements and let G operate from the left on

itself. We will verify that the Jacobian of the transformation

Q-*hg

for G, is I l i C1

" ) ' - Thus an invariant measure on G under G which is

unique upto a positive multiplicative constant , is

dX{g) =i l > ,

n?= i ( j ? » ) '

To show that the Jacobian is n r = i (^ « )*> ' e t 9 - ( f i j ) , ^ = with g tj =

kij = 0 for i <j. Then

This defines a transformation

( f t i ) - ( C y ) .

We can write the Jacobian of this transformation as the determinant of the

matrix /dC n dCu 3 C „ \

dglL &921

dC 21 dC 2l

dC 21

dgu dgnn

BC nn dC nn

\ dgn $921 &9nn 1



/nuartonce 21

which is a [n(rc +1) ] /2 x [n( fi -r l) ]/ 2 lower triangular matrix with the diagonal

elements

- h kk ,k > ro.

In other words h%i will appear once as a diagonal element, /122 will appear

twice as a diagonal element and finally h nn will appear n times as a diagonal

element. Hence the Jacobian is I I I L i C1

" )1

'

E x a m p l e 2.1.5. Let S be the space of all positive definite symmetric

matrices S of dimension n x n . and let Gi(rc) operate on 5 as

S^gSg', SeS, g 6 G,{n). (2.6)

To show that the Jacobian of this transformation is (det g) n+1 , let i?y be the

matrix obtained from the nxn identity matrix by interchanging the ith and

the jth row; M^C) be the matrix obtained from the n x n identity matrix by

multiplying the ith row by any non-zero constant C and let Ay be the matrix

obtained from the nxn identity matrix by adding the jth row to the ith. We

need only show this for the matrices Eij,Mi(C) and Ay. This can be easily

verified by the reader; for example, det(M,(C)) = C and Mi{C)SMi(C) is

obtained from S = (5y) by multiplying Sa by C 2 and Sy for i ^ j by C so

that the Jacobian is C ^ " "1 = C "

+ 1

.

Let us define the measure X on 5 by

d X { b }

- ( d e t S ) ( « W

Clearly,

d\(S) = dX(gSg'),

so that X is an invariant measure on S under the transformation (2.6).

Here also X is unique upto a positive mult iplicative constant. T h e group

G;(p) acts transitively on S.

E x a m p l e 2.1.6. Let H be a subgroup of G. The group G acts transitively

on the space x = G/H (the quotient group of H) where group action satisfies

9i (gh) = 9 i9H i for 9i e G , g€G.

When the group G acts transitively on x, there is a one to one correspondence

between x and the quotient space G/H of any subgroup H of G.

2.1.1. Transformation of variable in abstract integral

Let (X, A,n) be a measure space and let (y, C) be a measurable space.

Suppose p-1(C)), CEC. (2.7)

T h e o r e m 2.1.1. If f is a real measurable function on X such that f{x) —

g[tp(x)) and if f is integrable, then

I }{x)dti{x) = f g(<p(x))dn(x)

J x Jx

= f g{y)dv{y) (2.8) Jy

For a proof of this theorem, the reader is referred to Lehmann [(1959), p. 38],

E x a m p l e 2.1.6. ( W i s h a r t d i s t r i b u t i o n ) . L e t X\,..., X n be n indepen

dent and identically distr ibuted normal p-vectors wi th mean 0 and covariance

matrix £ (positive definite). Assume n > p so that

is positive definite almost everywhere. T he join t probabil ity density function

of X 1} ... ,X n is given by

p ( x 1 , . . . , x J - ( 2 7 r ) - " P / 2 ( d e t £ - 1 ) " / 2 x e x p | - i t r S - 1 ^ ^ ; | . (2.9)

Using Theorem 2.1.1, we get for any measurable set A in the space of S

P(S e A) — ( 2 J T ) - ' 1 ^ 2

J (det £ _ 1 ) n / 2

s - X^Xixl e A exp j - i t r S ^ s j f[dx,, (2.10)

' i=l

= (2^)-^ (det £ _ 1 ) n / 2 exp j - i tr dm(s)



Invariance 23

where m is the measure corresponding to the measure v of Theorem 2.1.1. Let

us define the measure m' by

To find the distribution of S, which is popularly called Wishart distribution

( W ( S , J I ) ) with parameter E and degrees of freedom n, it is sufficient to find

dm'. To do this let us first observe the following:

(i) Since E is positive definite symmetric, there exists a € Gi{p) such that

£ — aa'.

(ii) Let

As a 1 Xi , s are independently normally distr ibuted with mean vector 0 and

covariance matrix I (identity matrix), 5 is distributed as W(I,n). Thus

P aa .(S 6 A) = ( 2 7 r } - n ^ 2 / (det(aa')-l s)) n/2

JatA

x exp | - ^ t r ( a a ' ) - 1 s | dm'(s);

P r(aSa' € A) = ( 27 r ) - n ^ 2 / (de t ( ( c .a ' ) _ 1 s))/a

x exp j - i t r ( a a ' )l

s | x d m * { aJ

5a ' )

Since P a a ' ( S e A) = Pj{aSa € A) for all measurable sets A in the space of

5, we must have for all g £ GJ ( J I )

dm'(s) = dm t(gsg'). (2.12)

By example 2.1.5 we get

d m { S ) = (det '

where C is a posit ive constant.




Thus the probability density function of the Wishart random variable is

given by

W ( E , n ) = C(det £ - 1 } n ' 2 ( d e t s ) ( n - p - , ) / 2 exp j - ^ t r £ _ 1 s j (2. 13)

where C is the universal constant depending on E . To evaluate C let us first

observe that C is a function of n and p. Let us write it as C = C n , p . Since C n ̂p

is a constant for all values of £ , we can, in particular, take E = J to evaluate

C np . Write

Sn S12

\ $2l S22

where S u is (p — 1) X (p — 1). Make the transformations

£11 = Tn ,

S12 — T12,

S22 - SuSiiSu = T22 = Z ( s a y ) ,

which has a Jacobian equal to unity.

Using the fact that

\ S\ = \Su\ \S22 - S 2 iSi,l

Si2\ ,

J C n , p | 5 | l n - p - 1 » ^ e x p | - ^ t r 5 j d s = 1 f ora l l p .n ,

J C n , p \S \^-"-1>' 2e^p^~tis^ds

=C

-.pj k i i l | n " p " 1 ) / 2 e x p | - i t r S l l J d 5 l l

x y " { , ) ( - p - i ) / 2 e x p | _ I z | d 2

* / e x p { ~ 5 s ' i 2 s i " i 1 ' 5 l 2 } d l l » 2

and

we get



/nuor iance 25

I x / | s n |

t n

"p

~l ) / 2

e x p da11

m C n , P ^ 2 j r j ( P - i V 2 2 ( n - P + n / 2 r f n - p + l \C n,(p-1) \ 2 J

j C " , ( p - i ) | 5 n | ( n ^ l / 2 e x p dan •

Since C n , i = 2- n' 2 /T{n/2), we have

7 r - p ( p - l ) / 2 2 - ' 1 / 2 J r - p ( p - l ) / 2 2 - n p / 2

2.2. I n v a r i a n c e of T e s t i n g P r o b l e m s

We have seen in the previous section that for the invariance of the statistical

distribution, transformation g of g g G must satisfy g9 g fi for 9 g fl .

Def in i t ion 2.2.1. ( I n v a r i a n c e of pa ra me tr ic sp ac e) . The parametric

space fi remains invariant under a group of transformation G : X A1

, if

(i) for g g G, the induced mapping j on fi satisfies g9e.il for all # € fi;

(ii) for any 8' £ fi there exists 0 g fi such that g$ = 8'.

An equivalent way of writing (i) and (ii) is

fffi = fi. (2.14)

It may be remarked that if P g for different values of 8 g fi are distinct then

g is one-to-one. T he following theorem will assert that the set of all induced

transformations g,g g G also forms a group.

T h e o r e m 2.2.1. Letgly g 2 be two transformations which leave fi invariant.

The transformations g 2 gi and g^1

defined by

0201 (x) = 02(9i (aO)_ . (2-15)

ffi 9iW=x

http://g9e.il/

http://g9e.il/

http://g9e.il/

26 Group /rtuariance in Statistical Inference

for all x £ X leave SI invariant and g 2 gi — fftffi, (Si l ) — (ffi) 1

Pr oo f . We know that if the distr ibution of X is Pg, 0 G fl, the distribution

of gX is Pgd.gi 1 € ft. Hence the distribution of g2gi {X) is P g ^ e , as <hf) € ft

and leaves fi invariant, g~2g ~\6 € ft. Thus g 2 g\ leaves ft invariant and,

obviously,

9i gi = 92fli •

Similarly the reader may find it ins tru ctive to verify the other assertion . •

Let us now consider the problem of testing the hypothesis Ho : 6 G Slu0

against the alternatives Hi : 6 £ ftjj,, where ft#0 and Sin, are two disjoint

subsets of SI. Let G be a group of transformations which operate from theleft on X satisfying conditions (i) and (ii) above and (2.14). I n what follows

we will assume that g £ G is measurable which ensures that whenever X is a

random variable then gX is also a random variable.

Def ini t i on 2 .2 .2. ( I nv ar ia nc e of s ta t i s t i c a l p r ob le m ). T he problem

of testing Ho : 8 £ SIH U

against the altern ative H i : 0 <S Sin, remains invariant

under a group of transformations G operating from the left on X if

(i ) (or g € G, A € A, Pe(A) = P- sa(gA),

(ii) = ttH a ,gSl Hl = Sl Hl for all g G G .

2 . 3. I n v a r i a n c e o f S t a t i s t i c a l T e s t s a n d M a x i m a l I n v a r i a n t

Let (X,A, A) be a measure space. Suppose pg,8 G SI, is the probabilitydens ity function of the random variable X G X with respect to the measure

A. Consider a group G operating on X from the left and suppose that the

measure A is invariant wit h respect to G , that is,

\{gA) = \(A) f o r a l l A e A ? € G .

Let [y,C) be a measurable space and suppose T ( X ) is a measurable map

ping from X into X.

De fi ni t io n 2 .3 .1 . (I n va ri an t fu nc ti on ). T ( X ) is an invariant function

on X under G operating from the left on X if T ( X ) = T(gX) for all g G

G,X€X.

Invariance 27

De f i n i t i on 2 . 3 . 2 . ( M a x i m a l i n v a r i a n t ) . T(X) is a maximal invariant

function on X under G if T(X) is invariant under G and T(X) = T{Y) for all

X. Y £ X implies that there exists g £ G such that Y = gX.

If a problem of testing HQ : 8 e fl# 0 against the alternatives H lt8 £ fijjr

remains invariant under the group transformations G, it is natural to restrictattention to statistical tests which are also invariant under G. Let v>(X), X £ X

be a statistical test for testing Ho against H i , that is, <p{x) be the probability

of rejecting H a when x is the observed sample point. If f is invariant under G

then ip must satisfy

<p(x) = tp(gx), xeX.geG (2.16)

A useful characterization of invariant test f(X) in terms of the maximal in

variant T(X) on X is given in the following theorem.

T h e o r e m 2.3.1. Let T(X) be a maximal invariant on X under G. A test

ip(X) is invariant under the group G if and only if there exists a function h

for which <p(X) = h(T{X)).

P r oof . Let <p(X) = h{T(X)). Obviously,

<p(x) = <f(g(x)), g£G,x£X.

Conversely, if <p(x) is invariant and if T{x) — T(y);x,y £ X then there exists

a g £ G such that y — gx and therefore tp(x) — f[y). •

In general if ip is invariant and measureable, then <fi(x) — k(T(x)) but h

may not be a Borel measurable function. However if the range of the ma xima l

invariant T(X) is Euclidean and T is Borel measurable then Ai — {A : A £

A and gA — A for all g £ G) is the smallest f-field with respect to which T is

measurable and ip is invariant if and only if ip(x) — h(T(x)) where h is a Borel

measurable function. Thi s result is essentially due to Blackwell (1956).

Let G be the group of induced transformations on f! (induced by the group

G). Let v(8),8 € f i, be a maximal invariant on fi under G.

T h e o r e m 2.3.2 . The distribution of T(X) [or any function of T(X)) depends on f! only through i/{9).

Proof. Suppose v{9i) = v(8' 2 ),8\,8i £ fi. Then there exists a g £ G such

that f?2 — § # i . Now for any measurable set C

P tl (T(X) £ C) = P 9i{T{gX)) G C)

= P- g9l (T(X) E C )

•

2 .4 . S o m e E x a m p l e s o f M a x i m a l I n v a r i a n t s

E x a m p l e 2.4.1. Let X = {Xu • - • . * n ) ' £ * and let G be the group of

translations gc{X) = (Xi + C,.. .,X a + C)',-oo < C < oo.

A maximal invariant on X under G is T{X) = (X\,—X n ,.. — X n ).

Obviously, it is invariant. Suppose x — (x\,... , £ „ ) ' , y = ( j / i , . . . ,y n )' and let

T(x) = T(y). Writing C — y n—

x n we get yi = Xi + C for i—

1 , . . . , n.

Another maxim al invariant is the redundant (X\ — X,..., X n — X) where

X = i 51™ X j . It may be further observed that if n — 1 there is no maximal

invariant. The whole space forms a single orbit in the sense that given any two

points in the space there exists C G G transforming one point into the other.

In other words G acts transitively on X.

E x a m p l e 2.4. 2. Le t A

1

be a Euclidean n-space, and let G be the group of scale changes, that is, j? a G G,

E x a m p l e 2.4.3. Let X = X Y x X 2 where Xi is the p-dimensional Euclideanspace and X 2 is the set of all p x p symmetric positive definite matrices. Let

XeX uS€X 2 . Write

where X [t) is d; x 1 and 5 { i i ) is a\ x d; such that £ i U < * i = P- Let G be

the group of all nonsingular lower triangular pxp matrices in the block form,

S G G

9 a(X) = (aX 1 ,...,aX n )', a > 0.

X - ( X 1 i ) , . . . , X ( t ) ) ' ,

Invariance 29

9 =

(9(H) 0

3(21) 3(22)

0 >

0

operating as

Let

where

\9(ki) 9(k2) ••• 9{kk)/

(X,S)->(gX,gSg').

f{X,S) = {Rl...,Rl),

£ ^ - % S ® % * = 1 . - . * (2-17)

Obviously, R* > 0. Let us show that Rj,..., if £ is a maximal invariant on X

under G. We shall prove it for the case k —2, The general case can be proved

in a similar fashion.

Let us first observe the following:

(i ) I f X -* gX, S - » <?5g' then - * 9 ( n ) X ( i ) , 5(u) - t 9 ( i i )S (n)0( U ) .

Thus ( i f j . f i f + R%) is invariant under G.

(ii) A " ' 5 _ 1 X - X J ' J J S J " ! , ^ ! ) + ( X ( 2 ) - 5(ai)5(n)^"(i))'

( 5 ( 2 2 ) - S ( 2 1 ) 5 ( ^ ) 5 ( i 2 ) ) " 1 ( X ( 2 ) - S(21)S l̂ \ ) X( 1) ) .

Now, suppose that for X,Y e X\\ S,T G A*2

( x ( ' 1 1 5 - ! ) x ( 1 ) , x ' 5 - 1 x ) - ( r ( ' 1 1 T - ; l y [ 1 ) , y ' r - 1 y ) , (2.18)

where K, T are similarly partitioned as X, S. We must now show that there

exists a.g eG such that X = gY, S = gTg'. Let us first choose

g = '0

with

3(21) 3(22)

3(11) =

3(22) ~ (S(22) ~ 5 ( 2 1 ) S ( ^ ) ) S ( 1 2 ) ) " " 2

9(12) = 9(22)5'(21)S (" 11 ) .

Th e n gSg' — I. Similarly choose

h = ( h { 1 1 ) 0

\ f y 2 l ) /l(22)

such that hTh! = I. Since

implies

we conclude that there exists an orthogonal matrix 0\ or order <ii x di suchthat

Since

X ' S - ' X = Y'T~l

Y,

s - 1

= s's,

T- 1 = h%

(fl(ii)-^(i))'(»(ii)^(i)) = (fc(u)%))'(tyii)%))>

then

I|S(21)-^(1) +9(22 |AT (2 ) | |2 = 11/1(211^(1) + ^(22)^(2) IP

so that there exists an orthogonal matrix 0 2 of order d 2 x d 2 such that

(9(21)X(i) + 9 ( 2 2 ) X ( 2 ) ) = 02( /l ( 2i)y"(i) +/l(22)V(2)) .

Now let

=f 0, 0 \

I o o 2 Jr -

Then, obviously,

X = g^ThY

= aY,

where a £ G and

gSg' = rhTh'V,



Invariance 31

so that

S = aTo'

Hence (R J . f l J + i i j ) and equivalently (J?J, Ji^) * s a maxima! invariant on x

under G.

Note that if G is the group of p x p nonsingular upper triangular matrices

in the block form, that is, if g € G

9 =

( 9{ii) S(i2) ' • ' 9(i*) \

0 9(22) • ' • 9(2*)

V 0 0 •• • 9{kk)/

where is di x d{, then a maximal invariant on X under G is

( i r , . . . , n )

defined by

/ % ) • • • % , ]- l

Y^T* = (Xti ) ,,..,X m }' (X(i),... ,X w )i = 1, . . . ,k

2 .5 . D i s t r i b u t i o n o f M a x i m a l I n v a r i a n t

Let (X,A,X) be a measure space and suppose pg,9 £ 0, is a probability

density function w ith respect to A. Co nsider a group G operating from the left

on X and suppose that A is a left invariant measure,

\{A) = X(gA), g £ G, A £ A.

Let [yX) be a measurable space and suppose that T{X) is a measurable

mapping such that T is a maximal invariant on X under G. Let A* be a

measure on {y,C) indu ced by A, such that

X'(C) = X(T-1(C)), CeC

Also let Pi be the probability measure on (y,C) defined by

Pt{C) = j Pe(x)dX(x)!

JT-i(c)



32 Group /n variance in Statistical Inference

so that P 2 has a density function p" with respect A*. Hence

/ p'{T)d\'{T) = j p(x)dX{x),C€C

A s s u m p t i o n . There exists an invariant probability measure p on the

group G under G.

L e m m a 2.5.1. The density function p* of the maximal invariant T with

respect to A* is given by

P*(T) = J P(gx)dp(g). (2.19)

Proof . Le t / be any measurable function on (y,C}- Then

j f(T)p'(T)d\-(T) = J f(T(x))p(x)d\'x).

However,

j f(T(x)) J p{g(x))dft(g)dX(x)

= j j HT(x))p(g(x))d\(x)du(g)

= J J nnx^pigix^dHgix^g)

= j f(T{y)) P{y)d\{y).

Then , since the function

/ p(g(x))dfi{g) JG

is invariant, we have

p*{T) = / p(g(x))M9) • Ja



Invariance 33

2.5.1. Existence of an invaraint probability measure on O(p)

(group of p x p orthogonal matrices).

Let X = ( X i , . . . , X P ) be a p x p random matrix where the column vec

tors Xi (p-vectors) are independent and are identically distributed nor mal

(p-vectors) with mean vector 0 and covarian ce matrix / (identity). Ap ply theGram-Schmidt orthogonalization process to the columns of X to obtain the

random orthogonal matrix Y = (¥%,...., Y p ), where

Yi- g , i = i t . . . , p .

Define Y = T(X) and let G = 0(p). Obviously,

gY = (gY u ...,gY v )

= T(gX u ...,gX p )

= T(X;,...,X;),

where

X;=gXi, i = l , . . . . P .

Since X ; , X " have the same distribution and Xi,...,X* are independently

normally distributed with mean 0 and covariance matrix / , we obtain Y and

gY have the same distribution for each g £ 0(p). Thus the probability measure

of Y defined by

P[Y eC) = P(X eT~ L(C))

is invariant under 0{p),

2 . 6 . A p p l i c a t i o n s

E x a m p l e 2.6.1. ( N o n - c e n t r a l W i s h a r t d i s t r i b u t i o n ) . L e t X =

( X ! , . . . , X n ) (each X* is a p-vector) be a p x n matrix (n > p) where each

column Xi is norm ally distributed with mean vector / i ; and covariance matr ix

/ and is independent of the remaining columns of X . T h e probability density

function of X with respect to the pn-dimensional Lebesgue measure is

i&'t$ = ( 2 n - ) n p / 2 e x p { ~ i t t x x ' - | * i W * ' + t *a v i' j (2-19)



34 Group Invariance in Statistical In/erence

where

fi = (ft , . , . . . , Up).

It is well known that

n

iml

is a maximal invariant under the transformation

X - XT, r e 0{n),

where O (n ) is the group of orthogonal matrices of dimension n x n . Le t p(s, p.)

be the probability density function of 5 at the parameter point p. with respect to the induced measure dX'{S) as defined in Lemma 2.5.1. T o find the

probability density function of 5 with respect to d we first find dX'{s). From

(2.13)

p(s,0 )dX'(s) = C(det s ) ( " - p - , ) / 2 e x p | - | t r s J da, (2.20)

By (2.18)

P(s,o)= f P ( 2 r r , o ) d ^ ( r )

Join)

- fH/ 0„.,d

*( r )

r e x p i - ^ t r s l ,(2ff)»f/ a P \ 2

where /i is the invariant probability measure on O( n ) . H ence,

dA*(s) = (2T) n " / 2 C ( d e t »f*+-W*it .

Again by (2.18)

p{a,u) = exp | -1 tr (s 4- J V ) J jf ^ exp j - j t r acT**' J dp{T)

= ^ P | - g tr(s+ W 0 j p ) ,

Invariance 35

where

h(x,fi)= j exp J — - t r a d ? a ' \ du(T) Jot,n) I 2 J

It is evident tha t

h(x,u) = h(xT,p.<B),

where T, * e 0 {n). Thus we can write h(x,p) as

The probability density function of S with respect to the Lebesgue measure

ds is given by

p( s ,^)dA*(s ) =p( s ,^) (27r ) n p / 2 C(det s ) ( " - ' ' - 1 > / 2 d s .

This is called the probability density function of the noncentral Wishart dis

tribution with non-centrality parameter fia'.

E x a m p l e 2 . 6 .2 . ( N o n - c e n t r a l d i s t r i b u t i o n o f t h e c h a r a c t e r i s t i c

roots) . Let S of dimension p x p be distributed as central Wishart random

variable with parameter £ and degrees of freedom n > p. Let Ri > R 2 >

••• > R p > 0 be the roots of characteristic equation de t(S — XI) — 0, and

let P denote a diagonal matrix with diagonal elements R i , . . . ,R P . Denote by

p(i?, S) the probability density function of R at the parameter point E with

respect to induce measure dX'(R) of Lemma 2.5.1. We know [see, for example,

Giri(1977)]

/ p v ( n - p - l ) / 2 . p .

p(R,I)dX-(R) = C 1 {U^J e x p | 4 X > j *ll(&-Rj)dR

where dR stands for the Lebesgue measure and C\ is the universal constant.

It may be checked that R is a ma xima l invariant under the transformation

s r s r , r e o (p).

By (2.18)

p(R, I) = C 2 (j[ R^j exp | - \ J £ Ri | ^ dp(T).




So

dX'(R) = ^-'[[(R1-R ] )dR.

Again by (2.18)

/ p \ ( n - p - l ) / 2

p ( R , E ) = (det S " 1 ) " / 2 (n )̂

( n - p - l ) / 2

/ exp J --tr 0 -l

TRr'\ da(T) Jotv) I 2 J

x'O(p)

where t9 is a diagonal matrix whose diagonal elements are the characterist ic

roots of £ . The non-central probability density function of R depends on £

only through its characteristic roots.

2 .7. T h e Di s t r i bu t i on of a M a x i m a l Invar iant in the

G e n e r a l C a s e

Invariant tests depend on the observations only through the maximal invari

ant. To find the optimum test we need to find the explicit form of the maximal

invariant statist ic and its distribut ion. For many multivariate testing problems

it is not always convenient to find the explicit form of the m axi ma l invaraint .

Stein (1956) gave the following representation of the ratio of probability den

sities of a maximal invariant with respect to a group G of transformations g

leaving the testing problem invariant.

T h e o r e m 2.7.1. Let G be a group operating from the left on a topological

space (X, A) and \ a measure in x which is left invariant under G (G is

not necessarily transitive on X). Assume that there are two given probability

densities pi,p 2 with respect to A such that

Pi(A) = J AP2(x )d\( x ),



Invariance 37

for A € A and Pi and P% are absolutely continuous. Suppose T : X —* X is a

where p, is a left invariant Haar measure on G. An often used alternative form

of (2.21) is given by

where f is the probability density with respect to a relatively invariant measure

with multiplier X(g).

Stein (1956) gave the statements of Th eo re m 2.7.1. without stating explic

itly the conditions under which (2.21) holds. However this representation was

used by Giri (1961, 1964, 1965, 1971) and Schwartz (1967). Schwartz (1967)

also gave a set of conditions (ra ther complicated) which must be satisfied for

Stein's representation to be valid. Wijsman (1967, 1969) gave a sufficient con

dition for (2.21) using the concept of Cartan G-space. K oe hn (1970) gave

a generalization of the results of Wijsman (1967). Bonde r (1976) gave some

conditions for (2.21) through topological arguments. Anderson (1982) has ob

tained some results for the validity of (2.21) in terms of the proper action of the

group. Wijsman (1985) studied the properness of several groups of transfor

mations used in several multivariate testing problems. Subsequently Wijsman

(1986) investigated the global cross-section technique for factorization of mea

sures and applied it to the representation of the ratio of densities of a maximal

invariant.

2. 8. A n I m p o r t a n t M u l t i v a r i a t e D i s t r i b u t i o n

Let Xi,..., XN be N independently, identically distributed p-dimensional

normal random variables with mean £ = ( £ , . . . , £p) ' and covariance matrix £

(positive definite). Write

maximat invariant and p" is the distribution of T(X} when X has distribution

Pi. Then under certain conditions

(2.21)

(2.21a)

N N

x =

w Ex

' - s = jyi x i -x%x i -xf.

It is well-known that X,S are independent in distribution and X has

p-dimensional normal distribution with mean £ and covariance matrix (1/jV)

E and S has central Wishart distribution with parameter E and degrees of

freedom n — N — 1. Throughout this section we will assume that N > p so

that S is positive definite almost everywhere. Write for any p-vector b,

where fofl is the d; x 1 subvector of b such that £j d* — p and for any p x p

matrix A

A ( n ) ••• A ( l f c ) N

, A ( A i ) • • • A(kk) j

Am =

' A n ) ••• A

(n)

where is a di x dj submatrix of A. Let us define Ri,... ,R k and ft,...,5*

by

t

i

E ^ = ^w2r7,̂ W, t = l,2,...,fc.

T h e o r e m 2.8.1. TTte joint probability density function of R% t . ..,R k is

given by (for Ri > 0, £ * = i fl^ < y

m « w - r ( w / 2 )

* / k \ [ ( W - P ) / 2 ] - l

x T j £ ( * / 2 W l - E ^ J

n / J V - <5,_i dj JL& \

Invariance 39

where a — S i = i dj w ^ a o —0 and tp(a, b : x) is the confluent kypergeometric

series given by

a a{a + 1 ) x 2 a(a + l )(a + 2) x 3

+ b X + b(b + 1) 2! + 6(6 + l)(b + 2) 3!" + " ' '

Furthermore, the marg ina l probability density function of Ri,...,Rj,j < k

can be obtained from (2.22) by replacing A; by j and p by Y J[=I hi

proof . We will first prove the theorem for the case k — 2 and then use

this result for the case k — 3. T h e proof for the general case will follow from

these cases. Fo r the case k = 2 consider the random variables

W = 5(22) - 5( 21 )5 ^, 5( 12 ) ,

f = % , ) .

It is well-known [see, for example, Stein (1956), or Giri( l977) , p. 154] that

(i ) V is distributed as Wishart W(N - 1, E ( U ) ) ;

(ii) W is distributed as Wishart W{N - 1 - di, E [ 2 2 ) - S ^ i j S m j E ^ ) ) ;

(hi) the conditional distribution of L given V is multivariate normal with mean

£ ( 2 1 ) 2 ^ , V - 1 / 2 and covariance matrix

( £ ( 2 2 , - E ( 21) S ^ 1 i ) S ( 12 ) } ®/ ( ( 1 ,

where ® is the tensor product of two matrices £ (22 )— ^(21)^(11)^(12) and the

di x d, identity matrix I a

,•

(iv) W is independent of (L,V).

Let us now define W\, W 2 by

W 2(l + W L ) = N(X {2) - S m S ^ t i ) T ( S m - 5 ( 3 i ) 5 ( - 1 ) 5 ( i 2 ) ) - 1 (2.24)

x ( X ( 2 ) - 5 ( 2 i , 5 ^ , A " ( 1 ) ) .

Since X is independent of 5, the conditional distribution oi\fNX ̂ 2) given 5 ( n j

and X(i) is normal with mean \/~N {£ (2) + ^<2i )^( i i )£ ( i ) ) and covariance matrix

£{22> ~ E{8 i>Er i i jE ( i 4> 1 1 thereupon follows that the conditional distribution

of y/NXt2) given 5 ( l l , and y/NX^f is independent of the conditional distri

bution of

given Sni) and Xay. Further more, the conditional distribution of VWS(i2)

5(~!)^{ i ) give 1 1 s ( n ) a n ( i i s normal with mean V W E ( a i ) S ^ j j ^ i ) and

covariance matrix

Hence the conditional distribution of

Viv" ( x < 2 ) - ( i + W M _ 1 / 2

given 5( H ) a n d -^(i) ' s normal with mean

(C(2) - S ( 2 i ) S ^ i ^ ( i ) ) ( i + Wi)-1? 2

and covariance matrix

£ ( 22 ) - ^(21) £ ( i i ) £ ( 1 2 ) -

From Giri (1977) it follows that the conditional distribution of W% given X^

and S(u) is

p(W 2 \X (1) ,S tn) ) = xl [Hi + W i ) " 1 ) /xl-dl-dl , (2.25)

where Xn($) denotes the noncentral chisquare random variable with noncen-

trality parameter 6 and with n degrees of freedom and Xn denotes the central

chisquare random variable with n degrees of fredom. Since this conditional

distribution depends on Xh-t and Sti\) only through Wi, we get the joint

probability density of WuWz as

p{W 2

,Wi) =X

2

2

(6 2

(l + W i ) -

l

) / x N

.d l

. ^ (2.26)

and furthermore it is well known that the marginal probability density function

of Wi is

P(Wi) = X 2 dl (Si)/x N -ai • (2.27)



Invariance 41

Hence, we have

= 0 jSI f l + i f f t ^ ^ H *

r(* + /?)r(V) 1 2 JJ3 j !

From Lemma 2.8.1. below,

W 2 = {(Ri + R2 ){1 - R i - R a ) " 1

~H i (l- Ri)~1 }

x (1 + Hj.{l - H i ) " 1 ) " 1

= fi2(l-fli-fi2)"1.

(2.28)

(2.29)

From (2.28) and (2.29) the probability density function of Ri and R 2 is

given by

x R$ 4I -1 BI*'-1(I-RT - nriW-Jh*

x exp | - - ( f t +6 2) + rfefiji

2

x n ^ i ( J V - f f l _ 0 , f ; i i U , ) (2.30)

We now consider the case k = 3. Write

W 3 - ( N X ' S "1

: ? - N%S^X{ 3 y) ( l +

It is easy to conclude that

• 1

5(33, - 5[32jS [ 2 2 ] S l 2 3 j

42 Group /n variance in Statistical Inference

is distributed as Wishart

w(N-l-d-!- da,E(33) - E[33]E^jE(23]j ,

and is independent of S( 22) and S[ S2-• Furthermore, the conditional distribution

of \ / iVX(3) given S[ 22 ]<X [2] is normal with mean

(fo) - E|32]S|2 2 J(-?[2] - £[2]))


E(33) - E[32]Ep2] S[23] .

and is independent of the conditional distribution of

^S

[M ] S

\ ~22) X

W

given S[Z2] and X p ] , which is normal with mean


NX

m

S

[~2

l

2)

X

m (

S

[33] -

S

(32]

S

[22j

S

[23]) -Thus the conditional distribution of W 3 given Sp2j and X[ 2 j or, equivalently

given (Wi,W 2 ), is

viW^W^Wy) = X %$41 + WiftWi + W 2 + WiWsT 1

)

Ixfu-dy-it-di- (2-31)

Hence the joint probability density function of , W 2 , W 3 is given by

p(Wi,W 2 ,W 3 ) - — ^ 5 '-

XN-di-di-dz

X.N-d,-d? Aiv-fl,

From Le mma 2.8.1. below,

W 1=R1 (1 -Rl )-

1 ,

W a = R i ( l - R l - R 2 ) -1

,

W 3 = {{R, +R 2 + R 3 )(l - R , - R 2 - ^ p 1 - (R, + fl2)

x (1 - R, - R^-'Xl - R, - R 2 )

= R 3 ( . 1 - R l - R 3 - R a ) - 1 . (2.33)

Invariance 43

From (2.32)-(2.33) the joint probability density function of i f , , i f 2 , # 3 is

given by

i=l V i=l

x n ^ i t i V - ^ f ; ^ ) . (2.34)

Proceeding exactly in the above fashion we get (2.22) for general k. Since

the marginal distribution of JEui is normal with mean &a and covariance matrix

S[iij it follows that given the joint probability density function of i f i f ^ ,

the marginal distribution of i f , , . . . , i f j , 1 < j < p can be obtained from (2.22)

by replacing k by j . •

The following lemma will show that there is one-to-one correspondence

between ( f i i , . . . , i ft ) and (if J , . . . , if£) as defined earlier in this chapter.

L e m m a 2.8.1.

NX'S^X NX'(S + NXX')-1 X =

1 + NX'S-i-X

Proof. Let

{S + NXX 1

)-1 = S - ' +A.

Hence,

I + NXX'S-1 +(S + NXX')A = I.

So we get

(S + NXX')-1 = S " 1 - N(S + NXX'^XX'S'1

Now

NX'(S + NXX')-^ = NX'S^X - NX'(S + NXXT'XiX'S^X).

Therefore,

NX' S~1 X NX><a+ NXXT*X)=1+NX , s_lx

Note that

5 = i j=i V J = I

2 .9 . Almost Invar iance , Suff ic i ency and In va r i an ce

Def in i t ion 2.9.1. Let (A" ,A ,P ) be a measure space. A test ih{X),X G

A" is said to be equivalent to an invariant test with respect to the group of

transformations G, if there exists an invariant test ip(x) with respect to G

such that

ib(x) = y{x),

for all x G X except possibly for a subset TV of A" of probability measure zero.

Def in i t ion 2 .9 .2 . Al mo st inva r ian ce . A test ip{x), x G X, is said to be

almost invariant with respect to the group of transformations G, if

<p(x) = tp(gx) for all x G X — J\f g ,

where N g is a subset of x depending on g of probability measure zero.

It is tempting to conjecture that if tp(x) is almost invariant then it is equiv

alent to an invariant test i>{x). Obviously, any ifi(x) which is equivaleant to an

invariant test is almost invariant. For, take N g — N U g-1 N. If x £ N g , then

x & N, and gx $ N so that

tp(x) = y(gx) = ih(gx) = ii>{x).

Conversely, if G is countable or finite then given an almost invariant test <p(x)

we define

N = 1J N g

,

where N g = {x G x '• f{x) # f{gx)} and AT has probability measure zero.

The uncountable G presents difficulties. Such examples are not rare where

an almost invariant test can be different from an invariant test.

Invariance 45

Let G be a group of transformations operating on X and let, A,B, be the

d-field of subsets of X and G respectively such that for any A £ A the set of

pairs (x, g) for which gx £ A is measurable Ax B. Suppose further that there

exists a tr-finite measure v on G such that for all B £ B,v(B) = 0 implies

v{Bg) = 0 for all g £ G . Then any measurable almost invariant test functionwith respect to G is equivalent to an invariant test under G.

For a proof of this fact the reader is referred to Lehmann (1959). The

requirement that for all g £ G and B £ B,v(B) — 0 imply v(Bg) = 0 is

satisfied in par tic ula r when f{B) — f{Bg) for all g £ G and B £ B. Such a

right invariant measure exists for a large number of groups.

Let A\ = {A : A £ A and gA —A for all g £ G } and let .4 S be the sufficient

o--field of X. Le t G be a group, leaving a testing problem invariant , be suchthat gA„ = A s for each g £ G . If we first reduce the problem by sufficiency,

getting the measure space (X, A s , P ) , and subsequently by invariance then we

arrive at the measure space (X,A si, P) where A si is the c lass of all invariant

A s measurable sets. T he following theorem which is essentially due to Ste in

and Cox provides an answer to our questions under certain conditions. T h e

proof is given by Hall , Wijsman and Ghosh (1965).

T h e o r e m 2.9.1. Let [X, A, P ) be a measure space and let G be a group of

measurable transformations leaving P invariant. Let A s be a sufficient subfield

for P on (X, A) such that gA s — A s for each g £ G . Suppose that for each

almost invariant A 3 measurable function <p — tb except possibly on a set N of

probability measure zero. Then A s is sufficient for P on (X,A).

Thus under conditions of the theorem if we reduce by sufficiency and in

variance then the order in which it is done is immateria l.

2 .1 0. I n v a r i a n c e , T y p e D a n d E R e g i o n s .

The notion of a type D or type E region is due to Issacson (1951). Kiefer

(1958) showed that the P- te st of the univariate general linear hypothesis pos

sesses this property. Suppose, for a parametric space ft — {(#,i?) : 6 £ ft',n €

H] with associated distributions, with ft' a Euclidean set, that every test func

tion <fr has a power function 0${$,n} which, for each n, is twice continuouslydifferentiable in the components of 0 at 0 = 0, an interior point of ft'. Let Q a

be the class of locally strictly unbiased level a tests of Ho : $ — 0 against the

alternatives H\:0^O. Our assumption on implies that all tests in Q a are

similar and that

for all 4> in Q a .

Let be the matrix of second derivates of 0$(9, v) with respect to the

components of $ (which is also popularly known as the Gaussian curvature ) at6 = 0. Define

AM = d e t ( / V " ) ) -

Assumption. A^ (T J ) > 0 for all v E H and for at least one d>' 6 Q a .

D e f i n i t i o n 2 .10 .1. ( T y p e E an d T y p e D Te s t s ) . A test 4>' in Q a is

said to be of type E if

A ^ ( n ) - m a x ^ e Q = A ^ ( n )

for all V € H. If i f is a single point, d>~ is said to be of type D.

Lehmann (1959a) showed that, in finding regions which are of type D,

invariance could be invoked in the manner of Hu nt -S te in theorem. (Theorem

5.OA); and that this could also be done for type E regions (if they exist)

provided that one works with a group which operates on H as the identity.

Suppose that our problem is invariant under a group of transformations for

G which Hunt-Stein theorem holds and which acts trivially on fi', such that

for g e G

If (fig is the test function defined by

tpg(x) = <j>(gx),

then a trivial computation shows that

A * 9 ( n ) = A ^ o n )

and hence

A ( n ) = A{gr))

where

A(n) = m a x ^ ^ A ^ n ) .

Also, if 0 is better than 4>' in the sense of either type D or type E, then tpg is

clearly better than <p'g.



Invariance 47

E x e r c i s e s

1. Let Xi,..., X n be independently and identically distributed normal random

variables with mean a and variance (unknown) I T 2

. Using Theorem 2.7.1.

find the U M P invariant test of Ho : p, = 0 against the alternative Hi : u ^ 0

with respect to the group of transformations which transform each Xi —*

cXi,c±Q.

2. Let {Pg,8 e 11} be a family of distributions on (X,A) such that P$ has a

density p(-j8) with respect to some a-finite measure u. For testing Ho • 0 €

flrj against Hj : 9 G Hi with 120

n P i = 0 ( n u ' l s e t ) . t n e likelihood ratio test

rejects Ho whenever X(x) is less than a constant, where,

sup p(x\9)

X(x) = 9€.n0

sup p(x|S)

9 e flj u Hi

Let the problem of testing r/o against i i ] be invariant under a group G of

transformations g transforming x —* g( x), g 6 G, i G A". Assume that

p(x\9)=p(9x\g8)X(9)

for some multiplier X. Show that A( x) is an invariant function.

3. Prove (2.8) and (2.25).

4. Prove Theorem 2.9.1.

References

1. S. A, Anderson, Distribution of maximal invariants

usingquotient

measures,Ann. Statist. 10, pp. 955-961 (1982).

2. D. Blacltwell, On a class of probability spaces, Proc. Berkeley Symp. Math.

Statist. Probability. 3rd, Univ. of California Press, Berkeley, California, 1956.

3. J. V. Bonder, Borel cross-sections and maximal invariants, Ann . Statist. 4,

pp. 866-877 (1976).

4. T. Ferguson, Mathematical Statisties, Academic Press, 1969.

5. N. Giri, On the likelihood ratio test of a normal multivariate testing problems;

Ph.D. Thesis, Stanford University, California, 1961.

6. N. Giri, On the likelihood ratio test of a normal multivariate testing problems,Ann. Math. Statist. 35, pp. 181-189 (1964).

7. N. Giri, On the likelihood ratio test of a normal multivariate testing problem-II,

Ann. Math. Statist. 36, pp. 1061-1065 (1965).

8. N. Giri, On the distribution of a multivariate statistic, Sankhya, 33, pp. 207-210

(1971).




9. N. Giri, Multivariate Statistical Inference, Academic Press, N.Y., 1977.

10. S, L , Isaacson, On the theory of unbiased tests of simple statistical hypotheses

specifying the values of two or more parameters, Ann, Math. Statist. 22, pp. 217¬

234 (1951).

11. A. Haar, Der Massbegriff in der theorie der kontinvjierchen gruppen, Annals of Mathematics, 34 (1933), pp. 147-169.

12. W. J . Hail, R. A. Wijsman, and J . K. Ghosh, The relationship between sufficiency

and invariance with application in sequential analysis, Ann. Math. Statist. 36,

pp. 575-614 (1965).

13. P. R. Halmos, Finite Dimensional Vector Space, D. Van Nostrand Company,

Inc. Princeton, 1958.

14. J . Kiefer, On the nonrandomized optimatity and randomized nonoptimality of

symmetrical designs, Ann. Math. Statist. 29, pp. 675-699 (1958).

15. U. Koehn, Global cross sections and the densities of maximal invariants, Ann.

Math. Statist. 41, pp. 2045-2056 (1970).

16. E . L . Lehmann, Testing of Statistical Hypotheseses, John Wiley, N.Y., 1959.

17. E . L. Lehmann, Optimum invariant tests, Ann. Math. Statist. 30, pp. 881-884

(1959a).

18. L, Loomis, An Introduction to Abstract Hermonic Analysis, D. Van Nostrand

Company, Inc., Princeton, 1948.

19. L. Nachbin, The Haar Integral., D, Van Nostrand Company, Inc., Princeton,

1965.

20. R. Schwartz, Locally mimimax tests, Ann. Math. Statist., 38, pp 340-359, 1967.

21. C . Stein, Multivariate Analysis-I, Technical Report 42, Department of Statistics,

Standford University, 1956.

22. R. A. Wijsman, Cross-Section of orbits and their application to densities of max

imal invariants, Proc. Fifth. Berk. Symp. Math. Statist. Prob. 1. pp. 389-400

(1967) University of California, Berkeley.

23. R. A. Wijsman, General proof of termination with probability one of invariant

sequential probability ratio test based on multivariate observation, Ann. Math.

Statist. 38, pp. 8-24 (1969).

24. R. A. Wijsman, Distribution of maximal invariants, using global and local cross-

sections, Preprint, Univ. of Illinois at Urbana-Champaign, 1978.

25. R. A. Wijsman, Proper action in steps, with application to density ratios of max

imal invariants, Ann. Statist. 13, pp. 395-402 (1985).

26. R. A. Wijsman, Global cross sections as a tool for factorization of measures and

distribution of maximal invariants, Sankhya A, 48, pp. 1-42 (1986).

Chapter 3

EQUIVARIANT ESTIMATION INCURVED MODELS

Let ( x , A ) be a measure space and f) be the set of points 8. Denote by

{Ps,8 £ Si], the set of probability distributions on (x, A ) . Let G be a group

of transformations operating from the left on \ such that p £ G , g ; x —* X

is one to one onto (bijective) . Le t G be the corresponding group of induced

transformations g on ft.

Assume

(a) For 0i e 0, i = 1,2; 9i ? t9 2 ,P e , £ P B „

(b) P f l { A ) - P- ge(gA), A £ A,p € G,g e G.

Let A(0) be a max im al invariant on ft under G , and let

f!" = {0\9 £ fi.with A(<?) = A 0 } (3.1)

where Ao is known. We assume x to be the space of minimal sufficient statis tic

for 8.

Defini t ion 3 .1 . ( E q u i v a r i a n t Estimator) A point estimator 8(X),X £

X, a mapping from x —' ft, is equivariant if 8(gX) = g8(X),g £ G .

Note 1. For notational convenience we are taking G to the group of transformations on

e(X).

Note 2. The unique max imum likelihood estimator is equivariant.




L et T(X),X £ X' be a maximal invariant on x under G. Since the distri

bution of T(X) depends on 0 € Q only through A(0) , given X($) = A 0 T ( X )

is an anc illary statistic . A n anc illary stat ist ic is defined to be a part of the

minimal sufficient statistic X whose marginal distribution is parameter free.

In this chapter we consider the approach of finding the best equivariant es

timator in models admitting an anci llary statis tic. Such models are assumed

to be generated as an orbit under the induced group G on fi. Th e ancillary

statistic is realized as the maximal invariant on x under G. A model which

admits an ancil lary statist ic is referred to as a curved model. Models of this

nature are not uncommon in statistics. Fisher (1925) considered the problem

of estimating the mean of a Normal population with variance a 2 when the

coefficient of variation is known. The motivation behind this was based on theempirically observed fact that a standard deviation a often becomes large pro

portionally to a corresponding mean p so that the coefficient of variation a/p

remains constant. This fact is often present in mutually correlated multivariate

data. In the case of multivariate data , no well-accepted measure of variat ion

between a mean vector p and a covariance matrix E is available. Kariya , Giri

and Perron(1988) suggested the following multivariate version of the coefficent

of variation, (i) A = fl'SS-1

/! (ii) v = S " 1 ^ where E = E ^ E 1 ' 2 with S 1 / 2

as the unique lower triangular matrix with positive diagonal elements.

In recent years Cox and Hinkley (1977), Efron (1978), Amari (1982 a,b)

among other have reconsidered the problem of estimating p, when the coefficent

of variation is known in the context of curved models.

3 .1 . Best E q u i v a r i a n t E s t i m a t i o n o f fi wi t h A K n o w n

Let Xi,...,X n (n > p) be independently and identically distr ibuted JVp

(p, E ) . We want to estimate a under the loss function

L{n,d) = (d-n)'-L-1{d-p) (3.2)

when A = a'T,~l p, — A 0 (known). Let

n 7i

nx = Y ,x>,s = - x)(Xi -xy.

Then (X,S) is minimal sufficient for {u, E ) , JUX is dis tributed independently

of S as N p(y/nu, S ) and S is distributed as Wishart Wp(n - 1, E ) with n — 1

degrees of freedom and parameter E . Under the loss function (3.2) this problem

remains invariant under the group of transformations Gj(p) of pxp nonsingular



Equivariant Estimation in Curved Models 51

matrices g transforming (X,S) —» (gX ,gSg'). The corresponding group G of

induced transformations g on the parametric space ft transforms 8 = (p., £) —*

g8 — (pp, o £ p ' ) . A maximal invariant invariant on the space of (\/nX, S) is

T 2

= n(n- l J X ' S - ' X (3.3)

and the corresponding maximal invariant on the parametric space ft is A =

(i'T,~1u. Since the distribution of T 2 depends on the parameters only through

A, given A — A o T 2 is an ancillary statistic . Since the loss function L[p,d) is

invariant under the group Gi(p) of transformations g transforming {X,S) —»

{gX,gSg') and ( « , S ) —• (gp,gT,g') we get for any equivariant estimator p. of

R(8,( l) = Ee(p-p)'^- ,(p-p,))

= Ee(gjl - ga)'{g , £,g')~ 1{gii. - gp))

= Ee(p(gX) - 9${gWT\mm - gp)

= Ege(a{X) - g n)'{gZ^y\^{gX) - gp)

= * ( 5 M ) , (3-4)

for g € Gt{p),g is the induced transformation on E corresponding to g on x-

Since G|(p) acts transitively on E we conclude from (3.4) that the risk R(8,fi)

of any equivariant estimator ji is a constant for all 8 € ft.

Taking A 0 — 1, which we can do without any loss of generality and using

the fact that R(8, ii) is a constant we can choose p..= e = ( 1,0, . • . , Q) , E = I.

To find the best equivariant estimator which minimizes R(8,p) among all

equivariant estimators p. satisfying

ji(gX,gSg')=gp(X,S)

we need to characterize p.. Let Gr{p) be the subgroup of G;(p( containing all

p x p lower triangular matrices with positive diagonal elements. Since 5 is

positive definite with probability one we can write, S — W W ' , W £ Grip)-

Let

v = w~lY

'Q = WT\

where Y = •JnX, \\ V \\ 2= V'V = T 2 /(n - 1).

T h e o r e m 3.1. If p, is an equivariant estimator of p, under Gi(p) then

p(Y,S) = k(U)WQ (3.5)




where k is a measurable ofU = T 2 j(n — 1).

Proof. Since p is equivariant under Gi (p ) we get for g € Oi(p)

gp(Y,S) = p.(gY,gSg'). (3.6)

Replace Y by W _1Y, g by W and S by I in (3.6) to obtain

p(Y,S) = Wp(V,I). (3.7)

Let O be an pxp orthogonal matrix with Q = V/ || V \\ as its first column.

Then fi(V,I) = p(00'V,00') = OH{VUeJ).

Since the columns of O except the first one are arbitrary as far as they are

orthogonal to Q it is easy to claim that the components of p.(\fUe,I) except

the first component p L(VU e, / } are zero. Hence

p{V,I) = Qji1{y/UeJ). •

The following theorem gives the best equivariant estimator (BEE) of p.

T h e o r e m 3.2 . Under the loss function (3.2) the unique BEE of p isp = k

(If) WQ where

k(U) = E(Q'W'e\U)/E(Q'W'WQ\U). (3.8)

P r o o f . B y Theo rem 3.1 the risk function of an equivariant estimator p,

given Ao — 1, is

R(6,p)^R((e1 I),,l)

= E(k(U)WQ - e)'(k(U)WQ - e).

Using the fact that U is ancillary, a unique BEE is obtained as p = k(u)WQ

where k(u) minimizes the conditional risk given U

E{((k(U)WQ - e)(k{U)WQ - e)')\U\.

This implies k(U) is given by (3.8). •




3.1.1. Maximum likelihood estimators

The maximum likelihood estimators ft and £ of p and £ respectively under

the restriction An = I are obtained by maximizing

- | l o g (det £ ) - | tr 5 S "1

- | j £ - M J ' E - 1 ^ - M ) - (3.9)

with respect to p, £ where 7 is the Lagrange multiplier. Maximizing (3.9) we

obtain

p—~ —, S - - 5 + 7 M ' ( 7 + C I . (310)n + 7 71

Using the condition J t ' E - l / i = 1 we get

. (U-^U(4 + 5U)\ ^

(3.11)

The maximum likelihood estimator (mle) p is clearly equivariant and hence it

is dominated by jl of Theorem 3.2 for any p. In the univariate case the mle is

71 4(3.12)

Thus (3.12) is the natural extension of (3.11). Hinkley (1977) investigated

some properties of this model associated with the Fisher information. Amari

(1982 a, 6) proposed through a geometric approach what he called the dual mle

which is also equivariant.

3.2. A Particular Case

As a particular case of A constant we consider now the case £ = J when

C 2 is known. Let Y = y/nX,W = tr S. T h en (Y,W) is a sufficient statistic

for this problem, and i s distributed independently of Y as X ( n - i ) p -

We are interested here to estimate /t with respect to the loss function

(3.13)




This problem remains invariant with respect to the group G = R+ x 0(p),R+

being the multiplicative group of positive reals and O(p) being the multiplica

tive group of p x p orthogonal matrices transforming

Xi —t bTXi , t = 1 , . . . , n

d -4. bTd,

^gf) a ~ n ,

where (6,T) € G with b e R+,T g O(p). The transformation induced by G on

(Y, W) is given by

(Y,W) — (!>rY, f>2

W).

The theorem below gives the representation of an equivariant estimator

under G.

T h e o r e m 3.3. An estimator d(Y, W) is equivariant if and only if there

exists a measurable function h ; R+ —* R such that

d{Y,W) = h{^p}Y

for all [Y,W)£R? x R+ .

Proof . I f h is a measurable function from if+ —* R and d[Y, W) —

h (^w~j" Y t n e n clearly d is equivariant under G . Converse ly if d is equiv

ariant under G , then

(VY W\ d(Y,W) = b r d { — , ¥ ) (3.14)

for all T £ 0{p),Y € R p ,b > 0 , W > 0. We may assume without any loss of

generality that Y'Y > 0.

L e t Y, W be fixed and

- f t )where di is the first component of the p-vector d. Let A € O(p) be a fixed

p x p matrix such that

Y'A = (\\Y | | , 0 , . . . , 0 )

Let A be partitioned as A = ( A J . A J ) where A, = \\Y\\~1Y. Now choose

T = {Ai,A 2 B) with B e 0(p-l) and b = \\Y\\. From (3.14) we get

d(Y,W) = d 1 ^ ( l , 0 . . . 0 ) . ^ ) r .

+ \\Y\\A 3 Bd a((l t0...Q) t~) . (3.15)

Since (3.15) holds for any choice of B € 0(p — 1) we must have then

d(Y,w) = d l ( ( i , o . . . o ) . ^ ) r . •

It may be easily verified that a maximal invariant under G in the space of

sufficient statistic V = W~l Y'Y and a corresponding maximal invariant in

the parametric space is

As the group acts transitively on the parametric space the risk function (using

(3.13))

fl(M) = £ „ ( £ ( M ) )

of any equivariant estimator d is constant. Hence we can take p = po =

[C, 0.. . 0)' . Thus the risk of any equivariant estimator d can be written as

fl(po,d) = (L (p0 ,k y ) )

= Eu0(E(L(p n ,h(^-} r ) ) V = *) •

To find a B E E we need to find the function ho satisfying

EpQ(L(p0 ,h0(V)Y\V = v)< Ep0{L{po,h{v)y)\V = v)

for all h : R+ —* R measurable functions and for all values v of V. Since

EpampoM^YW = v)

= h 2

(v)Epo(Y'Y\V =v)~ 2h(v)Epo(Y l \V = v) + l

where K = ( y l t . . . , y p ) \ we get

EMY i \V = v) f 3 . 6 1

k o { v ) - Ep0(Y'Y\V = v)- ( 3 ' 1 6 )

T h e o r e m 3.4. The BEE da(xlt .. .,X„;C) = d 0(Y,W) is given by

d 0{Y,W)

' ™ rfjny+i+l) (n& ^

nC 2 hr i^+i+l )v\ 2

V r t i " P + j + i ) / /

B Ct

V

h hir+W \ 2 J .

(3.17)

where t — y ( l + ti)—1

Proof . The joint probability density function of Y and W under the as

sumption fi — fi0 is

[ e x p { - ( C2

/ 2 ) ( j / y - 2 y 5 g i + n + I K ) } ^ ' " -1

' " -1

/ ( y > ) = { 2 " p / 2 ( C 2 ) " " p / 2 ( r ( i ) ) p r ( ( n - l )p /2 ) , if w > 0

0, otherwise.

Hence the joint probability density function of Y, V is given by

(• e x p{ - ( C 2 / 2 [ ( ( l + i Q W i , + w ] }

/ y „ ( „ „ j = 2 ^ 2 ( C 2 ) - ^ \ r ^ m ( n - l )p /2 )

I 0, otherwise.

Using (3.16) we get (3.17).

C o r o l l a r y . If m - p(n - l ) / 2 is an integer, the B E E is given by

n <" 2

do(Y,W)=T ^h0(v) = g(t)X,

with <j(t) = p(t)/w(t) where

m+l

u(t) = y ,

m + l

m+ 1

m + l \ / ' n C3

\ 3 ...

Idr ( | p + i )

(3.18)

Proof. Let Y k be distributed as %%• Then

c . / y n ^ _ o t t r ( f c / 2 + a ) -kE

^ ~2

r(fc/2)l f Q >

^ " '

Hence with m as integer

t E i Y z ^ M - n C H m ^ y t

h0(v) = v ^ C 2 ^

£ ) e x p { - n C 2 ( / 2 } ( ^ ) J 43=0

= v

/ S C 2 £ ( V 1

m ) / £ ( V 2

m + 1 ) , (3.19)

where Vj is distributed as noncentral X 2 ,+2( n( ~' 2i)^ and ^2 is distributed as

noncentral Xp( nC 2 t). Hence letting V = ^ ( t f 2 ) a ° d taking r as an integer we

get

Prom (3.19) and (3.20) we get (3.18). •

Note 1. g(t) is a continuous function of f, and

lirn S (f) = T i C2

/ p ,(-.0+

hmg(t) = g { l ) < l ,

and g(t) > 0 for all t. Thus when Y'Y is large the B E E is less than X.

Note 2. We can also write d Q = (1 - ^ ) X where r(t))/t) = 1 - g(t). This

form is very popular in the literature. Perron and Giri (1990) have shown that

g{t) is a strictly decreasing function of t and T(V) is strictly increasing in v. T h e

result that g(t) is strictly decreasing in f tells what one may intuitively do if he

has an idea of the true value of C and observe many large values concentrated.

Normally one is suspicious of their effects on the sample mean and they havethe tendency to shrink the sample mean towards the origin. That is what our

estimator does. The result that T ( V) is strictly increasing in v relates the B B E

of the mean for C known with the class of minimax estimators of the mean for

C unknown. Efron and Morris (1977) have shown that a necessary condition

for an equivariant estimator of the form g{t)X to be minimax is g(t) —* 1 as

t —* 1, SO our estimator fails to be minimax if we do not know the value of

C. On the other hand Efron and Morris (1977) have shown that an estimator

of the form d = (1 — Z^p-)X is minimax if (i) r is an increasing function, (ii)

0 < T ( V ) <(p- 2 ) / (n - 1) + 2 for all v e (0, oo). Thus our estimator satisfies

(i) but fails to satisfy (ii). So a truncated version of our estimator could be a

compromise solution between the best, when one knows the values of C and

the worst, one can do by using an incorrect value of C.

3.2.1. An application

The following interesting application of this model is given by Kent, Briden,

and Mardia (1983). The natural remanent magnetization ( N R M ) in rocks is

known to have, in general, originated in one or more relatively short time inter

vals dur ing rock-forming or metamorphic events during which N R M is frozen in

by falling temperature, grain growth, etc. The N R M acquired dur ing each such

event is a single vector magnetization parallel to the then-prevailing geometric

field and is called a component of N R M . By thermal, alternating fields or chem

ical demagnetization in stages these components can be identified. Resistance

to these treatments is known as "stability of remanence". At each stage of

the demagnetization treatment one measures the remanent magnetization as a

vector in 3-dimensional space. These observations are represented by vectors

Xi,...,X n in fi3. They considered the model given by j j j — a , +0i+ei where

c»i denotes the true magnetization at the ith step, 0i represents the model er

ror, and ei represents the measurement error. They assumed that 0, and e<

are independent, p\ is distributed as N 3{0,T 2{c>i)I), and e z is distributed as

AM0>cr 2 (a,) / ) . T h e Q ; are assumed to possess some specific structures, like

collinearity etc., which one attemps to determine. Sometimes the magnitudeof model error is harder to ascertain and one reasonably assumes r 2(a) = 0.

In practice <J 2{a) is allowed to depend on a; and plausible model for c 2(a)

which fits many data reasonably well is cr 2(a) — a{a'o) + b with a > 0, b > 0.

When a'a large, b is essentially 0 and a is unknown.

3.2.2. Maximum likelihood estimator

The likelihood function of xi,... x n with C known is given by

L(xi,...,x n \p)

= C ^ ) - - / 2 e x p | - ^ - ( ( , + j / ' J , - 2 ^ ' M + V M ) } •

Thus the mle p of p (if it exists) is given by

[{np/C 2 )(p'p) -w-y'y + 2 \ / n i / A ] £ = Vnp'py • (3.21)

If the E q . (3.21) in ft has a solution it must be colinear with y. Let p = ky be

a solution. From (3.21) we obtain

k[(np/C 2 )y'y)k 2 + Jn~y'yk - {y'y +w)] = 0.

Two nonzero solutions of A; are

- 1 - ( 1 + ^ ( ^ ) )1 / 2 - 1 + (1 + ^ ( ^ ) )

1 / 2

1 _

iJkpIC 2 ' 2 ~ 2^lp/C 2

To find the value of k which maximizes the likelihood we compute the

matrix of mixed derivatives

a 2 ( - logL)

dp'dp kHy'y)'

and assert that this matrix should be positive definite. T h e characteristic roots

of this matrix are given by

k 3 y'y 'A 2 -

x/nC 2 + 2npk

k 2 y'y

If k = ki, then Ai < 0 and A 2 < 0. B ut if k = k 2 , then A, > 0, A 2 > 0.

Hence the mle p. — d-i(xi,... ,x n ,C) is given by (corresponding to positive

characteristic roots)

d 1{x1 ,...x n ,C) =(1 + 4 p / C

2

( )1 / 2 - 1

2pC 2 X. (3.22)

Since the maximum likelihood estimator is equivariant and it differs from

the B E E do the mle d\ is inadmissible. The risk function of do depends on C .

Perron and Giri (1990) computed the relative efficiency of do when compared

with d]_, the James-Stein estimator (d 2 ), the positive part of the James-Stein

estimator (d 3

), and the sample mean X ( d j ) for different values of C, n, and p.They have concluded that when the sample size n increases for a given p and C

the relative efficiency of do when compared with di, i = 1,.. , 4 does not change

significantly. This phenomenon changes markedly when C varies. When C is

small, dp is markedly superior to others. On the other hand, when C is large




all five estimators are more or less similar. These conclusions are not exact as

the risk of d 0 ,di are evaluated by simulation. Nevertheless, i t give us sufficient

indication that for small values of C the use of BEE is clearly advantageous.

3 .3 . Bes t E q u i v a r i a n t E s t i m a t i o n i n C u r v e d C o v a r i a n c e M o d e l s

Let Xi,... ,X n(n > p > 2) be independently and identical ly distributed

p-dimensional normal vectors with mean vector p and positive definite conva-

riance matrix S.

Let E and S = (X a - X)(X a - Xf be partitioned as

/ i P-I V ( i p - i V

£ — 1 £ l l ^12 I . 5 — 1 Sn S\ 2 I

We are interested to find the BEE of 0 — E ^ ^ i on the basis of n sample

observations Xi,...,x„ when one knows the value of the multiple correlation

coefficent p 2 — E ^ E ^ sE ^ E21. If the value of p 2 is significant, one would

naturally be interested to estimate 0 for the prediction purpose and also to

estimate £22 to ascertain the variabili ty of the predict ion variables.

Let Gi(p) be the full linear group of p x p nonsingular matrices and let

0{p) be the multiplicative group of p x p orthogonal matrices. Let Hi be the

subgroup of Gi(p) defined by

/ ' P - 1

Hi = {ft € Gi(p): ft = A n 0

\ 0 h22

and let H 2 be the additive group in R p

. Define G = Hi © H 2 the direct sum

of H i and H 2 . The transformation g = (hi,k 2 ) € G transforms H 2 . The

transformation g — (fti,fta) € G transforms

X{ - t fti^i + h 3 , t = l , . . . , n ,

( M , E ) - * (ftiM + ha .f ti Sf ti ) .

The corresponding transformation on the sufficient statistic is given by (X,S)

- * (fti-A + / i 2 , ftiSftJ). A max im al invariant in the space of (X, S) under G is

R? = S12S22S21

and a corresponding maximal invariant in the parametric space of (p, E ) is p 2 .




3.3.1. Characterization of equivariant estimators of £

Let S p be the space of all p x p positive definite matrices and let G £ ( p ) be

the group of pxp lower triangular matri ces wi th positive diagonal elements. A n

equivariant estimator d{X, S) of £ with respect to the group of transformations

G is a measurable function d(X, S) on S p xR p to S p satisfying

d(hX + £, hSh') = hd(X, S)ti

for all S £ S p ,h £ Hi, and X,£ € R p' From this definition it is easy to

remark that if d is equivariant with respect to G then d(X,S) = d(0,S) for

all X £ R P ,S £ S p . Thus without any loss of generality we can assume that d

is a mapping from S p to S p . Furthermore, if u is a function mapping S p into

another space Y (say) then d* is an equivariant estimator of u ( £ ) if and only

if d* — u • d for some equivariant estimator d of £ .

Let

and let G be the group of induced transformations corresponding to G on 0 P .

L e m m a 3.1. G acts transitively on 0 P .

Pr oof . It is sufficient to show that there exists a g — (A ,£ ) £ G wi th

ft £ Hi, ^ £ RP such that

1 1 P - 2 \ j N

(/i/t + £> A£A') =

1 p 0

p 1 0

.0 0 / /

1p - 2

(3.23)

/

If p = 0, i.e., £ 1 2 = 0, we take ha = £ , , 1 / 2 , f = - ft p to obtain (3.23 ). I f

p # 0, choose A n = E u

1 / 2

, A 2 2 = F £^3 , where T is a (p - 1) x (p - 1)

orthogonal matrix such that

£ n

1 / 2

E1 2

£2

-2

1 / a

r = 0 , 0, 0 ) ,

and £ = - A / / to obtain (3.23).

The following theorem gives a characterization of the equivariant estimator

d(S) of S.

T h e o r e m 3.5. An estimator d ofT. is equivariant if and only if it admits

the decomposition

a { S ,

~ {a^WR-iSu R~ 2 a 22 tR)S 21S,-1

1S1 2 + C(R){S22-S 2l S-1Sl2 )

(3.24)

where C{R) > 0 and

a /f,\ _ ( a-ii{R) a12(R) A [ K )

- \a 2l (R) a 22(R)

is a 2 x 2 positive definite matrix. Furthermore

d Y i di 2 d 2 2 <ki — p 2 — S ^ E i j i E j " ^ S a i

if and only if a n

l a i 2 a 2 2 a 2 \ = f> 2 •

N o t e . Th e d;j are submatrices of d as partitioned in (3.24) and aij =

aij(R).

P r oo f . T h e sufficiency part of the proof is computational. I t consists

in verifying d(hsh') — hd(S)h' for all h g H , S e Sp and d^,1 d^d^o1 d 2i —

<iii^i2a 22 a 2i. It can be obtained in a straightforward way from the compu

tations presented in the necessary part.

To prove the necessary part we observe that

P - ( R *)• FI >°<

and d satisfies

for all T € 0(p - 2). Th i s implies that

Ii 0

o r

P o \ _ / M R ) o0 I P- 2 ) { 0 C(R)I P_ 2

with C(R) > 0.

In general, S has a unique decomposit ion of the form

S =Sn Sl2 \ _(Ti 0 \ / l V \(T[ 0

Sn S 22 ) \ 0 T j \U V » J U T 2



Equina Hani Estimation in Curved Models 63

with 7\ G G £ ( 1 ) , T 2 e G+(p - 1}, and U = T 2-1S 21T,-> . Without any loss of

generality we may assume that U ^ 0. Corresponding to U there exists a B G

0 ( p - 1) such that U'B = {R, 0, . . .0 ) with R = \\U\\= ( S ^1 S^S^1 5 2 1 ) = >

0. For p > 2, 5 is not uniquely determined but its first column is R~l U.

Using such a B we have the decomposition

1 W \ _ ( \ O W P 0 \ (l 0

U !„.,) ~ {0 Bj \ 0 J P _2 ) \0 B'

and

« - ( ? i ) - ( J S ) ( T £ ) ( ! * ° ) ( »:

i )

(n O f an(f i ) R - ' a u t R J l / ' V TJ 0

p - i - W ) A o 13

fi-'a^ffi)^! R- 2

a 2 2(R}S 2-l S-l 'Sl 2 + C(R){S 22 -S 2iS^Sl2 )

which proves the necessary part of the theorem. •

3.3.2. Characterization of equivariant estimators of 0

The following theorem gives a characterization of the equivariant estimator

of 0.

T h e o r e m 3.6 . If d* is an equivariant estimator of 0 then d*(S) has the

form

d*(S) = R'1 a(R)S 22

1S 2 , ,

where a : R+ —* R.

Proof. Define u:S p ^ RP'1 by

« ( £ ) = £ = £ - % ! .

If a" is equivariant then

d*(S) = (R-^WSnS^Su + C(R)(S 22 - S^S^S12 ))~l

S^R-'a^R)

= (T 2(R- 2

a 22(R)UU' + C f f l H C V ! - UU r

)T^S 2l R-v

a 21{R)

= R~l (a 22(R) + (1 - R 2

)(C(R))-1

a 21(R)S 2- 2

1

S 21

= R-1 a{R)S 2- 2

l S 2i • •

The risk function of an equivariant estimator d" of 0 is given by

R(0,d*) = Ex(L((3,d'))

= E L{S^(R-1 a(R)S nS 2-.i

l

- 0}'S 22{R~l a(R)S 22

l S 2l - 0)}

= E z{a?{R) - 2R-1

a{R)S-[ l l

S120 + S^0'S 220} . (3.25)

T h e o r e m . 3. 7. T h e best equivariant estimator of 0 given p, under the

loss function L is given by

R-l a*(R)S 2- 2

l S 2 , (3.26)

where

£ r ( n ± i + l ) r ( ^ - M ) ( V V ) 7 I T +

a'(R) = r p 2 ^ _ . (3.27)

£ r2 (2=1 + m ) ( p 2 r

2 ) m / m ! r (2=1 + m)

P r o o f . From (3.25), the minimum of R(0,d*) is attained when a(R) =

a*(R) — E^S^1

Si20R~1

\R). Since the problem is invariant and d" is equiv

ariant we may assume, without any loss of generality, that

- M f ,1)

with C(p) = \ \ £ j . To evaluate o*( fi) we write

5 2 2 = T T ' , T e G + ( p - l ) , r = (%)

5 2 1 = RTW, 0 < B < l , B

r

6 fl""

1

,S u = W W .

The joint probability density function of (R, W,T) is (Gir i (1977))

1 P - i i

i = 2 j = l

1 P _ l

* e x p {

~ 2 ( i - P

2

)( W i

"+

* » -2 T

P ^ W

^ n fe) •

i=i (3.28)

A straightforward computation gives (3.26). •

L e m m a 3.2. Let 0 > 0, 0 < 7 < l , m be an integer.

Then

^Vja + m + l) T(0 + l)^

1=0T{a +1)

Proof .

^ r ( a + m + l ) r ( ^ + l ) j

i=0

d"

(=1

- 0

= (1 - 7 ) ^ ) ^ ( 1 + „ ) • + « - » ( l - ^

T h e o r e m 3.8. / / m — —p) is an integer then

•

a ' M ^ - ^ - l p y

i=0

m,

1 - r V

1 - r V

Pro of. Follows trivial ly from Lemma (3.2).

Note. If p 2 is such that terms of order (rp) 2 and higher can be ne

glected, the best equivariant estimator of 0 is approximately equal to p 2(n — 1)

(p-irl S 22

1S 2i. The mle of 0 is S 22S 2S .




E x e r c i s e s

1. Let X a = (Xai,- -,X aj ,)',a = 1,...,N( > p) be independently and iden

tically distributed p-variate normal random vectors with common mean fi

and positive definite common covariance matrix £ and let

show that if & = A V E - 1 / * is known, then NX'(S + NXX')~l X is an

ancillary statistic.

2. Let X a = (X ai,...,X ap y,a = 1,... ,2V( > p) be independently and identi

cally distributed p-variate normal random vectors with mean 0 and positive

definite covariance matrix £ and let

E _ (S

(12) \

\ £ ( 2 1 )

with £ ( l l ) : l x l . Given the value of p 2 - £ ( 1 2 , £ ( ~ 2 ) £ ( 2 1 1 / £ i i find the

ancillary statistic.

3. (B as u (1955)). If T is a boundedly complete sufficient statistic for the family

of distributions {Pg,9 g ft}, then any ancillary statis tic is independent of

T.

4. Find the conditions under which the maximum likelihood estimator is equiv

ariant.

Re fe r e n c e s

1. S. Amari, Differential geometry of curved exponential families - curvatures and

information loss, Ann. Statist. 10, 357-385, (1982a).

2. S. Amari, Geometrical theory of asymptotic ancillary and conditional inference,

Biometrika, 69, 1-17 (1982b).

3. D. Basu, On statistics independent of sufficient statistic, Sankhya 20, 223-226

(1955).

4. D. R. Cox and D. V, Hinkley, Theoretical Statistics, Champman and Hill, London,

(1977).

5. B. Efron, The geometry of exponential families, Ann. Statist., 6, 262-376, (1978).

6. B. Efron and C, Morris, Stein's estimation rule and its competitors, An empirical

approach, J . Amer. Statist. Assoc. 68, 117-130, (1973).

7. B. Efron and C. Morris, Stein's paradox in statistics, Sci. An n. , 239, 119-127,

(1977).




8. R. A. Fisher, Theory of statistical estimation, Proc. Cambridge Phil. Soc. 22,

700-725, (1925).


10. D. V. Hinkley, Conditional inference about a normal mean with known coefficent

of variation, Biometrika, 64, 105-108, (1977).

11. T. Kariya, N. Giri, and F . Perron, Equivariant estimator of a mean vector /i of Npifi, E ) with u"Z~l u = 1 or £"

V

V = c or £ = a 2(u'fi)I, J. Multivariate

Anal. 27, 270-283 (1988).

12. J . Kent, J . Briden, and K. Mardia, Linear and planer structure in ordered mul

tivariate data as applied to progressive demagnetization of palaemagnetic rema-

nence, Geophys. J. Roy. Astronom. Soc. 75, 593-662, (1983).

13. F . Perron and N. Giri, On the best equivariant estimator of mean of a multivariate

normal population, 3. Multivariate Analysis, 32, 1-16 (1990).

14. F . Perron and N. Giri, Best equivariant estimation in Curved Covariance Models,

40, 46-55 (1992).



Chapter 4

SOME BEST INVARIANT TESTSIN MULTINORMALS


In Ch apte r 3 we have dealt with some applications of invariance in statisti

cal estimat ion. We discuss in this chapter various testing problems concerning

means of multivariate normal distributions. Testing problems concerning dis

criminant coefficents, as they are somewhat related to mean problems will

also be cousidered. This chapter will also include testing problems concerning

multiple correlation coefficient and a rela ted problem concerning mult iple cor

relation with partial informations. We will be concerned with invariant tests

only and we will take a different approach to derive tests for these problems.

Rather than deriving the likelihood ratio tests and studying their optimum

properties we will look for a group under which the testing problem remain

invariant and then find tests based on the maximal invariant under the group.A justifi cat ion of this approach is as follows: If a testing problem is invariant

under a group, the likelihood ratio test, under a mild regularity condition, de

pends on the observations only through the maximal invariant in the sample

space under the group {Lehmann, 1959 p. 252). We find then the optimum

invariant test using the above approach, the likelihood ratio test can be no

better, since it is an invariant test.

4 .1 . T e s t s o f M e a n Ve c t o r

Let X = (Xi,...,X p ) be normally distributed with mean £ — E(X) =

(£i> — l i p ) 1 and positive definite covariance matrix £ — E(X — u) (X — p).

Its pdf is given by

68

Some Best Invariant Tests in Mattinormals 69

fx(x) = {2 7 r ) -" / 2 (det exp j - i t r E "1

^ - 0 & - tf } • (4-1)

In what follows we denote 3: a p-dimensional linear vector space, and X', the

dual space of X. The uniqueness of nonmal distribution follows from the fol

lowing two facts:

(a) The distribution of X is completely determined by the family of distribu

tions olB'X, 9 £ X'.

(b) T he mean and the variance of a normal variable completely determines its

distribution.

For relevant results of univariate and multivariate normal distribution we refer

to Giri (1996). We shall denote a p-variate normal with pdf (4.1) as JVp

( f , S ) .

On the basis of observations x" — (x ai ,..., x ap ) , a — I,. ..,N from

N p{l, E ) we are interested in testing the following stat ist ica l problems.

P r o b l e m 1, To test the null hypothesis HQ: £ = 0 against the alternatives

Hi: £ j£ 0 when g, S are unk now n. This is known as Hotelling's T 2 problem.

Problem 2 . To test the null hypothesis HQ: £i — • • — £ p , — against the

alternatives H i : £ # 0 when £, £ are unknow n and p > p i .

Prob lem 3 . To test the null hypothesis Ha : £i — * * > — £ P l against the

alternatives H i : & = • - * = ^ = 0 when p, £ are unknown and pi + p 2 < p.

Let X - jj^X", S = E^(X a - X)(X a - X)'. (X,S) is sufficient

(minimal) for (£ , £ ) and \/NX is distributed independently of S as N p(y/ntl, E )

and S has Wishart distribution Wj,(JV—1, E ) with JI = J V - 1 degrees of freedom

and parameter E . Th e pd/ of S is given by

f X { d e t E ) ' n / 2 ( d e t s ) : i : : F 1 e x p { - | t r E - 1 a }

W p(n, £ } = < if s i s positive definite, (4.2)

I 0 otherwise.

Problem 1 remains invariant under the general linear group G;(p) transfer-

ing (X, S) -* (gX, gSg'), g G Gi(p). From Ex am pl e 2.4.3 with di = p

R =NX'(S + nXX')-1

X

NX'S-l

X

l+NX'S-tX

is a maximal invariant in the space of (X, S) under G ( (p) and 0 < R < 1.

(4.3)

The likelihood ratio test of HQ depends on the statistic (Gir i (1996))

T 2 = N(N -^X'S^X (4-4)

which is known as Hotelling's T 2 statistic. Since R is a one-to-one function of

T 2 we will use the test based on R. From Th eo re m 2.8.1 the pdf of i i depends

only on 6 — i V ^ ' E '1

^ and is given by

M > r ( p / 2 ) r ( ( 7 v - P ) / 2 ) 1 '

It may be verified that 8i is a maximal invariant under the induced group

(5j(p) = Gi(p) in the parametric space of (£ , E ) . Since £ is positive definite

tfi = 0 if and only if p = 0 and 6 > 0 for p ^ 0. F or invariant tests with respect

to G;(p) Problem 1 reduces to testing HQ: 6 = 0 against the alternatives Hi-

6 > 0. From (4.5) the pdf of R under HQ is given by

Mn) =

r ( a

J—

r M ( i - T j t l * - * & - *r ( f ) r ( V ) ( 4 . 6 }

if 0 < r < 1, 1 '

0 otherwise,

which is a central beta with parameters (|, ^j^)- From (4.5) the distribution

of R has monotone likelihood ratio in i i (see Lehmann, 1959). Thus the test

which rejects HQ for large values of R is uniformly most powerful invariant

(UMPI) under the full linear group for testing HQ against H i .

Since R ^ constant, implies that T 2 ^ constant we get the following theo

rem.

T h e o r e m 4.1 .1 . For Problem 1, Hotelling's T 2 test which rejects HQ

whenever T 2 ^ C, the constant C depends on the level a of the test, is UMPI

for testing HQ against Hi.

The group Gt{p) induces on the space of {X, S) the transformations

(X,S)^(gX,gSg'), 9 € G , ( p ) .

It is easy to conclude that any statistical test <p(X,S), a function of {X,S}

whose power function depends on ( / i , £ ) only through 6, is almost invariant

under Gj(p) . Since

Some Best Invariant Tests in Multinomials 71

Ell ̂ (X <S) = E g .1)l _ g .lEg .l .ct>(X,S)

= Elti>2<KgX,gSg'),

thus

E^(<f>(X,S)-<P(gX,gSg')) = 0

for all ( / / , £ ) . Using the fact that the joint probability density function of

( X , S) is complete, we conclude that

<p(X 1S) = 4'(gX,gSg•}

for all g 6 <3J(P)> except possibly for a set of measure zero. As there exists a

left invariant measure (Example 2.1.3) on G ( ( p ) , which is also right invariant,

any almost invariant function is inva rian t (L eh man n, 1959). Hence we obtain

the following Theorem for Problem 1.

T h e o r e m 4 . 1 . 2 . Among all tests of HQ: £ — 0 with power function de

pending on 6, Hotelling's T 2 test is UMP:

P r o b l e m 2 . L e t Ti be the group of translations such that ti,g Ti trans

lates the last p — pi components of each X", a — 1 , . . . , N and let Gi be the

subgroup of Gi{p) such that g 6 G\ has the following form

g=(m o

\9{21) 9(22)

where g^U j is the p x x p x upper left-hand corner submatrix of g. This problem

is invariant under the affine group ( G ^ T i ) such that, for g £ Gi,tj € I i ,

(g, ( i ) transform

X a^gX a + tu a = l,..,,N.

A maximal invariant in the space of (X,S) under ( 0 i , T i ) is Bi, where

R ^ N X ^ S ^ + N X ^ X ' ^ r ' X ^ (4.7)

as defined in Example 2.4.3 with di == Jh- A corresponding maximal invariant

in the par ametric space of (£ , E ) under ( G i . T i ) is #i = N ^ E ^ . ^ J J . For in

variant test this problem reduces to testing H 0 : §i = 0 against the alternatives

Hi: &i > 0. From Theorem 2.8.1 the pdf of R t is given by

X

From (4.8) it follows that the distribution of R\ possesses a monotone likelihood

ratio in i%. Hence we get the following Theorem.

T h e o r e m 4 .1 .3 . For problem 2 the test which rejects Ho for large values

of Ri is UMPI under (Gi,Ti) for testing Ho against H\.

N o t e : As in Problem 1 this U M P I test is also the likelihood ratio test

for Problem 2. Using the arguments of Theorem 4.1.2 we can prove from

Theorem 4.1.3 that among all tests of Ho with power function depending on

Si, the test which rejects Ho for large values of Ri is U M P for Probl em 2.

Pr ob le m 3 . Let T 2

be the translation group which translates the last

p — pi — P2 components of each X a , a = 1 , . . . , N and let G 2 be the subgroup

of Gi(p) such that g e G 2 has the form

(9iu) 0 0 \

9 = 9(21) 3(22) 0 1

\9(31) 3(32) ff(33] /

where is pi x p t and <7(22] is p2 x p 2 . This problem is invariant under the

affine group (G2,T 2 ) transforming

X a

^gX a + t 2l a = l,..,,N

where g £ G 2 and t 2 € T 2 . A maximal invariant in the space of (X, S) under

(G 2 ,T 2 ) is C f l M b ) ) where

R l + R 2 - NXf2](Si22] + N J ? [ 2 ] A ' [ 2 ] ) "1

X [ 2 |

as defined in Ex am pl e 2.4.3 with di = pi, d 2 = p 2 .

A corresponding maximal invariant in the parametric space of (£,E) is

(61,62), defined by,

^ = ^ ( 1 ) ^ - ^ ( 1 ) .

Under the hypothesis H 6 ,$1 = 0, 6 2 = 0 and under the alternat ives Hi, fij =

0, 6 2 > 0. From Theorem 2.3.1 the joint pdf of Ri, R 2 is given by




m f , _ n | N )

l

'2 > r t l p i j r c ^ r t ^ p - p ! - ^ ) )

From Giri (1977) the likelihood ratio test of this problem rejects Ho whenever

where the constant c depends on the level a of the test and under Ho, Z is

distributed as central beta with parameters (|(iV — pi — p 2 ) , \p?)-

From (4.9) it follows that the likelihood ratio test is not U M P I . However for

fixed p, the likelihood ratio test is approximately optimum as the sample size

N is large (Wald, 1943). Thus if p is not large, it seems likely that the sample

size commonly occurring in practice will be fairly large enough for this result

to hold. However, if the dimension p is large, it might be that the sample size

N must be extremely large for this result to apply.We shall now show that the likelihood ratio test is not locally best invariant

( L B I ) as S 2 - * 0. Th e L B I test rejects H 0 whenever

Ri + N ~ P l R 2 > c (4.11)

where the constant c depends on the level a of the test. Hotelling's T2

test

which rejects Ho whenever R, + R 2 ^ c, the constant c depends on the level O:

of the test, does not coincide with the L B I test, and hence it is locally worse

for Problem 3.

Definit ion 4.1.1. ( L B I test) . For testing H a : 9 € £ I H„, an invariant test

4>* of level a is L B I if there exists an open neighborhood fi i of fl/jo such that

Eebt?) > Ee{4>\ 0 e & - f i j i o , (4.12)

where 0 is any other invariant test of level a.

T h e o r e m 4.1.4. For testing H 0 : h*i = 0, tf 2 = 0 against H\ : S i = 0,6 2 >

0, the test which rejects H Q whenever R\ + N ~?' R 2 ^ c, where the constant c

depends on the level a of the test, is LBI as 6 2 —* 0.

P r o o f . Since (R\,R2) is a maximal invariant under the affine group

( G 2 , r 2 ) , the ratio of densities of (R1 ,R 2 ) under Hi, to that under HQ, is

given by,

m , ^ ) = 1 + 4 ( ' _ 1 + f i + i v ^ f 2 > ) + o ( , 2 )

f(fi,f 2 \0) 2 V P2

as 6 2 —* 0. Hence the power of any invariant test d> is given by

ES2($)=a-rE s ̂D $6 2 (-l + R i + ^ ^ R ^ j + o(6 2 )

as 62 —* 0, which is maximized by taking $ = 1 whenever Ri + N ~ Pl R 2

^ c. n

A s y m t o t i c a l l y B e s t Invar iant ( A B I ) T es t

Let # be an invariant test for testing H 0 : $1 = 0, 6 2 = 0 against Hi : S\ = 0,

6 2 = A with associated critical region fi = {(fi.fa) ! U{f\,ft) ^ c „ } satisfying

P « 0 ( f i ) = a;

P Hl (R) = 1 - e x p { - H ( A ) ( l + o < l ) ) } ;

f ! - " - 2 l f f ? = e x p W A J l G f A i + fifA)^,^)]}

+ fi(f l7f 2;A); (4.12)

where J B ( f i , f 2 ;A) — o(H(A)) as A —> 00 and 0 < ci < R(X) < c 2 < 00. Then

$ is an A B I test for testing HQ against H 1 as A —1 00.

T h e o r e m 4.1.5 . For testing HQ : S\ = 62 = 0 against H i : t>\ — 0 , £ 2 —

A > 0, Hotelling's test which rejects H 0 whenever Ri + R 2 ^ c, the constant c

depending on the level a of the test, is ABI as A —* 00 .

P r o o f . Since

_, , . , a a(a +1) x 2

= exp{ar(l + o( l) } as x —< 0 0 ,

we get from (4.9)

T ^ j ^ j = e x ^ " f i " M ( l + B(f u f2-,\))}; (4.13)

where S ( f i , f 2 ; A ) - o ( l ) as A -» 00 . It follows from (4.2) that the pdf of

U = R\ + R 2 satisfies

Some Best Invariant Testa in Multinomials 75

P(U(Rl ,R 2 ) = R 1 + R 2 < c a )

= exp { ^ ( c « - 1 ) 0 + o ( l ) ) } • (4-14)

Thus, from (4.12) with H(X) = ~{c„ - 1), we conclude that Hotelling's T 2 test

is A B I . •

4 .2 . T h e C las s i f i c a t ion Pr ob le m ( T w o Popul a t ions )

Given two different p-dimensional populations Pi,P 2 characterized by the

probability density functions pi,p 2 respectively; and an observation x — [x\,

• . ., x PY the classification problem deals with classifying it to one of the two

populations. We assume for certainty that it belongs to one of the two pop

ulations. Denote by Oj, t = 1,2 that x comes from Pi and let us denote, for

simplicity, the states of nature Pi simply by i. Let L ( j , be the loss incurred

by taking the action a; when j is the true state of nature having property that

L(i ,a<) - 0, L(i,a,j) = ij > 0, j j£ i = 1,2,

Let (x, 1—T) be the a priori probability distribution on the states of nature.

The posterior probability distribution (£,1 — £), given the observation x, is

given by

_ rrp^x) ( ,5 «Pi(*J + { l

1

• '

The posterior expected loss of taking action al is given by

(1 - 7T)Ci P 2(l)

Kpi(x) + (1 - JT)P2(Z)

and for taking action a 2 is given by

irpi{x) + (1 - iv)p 2{x) '

From (4.16)-(4.17) it follows that we take action a 2 if

(4.16)

(4.17)

TC 2 Pl(x) , ( 1 - ^ ) C 1 P 2 ( X )

x p i ( x ) + (1 - 7 r ) p 2 ( i ) npi(x) + (1 - w)p 2(x)

and action aj if

7rc 2 pi(a;) > {1 - ir)clP2(x) ^

npi(x) + (1 - n)p 2(x) xp^x) + (1 - ?r)p2(i)

and randomize when equality holds.

(4.18) and (4.19) can be simplified as,

take action a 2 if > . , . — = K ( say) ,

take action a , if < JiT (4.20)

and randomize when — K.

Le t us now specialize the case of two p-variate normal populations wi th

different means but the same covariance matrix. Assume

Pi : W „ ( « , E ) ,

p 2 : A . p ( M , S ) ,

where a = . . . , ^ p ) ' , £ - (f t, . , - , £ p ) ' e i i p and £ is positive definite.

Hence

pa(«)= exp [f> - O'S"

1

^ HYX-H* - f*)] }

- exp ( s ' E ^ O - 0 4- | ( { ' E _ 1 £ - pE 'V )

Using (4.20) we take action a 2 if

and take action a x , otherwise. Th e linear functional T — £ - 1 ( p . — 4) is called

Fisher's discriminant function. In practice E , £, ft are usually unknown, and

we consider estimation and testing problems for T — ( r i , . . . , T p ) .

Let X a , a — 1 , . . . , Ni be a random sample of size A r i from N p(£, E) and

let Y a , a — 1 , . . . , A72 be a sample of size A72 from A r

p (p , E ) . Invariance and

sifficiency considerations lead us to consider the statistic

s = Y,{ xa - - + J2( Y ° - Y )( ya - y ) ' . a=l a=l

where N,X = X a , A ^ y = J T ^ Y° for the statistical inferences of V.

We consider two testing problems about T.

Problem 4 . To test the hypothesis H 0 : r p i + i = • • - — r p = 0 against the

alternatives Hi T ^ O .



Some Best Invariant Testa in Multinomials 77

P r o b l e m 5. To test the hypothesis H 0 : T P 1 + 1 = • • • = T p — 0 against the

alternatives Hi : r p i + P 2 + i = • • • = T p = 0.

Without any loss of generality we restate our setup in the following canon

ical form, where X = {Xi,... ,X P ) is distributed as JV p (7 j ,E) and S is dis

tributed, independently of X, as Wishart W p

(n, E ) and T—

E

_ 1

J J . Since E ispositive definite, T = 0 if and only if n - 0. Thus the problem of testing the

hypothesis T = 0 against T ^ 0 is equivalent to Problem 1 considered earlier

in this chapter.

P r o b l e m 4, It remains invariant under the group G of p x p nonsingular

matrices g of the form

9=(m 0 ) (4 .21)

\9{21) 9(22) }where 9{\\) is of order pi, operating as

(X }S;T }S) ^ (gX^gSg'^g')-1?^^')

where X, jf are the mean and the sample covariance matrix based on N ob

servations on X. By Example 2.4 .3 a maximal invariant in the space of the

sufficient statistic (X, S) under G is given by (Ri^R^), with di = pi, d% = P2

and p — pi +P2 - The joint pdf of (r? i , R 2

) is given by (4 .4 ) where the expressions for 6i, 6

2 in terms of T, E are given by

Si + s 2 = A T ' s r ,

S 2 = Nr'(2)(x 22 rl r{2)

with £2 2

= ( E ( 2 2 ) - E ( 2 i ) E j_

n ) E ( 1 2 ) )_ 1

. Under the hypothesis H 0 61 > 0,

02 — 0 and under the alternatives HiSi > 0, i = 1,2. From (4 .4) it follows that

the probability density function of Ri under Ho is given by

» M r (4-22)

m\$i) =r ( i p , ) r ( i ( A / - p , ) )

x ( f i ) ^ p , " 1 ( l - f i ) ^ W " p l ) " 1

x exp | - ^ ! } 4> QjT, | M J I f i ^ l , (4.23)

and Ri is sufficient for t\.

Giri (1964) has shown that the likelihood ratio test for this problem rejects

Ho whenever

* = i T T 7 r ^ ( 4

-2 4 )

where the constant c depends on the level a of the test and under H 0 Z has a

central beta distribution with parameter {^{N - p ) , \p2) and is independent

oiRi.

To prove that the likelihood ratio test is uniformly most powerful similar

we first check that the family of pdf

UXH%)-h ^ 0 } (4.25)

is boundedly complete.

De fi ni ti on 4.2.1. ( B o u n d e d ly co mp le te ). A family of distributions

{Ps(x) '• 6 E fi} of a random variable X or the corresponding family of pdfs

{ps(x) : 6 e fi} is boundedly complete if

E6 h(X) = J k(x)dP s(x)

= j h{x)p s{x)dx

= 0

for all 6 6 fi and for any real valued bounded function h(X), implies that

h(X) = 0 almost everywhere with respect to each of the measures Pf.We also say AT is a complete statistic or X is complete.

L e m m a 4.2.1 . The family of pdfs { / ( f i ^ i ) : 6\ > 0} is boundedly com

plete.

Proof . For any real-valued bounded function h(R]} of Ri we get (using

(4.23))

E-Sl (h{Ri)) = j Hnmr^dfi

= e x P { - i * - 1 }

j - Jox

= exp <f l - 1 ^ I ^ 1 ) rl

Some Best Invariant Tests in MultxnoTmals 79

where

h*(r1 )=h{r\)fi™-1(l-rl )u N -Ky-\

a g ( £ { P - P i ) , 5 ( * - P i ) + j )3

B ( i ( J V - p, ) , i P l + j ) B ( i ( J V - p), i ( p - P l ) )

and B(p , g) is the be ta function. Hen ce E g h(Ri) = 0 implies that

E - I / ^ C n ) ( n ) J ^ i = 0 - (4.26)

Since the right-hand side of (4.26) is a polynomial in 6\, (4.26) implies thatall its coefncents must be zero. In other words

/ h*(f\)f{dfi = 0 , j = 0 , 1 , 2 , . . . . (4.27) Jo

Let

where h"+

and ft*- denote the positive and negative parts of h'. Hence we get

from (4.27)

Jo Jo

for j= 0 ,1 , .. . ; which implies that

h*+(f 1 ) = h-(f l )

for all ^ except possibly on a set of measure 0. Hence we conclude that

h*(Ri) = 0 almost everywhere {Pg1 (Ri) : Si ^ 0} which, in turn, implies that

h(f i i ) = 0 almost everywhere. •

Since Ri is comple te, it follows; from Lehman n (1959) , p. 34; that any

invariant test 4>(Ri,R 2 ) of level a has Ne ym an Struct ure with respect to Ri,

i.e.,

ElIll (<p(Rll R 2 )\Ri ) = cl .

To find the uniformly most powerful test among all similar invariant tests

we need the ratio R

fH l (fl,f 2 \Rl = f i ) R =

}H a{T\,T 2 \Ri = f%)

where /^( f i f f^- f i i = r i ) denotes the conditional pdf of (RuRg) given Ri =

f i under Hi,i = 0,1. From (4.9) we get

H - exp | - i | 2 ( l - fx)} <t> ~ P i ) , \(P ~ P i ) ; - (4-28)

From (4.28), using the fact that Z is independent of R± when H 0 is true, we

get the following theorem:

T h e o r e m 4.2.1. For Problem 4, the likelihood ratio test given in (4.24) is

uniformly most powerful invariant similar.

P r o b l e m 5. It remains invariant under the group G of p x p nonsingular

matrices g of the form

/ V >0 0 \

9 = 3(21) 3(22) 0

\3(31) 3(32) 3(33)/

with 3(H) : pi x p2, 9(22) : P2 x p2 submatrices of g. A maximal invariant in

the space of (X,S) (as in Problem 4) under G is (Ri, R 2 , R3) as defined in

Example 2.4.3 with d\ = p i , d 2 = p 2 and d$ = p - pi - p2. B y Theorem 2.8.1

the joint probability density function of Ri, R 2 , Ri is given by

x f i " " - 1 ^ i » - l

F i ( r ^ - W - » J - l

x f l - n - ^ - f a j i ^ - p ' "1

l, '=1 J = l >>J J

where

and

"» - EpJ' CTQ = ° ' P3 = p - p i - p a •

Some Beat Invariant Tests in Multinomials 81

&1 = i V f E j i i j l ^ i ) + S ( 1 2 | r i 2 ) + £(18)1(3) ) 'Sj~, j

* ( s ( i o r ( i ) + S ( i 2 ) r f ! ) + £ ( i 3 ] r ( 3 ) ) ,

A + 6 2 = f s ( n ) r ( i > + s c i 2 ) r ( 2 > + s ( i 3 ) r ( 3 ) V s - i

\ S ( 2 1 ) r ( l ) + £(22)^(2) + £(231^(3) / ' 2 2 '

x j /

E ( i i ) r ( 1 ) + £ ( 1 3 | r ( 3 ) - i - £ j l 3 ) r ( 3 ) \ ^

V E ( 2 i ) r ( 1 ) + £ ( 2 2 ] r ( 2 j + £ ( 2 3 ) r ( 3 ) J

S , = W r ; , 1 ( E ( 3 3 1 - ( ^ ; ) ' E r i . ( ^ ) , r ( 3 ,

Under H 0 , 61 > 0, 6 2 = 0, 6 3 = 0, and under H i , Si > 0, S 2 > 0, 6 3 = 0. From

(4.29) it follows that Ri is sufficient for S\ under H 0 . From Giri (1964) the

likelihood ratio test of this problem rejects H0

whenever

Z = 1

~ R i

;S a $ c (4.30)

1 - tii

where the constant c depends on the level a of the test and under H 0 Z is

distributed independently of H i as central beta with parameter (^(N — p x —

J>2). \V2)-

Le t <p(Ri,R 2 ,R 3 ) be any invariant level a test of H 0 against H i . From

Lemma 4.2.1 H i is boundedly complete. Thus <j> has Neyman Structure with

respect to R x . Since the conditional distribution of (Hi, R 2 , R3) given Ri does

not depend on c\, the condition that the level a test <j> has Neyman structure,

that is,

H H D ( 0 ( H 1 , H 2 , H 3 ) | H I ) - Q

reduces the problem to that of testing the simple hypothesis 6 2 = 0 against the

alternatives 6% > 0 on each surface Ri — f\. In this conditional situation the

most powerful level a invariant test of S 2 = 0 against the simple alternativeS 2 — 6 2>0\s given by,

r e j e c t H 0 whenever <h Q(JV - P l ) , i p z ; ^f 26%j ^ c, (4.31)

where the constant c depends on a. Since

tf.2 = ( l - H i ) ( l - Z )

and Z is independent of H i , we get the following theorem.

T h e o r e m 4 .2 .2 . For Problem 5 the likelihood ratio test given in (4.30) is

uniformly most powerful invariant similar.




4 . 3. T e s t o f M u l t i p l e C o r r e l a t i o n

Let X a , a = 1, ,N beN independent normal p-vectors with com mon mean

fi and common positive definite covariance matrix E and let NX = E%-,X a

,

S = £%=1(X a -X)(X a

-X)'. As usual we assume JV > pso that 5 is positive

definite with probability one. Partition E and S as

\ E ( 2 i ] E(22) / ' \ 5(21) S ( 2 2 ) /

where E ( 2 2 j , 5 ( 2 2 ] are of order p— 1. Let

2 _ s

( i 3 ) E ( 3 2 ) E ( 2 1 )

(u)"

= E

the positive square root of p 2 is called the population multiple correlation

coefficent.

P r o b l e m 6. To test H 0 ; p 2 = 0 against the alternative Hi : p 2 > 0. This

problem is invariant under the affine group (G,T), operating as

(X, 5; p, E ) - » ( g X + t, gSg'; gp +1, gXg')

where g g G i s a p x p nonsingular matrix of the form

g=f «")0 9(22)

where p( 2 2 ) is of order p - 1 and T is the group of translations t of components

of each X a . A maximal invariant in the space of {X,S) under the affine group

(G,T) is

r 2 _ ^ 2 2 > 5 ( 22)5(21)

5 ( H )

which is popularly called the square of sample multiple correla tion coefficent.

D i s t r ibut ion o f R 2 . To find the distribution of R 2 we first observe that

R 2 5 (12)S[! 2 )

S (21)

1 R

2

5 (U

) -

5

( i 2 ) £( Z 2 )

S( 2 1

)

From Giri (1977)

5(u) - 5 ( i 2 ) 5 , 2 L £ ( 2 1 )




is distributed as X 2 N- P w ' t n

N - p degrees of freedom and is independent of

S(i2). 5( 22). Thus R 2 /(l - R 2 ) is distributed as

1 •S

( 1 2)5 { 22 |5(22)

* N - p 2(11) - E(12) S[ 22] S(21)

But the conditional distribution of

S ( 12) 5(22)2

\ / E ( i i } - S { 1 2 ) S ( 2 2 ) S ( 2 l )

given 5( 2 2 ) , is (p — l)-variate normal with mean

^{22)^(22)^2)

^ E ( u ) - E(i2)S ( 22)S(21)

and covariance / . Hence R 2 /(l - R 2 ) is distributed as

1 2 | / E ( 1 2 ) S ( 2 2 | 5 ( 2 2 ) E ( 2 2 ) S ( 2 1 )

X%- p

X p * v

E ( 1 1 ) - E d a j E ^ E f a i )

where Xj.(A) denotes a noncentral chisquare random variable with noncentrality

parameter A and k degrees of freedom. Also

s ( i 2 ) E ( 2 2 ) 5 ( 2 2 ) S | 2 2 | S ( 1 2 )

E(12)S(22)S(21>

is distributed as chisquare Xw—v Since

S(12)S(~22)S(21) p 2

E ( u ) - E ( I 2 ) E ( - 2 1 E ( 2 1 ) 1 -P 2 '

we conclude that R 2 /(l - R 2 ) is distributed as

,2

pl2 1 ^( 2 2 )^2 2 )^ ( 2 2 )^ (2 in

1 2 I _P_ 2 \

Using the fact that a noncentral chisquare random variable XmW can berepresented as xt1+2K> w n e r e K is a Poisson random variable with parameter

^, we conclude that

^ ( r r ^ - i )

is distributed as X - p _ 1 + 2 K - , where the conditional distribution of K given x ? v - i

is Poisson with parameter \[p 2l[l - P 2 )]XN-I - Let A/2 - i p 2 / ( l - p 2 ) . Then

P(K = k) =

fc!r<±(iv-i))

with lb = 0, 1 , . . . . Thus

25<w

-1

>r(i(Ar_i))A;!

_ r(§(iV-i)+*)A*

r ( | ( N - l ) + fc) , k _ 2 ) i ( N _ i , { 4

is distributed as X p

-1 + 2 K (4.34)

—T- l o U 1 M I 1 U U . c u d o t —

1 - B 2

X

2

N-

where K is a negative binomial with pdf given in (4.33). Simple calculations

yield

(N - p)R 2

(p~l)(l-R 2 )

has central F distribution with parameter (p - 1, N - p) when p 2 — 0. From

(4.33)-(4.34 ) the noncentral distribution of R 2 is given by

( j - r 2 ) l ( N - P - 3 ) ( r 2 ) l ( P - 3 ) ( 1 _ ftHtr-l)f R ' { r > - r ( i ( N - i ) ) r ( i ( i v - p ) )

, f . W W ( j ( f f - i ) + i) . . . . .x f e ^ ( i ( P - D + l)-

( 4 ' 3 5 )

It may be checked that a corresponding maximal invariant in the parametric

space is p 2 .

T h e o r e m 4.3.1. For problem 6 the test which rejects H 0 whenever R 2 ^ C,

the constant c depends on the level a of the test, is uniformly most powerful

invariant.



Some Beat Invariant Tests in Multinomials 85

P r o o f . From (4.36)

~ ( A 2 ) T ^ ( J v - i ^ ) r ( j ( p - i ) )

^ i ! r ( i ( p - i ) + i ) r2

( i ( 7 v - i ) ) '

which for a given value of p 2 is an increasing function of r 2 . Using Neyman-

Pearson Lemma we get the theorem. •

As in Theorem 4.1.2 we can show that among all tests of Ho : p 2 = 0

against Hi : p 2 > 0 with power function depending only on p 2 the fi2-test is

UMP.

4 .4 . T es t of Mu l t ip l e Co rr e l at io n wi th P a r t i a l I n f o r m a t i o n

Let X be a normally distributed p-dimensional random column vector with

mean p and positive definite covariance matrix S, and let X a , a — 1,... ,7V

[N > p) be a random sample of size N from this distribution. We partition X

as X = (Xi,X^,X'^Y where X\ is one-dimensional, X^ is pi-dimensional

and X ( 2 ) is p2-dimensional and 1 + p i + P 2 = p. Le t p\ and p denote the multi

ple correlatron coefncents of X\ with X[ 2 ) and with (X'^X'^y, respectively.

Denote by p\ = p

2

— p\. We consider here the following two testing problems:

P r o b l e m 7. To test H m: p 2 = 0 against the alternatives Hi\; p\ — 0,

p\ = X > 0.

P r o b l e m 8. To test H 2 o- p 2 = 0 against the alternatives H 2 \: p\ = 0,

pl = X>0.

Let NX = X^,X 0 , S = S ^ ^ A - " - X ) ( X ° - X ) ' , 6 ( f J denote the i-vector

consisting of the first i components of a vector b and C[;j denote the i x i upper

left submatrix of a matrix c. Partitions 5 and E as

(5 n 5 ( 1 2 ) S ( 1 3 ) \ / E n E( i 2 j E ( 1 3 )

5(21] 5 | 2 2 ] 5 [ 2 3 ) I , E —I S(2i ) E ( 2 2 ) E ( 2 3)

5(31| 5(32 ) 5(3 3)/ \^(31) E ( 3 2 ) E ( 3 3 )

where 5(2 2

) and E(2

2 ) are each of dimension pi x pi ; 5 (3 3

) and E (3 3

j are eachof dimension P2 x p 2 . Define

Pi = E ( i 2 ) E ^ 2

1

2 ) E ( 2 1 1 / E 1 1 ,

p 2 = ^ + p I = { E , 1 2 ) E , 1 3 ) ) f g 2 2 ; ^ ^ " ' ( E d ^ ^ l ' / S n ,

\ (32) ^(33) /

Rl - £ ( i 2 ) S ( 2 2 ) S < 2 I ) / \ S i i ,

4 + fl2 = c% 3 )^ }) ( f p jjgj )

_ 1

•

It is easy to verify that the translation group transforming

( X , S ; i i , E ) ^ ( X + &,S;p + 6 , E )

leaves the present problem invariant and, along with the full linear group G of

p x p nonsingular matrices g

(gu o o \

9 = 0 ff(23) 0 (4.36) \ 0 9(32) 9(33) /

where ffii : 1 X 1, 9 { 2 2 ) : pi x pu gl33): p 2 x p 2 , generates a group which

leaves the present problem invariant. The action of these transformations is to

reduce the problem to that where p, = 0 and S = £%=1 X a X a' is sufficient for

E , where N has been reduced by one from what it was originally. We now treat

later formulation considering X", a = 1 , . . . , TV ^ p > 2, to have a common

zero mean vector. We therefore consider the group G of transformations g

operating as

(S,Z)^(gSg\gU)

for the invariance of the problem. A maximal invariant in the sample space

under G is {Ri,R 2 ) as defined in (4.11). Since S is positive definite with

probability 1 (as N ^ p ^ 2 by assumption): > 0, R 2 > 0 and Ri + R 2 =

R 2 , the squared sample multiple correlation coefficent between the first and the

remaining p - 1 components of the random vector X. A corresponding maximal

invariant in the parametric space under G is (p\,p\). T h e joint probability

density function of (R^R 2 ) is given by (see Giri (1979))

/ a ( n , f 2 ) - m - P 2

r N/2

(l ~ fi - f 2 )^ N

- p

~V T J f o ) * * - *

x

" A S S « w t * + A > — 1 9 , 1 • < 4

'3 7 )



Some Best Invariant Tests in Multinormals 87

where

% = 1 - E P)w i t h = 1

'

ff

i = Ep

J'a

?=

~ f)/Wi-i.

.2

7 .

and i f is the normalizing constant.

By straightforward computations the likelihood ratio test of Hio when fl —

{ ( « , E ) : E13 —0} rejects J?io whenever

f ^ c , (4.38)

where the constant c depends on the size a of the test and under HM , RI has a

central beta distribution with parameter (i p i , ± ( N - p i ) ) , and the likelihood

ratio test of H 20 when fl = { (p , S) : S 1 2 = 0} rejects H 2 Q whenever

^ 1 ' ^ ~ ^ 4 C , (4-39)1 - r j

where the constant c depends on the size a of the test and under H 2 o the

corresponding random variable Z is distributed independently of R\ as central

beta with parameter (^(JV — pi — ps) , \p 2 )-

T h e o r e m 4.4 .1 . For problem 7 the likelihood ratio test given in (4.38) is

UMP invariant.

Proof . Under H i0 f t = 1 , i = 0,1,2. Hence a2

= 0, §i = 0, i = 1,2.

Under ff u P2

. = A, p\ = 0, % = 1, 71 = 1 - \ a 2 = 1 - A, 7 2 - 1 - A, al = 0,

81 = 4fiA and t?2 = 0. Thus

/ f l n ( n , r 2 ) _ ^ _ A j - w / 2

(801i=0

88 Group /ntrarinnce in Statistical Inference

Using the Neyman-Pearson Lemma and (4.40) we get the theorem. •

T h e o r e m 4.4 .2 . For Problem 8 the likelihood ratio test is UMP invariant

among all tests <j>(Ru R.2) b^sed on R\,R\ satisfying

Proof. Under p\ = 0, p\'= A, % = 1 , % 1, 7a = 1 - A, &{ - 0,

8 a\ = A, h = 0 and 0 2 = 4 f 2 A(l - f i A ) - 1 .

Hence

fn 3 A f 2\T l)/fH 20(

f 2\n) = fH 1 A rl,?2)/fnio( rl,r2)

0 0 . r a ( J V - p 1 ) + t ) / / 4f 2 A

— (2i)! \ l - f 1 A / ( 4 4 1 )

From (4.41) / j y a j ( f 2 | r i ) has a monotone likelihood ratio in f 2 = (1 — z)(l —

fi). Now using Le hm ann (1959) we get the theorem. •

E x e r c i s e s

1. Prove Equations (4.3) and (4.5).

2. Prove (4.10).

3. Show that JrJ^(A) is distributed as X m+2K w n e r e i f is a Poisson random

variable with parameter jA.

4. Prove (4.31).

5. Let 7ri , ?r2

be two p-variate normal populations with means p\ and / i2

andthe same positive definite covariance matrix S . Let X — {Xi,... ,X P )' be

distributed according to ir\ or ir 2 and let b = ( & i , . . . ,b p )' be a real vector.

Show that

[E1lb'X)-E2(b , X)\ 2

var(b'X)

is maximum for all b if 6 — E- 1 (pi — p 2 ) where Ei denote the expectation

under Tij.

6. (Giri, 1994a). Let X a = {X al ,..., X op )', a = 1 , . . . , N(> p) be indepen

dently identically distributed p = 2p!-variate normal random vectors with

mean p = (/*!..- • tUp)' a n < * c o m m o n covariance matr ix E (positive definite).

Let X = jjX^X^ S = ^ , ( X a - X)(X a - X)'. Write

Some Best Invariant Tests m Multinomials 89

E _ / S ( u ) £ ( i a ) \ s = ( 5 ( 1 1 ) % 2 } V

X = ( X i , . • • X p ) — ,

where S/yj and Sufy are pi x pi for all t, j and X( i) = ( X i , • • • , Xp

, ) '

(a) Show that the likelihood ratio test of H a : M(i) = p ( 2 j , with p =

{fi'w ,p[ 2) )\ p ( 1 } = ( M I I - - I M P I ) ' rejects rY 0 for large values of

T2 = N ( X ( 1 ) - X ( 2 ) ) ' ( 5 ( 1 1 ) + 5(22) - 5(i2) - 5(2i )) _ 1(-A"[i) - X ( 2 | )

where T 2 is distributed as x P 1 ( S 2 ) / ' x % - P 1 with 6 2 - /V (M ( I ) ~ r*(2])'

• (E ( i i ) + E (22) - E ( 2 1 1 - E ( 1 2 ) )_

' ( M ( I ) - "(2)) and X p , ( - ) . X w - P l

a r e

independent.

(b) Show that the above test is U M P I .

7. (Gir i , 19946). In problem 5 let T - E " V - ( T i r 2 p J ' = (T^j,Tfaf

with r ( 1 ) = ( r 1 [ . . . , r j „ ) ' .

(a) Show that the likelihood ratio test of HQ : T ( i ) = r ( 2 ) rejects Ho for

small values of Z

_ 1 + iV(X( i ) + X ( 2 ) ) , ( S ( n ) + 5(22) + 5(21) + 5 ( i 2 ) ) ~ 1 ( X ( i ) + X(2))

l + NX'S^X

where Z is distributed as central beta with parameter {^{N — pi), \pi)

under Hn.

(b) Show that it is U M P I similar.

(c) Find the likelihood ratio test of H 0 : T ^ , = AF( 2 ) when A is known.

8. (Gir i , 1994c). In problem 6 write

(£ ( i i ) E ( 1 2 j E ( 1 3 ) \ / S ( i t ) 5 ( 1 2 ) S ( 1 3 |

E(2i) E { 2 2 ) E(23) I , 5 = I 5(2 i) 5( 2 2) S ( 2 3 |

E(3l) E ( 3 3 ) E ( 3 2 ) / \ -S"<31) 5(32) 5(33)

where E ( 3 1 ) and 5(U

) are 1 x 1, E ( 2 2 ) and 5(2

2) are pi x pi and E ( 3 3

j and5(33) are pi x pi w it h 2pi = p — 1.

(a) Show that the likelihood ratio test of H 0 : E ( 1 2 ) = E ( 1 3 ) rejects H a for

large values of

R\ = ( 5 ( 1 2 ) - 5 ( i 3 ) ) ( 5 ( 2 2 ) + S ( 3 3 ) - S ( 3 2 ) - 5 ( 23 ) ) 1 (5(21) ~ 5 ( 3 ! ) ) / 5 ( U ) ,




the probability density function of R\ is given by

_ ( 1 - ^ - 0 ( 1 - ^ ( ^ - 3 )

r ( i ( i v - i ) ) r ( i ( i v - P l ) )

j = 0

( ^ ) J

( ^ ? >P l + J

" -1 )

r 3

( ^ ( J v - i ) + j) mhpi+j)

where

p 2 = i V ( £ ( 1 2 ) - 2(13)1(2(22) -I- £ { 3 3 ) - £ ( 3 2 ) - E ( 2 3 ] ) ~l

x ( £ ( 2 i ) - £ ( 3 i ) ) / £ ( 1 1 ) .

(b) Show that no optimum invariant test exists for this problem.

R e f e r e n c e s

1. N, Giri, Locally Minimax Tests of Multiple Correlations, Canad. J . Statist. 7,

53-60 (1979).


3. N.Giri, On the Likelihood Ratio Test of a Normal Multivariate Testing Problem,Ann. Math. Statist. 35, 181-189 (1964).

4. N. Giri, On a Multivariate Testing Problem, Calcutta Stat. Assoc. Bull. 11, 55-60

(1962).

5. N.Giri, On the Likelihood Ratio Test of A Multivariate Testing Problem II, Ann.

Math. Statist. 36, 1061-1065 (1965).

6. N. Giri, Admissibility of Some Minimax Tests of Multivariate Mean, Rapport de

Research, Mathematics and Statistics, University of Montreal, 1994a.

7. N. Giri, On a Test of Discriminant Coefficients, Rapport de Research, Mathe

matics and Statistics, University of Montreal, 1994b.

8. N. Giri, On Tests of Multiple Correlations with Partial Informations, Rapport

de Research, Mathematics and Statistics, Univerity of Montreal, 1994c.

9. N. Giri, Multivariate Statistical Analysis, Marcel Dekker, N. Y . , 1996.

10. J . K . Ghosh, Invariance in Testing and Estimation, Tech. report no Math-Stat

2/672. Indian Statistical Institute, Calcutta, India, 1967.

11. E . L . Lehmann, Testing Statistical Hypotheses, Wiley, N.Y., 1959.

12. A. Wald, Tests of Statistical Hypotheses Concerning Parameters when the Num

ber of Observations is Large, Trans. Amer. Math. Soc, 54, 426-482 (1959).



Chapter 5

SOME MINIMAX TESTS INMULTINORMALES


The invariance principle, restric ting its attention to invariant tests only,

allows us to consider a subclass of the class of all available tests. Nat ur al ly

a question arises, under what conditions, an optimum invariant test is alsooptimum among the class of all tests if such can at all be achieved. A power

ful support for this comes from the celebrated unpublished work of Hunt and

Stein, popularly known as the Hunt-Stein theorem, who towards the end of

Second World Wa r proved that under certain conditions on the transforma

tion group G , there exists an invariant test of level a which is also minimax,

i.e. minimizes the maximum error of second kind (1-power) among all tests.

Though many proofs of this theorem have now appeared in the literature, the

version of this theorem which appeared in Lehmann (1959) is probably closein spirit to that originally developed by Hu nt and Ste in. P i tt man (1939) gave

intuitive reasons for the use of best invariant procedure in hypothesis test ing

problems concerning location and scale parameters. Wald (1939) had the idea

that for certain nonsequential location parameter estimation problems under

certain restrictions on the group there exists an invariant estimator which is

minimax. Peisakoff (1950) in his P h . D . thesis pointed out that there seems to

be a locuna in Wa ld 's proof and he gave a general development of the theory

of minimax decision procedures invariant under transformation group. Kiefer

(1957) proved an analogue of the Hunt-Stein theorem for the continuous and

91

discrete sequential decision problems and extended this theorem to other de

cision problems. Wesler (1959) generalized for modified min im ax tests based

on slices of the parametric space.

It is well-known that for statistical inference problems we can, without any

loss of generality, characterize sta tis tic al tests as functions of sufficient statis ticinstead of sample observations. Such a characterization introduces considerable

simplifications to the sample space without loosing any information concerning

the problem at hand. Though such a characterization in terms of maximal

invariant is too strong a result to expect, the Hunt -Stein theorem has made

considerable contribution towards that direction. T h e Hunt-Ste in theorem

gives conditions on the transformation groups such that given any test <f> for

the problem of testing H : 8 € fl// against the alternatives K : $ € fl/f, with

fi / / n fi/f a null set, there exists an invariant test such that

sup ES(f>> sup Ee yj, (5.1)eefi„ eer2„

inf Ee d> < inf Eeil>. (5.2)fe n* ~ eefi K

In other words, V behaves at least as good as any <f> in the worst possible cases.

We shall present only the statements of this theorem. Fo r a detai led discussion

and a proof the reader is referred to Lehmann (1959) or Ghosh (1967).

L e t V — {Pg,0 e f!} be a dominated family of distr ibutions on the E u

clidean space (X,A), dominated by a cr-finite measure p. Let G be the group

of transformations, operating from the left on X, leave fi invariant.

T h e o r e m 5.0.1. (Hunt-Stein Theorem). Let B be a a-field of subsets of

G such that for any A E A, the set of pairs (x,g) with gx £ A is in A x B

and for any B € B, g G G , Bg e B. Suppose that there exists a sequence of

distribution functions v n on ( G , B) which is asymptotically right invariant in

the sense that for any g € G , B € B

\im^\v n(Bg) - v»(B)\ = 0. (5.3)

Then, given any test <j>, there exists a test V1 which is almost invariant and

satisfies conditions (5.1) and (5.2).

It is a remarkable feature of this theorem that its assumptions have nothing

to do with the statistical aspects of the problem and they involve only the group

G. However, for the problem of admissibility of statistical tests the situation is

more complicated. If G is a finite or a local ly compact group the best invariant

test is admissible . For other groups the nature of V plays a dominant role.

Some Minimax Test in Muttinormates 93

T h e proof of Th eo re m 5.0.1 is straightforward if G is a finite group. Let m

denote the number of elements of G. We define

As observed in Chapter 2, invariant measures exist for many groups and they

are essentially unique. But frequently they are not finite and as a result they

cannot be taken as a probability measure.

We have shown in Chapter 2 that on the group 0{p) of orthogonal matr ices

of order p an invariant probability measure exists an d this group satisfies the

conditions of the Hunt -St ein theorem. T h e group GT(P) of nonsingular lower

triangular matrices of order p also satisfies the condit ions of this theorem (see

Lehmann, 1959, p. 345).

5.1. L o c a l l y M i n i m a x T e s t s

Let (X,A) be a measurable space. Fo r each point (5, JJ) in the parametric

space f!, where 6 > 0 and ij is of arbitrary dimension and its range may depend

on i5, suppose that p{-; 6 ,n) is a probability density function on (3Z ,A) with

respect to some u-finite measure p . We are interested in testing at level a

(0 < a < 1) the hypothesis HQ : 6 — 0 against the alternative Hi : 6 = A,

where A is a positive specified canst ant and in giving a sufficient condition for

a test to be locally minimax in the sense of (5.7) below. This is a local theory

in sense that p(x;A, JJ) is close to p(x;o, 7?) when A is small. Obv iously, the n,

every test of level a would be locally minimax in the sense of trivial criterion

obtained by not substracting a in the numerator and the denominator of (5.7).

It may be remarked that our method of proof of (5.7) consists merely

of considering local power behaviour with sufficient accuracy to obtain anapproximate version of the classical result that a Bayes procedure with constant

risk is mi nima x. A result of this type can be proved under various possible

types of conditions of which we choose a form which is more convenient in m any

applications and stating other possible generalizations and simplifications as

remarks. Throughout this section expressions like o(A), o(/i(A)) are to be

interpreted as A — > 0.

For fixed a , 0 < a < 1 we shall consider critical regions of the form

(5.4)

where V is bounded and positive and has a continuous distribution function

for each {S,Tj), equicontinuous in (6 ,7)} for 6 < some 6$ and which satisfies

(5.5) P x ,v(R) = a + k(X)+q(X,V )

where q(X,n) = o{h(X)) uniformly in v with h(X) > 0 for A > 0 and h(X) =

o( l ) . We shall be concerned with a priori probability density functions £<J,Aand t]\,x on sets 6 = 0, 8 = A respectively, for which

{ f r o i f ^ = 1 + WW) + r(Xmm + B( X , A) (5.6)

where 0 < Ci < r(A) < c 2 < oo for A sufficiently sm all an d p(A ) = o ( l ) ,

B(x,X) = o(h(X)) uniformly in x.

T h e o r e m 5.1.1. (Locally Minimax Tests) If the region R satisfies (5.5)

and for sufficiently small X there exist £o,i. and fa^ satisfying (5.6) then R is

locally minimax of level a for testing Ho '• 8 = 0 against Hi : 8 = X as A —* 0,

that is to say,

lim inf , PUR)-o = t _ « -

i -o Sup0 A e Q a inf , Px, n{4>\ rejects i f 0 } - o

tu/iere Q Q is t/ie class of all level a tests <p\.

Proof . Let

r\ = [2 + h(X){g(X) + c a r ( A ) } ] - 1 .

Then

—p- = 1 + h{X) \g(X) + c a r(X)}. (5.8)

Using (5.7) and (5.8) the Bayes critical region relative to the a priori distribu

tion

£\ = (1 - T A ) £ O , X + n £ M

and ( 0 — 1 ) loss is given by



Some Minimal Teat in Multino-rmales 95

Define for any subset A

Pi,*(A) = J Px,vW) 6 , A W ,

Vx=R — B\,

W X = B X - R . (5.10)

Using the fact that

sup B(x,X)

h{\)- o ( A )

and the continuity assumption on the distribution function of U we get

Plx(Vx + W x )=o(X). (5.11)

Also for U x = V k or

Pi\x(Ux) = PS,x(Ux)[l + Pim)}] - (5-12)

Let

r* x(A) = {1 - n) ^,A

£4 + n ( l - pfc(jf)).

Using (5.9), (5.10) and (5.11) the integrated Bayes risk relative to $\ is given

by

r'x(Bx) = rl(R) + (1 - n ) [ P 0 ' , x ( ^ ) - P 0 * A ( V A ) ]

+ n [ f , r , A ( ^ ) - F 1 * , A ( - y - A ) ]

= r A ( R ) 4- (1 - 2 r A ) [ P 0 * , ( ^ ) - P Q * A (V A ) ]

+ ^ O % ( V X + W A ) O ( M A ) )

= r A ( R ) + o( / l (A)). (5.13)

If (5.7) were false one could by (5.5) find a family of tests {0>J of level a such

that d> x has power function cv + g(X, ij) on the set 6 — A with

lim sup [inf g(A, JJ) - h(\)]/h(\) > 0 .

The integrated risk r' x of ^> with respect to £\ would then satisfy

contradicting (5.13).

l im s u P K ( i i ) - r , ] / h ( A ) > 0 ,A—0

•

96 Group invariance in Statistical Inference

R e m a r k s .

(1) Let the set {6 = 0} be a single point and the set {8 = A } be a convex

finite dimensional Euclidean set where in each component $ of TJ is

00(A))- If

P1

? ' ^ = l + h(X) U(x) + T, 6i(*) OiifA) m + B{x, A, TJ) (5.14)p(ar;0,Jj)

where sif ay are bounded and sup X i , J3(X,A,JJ) = o ( / i (A) ) , and if there

exists any satisfying (5.6 ), then the degenerate measure f ? A which

assigns all its measure to the mean of £I,A also satisfies (5.6).

(2) T h e assumption on B can be weakened to Px,„{\B(x, A ) | < e k(X)} -» 0as A - * 0 uniformly in n for each e > 0. If the i are independent of

A the uniformity of the last condition is unnecessary. T he boundedness

of U and the equicontinuity of the distribution of U can be similarly

weakened.

(3) T he conclusion of Theorem 5.1.1 also holds if Q„ is modified to include

every family {<j>x} of tests of level a + o(h{X)). One can similarly

consider the optimality of the family {Ux} rather than single U by

replacing R by Rx - {x : U x{x) > c a ,x}, where PQt „{Rx} = a - qx(v)

with qxiv) — o(h{X)).

(4) We can refine (5.7) by including one or more error terms. Specifically

one may be interested to know if a level a critical region R which

satisfies (5.7) with

inf Px, n{R) = a + ci A + o(A), as A - > 0

also satisfies

l i m mUPxMRl-a-dX = ^

o sup^ e Q O inf , Px,v{<j> rejects H 0 } - a - c,X x

In the setting of (5.14) this involves two moments of the a priori distribution

£ I ,A rather than just one in Theorem 5.1.1. As further refinements are invoked

more moments are brought in.

The theory of locally minimax test as developed above and the theory of

asymptotically minimax test (far in distance from the null hypothesis) to be

developed later in this chapter serve two purposes. F i r s t the obvious point of

demonstrating such properties for their own sake. But well known valid doubts

Some Minimax Test in Multinormales 97

have been raised as to meaningfulness of such properties. Secondly, then, and

in our opinion more important, these properties can give an indication of what

to look for in the way of genuine minimax or admissibility property of certain

tests, even though the later do not follow from the local or the as ym ptot ic

properties.

E x a m p l e 5.1.1. (T 2 -t es t) . Conside r Prob lem 1 of Cha pte r 4. L et X i , . . . ,

XN be independently and identically distributed J V p ( p , E ) random vectors.

Write NX = X „ 5 - £ f ( X ; - - X)'. Let 6 = > 0.

For testing H 0 : 8 = 0 against the alternatives Hi : 8 > 0 the Hotelling's T 2

test which rejects H Q whenever R - NX'{S + NXX')~l

X > c, where c is

chosen to yield level a, is U M P I (Theorem 4.1.1).

Note. For notational convenience we are writing R\ as R.

L et 6 — A > 0 (specified). We are concerned here to find the local ly

minimax test of Hn against Hj : S = A as A -> 0 in the sense of (5.7 ). We

assume that N > p, since it is easily shown that the denominator of (5.7) is

zero in the degenerate case N < p. In our search for locally minimax test as

A—> 0 we may restrict attention to the space of sufficient s tat isti c ( X , S). T he

general linear group G)(p) of nousingular matrix of order p operating as

( £ , s ; p , E ) — . (gx,gsg';gp, gT,g')

leaves this problem invariant. However, as discussed earlier (see also James

and Ste in, 1960, p. 376), the Hu nt -Ste in theorem cannot be applied to the

group G ( (p ) , p > 2. However this theorem does apply to the subgroup Gj- (p)

of nonsingular lower triangular matrices of order p. Thus, for each A, there

is a level a test which is almost invariant and hence for this problem which

is invariant under Grip) (see Lehmann, 1959, p. 225) and which minimizes,

among all level a tests, the minimu m power under H i . From the local point

of view, the denominator of (5.7) remains unchanged by the restriction to GT

invariant tests and for any level a test d> there is a G j invariant level a test

4>' for which the expression in f, Px , , [4>' rejects H a ) is at least as large, so

that a procedure which is locally minimax among GT invariant level a tests,

is locally minimax among all level a tests.

In the place of one-dimensional maximal invariant R under G j( p) we obta in

a p-dimensional maximal invariant (Ri,...,R p ) as defined in Theorem 2.8.1

with k = p and di — 1 for all i, satisfying

J2 Rj = NX{ A{S M + N X ^ r ' X ^ , i = 1, .. - ,p (5.16)

i=]

with R, > 0, £ ] ? = l flj - J? < 1 and the corresponding maxi mal invariant on

the parametric space of ( /* ,£) is (^ 6 P ) (Theorem 2.8.1) where

(5.17)

with Sf_i — 5. The nuisance parameter in this reduced setup is n =

(rn,...,%)', with jji = 6,/6 > 0, V _ ^ = 1 TJJ = 1 and under H 0 i) = 0. From

Theorem 2.8.1 the Lebesgue density of Ri,... ,R P on the set

i n , . . . , r p ) : r , > 0, J P f l < 1

is given by

/ ( r i , . . . , r „ ) =

x exp

y j , / J V - « + 1 i nSi

2' 2(5.18)

We now verify the validity of Theorem 2.8.1 for U — Yl^=i ^ - s i n c e those

preceding (5.5) for the locally minimax tests are obvious. In (5.5) we take

fi(A) = b\ with 6 a positive constant. Of course P\,„(R) does not depend on

77. From (5.18) we get with r = ( r i , . . . r p ) ' ,

/ o , , (r ) 2+ B(J,X,V) (5-19)

where B(r,\,-n) — o(A) uniformly in r , n as A —* 0. The equation (5.6) issatisfied by letting £ 0 | A give measure one to the single point TJ — 0 while

gives measure one to the single point rj = TP* = (TJ", . . . where

T?; - (N - j)~ L(N - j + l r v ^ o v - p ) , j = 1 , . . . , p



Some Minimax Tesi in MultinoTmales 99

so that Y,i>3 n* + (N - j + l }rj; = £ for al l j. From Theorem 5.1.1 we get

the following theorem.

T h e o r e m 5.1.2. For every p,N,a{N > p) Hotelling's T 2 test based on

T 2 or equivalently on R is locally minimax for testing HQ : 6 — 0 against the

alternatives Hi : 6 = X as X —* 0.

E x a m p l e 5.1.3. Consider Problem 2 of Chapter 4. From Example 5.1.2

it follows that the maximal invariant in the sample space is ( J ? i , . . . , R P1 ) with

Ri = a n d t n e corresponding maximal invariant in the parametri c space

is ( f i i , . . . , (J P , ) with 8\ = Yii' 6j. Under Ho, Si = 0 and under Hi 6i — X.

Now following E xample 5.1.2 we prove Theorem 5.1.3.

T h e o r e m 5.1.3. For every pi,N,a the UMPi test which rejects Ho for

large values of R\ is locally minimax for testing Ho : &i — 0 against the

alternative Hi : 6\ — X as X —* 0.

E x a m p l e 5.1.4. Consider Problem 3 of Chapter 4. We are concerned here

to find the locally minimax test of Ho : Si = 0, S 2 = 0 against the alternatives

Hi : 8i = 0, 6 2 — X > 0. As usual we assume that N > p the determinator

of (5.7) is not zero. To find the locally minimax test we restrict attention

to the space of sufficient statistic (Xi S). The group (G2,T 2 ) operating as

(X, S) —* (gX + t 2 , gSg'), g € G 2 , t 2 € T 2 , which leaves the problem invariant,

does not satisfy the condition of the Hunt-Ste in theorem. However this theorem

holds for the subgroup (GT {p\ + p 2 ) , T 2 ). A maximal invariant in the sample

space under this subgroup is Ri...., R Pl + P I such that

^R^NX'u (SM + NXMX {i] )-l X [{l , i = l , . . . , p i + P 2 (5.20)

with Ri >0,Ri= Rj, Ri+R 2 = E ^ L i " 2 R 3 , and the corresponding

maximal invariant in the parametric space is 61,..., 6 Pl + P 2 such that

with Si > 0,6 = E j P L i 6j, h + 6 2 = 5,. Under H 0 Si = 0 for all i and

under Hi Si — 0, i — 1 , . . . , p i , fij = 6 = \ > 0. The nuisance parameter in this

(5.21)

100 Croup Invariance in Statistical Inference

reduced set up is fj = ( r f i , . . . , Vp,+piY w ' * h if, = ^ . Under H 0 n - 0 and under

Hi, ifc = 0 i = l . . . - , p i , ifc > 0, « = pi + 1 P i + Pa with E f ^ + i * = L

Equation (5.19) in this case reduces to

/ o , „ ( r ) 2- l + f,-!- M ^ r j . + f / V - . j + l ) ^ + £ ( r , n , A )

(5.22)

as A —» 0 with B(r, JJ, A) = o(A) uniformly in r, n.

The set £ 2 = 0, 6i = 0 is a single point n = 0, so £O,A assigns measure

one to the single point ij = 0. Since the set {Si = 0, 6 2 — A (fixed)} Is a

convex p 2-dimensional Eucl idean set wherein each component jj ; is o(/ i(A)),

any probabi lity measure can be replaced by the degenerate measure ££ A

which assigns measure one to the mean (0,0,*7p,+i>-• • < n p,+ P2 °f &,*)•

Hence from (5.22)

/ fx, n(r) £ux(dv)

I fo.n(r)

= I + ^ - i + f i + "E *(l>+Cw-i+i>%J

+ B ( r , A , 7 j ) \iiAdn)

= 1 + - - l + f , + £ ^ ( ^ i t f + t W - j + l h n + B ( r , A )

L J = P 1 +

' V , > J / J (5.23)

where B{r, A) — o(/i (A)} uniformly in r .

Let il* be the rejection region defined by

Rk = {X : U(X) = J * , + kR 2 > C Q } (5.24)

where fc is chosen such that (5.23) is reduced to yield (5.6) and the constant

C a depends on the level a of the test for the chosen fc. Choose

( j y - j - i ) . . . ( / V - P l - p 3 ) ( A 7 - P i )

P2(iV - p , - P 2 + 1)

http://iiadn/

http://iiadn/



Some Minimal Test in Muttinormales 101

j = pi + 1, . . . ,p i + p 2 - 1 ,

n _ ( ^ - P i ) ( 5 2 5 1

so that

N - p ,V M ^ - J + l l ^ ^—^-, J = Pi + 1, - - ,pl + P 2 •

The test 0* with rejection region

J T = j * : U ( X ) = % + > C „ | (5.26)

with Pa,\{R*) — a satisfies (5.6) as A —> 0. Furthermore any region R& of

structure (5.24) must have k — to satisfy (5.6) for some From

Theorem 4.1 .4, for any invariant region R*, Px tV (R*) depends only on A and

hence from (5.5) q{\,r)) — 0 and thus tp* is L B I for this problem as A —* 0.

Hotelling's T 2 test based on Ri + R 2 does not coincide with 4>' and hence it

is locally worse. It is easy to verify that the power function of Hotelling's test

which depends only on 03, 6\ being zero, has positive derivative everywhere,

in particular, at 8 2 — 0. Thus from (5.5), with R — R*, h(A) > 0. Hence we

get the following theorem.

Theorem 5.1 .4. For Problem 3 the L B I test 0* is locally minimax as A —* 0.

R e m a r k s . Hotelling's T 2 test and the likelihood ratio test are also invari

ant under (G 2 , T 2 ) and therefore thier power functions are functions of 6 2 only.

From the above theorem it follows that neither of these two tests maximises

the derivative of the power function at S 2 = 0. So Hotelling's T 2 test and the

likelihood ratio test are not locally minimax for this problem.

E x a m p l e 5.1.5. Consider Problem 6 of Chapter 4. The affine group

( G , T), transforming

( X , S; p, S ) (gX + t, gSg'; ga + t, gBg')

where g € G, t € T leaves the problem of testing H^p 2

— 0 again H\ : p 2

=

A > 0 (specified) invariant. However the subgroup ( G T ( P ) , T ) , where Gj-(p)

is the multiplicative group of nonsingular lower triangular matrices of order

p whose first column contains only zeros expect for the first element, satisfies

Hunt-Stein conditions. The action of the translation group T is to reduce the

mean a to zero and S — E a = i X a(X°)' is sufficient where N has been reduced

by one from what it was originally. We treat the latter formulation considering

A T 1 , . . . , A T W to have zero mean and positive definite covariance matrix E and

consider only the subgroup Gx{p) for invariance. A maximal invariant in the

sample space under Gx{p) is R= (R 2 , • - -, Rp)', where

_ S [l2 )[i] 5 ( ~ 2 ) [ i ] S ( 2 ] ) [ i ]

5 ( H )i = 2 , . . .,p (5.27)

i=i

where C[i] denotes the upper left-hand corner submatrix of C of order i and

tyi] denotes the i-vector consisting of the first i components of a vector 6 and

\

A

( 2 1 ) 3(22)7where 5(22) is a square matr ix of order (p— 1). As usual we assum e tha t N >p

which implies that 5 is positive definite with probability one. Hence Ri > 0,

R 2 = U = ^2*- 2 Rj S 1- (R 2 is the squared sample multiple correlation

coefficient). T h e corresponding ma xim al invarian t in the par ametr ic space is

A = (6 2 ,...,S PY, where

5 > -i-2

.2 _ V P

£('2)['] S (22)[i1 S<31)[i]

E ( U )

(5.28)

with 6, > 0, p 2 = E ? = 2 5 > - L e t

The joint distribution of J ? 2 , . . . , R p is (see Giri and Kiefer, 1964)

( l - X ) ^ ( l - T , ^ r i ) i l N - p - l )

nHr-Dr(l(N-p+ I ) ) ( l + E P

= 2 r j i C l - A ) / 7 i - 1])

x f i + i)^ N -+2] r(l{N -i +2))J= 2

ft,=D /3P=D \j=2

3=2 L

r ( i ( j v - i + 2) + ft; 4 T - J ( l - A ) / 7 j ( l + 7 r -1 0

(5.29)



Some Minimax Test in Multinormates 103

where 7_,- = 1 - A J j ^ l f t , ttj = Sjfy. Th e expression 1 / ( 1 + TT̂ -

6j - 0. From (5.29) we obtain

means 0 if

/o,o(r) 2 3=2 \ i>J

+ B ( r , A , » / ) , (5.30)

where £J(j", A,T?) —o (A) uniformly in r, 77.

From Th eo re m 4.3.1 it follows th at the assumptio n of Th eore m 5.1.1 are

satisfied with U = £ P

= 2 -^j = R 2 a i m M-M = bX, b > 0. As in exam ple 5.1.3

letting £ i | A give measure one to n" = ( ) )§ , . . . . ,1(2} where j j j — (JV— j + l j - ' ( W —

j + 2 ) - 1 ( p - l ) _ 1 J V ( i V - p + 1), j = 2 , . . . , p - l we get (5.6) . Hence we prove

the following theorem.

T h e o r e m 5.1 .5 . For ever?/ p,N and a the R? test which rejects H 0 for

large values of the squared sample multiple correlation coefficient R 2 is locally

minimax for testing HQ against Hi : p 2 = A —> 0.

R e m a r k s . It is not a coincidence that (5.19) and (5.29) have the same

form. (5.18) involves the ratio of noncentral to central chisquare distribution

while (5.29) involves similar ratio with random noncentrality parameter. The

first order terms in the expressions which involve only mathematical expecta

tions of these quantities correspond each other.

E x a m p l e 5. 1. 5 . Consid er problems 7 and 8 of Sec. 4.3. T h e group G

as defined there does not satisfy the condition of the Hunt-Stein theorem.

However this theorem does apply to the solvable subgroup GT of p x p lower

triangular matrices g of the form

9 =

(9u

0

V 0

0

322

92p

0 \ 0

Spp/

(5.31)

From Example 5.1.4 a maximal invariant in the sample space under GT is

R — ( i ? 2 , . . . , R p ) ' and the corresponding maximal invariant in the parametric

space is A = (62,...,6 P )'. From (5.29) R has a single distribution under

H IQ and H 2 o and it has a distribution which depends continuously on the pi-

dimension parameter Fix under Hi\ and on the p-dimension paramete r r 2 A

under H 2X , where

T IX = { A : tf< > 0, t = 2 , . . . , p i + l , Si=0,i = pi

PI+I

+ 2 , . . . , p , = pf = A } ,

;=2

r 2 > = { A : tfi = 0, t = 2 , . . . , p i + l , 6i>0, i — pi

+ 2 , . . . , p , g ft = 4 = A>. (5.32)

i= P 1 +2

Let

with tfi/p2, if p 2 > 0

0, if p 2 = 0 .

Because of the compactness of the reduced parameter space {0 } and r1A

and the continuity of fx,n( r ) in n we conclude from Wald (1950) that every

minimax test for the reduced problem in terms of the maximal invariant R is

Bayes. Thus any test based on R with constant power on TJX is minimax for

problem 7 if and only if it is Bayes. From (5.29) as A ~* 0 we get (for testing

H 10 against Hi\)

P I + I

j=2 , i > j

+ B ( r , A , i j ) (5.33)

where S ( r , A, 17) — o(A) uniformly in r and r>. It is obvious that the assumptionof Theorem 5.1.1 is satisfied with

P I + I

U= J^Rj = Ri, h(\) = b\

j=2

where b > 0. The set p 2 — 0 is a single point n = 0, so £rjj assigns measure one

to the point T? — 0. The set Fix is a convex p\-dimensional set wherein each

component r>i— o(h(\)). So for a Bayes solution any probability measure t\\,x

can be replaced by the degenerate measure £* A which assigns the probability

one tomeanri 0 - (J? 0,. . . ,J7°, , + 1 ) o f£ i* . Choos ing£ * A which assigns probability

one to the point whose j t h coordinate is = (N-j + l)~1

(N-j+2)-1

p21

(N-

Some Mintmax Test in Multinormates 105

p + l){N-pi), j =2 , . . . , p i + l we get (5.6 ) from (5.33) . T h e condit ion (5.5)

follows from The or em 4.4.1. Hence we prove the following theorem.

T h e o r e m 5 .1 .5 . For Problem 7 the likelihood ratio test of Hit defined in

(4.38), is locally minimax against the alternatives H \ \ as X —> 0.

In problem 8 as X —> 0 (for testing H 2 Q against H 2 x)

A , „ ( r ) = l .V A- i + n + Y, r i ( S ' K + ( i V - i + 2 H.. 1 + 5 ( r , A , 7 i ) ,

/o,T,(r)

(5.34)

where B ( r , A, JJ) — o(/i(A)) uniformly in r, TJ. Since the set I ^ A is a p2-dimen-

sional Euclidean set wherein each component r/, — o(h(X)), using the same

argument as in problem 7 we can write

/ I

h,n( r )^o,x(dv)

- 1 + f, + YI r>(£ w + ( w - 3 + 2)%

j = p i + i

JVA= 1 + - i + f,+ 2 ^(Sf+c^-j'+iM + B ( r , A )

(5.35)

where B ( r , A ) = o(/i (A) ) uniformly in r and £ I A assigns measures one to

« + 2 < ) ' •

Let iifc be the rejection region, given by,

Rk = {x: U{x) = n + kf 2 > c a } (5.36)

where k is chosen such that (5.36) is reduced to yield (5.5) and c Q depends on

the level of significance a of the test for the chosen k. Now choosing

o (N-p + 2)(N-p+l) „

p ( N - p + 2)p 2

we conclude that the test <p* with the rejection region

B* = U(x) = ft + ^ — - ' 2 > c a jwith PO ,A (W* ) = Q satisfies (5.6) as A -» 0. Moreover any region R k of the

form (5.36) must have k = in order to satisfy (5.36) for some From

(4.37) it is easy to conclude that the test 4>' is L B I for problem 8 as A—» 0. T h e

iE 2-test which rejects H20 whenever r 2 — f i +f 2 > c a does not coincide with 0*

and is locally worse. From Sec. 4.3 it is evident that the power function of R?

test depends only on p 2 and has positive derivatives everywhere in particu lar

at p 2 = 0. From (5.5) with R — R*, h{\) > 0. So we get the following theorem.

T h e o r e m 5.1.6. For Problem 8 the LBI test <p* which rejects H 2 a whenever

f 1 + N ~f l f 2 > c a where c a depends on the size a of the test, is locally minimax

against the alternatives as A —* 0.

R e m a r k s . In order for an invariant test of H 2 n against H 2 x to be minimax,

it has to be minimax among all invariant tests also. However, since for an

invariant test the power function is constant on each contour p 2 = p\ = A,

"minimax" simply means "most powerful". T h e rejection region of the most

powerful test is obtained from (5.34), from which it is easy to conclude that

[/A,7j('")//o,n(r)] depends non-trivia lly on A so that no test can be most powerful

for every value of a.

5 .2 . A s y m p t o t i c a l l y M i n i m a x T e s t s

We treat here the setting of (5.1) when A — > 00. Expressions like o(l),

o(H(\)) are to be interpreted as A - * 00. We shall be concerned here in maximizing a probability of error which tends to zero. T h e reader, familiar with

the large sample theory, may recall that in this setting it is difficult to com

pare direc tly approximations to such sm al l probabilities for different families

of tests and one instead compares their logar ithms. Whi le our considerations

are asymptotic in a sense not involving sample sizes, we encounter the same

difficulty which accounts for the form (5.39) below.

Assume that the region

R={x:U(x)>c a } (5.37)

satisfies in place of (5.5)

Some Minimal Test in Mu Kin urinates 107

= (5.38) P x>n(R) = 1 - e x p { - i r ( A ) (1 + o( l) ) }

where #(A) —i co as A —» co and o(l) term is uniform in TJ. Suppose that

p(x,A,7j)^ l i A (d7i)

/p{x,0 ,7j )^ 0 , A (dj ; )

= exp{ff(A)[G(A) + R(\)U{x)} + B(af ,A)} (5.39)

where sup^ |B(a: ,A) | = o{H(X)), and 0 < cj < R(X) < c 2 < oo. One other

regularity assumption is that c Q is a point of increase from the left of the

distribution function of U when 6 — 0 uniformly in n, i.e.

i n fP o „ (E / > c a - e ) > a (5.40)

for every e > 0.

T h e o r e m 5.2.1. / / U satisfies (5.38)-(5.40) and for sufficiently large A

there exist £ 0 , A and £ i ( A satisfying (5.39), then the rejection region R is asymp

totically logarithmically minimax of level a for testing Ho -6 = 0 against the

alternatives Hi : 6 — A as A -» oo, i.e.,

l i m i n f , [ - l o g ( l - P A , , { * } ) ] = 1

A - » sup^ A e Q a inf ,[- log(l - P a , , ( 0 A rejects /Jo))]

Proof . Assume that (5.41) does not hold. Then there exists an e > 0 and

an unbounded sequence V of values A with corresponding tests d>\ in Q a whose

critical region satisfies

P X ,V {R} > 1 - e x p { - # (A) (1 + 5e)} (5.42)

for all IJ . There are two possible cases, (5.43) and (5.46). If A € T and

- 1 - ( 7 ( A) < R(X) c a + 2e, (5.43)

consider the a priori distribution £A (see Theorem 5.1.1) given by and r A

satisfying

r A / ( l - n ) = exp {if (A) (1 + 4e)} . (5.44)

Using (5.42) and (5.44) the integrated risk of any Bayes procedure £f A

mustsatisfy

rl(Bx) < rl(M < (1 - r x)a + r A e x p { - J / ( A ) (1 + 5e)}

- (1 - Tx)\a + e x p { - e H ( A ) } ] . (5.45)

Some Minimal Test in Multinormo/es 109

with Sup \B(r,T), A)I — o(l) as A —• 0 0 . From (4.5) putting n p = 1, 7)i — 0,

i < p, in (5.49) the density of U being independent of n we see that

P X „{U < c a } = e x p { ( C a - 1)[1 + o ( l ) ] } (5.50)

as A —• oo. Hence (5.38) is satisfied with H(X) = | ( 1 — c a ). Now letting

£ i i A assign measure one to the point Vi = " — Vp-i — 0 5 Vp — 1 A N A £O,A

assign measure one to (0,0) we get (5.39). Finally (5.40) is trivial. From

Theorem 5.2.1 we get Theorem 5.2.2.

T h e o r e m 5.2.2. For every p,N,a, Hotelling's T 7 test is asymptotically

minimax for testing HQ .6 — 0 against H\ : 6 — A as A —> co.

From Sec. (5.1) we conclude that no critical region of the form £ ai R{ > cother than Hotelling's would have been locally minimax, many regions of this

form are asymptotically minimax.

T h e o r e m 5.2 .3 . c < 1 and 1 — a x < a 2 < •• • < a p , then the critical region

{ £ tti Ri > c} is asymptotically minimax among tests of same size as A —> 0 0 .

Pr oo f. Th e maximum of £ V f j 2~2i>j Vi subject to £ ; Ojr< < c is achieved

at r\ — c, T2 — • • • — r p — 0. Hence the integration />,,()") over a smal l

region near that point yields (5.50) wi th c„ replaced by c. Since a },'s are

nondecreasing in j it follows from (5.49) that £I ,A can be chosen to yield (5.39)

with U — Yl{aiRi. Again (5.40) is trivial. •

E x a m p l e 5. 2. 2. Consider again Problem 2 of Chapt er 4. From E x a m

ple 5.2.1 with p = pi, R = (Ru ..., R Pl )', r, = (n,,... ,n Pl )' we get

= e x P S ~ -^E^E* ( l + B(r,A,>))) (5.51)

with sup,.^ \B(r,\,n)\ — o(l ) as A -> 00. Using the same argument of the

above theorem we now prove the following.

T h e o r e m 5.2.3. For every p i , N, a the UMPI test is asymptotically min

imax for testing HQ : 61 — 0 against Hi :6 = \as\—* 0 0 .

E x a m p l e 5.2 .3 . In Problem 3 of Ch apter 4 we want to find the asy mp

totically minimax test of H Q : 6~i — 62 — 0 against the alternatives H x : 61 — 0,

110 Croup Invariance in Statistical Inference

62 = A as A —» co. In the notations of Ex amp le s 5.1.4 and 5.2.1 we get,

r

i - E 1 1 r J> f i +*a = E V + P 2 T v T = (n, -,Tn+toY* v = ( * h , • • • . * , + » ) ' .

ni = 0, i = 1,. . . , pi + P 2 under H 0 and rj, = 0, i = 1 ,. . , ,p i under H\. Hence

/o,n(r)

PI+ PJ

- l + f i + £ r i £ » K

where sup,.,, \B(r,A,n)| = o (l ) as A - t 00 . From (4.9) when 5a

61 = 0 we get

= A

(5.52)

00,

/ ( r , , f 2 | A )

/Cl.falO)= exp j^ [ - l + f i + f 2 ] ( l + B ( f i , f 2 . A ) ) j (5.53)

where sup, .^ \B(f t ,f 2 ,\)\ = o(l) as A -* 0. Letting fa in (5.39) assign

measure one to the single point n — 0 we get from (5.52)

/ A , , ( rKi .x(di j)

- l + f , + £ r , 2 *>=Pi + l i > j

. - ? .A ) ) |

(5.54)

Let Rk be the rejection region of size a, based on R\,R 2 given by

R k = {x : U(x) = R t+kR 2> c a } (5.55)

where k is chosen such that (5.54) is reduced to yield (5.39) and R k for the

chosen k satisfies (5.38) and (5.40). Now letting 1/1 = • • • = T ) P l + p l _ i = 0,

- 7 P l + P ) = 1 we see that (5.54) is reduced to yie ld (5.39) with U{x) = Ri + R 2 .

From (5.53)

P(U(x) <c'-)= exp j fa - 1)(1 + o ( l ) ) j . (5.56)

Hence Hotelling's test which rejects fi0 for large values of R\ + R 2 satisfies

(5.38) with H(X) = f (1 - c' a ). The fact that Hotelling's test satisfies (3.39) is

trivial . Since the coefficient of r p i + i in the expression inside the brackets in the

exponent of (5.54) is one, any rejection region R k must have k = 1 to satisfy

(5.39) for some £ \ t \ . From Theorem 5.2.1 a region R k with k / 1 and which



Some Minimax Teat in Multinormales 111

satisfies (5.38) and (5.40) cannot be minima x as A —• co. Us ing Th eo re m 4.1.5

we now prove the following:

T h e o r e m 5.2.4. Hotelling's test which is asymptotically best invariant is

asymptotically minimax for testing HQ I fli = &2 = 0 against H± : 6 T —

0,

oj = A as A —* oo.

R e m a r k s . From Theorem 5.2.3 it is obvious that there are other asymp

totically minimax tests, not of the form Rt, for this problem. It is easy to see

that

P Xin{R1 + (l-c)-

l

R 2

> 1) = l - e xP

{ - ( l o g A

2

- l o g ( ( l - c ) ( l + c ( l ) ) ) } ,

P a

' " ( A i + ^ r 1 * 2 - C o ) =1 ~ e x p { ~ ~C o ) ( 1 + 0 ( 1 ) )

Thus the fact that the likelihood ratio test which rejects HQ whenever1~/l*fi1

ft *' 5 c and the locally minimax test satisfy (5.40) and (5.38) is obvious

in both cases. But from Theorem 5.2.4 these two tests are not assymptotically

minimax for this problem.

5 .3 . M i n i m a x T e s t s

In this section we discuss the genuine minimax charac ter of Hotelling's T 2

and the test based on the square of the sample multiple correlation coefficient

R 2

. These problems have remained elusive even in the simplest case of p — 2 or

3 dimensions. In the case of Hotelling's T 2 test Semika (1940) proved the U M P

character of T 2

test among all level a tests with power functions depending

only on & — Np'H~1 p. In 1956 Ste in proved the admissib ility chara cter of T 2

test by a method which could not be used to prove the admissibi lity character

of R 2 - tes t . Gi r i , Kiefer and Stein (1964a) attacked the problem of minimax

property of Hotelling's T 2 -test among all level a tests and proved its mini

max propery for the very special case of p — 2, N — 3 by solving a Fredhol m

integral equation of first kind which is transformed into an "overdetermined"

linear differential equation of first order. Linnik, Pliss and Salaeveski (1969)

extended this result to N — 4 and p — 2 using a slightly more compl ica ted

argument to construct an overdetermined boundary problem with linear dif

ferential operator. Later Salavesk i (1969) extended this result to the general

case of p = 2.

112 Group /nvuriaiice in Statistical Inference

5.3.1. Hotelling's T 2 test

In the sett ing of Exa mp le s 5.1.1 and 5.2.1 we give first the details of

the proof of the minimax property of Hotelling's T 2 test as developed by

Giri , Kiefer and Stein (1964a), then give the method of proof by Lunnik,

Pliss and Salaeveski (1966). In this setting a maximal invariant under G j f p )

is R = (R1 ,... } R P )' with = NX'(S + NXXy^X and the corre

sponding max imal invariant in the parametr ic space is A = (6i,..., S p )' with

£ p St = N p ' £ - 1 u = S. From (5.18) R has a single distribution under H 0 and

its distribution under Hi : 6 = X (fixed) depends continuously on a p - 1

dimensional parameter

T = | A = (6 1 6 P )': « 5 , > 0 , =

Thus there is no U M P I test under Gr(p) as it was under Gt{p). Let us write

/ A ( r ) as the pdf of as given in (5.18) . Because of the compactness of the

reduced parametric spaces {0} and T and the continuity of f&( r ) i n A we

conclude from Wald (1950) that every minimax test for the reduced problem

in terms of R is Bayes. In particular Hotelling's T 2 test which rejects He,

whenever U = £ J R\ > c, which is also G r - i n variant and has a constant power

function on each contour Np'T,~1 p = X, maximizes the minimum power over

H\ if and only if there is a probability measure £ on T such that for some

constant Ci

' r / o t » \m 1 = ] *t (5.57)

according as

<

= > c

1 K < J

except possibly for a set of measure zero where c depends on the specified level

of significance a and ci may depend on c and the specified X. An examination

of the integrand of (5.57) allows us to replace it by its equivalent

« d A ) = c, if Y > - e . (5.58)

r Jo\ r

) ,

Obviously (5.57) implies (5.58). On the other hand, if there are a £ and a

constant C[ for which (5.58) is satisfied and if r* = ( r j , . . . , r ' ) ' is such that

£ i r j = c' > c, then writing / = ^ and r" = ft* we conclude that

because of the form of / an d the fact that c'/c > 1 and £ p r** = f E i r t = c-

This and similar argument for the case c' < c show that (5.58) implies (5.57).

Of course we do not assert that the left-hand side of (5.58) still depends only

on 2%n if £ p n / c.

The computation is somewhat simplified by the fact that for fixed c and

A we can at this point compute the unique value of ci for which (5.58) can

possibly be satisfied.

Let R = (Ri,..., Rp-i}' and let f&{f/u) be the conditional pdf of R given

£ p = i Ri — it wi th respect to the Lebesgue measure , which is continuous in

f and u with > 0 and1 r

« < it < 1. Denote by / " ( u ) the pdf of U = E i which is continuous for 0 0, E i_ 1 T

i< c

- The fian

^ integral in (5.59), being a probabilitymixture of probability densities is itself a probability density in f as is fo(f\c).

Hence the expression inside square brakets of (5.59) is equal to one. From (4.5)

r ( f ) r ( V ) i 2 '

. C ^ ) ( a _ c ) ^ - P - , 0 ^ , E ; | ) . { 5 . 6 0 )

Using (5.60) we rewrite (5.58) as

J r I 1 i>3 > >=1

J V - i + 1 1 r $ - \ * , . A * , / J V /j cA

for all r with rj > 0, E i ri = c - Write T i for the unit (p - l)- sim plex

2 '2' 2 i M ' ' 1 2 ' ! ' 2

(5.61)

r ! = { ( A , . . . , / 3 p ) : f t > o , X > = i }

where /3; — 6i / X. Let U — 7 — cA and £* be the probabili ty measure on

T i associated with £ on T. Since — t]{6A), (5.61) reduces to

J r

> I )=1 i>j i fe=lv

'

= * ( f , f ; i , ) ( - >

for all ( i i , . . . , f p ) with ' i = 7 a i m ii > 0 and hence, by analyticity for all

(h t P ) with gjfft = 7-

From (5.62) it is evident that such £*, if it exists, depends on c and A only

through their product 7 . Note that for p — 1, T i is a single point but the

dependence on 7 in other cases is genuine.

C a s e p = 2, N = 3 . Sol ut io n of G i r i , K i e f e r a n d S t e i n

Since 0 ( | , — (1 + x)exp(^x), from (5.62) we obtain

jf [i + ( 7 - - ft)]<TO) = ^ " ^ ( f , i ; |-

(5.63)

with fi = 7 - f 2 , A = 1 - / 3 2 . We could now try to solve (5.63) for by using

1 . 1 .21 2'the theory of Meijer transform with kernal ^(1 ,4; hx). Instead we expand both

sides of (5.63) as power series in f 2. Let

Jo

be the ith moment of /3 2. From (5.63) we obtain

(a) l+t-*m=B,

(b) -(2r - 1) p r _ ! + (2r + 7 W - 7^+1 = BT ( - - ) r ( i ) J v

where B = e~^ y d>(^, 1; £ 7 ) . We could now try to show that the sequence {p,}

given by (5.64) satisfies the classical necessary and sufficient condition for it

to be the moment sequence of a probability measure on [0,1] or, equivalently,that the Laplace transform

G O

3= 0

Some Minimal Test in Muttinormales 115

is completely monotone on [0, oo), but we have been unable to proceed suc

cessfully in this way Instead, we shall obtain a function m1(x) which we

prove below to be the Lebesgue density dt]*{x)jdx of an absolutely continuous

probability measure f* satisfying (5.64) and hence (5.63).

The generating function

C O

of the sequence {fa} satisfies a differential equation which is obtained by mul

tiplying (5.64) (b) by f _ 1 and summing from 1 to oo:

2 t 2 ( l - t ) V J ' ( t ) - ( f 2 - 7 t + 7 M t )

= B t [ ( l - t r 1 / 2 - l ] + 7 [ t ( l - / 0 - l ]

= S ( ( l - t ) " 1 / 2 - ( - 7 . (5.65)

This is solved by treatment of the corresponding homogeneous equation

and by variation of parameter to yield

mf j M zl 2 i

B

] d T

n ' ( l - i ) l / a

J 0 [3T#-Ift* 2 T 2 ( l - T ) 1 / 2 2 T ( 1 - T ) J '

(5.66)

the integration being understood to start from the origin along the negative

real axis of the complex plane. The constant of integration has been chosen

to make VJ continuous at 0 with VJ (0) = I , and (5.66) defines a single-valued

function on the complex plane minus a cut along the real axis from 1 to oo.The analyticity of ip on this region can easily be demonstrated by considering

the integral of ip on a closed curve about 0 avoiding 0 and the cut, making

the inversion w = | , shrinking the path down to the cut 0 < w < 1 and using

(5.67) below. Now, if there existed an absolutely continuous £* whose sui tably

regular derivative m 7 satisfied

j m7

(a; ) / ( l - tx)dx = i>(t), (5.67)Jo

we could obtain m y by using the inversion formula

m^(x) = ( 2 i r i x ) _ 1 l i m t y ( x - 1 + it) - T / > (X - 1 - ie)]. (5.68)

However, there is nothing in the theory of the Stieltjes transform which tells us

that an m y(x) satisfying (5.68) does satisfy (5.67) and, hence (5.63), so we use

(5.68) as a formal device to obtain an m 7 which we shall then prove satisfied

(5.63).

From (5.66) and (5.68) we obtain, for 0 < x < 1,

m-,(x) —2rr x^

2

(l - x)1

' 2

\ J 0 | l + u (1 + u) 3

' 2

B v>l 2

+ B

Ja 1 - « J

(5.69)

In order to prove that a%*{x) — m 7 ( x ) dx satisfies (5.63) with £" a probability

measure we must prove that

(a) m T ( i ) > 0 for almost all x, 0 <x < 1 ,

(b) f 01

m1{x)dx = l (5.70)

(c) fii = xm-f(x) dx satisfies (5.64) (a)

(d) p r — j 0 xT

m1{x) dx satisfies (5.64) (b) for all r > 1.

The first condition follows from (5.69) and the fact that B > 1 and u 3 / 2 ( l +

u) < (1 + w) 3-' 2 for u > 0. The condition (d) follows from the fact that m 7 ( x )

as defined in (5.69) satisfies the differential equation

w! (x) + m 7 ( i ) | + ( l - 2 x ) / 2 x ( l - x ) = B/2irx1/2(l-x) 3< 2 , (5.71)

so that an integration by parts yields, for T > 1,

( r + l K - r ^ r _ , = / [(r + l ) x r - rx r

-l

\m y(x) dx

Jo

= / dx Jo

= uT (l - 7 / 2 ) + 7 iiT+l /2 - « r _ i / 2

+ B r ^ + ^ / 2 T r ^ r !

which is (5.64) (b).

To prove (5.70) (b) and (e) we need the following identities involving con

fluent hypergeometric function <p(a,b;x). The materials presented here can

Some Minimax Test in Multinormates 117

be found, for example, in Erde ly i (1953), Chapter 6 or Stater (1960). T h e

confluent hypergeometric function has the following integral representation

0 K 6; x) = T(a + j)T(b)/r{a)T(b + j) j \ JS>

5=0

when 6 > a > 0. T h e associated solution ip to the hypergeometric equation

has the representation

if a > 0. We shall use the fact the general definition of ip, as used in what

follows when a — 0, satisfies

iP(0,b;x) = 1. (5.74)

The functions T/J and <f> satisfy the following differential properties, identities

and integrals.

~<t>(a,b;x)= (j-l)b+l;x) + 0(o ,6 ; i ) , (5 .75 )

^ / > ( a , 6; x ) - $ ( a , 6; x) - ^ ( a , 6 + 1; x), (5.76)

^ ( a , b ; x ) = i - ' o K a - H l W a + l . i ; ! ) - ^ , ! ; ! ) ] , (5.77)

(a - b + l)0(o, 6; x) - a0(a + 1,6; x) + (b - l ) 0 ( a , b~l;x) = 0, (5.78)

60(a ,6 ; i ) - bd>(a - 1 , 6 ; a ; ) - x 0 ( o , 6 + l ; i ) = 0 , (5.79)

ih(a, 6; x) - ai/>(a + 1, b; x) - ip{a, b - 1; x) = 0, (5.80)

( 6 - a K > ( a , 6 ; x ) -xih(a,b + l;ar) + ^ - ( o - l , fc;«) = 0 , (5.81)

rJo

e~ x(x + y)-l <f>[ - , l ; i <tt = * r , l ; y for y > 0 ,

(5.82)

J g e-*(s + y)-vQ,l;z)<te

(5.83)

Using (5.73) , in terms of hypergeometric functions, for 0 < x < 1, the m 7 ,

defined in (5.69), can be written as

= h [ , '^ ./.{^ / '<l- 1 ; ^)*' 1 - l i T / 2 »- r (l)*(i l i ^)+ </>(^,1 ,7/2j jf v - l e - < " t 2 d v - v ^ e ^ ^ d v J

° r r ? r „ . ( e " ^ f i i : T / 2 ) ^ i , i ; 7 ( i - » ) / 2 )2TT[X(1 - ( \ 2 /

E/2r(|)v-(|,l;7/2)}. (5.84)

We now prove (5.70) (b) to establish that m-,(x) is an honest probability

density function. From (5.73), (5.72) and (5.82) we get

/ VTTt—^*(i.i;7(i-»)/?)#Jo 2ir x f l - i ) J

= f VT7T~ v r ^ ( l , 1 ; 7 * / 2 ) tteJo 2ir[x(l - l ) ]a

7o 2 , r [ * ( l - * ) ] » 7o 1 J

dt

(5.85)

Some Minimal Test in Mull in orm ales 119

From (5.85), (5.84) and (5.72) we get

„i r i4 e z f

r(|) h dx

= 2 ^ , l ; ^ ( i , l ; , ) - 0 ( I , l ; ^ ( ^ , l ; S

where 2z = 7 .

We now show that

tf'(z)-#(*)=0, (5.86)

from which it follows that

H{z)=cel

for some constant c.

B y direct evaluation in terms of elementary integrals when 7 — 0 we get

(using (5.85))

/ m a(x) dx = 1; Jo

hence, (5.70) (b) follows from (5.S6). To prove (5.86), we use (5.75) and (5.76),

which yield

H\z)-H(z) = <b{^,2 ] z^i>^\-,z^ + # / | , l ; » \ ^ Q 1 i 5 ^

-2Kii;^G'2;s)+^G'2;^G ,M)1 1

To this expression we now add the following four left-hand side expressions,each of which equals zero

1 1 . ^3i> U . 1 ; * ~ n H ^ 2 ; 2 - -<P[ - , 2 ; z + 0 - , l ; z

2 T \ 2

2* U

( N1 . / 3

2 " V 2

^ - , 2 ; * - - t f - , 2 ; * - t f - , l ; z1

- 0 1 2 - 2 ^1 . (Z

2 T \ 2

to obtain H' — H — 0, as desired.




We now verify (5.70) (c). Prom (5.72) and (5.79), with a = §, b = 1 we

obtain

1 (1+7?/) e™' 2

dy = M 7 -,l;1 /2 + 7 0 - , 2 ; 7 / 2 / 4

(5.88)

Using (5.70) (6), which we have just proved, and (5.85) and (5.88) we rewrite

(5.70) (c) as

1-1 AI

1 - |, mt/aJJ jf u + 7(1 - $}] M * ) *

( l + 7» )

2 T [ » ( 1 - V ) ] I / 21 ; 7 / 2 W M , ?

- e ^ r U 1 1 ; 7 / 2 )\dy

1

7 g ^ ( l , l ; 7g / 2 ) , , f l 0(1,1; Tg/g)

2 ^ ( 1 - f + 7 0 2 ^ ^ ( 1 - 2 / ) ] ! ^ »

r ( | ) f ( f . 1,7/2) y 1 ( l + 73 / ) e ^ 2

0 ( | , l ; 7 / 2 ) I 2ff [v(l - J/)]V2

= f 1 7 ^ ( 1 , 1 ; 7J//2)

Jo 2T[J /(1 — J f) ] 1 ' 2

dy

1_ A l „ / 3

+ 2 r ( - ] ^ ( - , l ; 7 / 2 ) - - r [ - ] ^ [ - , l ; 7 / 22 V2(5.89)

Using (5.81), with a = 1, 6 = 0 and (5 .72) , (5.73), (5.83) we get

f 1 7 g ^ ( l , l ; 7 » / 2 ) , _ f 1 M 0 , 0 ; 7 y / 2 ) - » ( l , O ; 7 y / 2 ) ] ,

/ „ 2 ^ ( 1 - , ) ] ^ ^ - y 0

d »

= 1 - f

1 _ dy

Jo * Jo[3 /(1 - » ) ] ^ a

= l - j T ( l + r V Q ^ 7 * / 2 ) e - ^

= 1 + r(|) ^(I^^A-^d-HT/s) / 2 .

(5.90)

Some Afinimai Test in Mv.lttnorma.les 121

Thus (5.89) and (5.90) imply (5.31) and, hence (5.70) (c).

C a s e p = 2, N = 4. Solution of L i n n i k , Pl i s s and Salaevski

From (5.62) the problem consists of showing (with 0i — /J) on the interval

0 < 8 < 1 there exists a probability density function £{/3) for which

= 0 ( 2 , l ; 7 / 2 ) . (5.91)

With a view to solving this problem we now consider the equation

j * e

" < ' - « / J

^ 2 , | ; M / 2 ^ { | | i (7 - t l) (l - 8)/2)mW

= l 4l <\M ̂ -0)/2jmdf3. (5.92)

Using the fact that

we rewrite (5.92) as

1 6 - ^ 0 ( 2 , 1 ; ^ ) [ 1 + 7 -10 - d d - / 3 ) K ( W

- / 1

e - ^ / 2 [ 1 + 7 _ 7 ^ K ( A ) ( f / J ( 5 9 3 )

Jo

From (5.93) we get

/" e - ^ 2 [ ( r - 2r 2W~l + (1 + 7 + 7 r + 2r 2

)8 r

Jo

- (7 + irW +1

}t:(0)d0 - 0, for r = 1 , 2 , . . . . (5.94)

Let

r = l

be an arbitrary function represented by a series with radius of convergence

greater than one. Multip lying the rth equation in (5.94) by o r _ i and taking

the sum over r from 1 to 00 we get

http://mv.lttnorma.les/

f1

s:(2/33 - 2/3 2 )/" + (-5/3 + 60 2 - f& 2 + 7 / 3 3 ) / '

! - l + o 0 - ^ + f 0 2 ) ,

= J LUMP) d0 Jo

= 0 (5.95)

where L{f) denotes the differential form inside the square brakets. Applying

the green formula to L(f) and choosing a small e > 0 we get

where the adjoint form L*(£) is given by

L * ( 0 = 20 2(0 - 1)C + 0{-Z + 60 + 7 /3 - 7 /3 2 K'

and the bilinear form L[f, £] is given by

m% =2

^2

( ! - ^ ) ( ^ - / ' o - \0+7^2(i - mi n •

Using the thoery of Frobenius (see Ince, 1926} we conclude that

C C C O

(5.96)

(5.97)

(5.98)

(5.99)

£=0

with tto / 0, Co / 0; are two fundamental solution of L*(Q — 0 on the interval

(0,1)- Similarly

£ n = £ 96(1 - /J)f c , j i a - (1 - /3)"5 £ fefcd - 0)1

(5.100)

k=0 k=0

with 3o ^ 0, fto # 0 are also two fundamental solutions of the same equation

on the interval (0,1). Consequently any arbitrary solution of the equation

L*(£) — 0 is integrable on [0,1] and we assert that there also exists a solution

of the equation L*(£) — 0 for which

h m L [ M ] | ; - < = 0. (5.101)

Some Minimax Test in Mullinormales 123

It can also be shown that any linear combination of the solutions in (5 .99)

(similarly for solutions in (5.100)) satisfies (5.101) provided that £(• ) = £ i 2 ( - ) .

Thus £ - £12 is a solution of (5.95) and hence of (5.91).

In the following paragra ph we show that £ i 2 does not vanish in the interval

(0,1). Write

0 = 1 - * , * ( a r ) = A 5 * y f & i ( l - 4 - (5.102)

The equation L"(t\) — 0 becomes

2x{\ - x)8" + ( - 1 + 4x--,x + -yx 2 )S' + ^(1 + jx)$ = 0 . (5.103)z

Substituting u(x) = 6'(x)j8(x) we obtain the Riccati equation

l - 4 s + 7 a - 7 s ' _ 3 + ^ 7 *

2x{l-x) 4 x ( l - x ) v '

Let us recall that 8(x) — E t ^ o ^ ) 1 * ' • > 1*1 < !• Ass um e that 9{x) vanishes

at some point in the interval (0,1). Since 0(0) = 1, among the roots of the

equation ${x) = 0 there will be a least root. L et us denote it by XQ. Then

the function 0(x) is analytic on the circle \x\ < x$. From (5.103) we get

£?'(0) — | which implies that u(0) = —| . Hence u(x) cannot vanish in the

interval (0, Xrj), since at any point where it vanishes we would necessarily have

u'(x) > 0 whereas (5.104) shows that at this point u'(x) < 0. T he fact that

the function u(x) has a negative value implies that it decreases unboundedly

as we approach the point XQ from the left (u' / 0 since which in turn

contradicts (5.104). As a result £ = £ i 2 does not vanish in the interval (0,1).

L e t us now return to (5.91), which, as we have seen, is satisfied by £ — £ 1 2 .

By the method of Laplace transforms it is easy to obtain the relation

/ x b

-l

{l - x y - ^ e i *tt>(a,b;px) <p(a - b,c - b; q(l - x))Jo

r(t)r(c - b)

r(c)-4>(a,c;p + q), 0<b<c. (5.105)

Letting t\ = 7 T , and multi ply ing both sides of (5.91 ) by r 5 (1 —r ) ' and

integrating with respect to r from 0 to 1 we obtain with a = 2, 6 = | , c = 1,p = 7£72,? = 7 ( l - W 2 ,

| y 2 , l ; ^ W = p ( H ; ^ ) ? ( W . (5.106)




Now to obtain (5.91) it is sufficient to make use of the homogeneity of the

above relation and to normalize the function £12.

5.3.2. R?-test

Consider the setting of Example 5.1.5 for the problem of testing H 0 : p 2

— 0

against the alternatives Hi : p 2 = A > 0. We have shown in Chapter 4

that among all tests based on sufficient statistic (X,S) of HQ against the

alternatives p 2

> 0 the R 2

-test is UM P invariant under the affine group ( G , T )

of transformations (9,t) , g e G , t e T operating as (X,S) —• (gX + t, gSg')

where g is a nousingular matrix of order p of the form

a = (%m 0

with 3(H) of order 1 and t is a p-vector.

For p > 2 this result does not imply the minimax property of R 2

-test

as the group ( G , T) does not satisfy the condition of the Hunt-Stein theo

rem. As discussed in Example 5.1.5 we assume the mean to be zero and

consider only the subgroup Gr ( p ) for invariance. This subgroup satisfies the

conditions of the Hunt-Stein theorem. A maximal invariant under GT-(P)

is R = (Rg,... ,Rp)' with a single distribution under Ho but with a dis

tribution which depends continuously on a (p — 2)-dimensional parameter

A — (6 2 , • • • ,S PY with £ J = A 6 j — p 2 — A under St- The Lebesgue density

function f\,7){ r

) °^ ̂ under Hi is given in (5.29). Because of the compactness

of the reduced parameter spaces {0} and

r = 6 p y, 6i>o, £ 4 = A ] (5.107)

and the continuity of fx,v{ r ) m 1" '* follows that every minimax test for the

reduced problem in terms of R is Bayes. In particular, the Jf 2-test which has

a constant power on the contour p 2 = A and which is also GT (p) invariant,

maximizes the minimum power over H i if and only if there is a probability

measure on Y such that for some constant cj

according as

except possibly for a set of measure zero. An examination of the integrand in

(5.108} allows us to replace it by the equivalent

/|McktfA3««i i f X > , = c- (5-109)

Obviously (5.108) implies (5.109). On the other hand if there is a £ and a

constant c\ for wh ich (5.109) is satisfied and if f = (r~2>---.*p)' s u c n t n a t

Y % f i = c 7 > c, then wr it ing / = and r' = cf/c' we conclude that

/ ( f ) = / ( c V / c ) > / ( r - ) = C l

because of the form of / and the fact that c'/c > 1 and ^ j j r * = c - Note that

in (5.29) 7 i "1 ( l — A) — 1 — - J2j>i 6 jfai a n d t h a t ti > °- T n i s ^ similar

argument for the case c' < c show that (5.108) implies (5.109). T h e remaining

computations in this section is somewhat simplified by the fact that for fixed

c and A we can at this point compute the unique value of C\ for which (5.109)

can possibly be satisfied.

L et R = (R2, ..., Rp-i)' and write /x,Tj(r|i/) for the version of conditional

Lebesgue density of R given that E 2 Ri = U = u which is continuous in f and

for Ti > 0 and E i _ 1 r« < w < 1 and zero elsewhere. Also write (u ) for the

Lebesgue density of R 2

— E 2 Ri = U which is continuous for 0 0 and E S P 1 r> < c - The integral in (5.110) , being a probability mix tur e

of probability densities, is itself a probability density in f , as is /o o (r| c ). Hence

the expression in square brackets equals one. From (4.35) with 0 < c < 1

= q - A ) ^ - » r ( ^ - i » , ( p _ 3 ) _ i { N _ p _ 2 )

M ) r ( ( j v - P ) / 2 ) r ( ( p - i ) / 2 ) 1 '

\(N-l), 1 ( J V - 1 ) ; ± ( P - 1 ) ; C A ) (5.111)

where F(a, 6; c; x ) is the ordinary 2̂ 1 hypergeometric series given by

F(a,b;c;x) = 2^ - -- —*

r=0r ( « ) r ( 6 ) r ( c + r )

E(g ) r (fc)r _ r

. ( c ) r

(5.112)

where we write (a) r = T ( a + r)/T(a). From (5.110) the value of a which

satisfies (5.109) is given by

( i - A ) t t " - ' > r < i ( j y - i ) ) , , „ _ 3 , . ( N - P _ „

C l " r ( | ( / v - P ) ) r ( i ( p - i ) ) 1 J

FQ(JV- 1), i ( J V - 1); | ( p - 1);

CA). (5.113)

From (5.113) and (5.29) the condition (5.109) becomes

£ [ i + £ r i ( i - A ) / 7 i - t )

03=o 0 p=a r c i ( N - U )

* r ( i ( J v - j + l ) + ft-) r 4 r j ( l - A ) / 7 j ( l + 7r-') if t

i l r ( | ( i v - i + i))(2/3 j )! U + £ ^ ( ( 1 - * ) / • > ; - i ) J " ;

= F Q ( / V - 1 ) , l -(N-l); i ( p - l ) ; c A ) (5.114)

for all T with r, > 0 and £ J ri = c.

C a s e p — 3, N — 4 ( N — 3 if me an kn ow n) . Soluti on of G i r i and Kiefer

In this case (5.113) can be written as

. + 2n) >*3 *3

V(l-4)(i-^A)

r a g a ( l - A ) ( 2 n + l ) ( 2 n + 3)

( l - « a ) ( l - r a A ) l+

Ta 83

( l - f c ) ( l - r 2 A )

(5.115)

One could presumably solve for £ by using the theory Meijer transforms

with kernal F ( § , §; \\x). Instead they proceeded as for Hotel ling's T 2 problem,

measure on [0,1], or, equivalently that the Laplace transform £ ™ ltj(-t) slj\ is

completely monotone on [0, oo}. Unfortunately we have been unable to proceed

successfully in this way. We now obtain a function m t{x), which we then prove

to be the Lebesgue density d £ * ( x ) /d x of an absolutely continuous probability

measure C satisfying (5.118) and hence (5.115). That proof does not rely on

the somewhat heuristic development which follows, but we nevertheless sketch

that development to give an idea of where the 111,(1) of (5.123) comes from.

T h e generating function

C C

3=0

of the sequence {fij} satisfies a differential equation, which is obtained from(5.118) by multiplying it by t

n

~l and summing with respect to n from 1 to 00,

2(1 - t)(t - z)<p'{t) - t~l {t 2

- 2zt + z)<f>(t)

= tf,(l-t)-i - 1 - z f " 1 . (5.119)

Solving (5.119) by treatment of the corresponding homogeneous equation and

by variation of parameter, we get

m=W=T)\ k U l - r ) ( r -

+1

C r - * ) l ( 1 - * ) * 2 [ T ( 1 - T ) ( T - Z ) ] 1d r .

(5.120)

T h e constant of integration has been chosen to make <p(t) continuous at t = 0

with 0( 0) — 1 and (5.120) defines a single valued analytic function on thecomplex plane cut from 0 to z and from 1 to 00. I f there did exist an absolutely

continuous £* whose suitably regular derivative m z satisfied

L

1

m f (x) dx = 0( f ) ,

lo (1— tx)

we could obtain m c by using the simple inversion formula

m z(x) = - . lim27TIX ElO

<p(x^ +i e ) - 0 ( x - 1 - ie)

(5.121)

(5.122)

Since there is nothing in the theory of Stieltjes transforms which tells us that

an m z satisfying (5.122) does satisfy (5.121), we use (5.122) a formal device to

obtain m E which we shall prove satisfies (5.115). From (5.120) and (5.122) we

obtain, for 0 < x < 1,

( 1 - * * ) * f n f X

du« • ( « ) = ( 1 ; M ) , k /

- 2

(1 + u)(z + u p [u(l + u) (z + i*)]*

(1 + 1 i ) i ( z + «)

2jr(a:(l - as))i { B , Q , ( z ) + c , } (s ay). (5.123)

We can evaluate c c by making the change of variables V = (1 + u) 1 and using

(5.126) below.

We obtain

and

Q , - 2 ( l - z } - 1 [ l - ( l - ^ ) - ^

' ( l - z ^ i + f l - z ) ^ l - ( l - z ) i+ ( l - z ) " t log

. ( 1 - « ) * - ( ! - * ) * l + ( l - z ) * . '

Now to show that £*(da;) = m I (x) dx satisfies (5.115) with £* a probability

measure we must show that

(a) m , ( i ) > 0 for allmost all x, 0 < x < 1;

(b) fi m z(x) dx = l; (5.124)

(c) /ti = 1111,(1) da; satisfies (5.118a);

(d) u n = x n m,(a;) da; satisfies (5.118b) for TI > 1.

T h e first condition will follow from (5.123) and the positivity of B z and c,

for 0 < z < 1. The former is obvious. To prove the positivi ty of c z , we first

note that

this is seen by comparing the two power series, the coefficient of z j being

[ ( f j j / j f ] 3 and (j + 1), and the ratio of the former to the latter being n";=i(» +

I ) 2 / j { j + i ) > i. Thus we have B M > l - 2 . Substituting this lower bound into

the expression for c z and writing u = 1 — 2 the resulting lower bound for c z has

a power series in u (convergent for \u\ < 1) whose constant term is zero and

whose coefficent of v> for j > 1 is [(j + i ) " 1 - T 2

(j + ± ) / r ( j ) r ( j + +1)].

Using the logarithmic convexity of the T-function, i.e. V 2

(j+^) < !T(j)r(i + l ) ,

the coefficient of u> for j > 1 is > (j + | )- 1

- (j + l ) "1

> 0 Hence c z > 0 for

0 < z < 1.

To prove (5.124)(d) we note that m , ( i ) defined by (5.118) satisfies the

differential equation

1 - 2x + zx 2 1

m' x(x) + 7ro , ( i ) = B z /2nxHl - x)%{\ - zx), (5.125) x(l - x)(l — zx)

so that an integration by parts yields, n > 1

- z(n + 2)M-v+I + (1 + z)(n + l ) p« - n ^ n - i

= / {-z(n + 2)x n+1 +(1 + z)(n + l)x n -nx n~1 } m z{x) dx

Jo

= [ x n(l-(l + z)x + zx 2

)m' t(x)dx

Jo

= - ) - f i n _ l + 2 l l n - z u n + 1 + B t — ^ | ,

which is (5.118)(b).

T h e proofs of (5.124) (b) and (c) depends on the following identities in

volving hypergeometric functions F{a, b; c; x) which has the following integral

representation when Re(c) > Re(b) > 0:

F(a,b;c;x) = r ( & ) ^ _ 6 ) f - t ) M

( \ - tx)~*dt. (5.126)

We will also use the representation

2 ^ ( i l ; ^ ; x ) = 1 o g ( l ± f ) (5.127)




and the identities

F(a,b;c;x) = F(b,a;c;x),

( c - o - 1 ) F(a,b;c\x) +aF(a + l,b;c;x)

- ( c - 1) F{a,b;c - l;x) = 0,

lim [TicT'F^b-c^x)

= (a) + l ( f t U i ^ M A *-(„ + „ + i ) 6 + B + 1 ; „ + 2 ; x )

(JI + 1)!

(5.128)

(5.129)

(5.130)

for 7i = 0 , 1 , 2 , . . . ,

r ( l + A + n ) r ( l + y + /Q

= F \ l + A, ~ - v; 1 + X + - A, 1 + V;1 + J/ + ii; 1 - * j

+ F\ \ + X, \ - u; \ + X -t- fi\x \ f [ - i - A, ~ +1/; 1 + « + ./*;! — as

- j Q + A , | - i / ; l + A + Was) F ( | - A , | + + y + - ;£.j;

F(ffl ,6;c;x) = (1 - x ) C - " - 6 F ( c - a ,c - b;c; x). (5.131)

We now prove (5.124) (b) and (c). From (5.123), using (5.126) and (5.127) we

obtain

- ( I - 4 8 F ^ , i ; l ; ^ + ( l r / 2 ) | ; 1; 1 - ^

3 1

2 2

— F - , - ; 2; 1 - z] x F\ -1 1

2 2

(1 -— log

3 X 1 6 1 i 1

2ir(l - j ) t i i ( l - i ] l \ 1 — - z i ) - ?

l + (i-z)5(i-zx) I

(5.132)

The first expression in square brackets in ( 5 . 1 3 2 ) varishes, as is easily seen

from the power series of F in ( 5 . 1 1 1 ) . Using ( 5 . 1 2 7 ) , the integral in ( 5 . 1 3 2 }

can be written as

_ J y . (1 - zT f 1

dx

^ ( l " 3 ) ^ 2n + l Jo (l-x)i(l-zx)"

71=0 V

1 °° , m ( l \ r a oo°° z

m

( ^ ) m ^ (n + m ) ! < l - J t ) "2 * H ( m! ) 2 Z - B ! ( n + | )

m=0 v

' n=0v

3 '

m=0

r l - l

( i - f ) m +

= ( 1 _ z ) - 1 + l ( 1 _ z ) - i r ' _ — i i ^ _1

' 2V ;

; 0 ( i - 2 - o *

= ( 1 - z } "1

+ (TT /4) (1 - i r * * ( j , | ; 2 ; 1 - z ) . ( 5 . 1 3 3 )

From ( 5 . 1 3 2 ) and ( 5 . 1 3 3 ) we get

" 2 ' 2 " " / T V2'2| m , ( « ) ( f c = ( 7 r / 2 J F f - i , i ! l ; 4 r i F , r i i ; l

- P

1.1,*.-.)]

Hence, from ( 5 . 1 2 9 ) and ( 5 . 1 3 3 ) we obtain from ( 5 . 1 3 4 )

- F ( - ^ - ^ I ^ ) ] } . ( 5 . 1 3 5 )

Using (5.129) with a = b = c = 1 + e and (5.130) with n = 0 and

c = e —* 0, (5.132) reduces to

1 1 1 1

2 ' 2 ' mF[ - - , - ; l ; z ] F ( - , - ; 2 ; l - z ) + ( E / 2 ) F ( - , 2; 1-z )F[ - ; 2 ; *

3 1

2 ' 2 '

1 1

2 ' 2 '

(5.136)

Now, by (5.131), the expression (5.133) equals one if we have

- / • ( - i , 4 ; l ; Z ) + ( 2 / 2 ) F ( i , i ; 2 ; . = 0.

(5.134)

The expression inside the square brackets is easily seen to be zero by computing

the coefficent of z n . Thus we prove (5.124)(b).

We now verify (5.124) (c). The integrand in (5.132) is unaltered by multi

plication by x, and in place of (5.133) we obtain (1 — Z )~1 /2 — Z ~1

/3+[ K /4Z (1 —

z)i] F ( - | , | ; 2 ;1 - z). The analogue of (5.134) is

pi — I x m 2(x) dx Jo

; 2 ; * )

+ ( 7 T / 4 ) ( 1 - Z 2 ) / Z k i n -- [ ( l - z ) * / 3 . ] F |

'3 3

( 2 ' 2 : 1

^i n ^ l ^ i - z

1 3

2 ; 2 ;

(5.135)

To verify (5.124) (c) we then have to prove the following identity (by (5.118)

ril ,1:1:1-*

3 3

2 ' 2 '

+ < x / 4 ) [ ( l - * )2

/ * ]

1 3x F F - - , - ; 2 ; l - z .

2 ' 2 '(5.136)




From (5.131) with c = 1, a = b = §, then (5.129) wi th a = ±, 6 = - J , c = 1,

and then (5.130) wi th A = /i = 0, f = 1 we can rewrite (5.136) as

(4/3ir){l + 2»)

+ ( 3 z / 2 ) F ( i , - | ; 2 i * ) | , | ; 2 ; 1 - * ) + (4/3TT)(1 - z)

i i2' 2'

1 3

2 ' 2

(5.137)

Using (5.129) with a = - f , b = §, c = 1 + «r, and then (5.130) with n = 0,

c — « -* 0, the expression inside the square brackets in the last term of (5.137)

can be simplified to \ z F ( — I , ^ ; 2 ; r ) . Thus we are faced with the problem of

establishing the following identity

1 3

4/ TT = Fl — - , - ; 2; z j F\ - , - ; 1; 1 — z2 2

1 1

2 ' 2 '

(5.138)

which finally by (5.126) reduces to

-^FG'-^;2;1-2)+((1-£)/2-1)F(^;2;1-3) •

(5.139)

The expression inside the square brackets in (5.139) has a power series in 1 - z,

the value of which is seen to be zero by computing the coeificents of various

power of 1 - z. Hence we prove (5.124) (c).

Some Minimal Test in Mul (in ormales 135

5.3.3. E-Minimax test (Linnik, 1966)

Consider the setting of Sec. 5.3.1. for Hotelling's T 2 lest for testing H 0 :

6 = 0 against alternatives Hj : 8 2 > 0 at level Q(T(0,1). The Hotelling T

2 test

0jv is £-minimax if

sup inf E8(<b)- inf Ee( /V 0 (e) where 8 = (ft, E ) and 0 runs through all tests of level < a.

Fo r N - * oo, A = ^ , 0 < tp < 2(log N ) * , (5.26) has the approximate

solution

which is a pdf on the simplex I V

If we substitute (5.140) in the left side of (5.62) we obtain a discrepancy

with its right side which is of the order of 0 £ ( J V - 1 + £ ) , for e > 0. Thus, if

a = ON satisfies the condition

0(exp{-logN)V*) < a < 1 - Ofex pf- l ogfV ) 1 ^) (5.141)

and we have

e x p ( - ( l o g 7 V } ^ £ ) < 8 < (logfiV)) 1 /* (5.142)

then

sup inf Ee(<t>) - inf Eg(4> N ) = O ^ l / i V 1 " 5 )

for e > 0 and hence the Hotelling T 2 test is e-minimax for testing H a against

i f , : 8 > 0.

E x e r c i s e s

1. Prove in details (5.65).

2. Prove (5.72) and (5.73).

3. Prove (5.119) and (5.120).

4. Prove (5.127) and (5.130).

5. Prove that in Problem 8 of Chapter 4 no invariant test under ( G T - , ! ^ ) is

minimax for testing Ho against Hi for every choice of A.




R e f e r e n c e s

1. M. Behara, and N. Giri, Locally and asymptotically minimax test of some multi

variate decision problems, Archiv der Mathmatik, 4, 436-441, (1971).

2. A. Erdelyi, Higher Transcedenial Functions, McGraw-Hill, N.Y ., 1953.

3. J. K. Ghosh, fnvariance in Testing and Estimation, Publication no SM 67/2,Indian Statistical Institute, Calculta India, (1967).

4. N. Giri, Multivariate Statistical Inference, Academic Press, N .Y . , (1977).

5. N. Giri, Locally and asymptotically minimax tests of a multivariate problem.,

Ann. Math. Statist. 39, 171-178 (1968).

6. N. Giri, and J . Kiefer, Local and asymptotic minimax properties of multivariate

test, Ann. Math. Statist. 39, 21-35 (1964).

7. N. Giri, and J. Kiefer, Minimal character of R 2-test in the simplest case, Ann,

Math. Statist, 34, 1475-1490 (1964).

8. N. Giri, J . Kiefer, and C. Stein, Minimax properties of T 2

-test in the simplest case, Ann. Math. Statist. 34, 1524-1535 (1963).

9. E . L. Ince, Ordinary Differential Equations, Longmans, Green and Co., London

(1926).

10. W, James, and C . Stein, Estimation with quadratic loss., Proc. Fourth Berkeley

Symp. Math. Statist. Prob. 1, 361-379 (1960).

11. E. L . Lehmann, Testing Statistical Hypotheses, Wiley, N .Y. (1959).

12. Ju. V. Linnik, V. A. Pliss and Salaevski, On Hotelling's Test, Doklady, 168,

719-722 (1966).

13. J . Kiefer, Invariance, minimax sequential estimation, and continuous time pro

cesses, Ann. Math. Statist. 28, 573-601, (1957).

14. Ju . V. Linnik, Appoximately minimax detection of a vector signal on a Gaussian

background, Soviet Math. Doklady, 7, 966-968, (1966).

15. M. P. Peisakoff, Transformation Parameters, Ph. D. thesis, Princeton University

(1950).

16. E . J . G . Pittman, Tests of hypotheses concerning location and scale parameters,

Biometrika 31, 20G-215 (1939).

17. O. V. Salaevski, Minimal character of Hotelling's T 2 test, Soviet Math. Doklady

3, 733-735 (1968).

18. J . B. Semika, An optimum poroperty of two statistical test, Biometrika 33, 70-80

(1941).

19. L . J . Slater, Confluent Hypergeometric Functions, Cambridge University Press(1960).

20. C. Stein, The admissibility of Hotelling's T 2 test, Ann. Math. Statist, 27, 616-623

(1956).

21. A. Wald, Contributions to the theory of statistical estimation and testing hypothe

ses, Ann. Math. Statist 10, 299-320, (1939).

22. A. Wald, Statistical Decision Functions, Wiley, N.Y. (1950).

23. O. Wesler, fnvariance theory and a modified principle, Ann. Math. Statist. 30,

1-20 (1959).




where q is a function on [0,oo) satisfying

J o(y'y)dy = 1

for y e E p .

We shall denote a family of elliptically symmetric distributions by E p(n, £).

It may be verified that

E{X) = u, cov(X) = o S

where a — p - 1 E{{X - n)'%-l (X - f*)). In other words al l distr ibutions in

the class E p{u t S) have the same mean and the same correlation matrix. The

family E p(ti, S ) contains a class of probability densities whose contours of equal

density have the same elliptical shape as the multivariate normal but it contains

also long-tailed and short-tailed distributions relative to multivariate normal.

This family of distributions satisfies most properties of the multivariate normal.

We refer to Giri (1996) for these results.

De f in i t ion 6.1.2. (Multivariate Elliptically Symme tric Distr ibuti on). A

n x p random matrix

x = (xij ) = (xl ,---,x n y

where Xi — (Xn,..., X ip )' is said to have a multivariate elliptically symmetric

distribution with the same location parameter p = ( p i , . . . , p p } ' and the same

scale matr ix S (positive definite) if its probability density function is given by

f x(x) = \L\-^ 2

q r g f o - tf^fa-tM , (6.2)

with Xi e E p ,i = 1,... ,n and q is a function on [0, oo) of the sum of quadratic

forms {ii — p ) ' S - 1 ( i i —p) , i — 1,.... ,H.

We have modified the definition 6.1.2. for statistical applications. Ell ip

tically symmetric distributions are becoming increasingly popular because of

their frequent use in filtering and stochastic control, random signal input, stock

market data analysis and robustness studies of multivariate normal statistical

procedures. The family E p(u, S ) includes, among others, the muiltivar iate nor

mal, the multivariate Cauchy, the Pearsonian type I I and type I V , the Laplance

distributions. T h e family of spherically symmetr ic distributions which includes

the multivariate-t distribut ion, the contaminated normal and the compound

normal distributions is a subfamily of i £ p ( p , £ ) .



Locally Minimax Tests in Symmetrical Distributions 139

Write f x(x) in (6.2) as

(6.3)

where 8 = {p, E ) e Q,x e x and * is a measu rable function from \ o n t o 3̂ -

Let us assume that y is a nonempty open subset of R m

and is independent of

0 and q is fixed integrable function from y to [0, oo) independently of 8.

Let G be a group of transformations which leaves the problem of testing

H 0 :8 £ i l / / 0 against Bi : 8 £ flff, invariant.

Assume that \ is a Cartan G-space and the function i/j satisfies

for ai l i € x> ff £ G , # G SI an d g € G where (5 is the induced group of transformations on * corresponding to g £ G on x a r | d G acts transitively on

the range space y of *£.

Using (2.21a) the ratio i i of the prob ability density function of the maxima l

invariant T(X) under G on x; with 8\ 6 QH , , #o £ QH 0 > >s given by

where 6 (g) is the Jacobian of the inverse transformation x —> gx.

T h e o r e m 6.1.1. / / G acts transitively on the range space y of then R

is independent of q for all 8\ 6 tiff,, do £ fln0 -

P r o o f . Since G acts transitively on y , for any jo £ J there exists an

element k(x,8j : y 0 ) of G such that

Using the invariance property of p and replacing g by gh(x,8i : j / 0 ) we get

*(gx\8) = gV(x\8) (6.4)

(6.6)

where A r (-) is the right modular function of ft. Hence

Q{80 )8{h{x,80 : y0 )-i)A r(h(x,6>0 : y0 ))

is independent of g. •

6.2. L o c a l l y M i n i m a x T e s t s i n E p(fi, E )

In recent years considerable attentions are being focussed on the study of

robustness of commonly used test procedures concerning the parameters of

multivariate normal populations in the family of symmetric distributions. For

an up-to-date reference we refer to K a r i y a and Sinha (1988). T h e criterion

mostly used in such studies is the locally best invariant ( L B I ) property of the

test procedure. Giri (1988) gave the following formulation of the locally min

imax test in E p(ii, £) . Le t F be the space of values of the maximal invariant

Z — T(X) and let the distribution of T(X) depend on the parameter (<5,77),

with 6 > 0 and 77 g W. For each (u, 77) in the parametric space of the distribu

tion of Z suppose that f(z : $,tf) is the probability density function of Z with

respect to some r/-finite measure. Suppose the problem of testing Hg : 8 e f l / / 0

against Hi .8 e fi//, reduces to that of testing H 0 : 8 - 0 against Hi := A > 0,

in the presence of the nuisance parameter IJ , in terms of Z. We are concernedwith the local theory in the sense that f(z : A, n) is close to f{z : 0,17) when A

is small for all q in (6.2). Thro ugho ut this chapter notations like o ( l ) , o(h(X))

are to be interpreted a A —* 0 for an q in (6.2).

For each a, 0 < a < 1 we now consider a critical region of the form

R' - [U{x) = U(T(x)) > C a ] (6.7)

where U is bounded, positive and has a continous distribution for (8, r/), equi-

continuous in (S,ri) with 6 < S0 for any q (6.2) and R' satisfies

P 0 , „ ( f l * ) = cv, P M ( f T ) = a + h{\) + r fA, / , ) (6.8)

for any q in (6.2) where r(A,ri) - o(/ i(A)) uniformly in 77 with h{\) > 0 and

h(X) = o(l). Without any loss of generality we shall assume h(X) — bX with

£>> 0.

R e m a r k s .

(1) The assumption P0t „(R*) = a for any q implies that the distribution

of U under H 0 is independent of the choice of q. T h e test with critical

region R' satisfying the assumption is called null robust.

Locally Minimax Teats in Symmetrical Distributions 141

(2) The assumption P\^(R") — a + h(X) + r(X,ri) for any q implies that

the distribution of U under Hi is independent of the choice of q as

X —> 0. The test with critical region R" satisfying the assumption is

called locally non-null as X —• 0.

Let fo, £x be the probability measures on the sets {S = 0}, {6 — X},

respectively, satisfying

//

f(z : X,T,)Udv)

= 1 + h(X)[g{X) + r{X)u] + B{z, X) (6.9)

f(z:Q, n )( 0(dri)

for any q in (6.2) where 0 < Ci < r(X) < c 2 < oo for X sufficiently small,

g(X) — o(l) and B(z,X) = o(X) uniformly in z.

Note. If the set 8 = 0 is a single point, £ 0 assigns measure 1 to that point. In

this case we obtain

j = t ( 6 1 0 )

Since G act s trans itively on the range space y of the right-hand side of

(6.10) is independent of q.

T h e o r e m 6.2 .1 . If R* satisfies (6.8) and for sufficiently small X, there

exist £\ and £o satisfying (6.10), then R* is locally minimax for testing HQ :

5 = 0 against the alternative Hi : 6 = X for any q in (6.2) as X —* 0, i.e.

l i m iaU,Px,v(R*) - a = j , f i ^

A - . o s u p ^ i e Q a iaf„Px,v{4x rejects H 0 } - a

for any q in (6.2), where Q a is the class of test d>\ of level a.

P r o o f . Let

n = (2 + /i(A)[9(A) + C*Q7-(A)])- 1 .

A Bayes critical region for (0, 1) losses with respect to the a priori £ J —

T\ Z\ + (1 - TX)£O is given by

B y (6.10) B x(z) holds for any q in (6.2). Write B x(z) = B x , V x = R * - B x

and W x = B\ — R". Since supJ5 A (z )/ 7 i ( A ) | = o( l) and the distribution of U

is continuous, letting

KM) = / PxMUdv)

we get

Since, with A — Vj, or

p & t ^ ) = % f , 4 ! ( i + o ( f t ( A ) )

writing

r J ( A ) - (1 - r A ) P J i A ( A ) + r A ( l - f & f A ) )

we get the integrated Bayes risk with respect to fj as

rJ(B>) - (1 - r O P o , x ( S , ) + r A ( l - P & p j J ) .

r x(R*) =(1 - r A ) P 0 % ( J ? ) + rx(l - P ; , A ( P ' ) ) •

Hence for any q in (6.2)

r A ( B A ) = rl(R-) + (1 - n ) t * f o T O "

+ T a ( P 1 ' I A ( V A ) - P 1 - I , ( W , ) )

= rl(R') + (1 - 2 n ) ( P 0 * A ( H M - P 0 * A ( V A ) )

= r i ( i l * ) + 0 ( M A ) ) . (6.13)

If (6.11) is false for all q in (6.2), then by (6.8) we can find a family of tests

{<px} of level a such that d> x has power function cv + r(A, n) on the set {6 = A}

for all q in (6.2) satisfying

lim sup (inf[r(A,jj) -h(X)]/h(X) > 0.

A —> 0 "

Hence, the integrated Bayes risk rA

of d>\ with respect to then satisfies

lim sup ( r ; ( P ' ) - r A ) / M A ) > 0

A — 0

for all q in (6.2) contradicting (6.13) •




6 . 3 . E x a m p l e s

E x a m p l e s 6.3 .1 . Let X = (Xij) = (X[,. .. t X' n ), X[ = (X' tl ,... ,X ip ),

i — 1,... ,n be a n x p (TI > p) matrix with probability density function

f x(x) = m-^ 2

q(Y (xi ~ rf'E-^-JK)) (6.14)

i-i

where g is from [0, co) to [0, oo), ft = (pi,... ,p p )' € E p and E is a p x p positive

definite matrix. We shall assume that q is thrice contunuously differentiable.

Write for any b = (h,...,b p )' = {b'{1) ,b'(2) )', with b{ i) - (f>i,... , 6 P l ) ' , b [2 ) =

(lfa+1,.. .»6p)',6p] = (bi,...,bi)' and for any pxp matrix

A=(aiS )=(f*

3 \ A 2 1 A 22

where A{ 3 are pi x p } submatrices of A with pi -I- p 2 — p- We shall denote by

/ a n , . . . , a i , \

\aa,..., On/

We are interested here in the following three problems.

P r o b l e m 1. To test H IQ : p = 0 against H \ \ ; p ^ 0 when E is unknown.

P r o b l e m 2. To test H 2I) : M(i) = 0 against H 2 \ • P(i) ^ 0 when p( 2 ) , E

are unknown.

P r o b l e m 3. To test H Z Q : p — 0 against H 31 : p; ij = 0, p( 2 ) / 0 when E

is unknown. The normal analogues of these problems have been considered in

Chapter 4.

P r o b l e m 1.

Let Gi(p) be the mul tipl icative group of p >c p nonsingular matrices g.

Problem 1 remains invariant under G;(p) transforming

(X,S;p,-i:)^(gX,gSg';gp,gT l g')

where X = ^ = l X i , S = Z?=l (Xi-X)(Xi-X}'. A m axi ma l invariant in the

space of ( X , S) is T 2 = nX'S^X' or equivalently R = nX'(S + nXX')~1

X =




T 2 / ( l + T 2 ) . A corresponding maximal invariant in the parametric space under

Gi(p) is 6 = T t y t ' E ~ V - If Q is convex and nonincreasing, then the Hotelling T 2

test which rejects H\o whenever i i > C is uniformly most powerful invariant

with respect to G ; (p) for testing i f 1 0 against H\i (Kar iya (1981}}. As stated in

Chapter 5, the group Gi(p) does not satisfy the conditions of the Hunt-Stein

theorm. However this theorem does apply for the subgroup GT(P) o f p x p

nonsingular lower triangular matrices with p > 2. From Example 5.1.1. the

maximal invariant under GT{P) is [R\,... , i?p) ' , whose distribution depends

on A — (6\,...,Sf,)'. The ratio R of probability densities of (Ri,...,Rp)'

under Hu : S = A and H L 0 : 6 = 0 is given by (with g = (gij) € GT = GT{P))

j Pi(9z)fl(9 2

i)(

"-i)/2

d9

R = ~ — ^ (6.15)

/ P o ( s x ) I ] ( ^ ) ( n - 0 / 2 d S

where

dg = Jldgij,

Pi(x) = q(tr x'x - 2nxp + A),

Po(x) = ?(tr x'x).

Since x'x > 0, there exists a g E GT such that

gx'xg' = I, Jn~gx = y = (VR U ..., v ^ p ) ' .

Hence

R =

f q(ti{ 9 g' - 2A'gy + tyf[(fi)f**&*B

/ 9 ( t r ( s s ' ) ) n ( ^ ) ( n - i ) / 2 d f f

(6.16)

Under the assumption that q is thrice differentiable, we expand q as

q(ti(gg' - 2A'gy + A))

- 1(^(99')) + ( -2tr(A'gy) + \)q™(ti gg1 )

+ ^ 2 t r ( A ' g y ) + A) 2 ( tr (gg'))

+ h-2ti(A'g y ) + X) 3q 3(z) (6.17)

Locally Minimal Tests in Symmetrical Distributions 145

where z = ti{gg') + (1 - a ) ( - 2 t r ( A ' 9 y ) + A) , 0 < c < 1 an d -

Since the measures

9%*<^)>jl^^^dfl , fc = 1.2,3i=i

are invariant under the change of sign of g to -g, we conclude that

/ t r ( A W " ( t r ( S 9 ' ) ) fiftfe**-*** = ° '

/ 9 i W t ) ( t r ( 9 9 ' ) ) n ( 4 ) ( ' 1 " i l / 2 r f ^ 0 (618)

for i j£ i , j m . Now letting

D = / g ( t r ( f f 0 ' ) ) n ( s ? i ) ( u - ° / 2 ^ ,

from (6.15)-(6.18) we get

R=l + ± ! g^(gg'))f[(gir-^dg

+ 4 / M A ' 0 i , ) ) 2

9 ( 2 > ( t r ( 5 f f ' ) ) n ( ^ ) | n - ' ) / 2 d S

+ 4 / ( - 2 t r ( A '? s /

) + A ) V

3 1

(

3

) n ( S

2

, )

| n

-

l ) / 3

^6 / ? i i (6 .19 )

The first integral in (6.19) is a finite constant Q\. To evaluate the second

integral in (6.19) we first note that

tr{L\'gy) = Y d r)12 (6.20)

From (6.17) and (6.19) the second integral in (6.18) reduces to

/ | 5 > £ M - I ^(gg'))f[(gl^-^dg. (6.21)



Locally Minimax Teats in Symmetrical Distributions 147

where

/ P M = E[ T J ( E I ) ( « - 0 / a

\ i = l

C = p(p + l ) / 4 + 1/2 £ ( n - i ) . (6.26)

Let JJ = ( i j i , . . . ,%,) ' , r?i — ^i/A. From (6.18) we get

» = X-+ A2? j&+ | J |j> + ( n - j +

L

)%]j j +(6.27)

where B(y,ri,\) — o(A) uniformly in and 77. Thus, from (6.27), with

the equation (6.9) is satisfied by letting £0 give measure one to the single point

?7 = 0 while £A

gives measure one to the single point 77—

n" — ( j j j , . . • , t j j )whose j t h corrdinate Wj satisfies

V'j = (" " j ) " V " 3 + i r V ^ f f t " P). / = 1, - (6-28)

so that

+ l ) n | - njp. (6.29)

Since G;(p) and GT (p) satisfy the condition of Theorem 6.1.1., R does not

depend on q. The first equation in (6.8) follows from the following lemma.

L e m m a 6.3.1. Let Y = (Y u ... ,Y n )', Y; = {Yn,..., Y ip )' f i = 1 , . . . , « be

annxp random matrix with spherically symmetric probability density function

f Y (y) = 9(tr yy') = q ( g tr ] , (6.30)

a n d Jet

, i = l

where Y ( L ) , Y( 2 ) are of dimensions k x p, (n - k) x v respectively and n - k >

p>k. The distribution of Y ( 1 | ( Y ( ' 2 ) Y ( 2 ) ) Y (1) does not depend on q.

Proof. Since n > p, Y'Y is positive definite with probabili ty one. Hence

there exists a p x p symmetric positive definite matrix A such that Y'Y — AA.

Transform Y to U, given by Y = UA. As the Jacobian of the transformation

Y — U is \A\ n

,

/ r / ,A(« ,«) = «(tr a a ' ) M n .

Thus (7 has the uniform dis tribution and 17 is independent of A. Partition

U as

where t/ ( 1 ) is k x p and (7(2) is (n - k) x p. From Y = C M we get Y ( { ) = t7#j>4,

i = 1, 2. Thus

Since Y ( 1 1 ( Y 1 ' 2 ) Y ( 2 ) ) _ 1

Y { j j is a function of 17 alone, it has a completely specified

distribution. •

C o r o l l a r y 6. 3.1. If fc- is distributed as Hotelling's

T 2 (central i.e. 6 = 0).

Proof. Since Y( 1 j (Y j ' 2 j Y (2 ] )- 1

Yj ' 1 j has a completely specified distribution

for all q in (6.30), we can, without any loss of generality, assume that

Y i , . . . , Y„ are independently distributed N p(0,I). This implies that Y / 2 j l ( 3 ) =

E J L 2 Y / Y / has Wishart distribution W p(n - 1,1) independently of Y i . Hence

Y

(i)( Y

(2)Y

m)~lY

{i)h a s

Hotelling's T 2

distribution. •

From this it follows that when 6 = 0, T 2

= nY'S'1?, with Y = l/n £"=1

Yi , 5 = ~Y

)( Y

i ~Y

Y h a s Hotelling's T

2 distribution for all q in

(6.30).

To establish the second equation in (6.8) we proceed as follows. Us ing

(2.21) we get for any g £ t?/(p), (which we will write as Gi)

fT*{t 2 \6) J Gi g(tT( 9 g'-2a'gy + S))\gg'\^' 2 dg

fT2(t 2

\0) L <j(tr gg')\g0«-»)/ 2 dg( J

Locally Minimax Testa in Symmetrical Distributions 149

where y = ( < /r ,0 \ . . . , 0 ) ' , a - ( \ A , 0 , . . . ,0) ' are p-vectors and T = f 2 / ( l + f 2 ) .

After a Taylor expansion of q, the numerator of (6.31) can be written as

/ q(ti99')\99'\iin- p, dg + 6 j <,<»>(tr 99' ) \99 ' \ { n - p ) n dg JG, JG,

+ 26r f q^(tT gg , )gl\gg'\^->'y 2 dg + o(6) JG,

where g — (jJy). Using (6.18) we get the ratio in (6.13) as

l + 6(k + CT) + B(y,t),6) (6.32)

where B(y,r),6) — o(6) uniformly in y,v and 77 — ( i j i , . . . , i7 p ) ', with r/i —

Si/S,i = 1,.. . ,p and k,c are positive constants. Hence for testing H l0 : 6 = 0

against the alternative H' n : S — A the Hotelling's T2

test with critical region.

R* = {U(X) = R>c a } (6.33)

where c a is a positive constant so that R* has the size a, satisfies

P^tRn^a + bX + giKr,) (6.34)

with b > 0 and <7(A,r/) = o(Z>A). Thus we get the following theorem.

T h e o r e m 6.3.1 . For testing Hm : n = 0 against the alternative H' n : 6 —

A specified > 0 Hotelling's T 2 test is locally minimax as X —* 0 for the family

of distribution (6.14).

R e m a r k 1, Since Gj(p) satisfies the condition of Theorem 6.1.1., the ratio

(6.31) is independent of q. From Corollary 6.3.1. and equations (6.31) and

(6.34) it follows that Hotelling's T 2 test is locally best invariant for testing

Hia against : 6 — X as A —* 0 for the family of distributions (6.14). I f qis a nonincreasing convex function then Hotelling's T 2 test is uniformly most

powerful invariant.

P r o b l e m 2. Write X = ( X ( i j , X ( 2 ) ) with : n x p\,X( 2 ) : n x P2, and

p,+P2= p.

Since the problem is invariant under translations of the last pa components

of each Xj, j — 1,... ,n, this can be reduced to that of testing if 2

o : p ^ j — 0against H21 • P( i) 0 in terms of X^ only whose marg inal probability density

function is given by

/ ^ ^ ( ^ D i - I S i i r ^ g f t r S r / ^ i j - e p ' ^ y ^ ^ - e p ^ ) )




where E i j is pi x P l left-hand cornered sub mat rix of £ , e — (1 , . . . ,1)

JI-vector and

g(tr v) = / q(tr(v + ww'))dw,

J R "F1

and q is a function on [0, oo) to [0, oo) satisfying (6.1a). Now, us ing the resul ts

of problem 1 with p = p L , R = Ri - R i w e h a v e t h e following theorem.

T h e o r m 6 .3 .2 . For testing H20 • P(i) = 0 against H' 21 : 6\ — np'^y

S n V ( i ) = X {specified) > 0 the test which rejects H 20 for large values of

Rj — nX'(1)(S{n) +r$( l y£fa)~1 Jt(i)

= nXl l)S( l \ ) X {l) /{l + nXl 1)S-\ ) X l , ) )

is locally minimax as X —t 0 for the family of distributions (6.34).

R e m a r k 2. T h e locally best invariant property and the uniform ly most

powerful invariant property under the assumption of nonincreasing convex

property of q follows from the results of problem 1.

P r o b l e m 3. T h e invariance property of this problem has been discussed

in Ch ap te r 4. T h e problem of testing i f 3 0

against H 3

i remains invariant underthe multiplicative group of transformations G of p x p nonsingular matrices g,

9 = ( 9 m ) ° \9{2i) 9(32)

where gin) is pi X P i , transforming (X, S) -» (gX, gSg'). A ma xim al invariant

in the sample space under G is (Hi,.S3) where

pi p

Ri=2^Ri, Ri+R~ 2 = R = Y, Ri

-fc=l i = l

A corresponding maxim al invariant in the para metri c space is S\, 62 where

i = l

P

i - l

For invariant tests under G the problem is reduced to testing i i 3 0 ; 5 2 = 0

against the alternatives H 3l : b\ > 0, when it is given that 61 = 0, in terms of

Ri and R 2-

As stated in Chapter 5 the group G does not satisfy the conditions of

the Hu nt -S te in theorem. However this theorem does apply for the subgroup

GT ( P) of p x p nonsingular lower triangular matrices. From problem 1 the

maximal invariant under GT{P) is (Ri,...,Rp)' whose distribution depends

continuously on A = , • - •, VS P

)' with 6i > 0 for all i. UnderH

30

: 6i — 0,i = l , . . .p and under i/31 : 6i — 0, i — 1 , . . . , pi- The ratio R of the probability

densities of (Ri,..., R p )' under H ' 31 : 6 2 = A > 0 and under H 3 Q '• $2 = 0, when

it is given that 6\ — 0, is given by

f

/ q(tv(g 9' - 2A{ 2j(g{2im) + S ( 3 2 ) J , ( 2 ) + A)} ]\[gl){n- ,)l2 dg

R = — — 1 (6-35)

/ ?(tr gg')\{{gir-^dg

J G T 1

where g e G T (p) — GT , y — ( T/R~I , • • • ,VRy)' and

• - * • > - ( a a )with (» i i j ff (22) both lower triangular matrices .

As before we expand 9 in the numerator of (6.35) as

9(tr Sff') + (~2v + \)qM(ti gg') + {-2v + A ) 2

9 | 2 ) ( t r 3 0 ' ) + ( -2 f + A ) V 3 , ( ^ ) (6-36)

where

2 = tr gg' + (1 - a ) ( - 2 f + A), 0 < a < 1,

" = tr A {2 )

( 9[ 2 1

) 3 /( 1

) + 3(22)2/(2))-

Using (6.18) the integration of the second term in (6.36) with respect to

the measure

n ^ - ^ ( 6 - 3 7 >1

gives AQI where

0 1 = / qW (K99')f[(9 2 ,)in-iy2 dg.JG T 1

To integrate the third term in (6.36) with respect to the measure given in

(6.37) we first observe that (by (6.18))

and

Denote

/ ( t r ( A ; 2 ) S ( 2 u y o ) ) % ' 2 H t r f f s 0 n ( 5 2 J l n - , ) / 2 ^

= / ( E 9 ( 2 t ( t r S f f ' ) n ^ ) ( n " i ) / a d f

/ ( t t ( A ; 2 ) f f ( 2 2 ) i ( ( 2 ) ) % ( 2 > ( t r S 3 ' ) n ( s 2

1 } ( n - 0 / 2 r f sJ G r

v i

= / E r >Ei f f 4 + - + V i I )T

[ )=Pi + l j > i

1

K= f q^[tigg')dg,JG T

£ = tr gg',

D= f 9(tr gg')f[{glY-^dg,

where e-s are defined in (6.23). From (6.35)-(6.40) we get

D

+ B{y,r,, A)

( 2MN ^ ( n+ E r i

1>J

(6.38)

(6.39)

(6.40)

(6.41)

where B(y,n, A) — o(A) uniformly in j / , T> (independently of q by Theorem

6.1.1). The set {A = 0} is a simple point 17 = 0. So the £ 0 in Theorem 6.2.1.

assigns measure 1 to the single point rj = 0. The set {<52 — A} is a convex

P2 = (p — pi) dimensional Eucl idean set wherein each component 77; = 0(h(X}).




Any probability measure £ \ can be replaced by degenerate measure which

assigns measure 1 to the mean 77*, i — pi + 1 , . . . , p of £ j . Hence

p

+ B(y,ri,\) f g 4 2 j

where B(y,Tj, A) = o(h{\)) uniformly in y,n. Consider the rejection region

RK = {X : U(X ) = fi + Kf 2 > c0 ) (6.43)

where A" is a constant such that (6.42) is reduced to yield (6.9) and C„ depends

on the level of significance a of the test for the chosen K, independently of q

(by Lemma 6.3.1.) Now choose

„ _ (n-j-!)••• (n-p) I n-pi \ , _

V j (n-j + l) --(n-p + 2) {(n-p+Vpi) ' 3 P l

so that

Hence we can conclude that the test with rejection region

R' = ^x :U(x)=f 1 + 7 ^ ^ - r 2 > C 0 J (6.45)

with P(,TTI (R') = a satisfies (6.9) as A — > 0. Furthermore any region RK of the

form (6.43) must have K = to satisfy (6.9) as A - * 0 for some t\.

It can be shown that (Giri (1988))

ffi, R 2 (?i, f 21 J?M )//fi, (fi, f 2 \H 3a )

Dl\P2 \ \

P2 (6.46)

where Dl7 ati, a 2 are positive constants and B{f '1,^2,62) — 0(62) uniformly in

f j , f 2 . From this and Theo rm 6.1.1 it follows tha t for testing H 3D against Jf 3 1

the test with critical region R' is locally best invariant as 6 2 —> 0. Hotelling's




test which rejects H 3Q whenever f i + f 2 > constant does not coincide with the

L B I test and hence it is locally worse. From (6.32) it follows that Hotelling's

test whose power depends only on 6 = 6 2 , has positive derivative at 6 2 = 0.

Thus, for the critical region of Hotelling's test the value of h(\) with 6% = A in

(6.8) is positive. The first equation in (6.8) follows from Lemma 6.3.1. Hence

we get the following theorem.

T h e o r e m 6.2.2. For testing H 30 against H 3l : £ ( 2 ) = A the test, which

rejects H 30 whenever fi + ^jf-rt > constant, is locally minimax as A —* 0 for

the family of distributions (6.2).

E x a m p l e 6.3.2. Let X = (X I} ) = {X\ X n )' X[ = (X* X ip )',i =

1,..., TI be an n x p random matrix with probability density function given in

(6.2). Let X = J Xt, S = J J ( X i - X)(X t - X)'. For any p x p matrix A

we shall write

C-4(l l | -4(12) 4 ( i 3 ) \

Ajai) A(2 2) -4(23) I (6-47)

4(31) 4(32) .4(33) /

with 4 ( n ) : 1 x 1, -4{2

2) I Pi xp i , 4 (3 3

, : p2 x p; so that pi +pa = p—

1. Define

Pi =S ( 1 2 ) S ( 2 2 ) S ( 2 1 ) / S ( I 1 ) '

p2

= p2

+ p 2 - ( E , i 2 ) , E , i 3 , ) ' ( ^ ( E a a , , E ( 1 3 ) ) ' / E ( 1 1 ) .

We shall consider the following three testing problems: (see Sec. 4.4)

P r o b l e m 4. To test H 0 : p 2

= 0 against Hi : p 2 > 0 when p, E are

unkown.

P r o b l e m 5. To test H 0 : (? = 0 against H 2 : pi > 0, p\ = 0 when p , E

are unknown.

P r o b l e m 6. To test H 0 : p2

= 0 against H 3 : p\ — 0, p\ > 0 when p, E

are unknown.

These problems remain invariant under the group of translations trans

forming

( * , S ; i » , S ) - » ( * + 6 , S ; / £ + t , E ) . 6 e Rp

.




The effiect of these transformations is to reduce these problems to correspond

ing ones where p = Q and 5 = ^27 =1 XiX[. Thus we reduce n by 1. We treat

now the latter formulation with p = 0 and assume that n > p > 2 to assert

that S is positive definite with probabil ity one. Let G be the full linear group

of p x p nonsingular matrices g of the form

f s m 0 0 \ 9 = \ 0 9(22) 0 . (6.48)

\ 0 9(32] 5 ( 3 3 ] /

For the first problem p 2 = 0, p\ =p — 1 and for the other two problems

Pi> 0,p2 > 0 satisfying p\+p2 — p— 1- T h e problems above remain invarian t

under G operating as

( S ; E ) -> (gSg'^Zg1

)

with g G G. Fo r Pro ble m 6 a ma xima l invariant in the space of S under G

(with p2 = 0) is

_o 5

( 1 2 )5

( 2 2 )5

[ 2 1 i

" =

Q

and a corresponding maximal invariant in the parametric space of E under the

induced group (with p2 = 0) is

p* = S ( 1 2 ) S

^S f 2 1 ) . (6.49)

s

( n )

For Problems 5, 6 a maximal invariant under G is (fi 2 , fi2), where

R\ = 5 " 2 , y 2 ^ ( 2 1 ) , (6.50)6

( H )

fij+fi| = ( 5 ( 1 2 ) , 5 ( 1 3 1 ) ( ^ l ( % a ) , •

(6.51)

A corresponding maximal invariant in the parametric space of E is {p\,p 2 ),

where

Pi =

., _ E ( i 2 ) E ( 2 2 ) E ( 2 i ]

E ( n )

( E ( 1 2 ) , E ( 1 3 ) ) ( ' ^2 2

) ^3

) ) ( S ( 1 2 ) 1 E ( 1 3 ) ) '

L ( u )

P r o b l e m 4. T h e invariance of this problem has been discussed in Sec. 4.4.

The group G does not satisfy the conditions of the H unt -S te in theorem . T h e

subgroup GT of G , given by,

f ( a m0 0

GT = < 9 - 0 0(22) 0

( \ 0 9(32) 9(33)

where 3( 22)(Pi x Pi)> 9(33)(P2 x Pi) are nonsingular lower triangular matrices,

satisfies the conditions of the theorm. It may be rem arked that we are using

the same notation g for lower triangular matrices as g in G . In the place of fi2,

the m ax imal invariant in the sample space under GT is a (p — 1)-dimensional

vector R = ( R 2

, R p

) ' denned in Exa mpl e 5.1.5. wi th

PI + I P

J=2 j = 2

The corresponding maximal invariant in the parametric space under GT is

A — {6^ 2 , • • • ,t>)J 2

)' as defined in Ex amp le 5.1.5. with

P I + I p

p\ = P 2 =P 2i+pl = J26i

i=2 i - 2

For testing H 0 : p 2 = 0 against the specified alternative Hi\ : p 2 = A the rat io

fi of the probability density function of R under H \ \ to that under HQ (using

Theorem 2.7.1.) is

R = fR(r\H lx

)/f R

(r\H 0

)(1 - A ) " " / 2

/ $((1 - A) hT(g 2

u) - 2g(11) Ay'g'(22) + g{ 22) g( 2 2)]v{dg)

(6.53)

where

•=2

= / g(tr gg')v(dg). JGT

i=2

lGT

y = ( v ^ 2 y/Tp)'

Locally Minimax Testa in Symmetrical Distribntxona 157

Since the (p - 1) x (p - 1) matrix

C = ( l - p 2 ) " 1 ( / - A A ' )

is positive definite there exists a (p— 1} x (p— 1) lower triangular nonsingularmatrix T such that TCT' = I. Define 7i,a<,2 /7<7i-i-

j = 2

Obviously 7 P = 1 - p 2 . Le t a = ( a 2 ) . . . , a f ) ' . The n a = T A . Since C A = A

and TCT' = I we get

a = T A - r C A = { r ) _ 1 A ,

a'ot = A ' ( T ' T )- 1

A = A C A '

= (1 - p 2 ) - 1 ( A ' A ) - ( A ' A ) { A ' A ) = p 2 .

Furthermore with = (a 2 , • • •, on)'

i

Since

| C | - ( l - p 2 } 2 - "

we can write

(1 — A ) " ' " / 2 ' f

M=- g g(tT(9 2

U )^ 2 9(ii)ay'9(22)+m2)9{22)))l/ ( d 9)- (6-54)

Expanding the integrand in (6.45) as

9(tt 90') + o ( 1 ) ( tr S<7 ' ) ( - 2t r ( a ( 1 1 ) C ! i , ' o ; 2 2 ) ) )

+ ^ ^ ( - 2 t r ( f f l l l ) Q ^ 2 2 ) ) ) 2

+ ^ ^ { - 2 t r ( g , n | a y ' g ( 2 2 | ) )3

where

z = tr go' + (1 - ff)(-2tr 9 ( i i | a y ' S ( 2 2 ) ) '

with 0 < S < 1.

Since, with g - (gtj),

P

trg (ii,QJ/'<i(22] = tf(ii> E f o ) I / 2 ( + a

*9 i

j ) j=2 i>j

and the measure g (1> (tr gg') f(do) is invariant under the change of sign of g

to — g we conclude as in (6.18) that

/ [tr <J(i i )Q!/ ' i7(22)]9 ( 1 ) ( t r 99'Hdg) = 0 JGT

j 9i j 9ihQ(1)(^gg')u(dg) = 0! i / /, J j t f e . (6.55)

Let

L = tr gg', L e i = gj u) , L e , = g2

, i = 2 , . . . ,p,

L e 2 p - i + i = G2

+ 2 , i , i = 2, . . . , p ,

i e i + P ( p - i ) / 2 = ffp, •

Now proceeding as in Problem 3 we get

j = 2 • \ !>j

+ £ ( > / , A, r>)

where

I f =6j/X, j = 2 , . . . , p ,

M = E

j=2

e ( n - i ) / a T T J " - * *1

) / 2

(6.56)

S ( j / i , A , n ) = o(A) uniformly in 3/ and TJ. Furthermore from Theorem 6.1.1 R

does not depend on q.

Since A = 0 is a single point n = 0, £0 assigns measure 1 to the point 17 =

0. The set p2

= A is a convex (p —1)-dimensional Euclidean set wherein each



Locally Minimal Tests in Symmetrical Distributions 159

component is 0(A) . Thus any probability measure & can be replaced by the

degenerate measure which assigns measure 1 to the mean TJ- , i = 2 , . . . , p of t)y •

Choosing

V* = (n-j + l ) - ' ( n - j + 2 ) ~ 1 ( p - l)~l n(n - P + 1 ) ,

f = 2 , . . . »J> . (6-57)

we see that (6.9) is satisfied with U = YZ =2 R 3 = R 2 - F l o m G i r i ( 1 9 8 8 ) o r

using Theorem 2.7.1. we get

/ ^ ( r 2 | 7 J i ) / / f i 2 ( r 2 | f f 0 )

(1 - p 2 )-' 2

B„2i

y G

+ ( i - p 2 ) ( / - A A ' ) 9 ( 2 2 | 0 ; 2 2 ) ] ) p ( r f 9 )

where g e G , as defined in (6.48) , wit h p 2 = 0,

^ S ) - ( 9 ( 2 n , ) ( " - , l / 2 | f f ( 2 2 ) 9 f 2 2 ) |n ( p - 1 , / 2 ^ ,

5 = / g(tr gg')p{dg),

and u - (tii> - - • > T p - i ) ' satisfying u'u — r 2 . I f p 2 is small the power of the

test against the alternative H\ which rejects Ho whenever R 2 > C a is given

by a + h{p 2 ) + o(p 2 ), where h{p 2 ) > 0 for p 2 > 0 and h(p 2 } = o( l ) . The

null robustness of the test follows from the fact that T(X) = R 2 satisfies

T(X) = T(Xh) where h 6 GT (Kariya (1981b)). Hence we get the following

theorem:

T h e o r e m 6.2 .3 . For testing H a against H[: p 2 = A, the level-a test which

rejects Hn whenever R 2 > C is locally minimax as A —> 0.

Problem 5 . H ere

m = 0, j = pi + 2 , . . . , p .

For testing H Q against H 2 - p\ = A we can write (using 6.55)

- TlA R = l + Y

•pi+i

3=2 M>j / .

+ B(r,X,n). (6.58)

Now letting give measure one to n = 0 and t\x give measure one to

Vh---< n p,+i'tlie m e a n ot*£> where

j ) | = ( n - p 1 ) 7 I ( 7 i - i + l ) - 1 ( n - J - l - 2 ) - 1 p r 1

J = l , . . . , P i . (6.59)

we conclude that (6.9) is satisfied with U = Rf = £,,'=2R j a n d ( 6 - 8 ) follows

from Theorem 6.2.3.

T h e o r e m 6.2.4. For testing Ha against H' -p\ — A, the level-a test which

rejects Ho whenever Rf > C a , where the constant C 0 depends on level a of

the test, is locally minimax as X —* 0.

Problem 6 . Here

T ) , = 0 , j = 2 , . . . , p i + l .

For testing Ho against H 3 ; p\ = X we can write R as A —< 0 as

)=P i+2 V i »2 B

+ B ( r , A , n ) (6.60)

where B(r, X, n) = o(A) uniformly in r and rj. Let us now consider the rejection

region of the form

RK ={? ; u(y) = f\+Kf 2 2 > c a ] (6.61)

where K is chosen such that (6.61) is reduced to yield (6.9) and C„ depends

on the level a of the test for the chosen K. Let Co give measure one to n = 0

and t]x give measure one to a single point ( 0 , . . . , 0, r v * i + 2 , . . . , JJp) where

Tl'j=(n-p+ l)n(n - j + l ) " l ( n - j + 2 ) - \ ( p - 1 ) " \ }=Pi+2,...,p

and let

R=! [ y:U(y) = fl + ^ f l > C a }

with pi + P2 = p - 1. Observe that the rejection region R with P0> „R = a

satisfies (6.9) . Fu rthe rm ore, for the invar iant region R, PxiTI {R) depends only

on A. Hence, from (6.8) r ( A , r ) = 0. Since R is locally best invariant as A -*




0. the test, which rejects HQ whenever R 2 > C, does not coincide with R and

hence is locally worse. From Problem 4 the power of the test (which rejects

H 0 whenever R

2 > C) which depends only on A, has positive derivative at A

= 0. Thus from (6.8) with R* = Rwe conclude that h(X) > 0. Hence we have

T h e o r e m 6 .2 ,5 . For testing HQ against H ' 3 : p\ = X > 0, the test with

critical region R is locally minimax as A —> 0.

E x e r c i s e s

1. Let X = (Xi,.... X p )' be a random vector with peobability density function

(6.1). Show that E(X) = p, c o v ( X ) = Q s with a = p~l E(X - / i J ' E "1 (X-

!*))•

2. Let X be a n x p random matrix wi th probability density function (6.2) .

Find the maximum likelihood estimator of p and E .

R e f e r e n c e s

1. N. Giri, Locally minimax tests for multiple correlations, Canad. J. Statist., pp. 53¬

60 (1979).

2. N. Giri, Locally minimax tests m symmetrical distributions, Ann. Inst. Stat. Math.,pp. 381-394 (1988).

3. N. Giri, Multivariate Statistical Analysis, Marcel Dekker, N.Y. (1996).

4. T. Kariya and M. Eaton, Robust test of spherical symmetry, Ann. Statist., pp. 208¬

215 (1977).

5. T . Kar iya, A robustness property of Hotelling's T 2-problem, Ann. Statist., pp. 206¬

215 (1981).

6. T. Kariya, Robustness of multivariate tests, Ann. Statist., pp. 1267-1275 (1981b),

7. T . Kari ya and B . Sinha, Non-null and optimality robustness of some tests, Ann,

Statist., pp. 1182-1197 (1985).8. T. Kariya and B. Sinha, Robustness of Statistical Tests, Academic Press, N.Y.

(1988).

Chapter 7

TYPE D AND E REGIONS

The notion of a type D or E region is due to Isaacson (1951). Kiefer

(1958) showed that the usual F-test of the univarial general linear hypotheses

possesses this property. Lehmann (1959) showed that, in finding type D region,

invariance could be invoked in the manner of the Hunt-stein theorem; and this

could also be done for type E regions (if they exist) provided that one works

with a group which operates as the identity on the nuisance parameter set H

of the testing problem.

Suppose, for a parameter set fi = {( 8 ,-n) '• 8 € 0 , J J e H with associated

distributions, with 0 a Euclidean set, that every test function d> has a power

function fy(9,IJ) which, for each n, is twice continuously differentiable in the

components of 6 at 8 = 0, an interior point of 0. L e t Q a be the class of locally

strictly unbiased level a test of HQ : 8 = 0 against Hi : 0 ^ 0. Our assumption

on 0$ implies that all tests in Q„ are similar and tha t d0^/d&i\g ̂o — 0 for all d>

in Q a . Let A^(JJ) be the determinant of the matrix B<j,(rj) of second derivalives

of 0${8,TI) with respect to the components of 8 (the Gaussian curvature) at

8 — 0. Suppose the param etrization be such that A # ( f i ) > 0 for all n for at

least one <(>' in Q a .

D e f i n i t i o n 7 .1 . T y p e . E t e s t . A test d>' is said to be of type E if <p' e Q a

and A ^ . ( n ) = max^gp,, A ^ ( n ) for all 77.

D e f i n i t i o n 7 .2 . T y p e D test . A type E test d>' is said to be of type D if the nuisance parameter set H is a single point. I n the problems treated

earlier in Chapter 4 it seems doubtful that type E regions exist (in terms of

162

Type D and E Regions 163

Lehmann's development, H is not left invariant by many transform ations ). We

introduce here two possible optimality criteria in the same sprit as the type

D and E criteria which will always be fulfilled by some test under minimum

regularity assumptions. Let

A ( n ) = max A^(TJ) . (7.1)

D e f i n i t i o n 7 .3 . T y p e D A test . A test d>' is of type D A if

maxi

A( V ) - A^(77)] - min ma x[A (r, ) - A*(r/) (7.2)

D e f i n i t i o n 7 .4 . T y p e D M test . A test d<" is of type DM if

m a x ^ M / A d - f a ) ] = min max[A(rj)/A,, , (j))] . (7.3)

These two criteria resemble stringency and regret criteria employed elsewhere

in statistics; the subscripts "A" and " M " stand for "additive" and "multip lica

tive" regret principles. T h e possession of these properties is invariant under

the product of any transformations on 9 (acting tr ivia lly on H ) of the same

general type as those for which type D regions retain their property, and an

arbitrary 1-1 transformation on H (acting triv ial ly on 0 ) , but, of course, not

under more general transformations on ft. Obviously, a type E test automat

ically satisfies these weaker criteria.

Let us now assum e that a testing problem is invariant under a group of

transformations G for which the Hunt-Stein theorem holds and which acts

trivially on 0; that is

g(0,n) = (6,gr,), 9 e G . (7.4)

If d>g is the test function, defined by 4>g{x) = d>(gx), then a trivial compulation

shows that

A tg( V ) =/Xpigv) • (7.5)

It is easy to verify that A^3

( ) j ) = A^gv) implies A(TJ) = A(g-n). Furthermore,if d> is better than $ m the sense of either of the above criteria, then d>g is

better than d>'g. Al l of the requirements of Le hm an n (1959) are easily seen to

be satisfied, so that we can conclude that there is an almost invariant (hence,

in our problems, invariant) test which is of type DA or D ^• This differs from

the way in which invariance is used in page 883 of Lehmann (1959), where it

is used to reduce Q rather than H here.

If the group G is transitive on H, then A( JJ ) is constant as is A^(TJ) for an

invariant <p, which we therefore write simply as A ^ . In this case we conc lude

that if 0 ' is invariant and if <p* of type D among invariant d> ( i.e. i f A ^ .

maximizes A ^ over all invariant <p), then d>" is of type DA or DM among all a).

To verify these optimality properties we need the following lemma.

L e m m a 7.1. Let L be a class of non-negative definite sym me tr ic matrices

of order m and suppose J is a fixed nousingular member of L. If tr J~ L B

is maximized (over B in L) by B = J , then det B is also maximized by J.

Conversely, if L is convex and J maximizes det B, the tr J~ L B is maximized

by B = J.

Proof. Write J~ L B = A. If A = I maximizes tr A, we get (det A) L' M <

tr ~ < 1 = det I . Conversely, if J. maxim ize s det A , it also maxim ize s tr A,

since tr B > tr I implies det(aB + (1 — a)I) > 1 for a sma ll and positive. •

R e m a r k s . The usefulness of this lemma lies in the fact that the generalized

Neyman-Pearson lemma allows us to maximize tr QB^,, for fixed Q more easily

than to maximize A ^ —detB^ among similar level a tests. We can find, for

each Q, a, <pQ which maximizes tr QB^,\ a 4>" which maximizes A ^ is then

obtained by finding a <S> Q for which B$Q — Q~ L .

In problems of the type which we consider here the reduction by invariance

under a group G which is transitive on H often results in a reduced problem

wherein the maximal invariant is a vector R = (Ri,..., Rm)' whose dist ribu

tion depends only on A - (Sy,..., 6 M ) ' where 6\ = B\, $ = ( 0 * , . . . , $ m )' and

such that the density / A of i i wi th respect to a o"-finite measure fi is of theform

where the hi and ay are constants and Q ( r , A ) — o ( ^ £ ; ) as A —> 0 and we

can differentiate under the integral sign to obtain, for invariant test hA + 5 > £ a t f / r }- #r) f 0(r) p(dr)

(7.7)

Type D and E Region) 165

as S - » 0. According to Gir i and Kiefer (1969) this is called the symmetric

reduced regular case ( S RR ) .

T h e o r e m 7.1. In the SRR case, an invariant test i>* of level a is of type

D among invariant ) if and

only if <p* is of the form

^^={1}IF

E « {<}<? (7.8)

where c is a constant and qT 1

= const

Proof . In the S R R case, every invariant <f> has a diagonal B$ whose ith

diagonal entry, by (6.7), is 2[hia + £o{53, ai3 R } 4i(R)}]- By Neyman-Pearson

lemma tr QB$ is maximized over such a <j> by a d>* of the form (6.7). Hence

we get the theorem. •

h^ + EotZj ai 3 Rj<l>'{R)

E x a m p l e 7.1. (T

2

-test) Consider once again. Problem 1 of Chapter 4.Let 0 = A

r l

^2

F- 1

^ where F is the unique member of GT with positive diagonal

elements satisfying F F' = E . Th e n 8 is Euclidean p-space, GT operates

transitively on H = {posi tive definite E } but trivially on © and we have the

SRR case. We thus have (6.7) with h{ = - 1 ,

' 1 , d i > 3 ,

- < 0, if i < j , N - j + l, if i - j .

(7.9)

Hotelling's T 2 test has a power function which depends only 5Z;°̂ > so that,

with the above parametrization for 6, we have BT* a multiple of identity.

Hotelling's critical region is of the form £ r> > c - But, when all are equal,

the critical region corresponding to (6.8) is of the form ^( - / V+ p — 1—2 j ) r , > c,

which is not Hotelling's region if p > 1. Hence we get the following theorem.

T h e o r e m 7.2. For 0 < a < l < p < N , Hotelling's T 2 test is not of type

D among Gr-invariant tests, and hence is not of type DA or DM (nor of type

E) among all tests.

R e m a r k s . T he actual computation of a <j>* of type D appears difficult due

to the fact that we need to evaluate an integral of the form

(7.10)

for ft = 0 or 1 for various choices of the Ci'fl and c. Wh en a is close to zero

or 1, we can carry out approximate computations as follows. As a —» 1 the

complement R of the critical region becomes a simplex with one corner at 0.

When p = 2, if we write p = 1 - a and consider critical regions of level a of the

form byi + y 2 > c where 0 < L _ I < b < L , L being fixed but large (this keeps R

close to the origin), we get from (6.10) that p = E a{l-(p{R)) = ( A r - 2 ) c / 2 6 1 / 2 +

o(c) as C - 0. Similarly, E a

{(l - <p(R)Ri} = {N - 2 ) c3

61

-s

'2

/ 8 + o(c) = p 2 bi- 3' 2 /2(N-2) + o(p) as p — 0 while £ 0 ( # , ) = l/N. From (5.18) we obtain

for the power near He

1 - Np + o(p)

2{N-2)b?+ <52

1 - p + o(p) (N - l)pb% + o(p)

2(N - 2 ) 6 * 2(N-2)

+where o{p) and o(J^6i) terms are uniform in A and p, respectively. The

product A ^ of the coefficients of 6\ and b% is maximized when b = (N +

1)/{JV - 1) + o ( l ) , as p —* 0; with more care, one can obtain further terms in

an expansion in p for the type D choice of b. The argument is completed by

showing that 6 < L _ 1 implies that R lies in a strip so close to the ri -axis as

to make EQ (1 - <p(R))Ri too large and E 0{(1 - d>(R))R 2 } too small to yield a

d> as good as that with b = (N + l)/(N - 1), with a similar argument if b > L .

When p is very close to 0, all choices of 6 > 0 give substantially the same power,

a + (Si + £ 2 ) / 2 + o(p 2 ), so that the relative departure from being type D, of

Hotelling's T 2 test or any other critical region of the form bR\ + R 2 > c, with

b fixed and positive, approaches 0 as a -» 1. However we do not know how

great the departure of A r j from A can be for arbitrary a. We can similarly

treat the case p > 2 and also the case a —* 0.

One can treat similarly other problems considered in Chapter 4.

References

1. N. Giri, and J. Kiefer, Local and Asymptotic Minimax Properties of Multivariate

Tests, Ann. Math. Statist. 35, 21-35 (1964).



Type D and B Regions 167

2. S. L . Isaacson, On the Theory of Unbiased Tests of Simple Statistical Hypothesis

Specifying the Values of Two or more Parameters, A n n . Math. Statist . 22, 217¬

234 (1951).

3. J . Kiefer, On the Nonrandomized Optimality and Randomized Nonoptimality of

Symmetrical Designs, A n n . Math. Statist . 29 , 675-699 (1958).

4. E . L . Lehmann, Optimum Invariant Tests., A nn. Math. Statist . 30, 881-884

(1959).

group invariance in statistical inference (narayan)

Documents