small-sample comparisons of exact levels for chi-squared goodness-of-fit statistics

Post on 16-Feb-2017

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

This article was downloaded by: [New York University]On: 06 October 2014, At: 10:54Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41Mortimer Street, London W1T 3JH, UK

Journal of the American Statistical AssociationPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/uasa20

Small-Sample Comparisons of Exact Levels for Chi-SquaredGoodness-of-Fit StatisticsKinley Larntz aa School of Statistics , University of Minnesota , St. Paul , MN , 55108 , USAPublished online: 05 Apr 2012.

To cite this article: Kinley Larntz (1978) Small-Sample Comparisons of Exact Levels for Chi-Squared Goodness-of-Fit Statistics, Journalof the American Statistical Association, 73:362, 253-263

To link to this article: http://dx.doi.org/10.1080/01621459.1978.10481567

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in thepublications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations orwarranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsedby Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings,demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectlyin connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Small-Sample Comparisons of Exact levels for

Chi-Squared Goodness-of-Fit StatisticsKINLEY LARNTZ*

1. INTRODUCTION

KEY WORDS: Pearson chi-square; Likelihood ratio statistic;Freeman-Tukey statistic; Goodness of fit; Small sample; Categoricaldata.

Several statistics arc commonly used to judge thegoodness of fit for counted data models. In this articlethree of these statistics will be compared with respect totheir small-sample properties under the null hypothesis.The usual chi-squared statistic (Pearson statistic) isdefined by

The small-sample properties of three goodness-of-fit statistics for theanalysis of categorical data are examined with respect to the adequacyof the asymptotic chi-squared approximation. The approximate testsbased on the likelihood ratio and Freeman-Tukey statistics yieldexact levels that are typically in excess of the nominal levels formoderate expected values. In contrast, the Pearson statistic attainsexact levels that are quite close to the nominal values. The reasonfor the large number of rejections for the likelihood ratio and Free­man-Tukey statistics is related to their handling of small observedcounts.

2. THE MODELS

Comparisons among the statistics are made for fivediverse models. Since the results are similar for all fivemodels, we believe that the conclusions will apply to theentire range of counted data problems.

most popular. Also, some statisticians follow the practiceof reporting two or more statistics (see, e.g., Goodman1973), but little guidance is available concerning themeaning of large discrepancies between the statistics.

In Section 2 we introduce five models used for com­parison. One model is the standard multinomial goodnessof fit of a given probability vector. The second is aparametric model that arises naturally from data in aproblem-solving experiment. The remaining three modelswere selected to check the generalizability of the resultsfor the first two. Section 3 provides the small-samplecomparison of the statistics, and in Section 4 particularproperties of the statistics are examined in more detail.Conclusions are given in the final section.

(1.1)(observed - expected)2

expectedX 2 = E

all cells

A suggested alternative statistic that has some asymp­totically optimal properties (Bahadur 1967; Hoeffding1965) is the likelihood ratio statistic

G2 = 2 E observed log. (observed/expected). (1.2)all cells

Another alternative is the Freeman-Tukey chi-squaredstatistic (Bishop 1969; Bishop, Fienberg, and Holland1975; Freeman and Tukey 1950)

T2 = E [Iobserved)! + (observed + l)tall cells

2.1 Multinomial Goodness of Fit

We consider first the simplest chi-squared test whicharises in testing the null hypothesis that the observedfrequency vector (nl, n2, ... , nk) follows a multi­nomial distribution with specified probability vector(PI, P2, ... , Pk). If the total sample size is N = E:= 1 ni,the expected cell frequencies are simply NPi and thechi-squared statistics may be calculated using (1.1)-(1.3).

There are several reasons for selecting this model tomake comparisons:

- (4 expected + l)tp. (1.3)

Many statisticians prefer the use of (;2 or T2 althoughamong everyday users the Pearson statistic is by far the

.. Kinley Larnts is Associate Professor, School of Statistics, Uni­versity of Minnesota, St. Paul, MN 55108. Research for this articlewas supported in part by a Grant-in-Aid from the Graduate School,University of Minnesota and by Grant NIE-G-76-0094 from theNational Institute of Education, U.S. Department of Health,Education, and Welfare. This article is based in part on the author'sdoctoral dissertation written under Professor Stephen E. Fienbergat the University of Chicago. Support for this research at the Uni­versity of Chicago was provided by a National Defense EducationAct Fellowship and a Warner-Lambert Research Institute Fellow­ship, with additional support from the Shell Companies Foundation,Inc. The author is indebted to Professor Fienberg for guidance andto Professors R.R. Bahadur, S.J. Haberman, D.L. Wallace, and thereferees for valuable comments.

1. This is the simplest counted data model, with theexpecteds depending only on a prespecified prob­ability vector. Comparisons are, therefore, madefor a simple null hypothesis.

2. The amount of computation is not excessive. Fork = 2 or 3 and N ~ 100, exact computations areeasily carried out. For larger numbers of cells,Monte Carlo procedures can be easilyprogrammed.

3. By varying the probability vector (PI, P2, ... , Pk),the entire range from skewed to equal cell prob-

© Journal of the American Statistical AssociationJune 1978,Volume 73, Number 362

Applications Section

253

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

(2.1)

254

abilities can be considered. Previous studies(Cochran 1952; Tate and Hyer 1973; Yarnold1970) have indicated that for small expectedvalues the Pearson statistic does not follow thechi-squared distribution well, while some sug­gestion has been indicated (cf. Bliss 1967, p. 38)that the likelihood ratio statistic would be betterin such situations. Recently, it has been suggestedthat for certain models the Freeman-Tukeystatistic has true significance levels closer to thenominal levels than does either the Pearson orlikelihood ratio statistic (Sylwester 1974).

2.2 Group Helping Model

Suppose that individuals or groups are given the op­portunity to help an individual in distress. The degree ofhelp is graded I, II, or ill: I for not helping, III for ac­tively helping, and II for an intermediate action. Staub(1970) describes such an experiment, for which Fienbergand Larntz (1971) proposed the following model. (Similarmodels are also used in component testing problems(Easterling and Prairie 1971).)

Consider the case where data were gathered for in­dividuals and for groups of size two. Let pi, P2, and P3 bethe probabilities of observing an individual with helpgraded I, II, and III, respectively. Then if the individualsin a group act independently, and if only the higher gradeof help is scored, P12

, P22+ 2P1P2, and P32+ 2P1P3 + 2p2P3are the respective probabilities of observing I, II, andIII for groups of size two.

1. Observed Table for N, Individuals and N2 Groups

Help grade Individuals Groups Total

I n ll n 12 R,II n., n•• R.III n31 n3• R3

Total N, N.

Suppose N', individuals and N 2 groups arc tested. Theresults can be summarized in a 3 X 2 contingency tablewith column totals fixed as in Table 1. Under the preced­ing assumptions, (nu, n21, n31) follows a multinomialdistribution with probability vector (Pi, P2, P3), and(nu, n22, n32) follows a multinomial distribution withprobability vector (gl' g2, g3) where

gl= P12 ,

g2 = P22 + 2P1P2 ,

g3 = P32+ 2P1P3 + 2P2P3

For this case the unique maximum likelihood estimatesfor (Pi, P2, P3) can be written directly (Fienberg andLarntz 1971) as

fh = (-n31 + [n312 + 4acJl)/2a

(2.2)

pa = 1 - (1 + r)pl ,

Journal of the American Statistical Association, June 1978

where

n., -2n"-4n12+[(2n,, +4n12-n.,)'+8(n., +n..) (n" +2n12)]tr=

2(n" +2n12)(2.3)

a = (l+r) [(nll +2n12) (1 -l-r) + (n31 +2nn) +2n.. (1+r)/(2+r)] ,

(2.4)and

c = nu + 2nu + 2n22/ (2 + r) . (2.5)

(If the ith row total (R i in Table 1) is zero, the maximumlikelihood estimates are derived conditional on the zerototal. The estimates then become just the extension bycontinuity of the estimates given by (2.2) through (2.5).)

The selection of this model for comparing the chi­squared statistics provides several advantages ;

1. The model depends upon two parameters, Pi aridP2, and thus the goodness-of-fit test for the nullhypothesis involves the estimation of these pa­rameters. Comparisons can, therefore, be madefor a composite null hypothesis.

2. Since the maximum likelihood estimates can bewritten down in closed form, iteration is not neces­sary for finding the estimates. This is importantwhen considering the feasibility of doing largeamounts of computation.

3. Examining (2.1), note that the probability ofHelp Grade I for groups is Pi". When Pi is small,P12 is quite small. Thus the selection of thismodel allows for comparisons of very skewedmultinomials, which means comparisons can bemade for small as well as moderate minimum cellexpectations.

2.3 Three Additional Models

To validate the results derived from careful examina­tion of the first two models, three additional models areconsidered. The first model involves the usual test for noassociation in an R X C contingency table with bothmargins fixed, i.e., hypergeometric sampling. The ob­served counts are denoted by Inij; i = 1, ... , R;j = 1, ... , C} and the estimated expected counts aretaken as ni+n+;/n++. Thus we will be examining rejectionrates conditional on the margins as advocated by somestatisticians (e.g., Bartlett 1935).

The next model is that for quasi-independence in anincomplete two-way table. Specifically, we consider anobserved data structure which consists of an I X Iupper left triangular table, i.e., Inij; i = 1, ... , I;j = 1, ... , I + 1 - j}. Estimated cell frequencies forthe quasi-independence model can be computed directlyfrom formulas given by Bishop and Fienberg (1969).

For the final model we consider no three-factor inter­action in an I X .J X K contingency table (Bartlett1935). Cell expected values are computed via an iterativemaximum likelihood procedure (iterative proportionalfitting), as closed-form estimates are not possible. Com­putations for 2 X 2 X 2 and 3 X 2 X 2 tables have been

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

Larntz: Small-Sample Comparisons of Chi-Squared Statistics

2. Cell Probabilities tor Multinomial Gooaness-ot-Fit Computations

k

255

Case

123456789

10111213

2

(.50,.50)(.35,.65)(.25,.75)(.15,.85)(.10,.90)(.05,.95)(.04,.96)(.03,.97)(.02,.98)(.01,.99)

3

(Y3,V3,V3)(.30;.30,.40)(.20,.40,.40)(.20,.30,.50)(.20,.20,.60)(.10,.40,.50)(.1b,.30,.60)(.10,.20,.70)Cl0,.10,.8Q)(.05,.45,.50)(.05,.05,.90)(.02,.49,.49)(.02,.02,.96)

5

(.20,.20,.20,.20,.20)(.10,.20,.20,.20,.30)(.10,.10,.20,.30,.30)(.10,.10,.20,.20,.40)(.10,.10,.10,.30,.40)(.10,.10,.10,.20,.50)(.10,.10,.10,.10,.60)(.05,.2375,.2375,.2375,.2375)(.05,.05,.30,.30,.30)(.05,.05,.05,.425,.425)(.05,.05,.05,.05,.80)

1'0

(.10,.10,.10,.10,.10,.10,.10,.10,.10,.10)(.05,.10,.10,.10,.10,.10,.10,.10,.10,.15)(.05,.05,.10,.10,.10,.10,.10,.10,.15,.15)(.05,.05,.05,.10,:10,.10,.10,.15,.15,.15)(.05,.05,.05,.05,.10,.10,.15,.15,.15,.15)(.05,.05,.05,.05,.05,.15,.15,.15,.15,.15)(.05,.05,.05,.05,.05,.05,.15,.15,.20,.20)(.05,.05,.05,.05,.05,.05,.05,.20,.20,.25)(.05,.05,.05,.05,.05,.05,.05,.05,.30,.30)(.05,.05,.05,.05,.05,.05,.05,.05,.05,.55)

A. Exact Rejection Rates for Multinomial Null Hy­potheses, 2 Cells.

degrees of skewness were selected. For each null hy­pothesis, computations were carried out for sample sizesN = 10 (1) 100. The upper bound of 100 was picked asan appropriate "small sample" size for some of the moreskewed cases, although it would not usually be consideredas a small sample for symmetric hypotheses.

Sample Size

20 30 40 50 60 70 80 90 100

1 7 7 7 72 7 73 84 7

PEARSON 5 2 2 2 2 8 26 * 722 2 2 22 27 8*2 8Z 22 Z2 22 22

8 8*22 7822 7222 222 222

9 2 789*22 778222 7222210 ****1222 77889**2222 777

reported by Odoroff (1970). Here we will consider3 X 3 X 3 tables.

3. SMALL SAMPLE PROPERTIES UNDER THENULL HYPOTHESIS

Under the null hypothesis, the goodness-of-fit statistics,X 2, (J2, and T2 have asymptotic chi-squared distribu­tions; however, for small samples the chi-squared ap­proximation in many cases does not agree well with theexact distribution. Several studies (Cochran 1952;Fisher 1958; Roscoe and Byars 1971; Yarnold 1970)have given conflicting points of view as to the point atwhich the approximation is "reasonable" for the Pearsonchi-squared statistic. "Standard rules" specify that theminimum cell expectation should be five, with a few cellspossibly smaller. The emphasis here will not be on findingsuch a rule but in comparing the Pearson, likelihood ratio,and Freeman-Tukey statistics with regard to the ap­proximation. In other words we ask, for small samples,which of the three statistics is best approximated by theasymptotic chi-squared distribution.

3.1 Results for the Multinomial Goodness-of-Fit Model

Both exact and' Monte Carlo computations werecarried out in the study of the multinomial goodness-of­fit model. For number of cells k = 2 arid 3, exact enumera­tion was made of the small-sample distributions of thechi-squared statistics. For k = 5 and 10 cells, MonteCarlo methods (Hammersley and Handscomb 1964) wereused to simulate the small-sample distributions. 1 Onethousand trials were run for each null hypothesis andsample size. Table 2 lists the probability vectors used asnull hypotheses in this part of the study. For each k,hypotheses ranging from symmetry through varying

1234

LIKELIHOOD 56789

10

1234

FREEMAN 5TUKEY 6

789

10

97 7 7 7 77777 99***9 8772 12****7 7 881 1122 1122 *******887777 97112 1112 11122 ********88887777

01122 011122 01111222 **********9888887772 0011122 0011112222 11111122222 1***00111222 0000011111122222 000111111111

777 77

72**9 272012***87 8 21001120011220*****0088777 27112001112001112200******9988887700111000011120000111122001********99988888770000001110000000111122000001111112220000111**000000000000000001111110000000000000111111111

1 Computations were performed using FORTRAN programs on aCDC 6400 computer. Multinomials were generated from uniformrandom numbers by classifying the uniforms into one of k categoriesusing the boundaries (0, PI, PI + po, ... , 1). In the cases of tableswith fixed margins, a University of Minnesota library subroutinePERMUTE was used to simulate hypergeometrio sampling. Theuniform random numbers used by PERMUTE and those used ingenerating the multinomials were produced by a multiplicative con­gruential generator using modulus 2'· and multiplier 5".

Figures A, B, C, and D summarize the results for themultinomial model. For instance, Figure A contains plotsof the exact level of the nominal 0.05 test as a functionof sample size and hypothesis number (ordered, as inTable 2, by the probabilities of the least likely cells, no. 1is symmetry). To compress the display only the higherexact level for each pair of sample sizes is recorded here.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

256 Journal of the American Statistical Association, June 1978

B. Exact Rejection Rates for Multinomial Null Hy- C. Estimated Rejection Rates for Multinomial Nullpotheses,3 Cells. Hypotheses, 5 Cells.

Sample Size Sample Size

20 30 40 50 60 70 80 90 100 20 30 40 50 60 70 80 90 100

1 1

2 2

3 3

4 4

PEARSON 5 PEARSON 5

6 9

7 6

8 7

9 10 79

10 7 8

11 8 11 78 77 77

12 77813 79**2 7877 77

1 72 73 884 7777

LIKELIHOOD 5 7*976 877 7878 78779 9 7 **9877

10 788711 1222 77 *9**98877771213 11011222 111122122222

D. Estimated Rejection Rates for Multinomial NullHypotheses, 10 Cells.

1 97777872 777883 8 774 7 77 78

LIKELIHOOD 5 877 7 79 7 7 77 7 76 7 77 9 7 77 78 8877887 7

10 777 7 877 77877 778 878 *78 98 8788777 7 7

11 12 2 7 7779788 7 778777787 79977 777

1 9**972 77973 974 7788787

FREEMAN 5 77 777877TUKEY 9 12 7 7 7 7887 77 7

6 77 78*787 7 8898877

10 11122 2 7 77879898878877 78 2 *99*977

11 0000000222 78 7 778*99*8**97779

7988777988****99887

987987*87"

8***87112 200110111111222222 222 2 20000000000000110000001001011111111

9978

**8

123456789

10111213

FREEMANTlJKEY

100

77

777 7

7 777777 7 7 7 77 7

77 78 778 77 7778 778 7 7 87 7777 7 77 7 7

77 7797 7 7777 777 788 7788 77789787 8977778 877777

7 77 7889887777788887888878897

Sample Size

20 30 40 50 60 70 80 90

7778787788888 7787888 7989877

87 7777 777 87 7

7 77

7788

877**989788788777 7 7778789*89887888877 77 7 7887887888787877 77 77 77 77 7

877778977 777 77 7 77 7777 778778778778777878 7 7 777

8 88887 77789788779777 8 7777 7 87 878 877 79 7877787 777777 7 7 778778897787797879777 78787777 777 7

778887 9*888888888 8878777 77 7 712 78888*88899979897888788777877877797

121212120221220210122 2211111122000001112

123456789

10

123456789

10

123456789

10

PEARSON

LIKELIHOOD

FREEMANTUKEY

combination ofthe following

Exact Level RangeSymbol

o12

(blank)789..

The symbols in the Figures for eachsample size and hypothesis haveinterpretation:

0--1 percent1-2 percent2-3 percent3-7 percent7-8 percent8--9 percent9-10 percent

10 or greater percent

Thus a "7" indicates that the exact level (or estimatedexact level for k = 5 and 10) is in the range 7 to 8 per­cent. A blank is good-the exactlevel is .05 ± .02.

Looking at the four figures, one can see that Pearsonplots all have more blanks, indicating that the statisticis most often closest to the nominal level. In fact, thePearson statistic appears to achieve the desired levelin general when all cell expected values are greater than1.0. The pictures for the likelihood ratio statistic areconsistent in pattern, if not degree, for all four dimen­sions. When there are small cell expectations, G2 does notreject often enough; for a middle range (specifically, caseswith a number of cell expecteds in the interval 1.5-4.0),

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

Larntz: Small-Sample Comparisons of Chi-Squared Statistics 257

3. Minimum and Maximum Significance Levelsfor Nominal .05 Tests for Multinomial

Goodness-of-Fit Model

(;2 rejects much too often; and for larger sample sizes andless skewed cases, the exact level is close to the nominallevel. The Freeman-Tukey statistic follows the generallikelihood ratio pattern, but is much more conservativefor cases with small expected values.

3.2 Results for the Parametric Model

The model presented in Section 2.2 assumes that thedata consist of two independent trinomials: The indi­viduals have probability vector (Pi, P2, Pa) and the pairshave probability vector (gl' g2, g3), where the g's aregiven by (2.1). The maximum likelihood estimates forsample sizes N 1 and N 2 are given by (2.2) through (2.5).The number of possible outcomes for two trinomials withsample sizes N 1 and N 2 is given (see Feller 1957) by

class boundaries. Although the spread is quite large formany classes, it is clear that for smaller minimum cellexpected values (1 ~ MCE ~ 4) the Pearson statistichas an exact size closer to the nominal .05 than either thelikelihood ratio or Freeman-Tukey statistic. In fact,looking again at Figures A, B, C, and D, we see thatwhen there are several expecteds equal to the MCE, thePearson statistic looks even better relative to the likeli­hood ratio and Freeman-Tukey statistics. For largerMCEs, as noted before, there does not appear to be agreat difference between the three statistics.

The basic conclusion from these computations is thata P value based on the asymptotic chi-squared approxima­tion is "on average" about right for the Pearson statistic,but is understated for the likelihood ratio and Freeman­Tukey statistics when there are cell expectations in the1.5-4.0 range. Further exploration of the problems as­sociated with the likelihood ratio and Freeman-Tukeystatistics is given in Section 4. It should be noted thatresults similar to the above are found if other nominallevels (e.g., .10 or .01) are examined. Also, these resultsare consistent with those obtained by Good, Gover, andMitchell (1970) for the special case of symmetric multi­nomial distributions.

(3.3)(Nl+ 2)(N2 + 2)Outcomes = .2 2

Min. cellexpectation X' G' P

.50-1.49 (.018,.105) (.007,.043) (.002,.026)1.50-1.99 (.020,.071 ) (.011,.170) (.007,.155). 2.00-2.49 (.022,.065) (.091,.148) (.091,.148)2.50-2.99 (.020,.060) (.062,.105) (.059,.095)3.00-3.99 (.022,.086) (.030,.079) (.021,.068)4.00-4.99 (.025,.076) (.026,.095) (.023,.085)5.00-6.99 (.021,.071) (.038,.109) (.022,.079)7.00-9.99 (.031,.077) (.031,.077) (.031,.077)

10.00- (.031,.078) (.035,.078) (.034,.078)

.50- .99 (.027,.090) (.012,.056) (.001,.037)1.00-1.99 (.034,.074) (.022,.090) (.006,.068)2.00-2.49 (.030,.062) (.040,.086) (.033,.111 )2.50-3.49 (.034,.060) (.053,.114) (.057,.139)3.50-4.99 (.038,.070) (.044,.078) (.031,.072)5.00-5.99 (.041,.061) (.047,.074) (.044,.072)6.00-9.99 (.041,.060) (.039,.066) (.039,.059)

10.00- (.044,.060) (.046,.062) (.041,.058)

.50- .99 (.035,.095) (.031,.060) (.008,.043)1.00-1.49 (.033,.074) (.021,.080) (.005,.075)1.50-2.49 (.031,.074) (.038,.105) (.038,.095)2.50-2.99 (.036,.071 ) (.044,.082) (.045,.085)3.00-3.99 (.031,.073) (.042,.098) (.045,.112)4.00-4.99 (.033,.068) (.041,.098) (.032,.092)5.00- (.033,.068) (.037,.074) (.031,.071 )

.50- .99 (.036,.088) (.012,.086) (.000,.058)1.00-1.49 (.028,.072) (.025,.101) (.000,:084)1.50-1.99 (.028,.071) (.058,.107) (.027,.093)2.00-2.99 (.029,.074) (.051,.108) (.045,.100)3.00-3.99 (.033,.071 ) (.047,.094) (.043,.091 )4.00-4.99 (.030,.071 ) (.034,.096) (.028,.099)5.00- (.033,.063) (.041,.075) (.037,.074)

Numberof cells

5"

10·

2

3

4. Individual Probabilities for GroupHelping Model Computations

For N 1 = N 2 = 8, the number of possible outcomes is2,025. For the 36 values of (Pi, P2, P3) listed in Table 4and for N 1 = N 2 = 4, 6, 8, 12, and 16, the distributionsof X 2, 0 2, and T2 were determined by enumeration on thecomputer.

One question that arises in the use of this method is howto deal with zero cell counts and zero expected values.As indicated previously, the maximum likelihood esti-

Individual probabilities

• Monte Cerlo results based on 1,000 Irials per case.

The minimum cell expectation (MCE) for each casecan be computed as

MCE = N X min {Pi, ... , pd . (3.1)

Some evidence has been given that the MCE governs thecloseness of the small-sample distribution to asymptotictheory for several chi-squared problems (see, e.g.,Cochran 1952, Cramer 1945, Odoroff 1970, Yarnold1970). For each value of k, the cases were divided intoclasses according to their minimum cell expectations.Table 3 gives the range of actual small-sample rejectionrates when the nominal .05 level test is used. For example,for k = 3 the statistics are asymptotically distributed aschi-squared variables with two degrees of freedom and theasymptotic .05 critical value is 5.991. The values inTable 3 for k = 3 give the minimum and maximum of

a.O.(Pl, P2, pa) = P(statistic > 5.991) (3.2)

for all cases with minimum cell expecteds within the

(.1,.1,.8)(.1,.5,.4)(.2,.1,.7)(.2,.5,.3)(.3,.2,.5)(.3,.6,.1)(.4,.4,.2)(.5,.3,.2)(.6,.3,.1)

(.1,.2,.7)(.1,.6,.3)(.2,.2,.6)(.2,.6,.2)(.3,.3,.4)(.4,.1,.5)(.4,.5,.1)(.5,.4,.1)(.7,.1,.2)

(.1,.3,.6)(.1,.7,.2)(.2,.3,.5)(.2,.7,.1)(.3,.4,.3)(.4,.2,.4)(.5,.1,.4)(.6,.1,.3)(.7,.2,.1)

(.1,.4,.5)(.1,.8,.1)(.2,.4,.4)(.3,.1,.6)(.3,.5,.2)(.4,.3,.3)(.5,.2,.3)(.6,.2,.2)(.8,.1,.1)

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

258 Journal of the American Statistical Association, June 1978

E. Level of Significance of Pearson Chi-Squared.05 Test for Nl = N2 = 8, Group He/ping Mode/. a

11,0,0)

(0,0,111-.. ---.1 (0,1,0)

a Contour plots drawn by computer program written by DanielLaliberte under the supervision of Christopher Bingham, Schoolof Statistics, University of Minnesota.

G. Level of Significance of Freeman-Tukey Chi­Squared .05 Test for Nl = N2 = 8, Group HelpingModet.a

U,O,ol

(0,0,11 '-- .......J. (0,1,01

a See footnote a, Figure E.

N, = N. X' G' P

4 .0999 .1723 .10916 .1159 .1711 .12698 .1083 .1613 .1302

12 .1063 .1528 .127816 .1053 .1567 .1254

4 .0351 .0802 .06756 .0412 .0961 .06798 .0483 .0907 .0695

12 .0491 .0804 .068216 .0494 .0729 .0636

4 .0036 .0112 .00776 .0054 .0170 .01458 .0077 .0194 .0171

12 .0088 .0186 .017216 .0092 .0174 .0163

.05

.10

.01

5. Maximum Rejection Rates for the GroupHe/ping Model

Nominallevel

for N 1 = N 2 = 8, both G2 and T2 reject the null hypothe­sis more often than the nominal .05 level. Figure E forX 2 shows that the Pearson statistic does not reject toooften and, in fact, taking the size for a composite nullhypothesis as the maximum over the possible parametervalues of the probability of rejection, the size is about.048. Table 5 gives the corresponding maxima for allstatistics for N 1 = N 2 = 4, 6, 8, 12,. and 16 and fornominal levels .01, .05, and .10.

a See footnote a, Figure E.

F. Level of Significance of Likelihood Ratio Chi­Squared .05 Test for N, = N2 = 8, Group HelpingModeU

The results from Figures E, F, and G as well as Table 5show that the Pearson statistic has exact sizes that areclose to the nominal values for N 1 = N 2 ~ 8 and is con­servative for smaller sample sizes. For the sample sizesconsidered the likelihood ratio statistic has exact sizesthat are 46-92 percent higher than the nominal .05 j for

(0.0,11 I-.---------- ~ 10,1,0) the Freeman-Tukey statistic the exact sizes are 27-39percent too high. A similar picture is found for other

U.o.o )

mates were extended by continuity to provide well­defined procedures. In the same manner, when a cell hada zero expected value, it contributed zero to the chi­squared statistics.

For N 1 = N 2 = 8, Figure E gives a contour plot ofthe exact size for the Pearson statistic when the nominal.05 level test is used. Barycentric coordinates werechosen to represent the three probabilities (see Mostellerand Tukey 1968). Each corner of the right triangle repre­sents one of the probability vectors (1, 0, 0), (0, 1, 0),and (0, 0, 1), while a general point in the triangle cor­responds to the probability vector (Pl, P2, P3). Figures Fand G give similar plots for the likelihood ratio andFreeman-Tukey statistics. Figures F and G show that

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

Larntz: Small-Sample Comparisons of Chi-Squared Statistics 259

levels with the exception that the Freeman-Tukey sta­tistic is conservative for small sample sizes at the nominal.01 level.

Comparing the preceding results with those given pre­viously for the single multinomial, similar patterns arenoted for each of the three statistics although the twosituations are quite different, and in fact, the minimumcell expectations for some of the parametric model casesare as low as N 2P12. It must be noted, however, that themaximum rejection rates in the present case were at­tained in the middle of the probability triangle, as seenin Figures E, F, and G for N 1 = N 2 = 8.

3.3 Results for Three Additional Models

To validate the results of the previous two sections,Monte Carlo experiments were conducted for three addi­tional models. The first case considered was the model ofno association in a 3 X 5 table with both margins fixed.Ten thousand random trials were run for each combina­tion of three factors:

(a) row margin in proportion to (2, 3, 5) or (3, 3, 4) ;(b) column margin in proportion to (2, 3, 4, 5, 6) or

(1, 1, 1, 1, 1) ; and .(c) total sample size 20, 40, 50, 80, 100.

The results for nominal .05 tests are given in Table 6 andconfirm the previous conclusions. The Pearson statisticis remarkably stable with rejection rates close to orslightly below .05. Both the likelihood ratio and Freeman­Tukey statistics reject more often than desired, with thelikelihood ratio rejection rates for the smaller samplesizes running 10-14 percent.

6. Rejection Rates for 3 x 5 Fixed Margins Model

Row Columnmargins margins Sample sizepropor- propor- Sta-

tional to: tional to: tistic 20 40 60 80 100

2:3:5 2:3:4:5:6 X" .0369 .0428 .0456 .0465 .0486G" .1189 .1016 .0878 .0809 .0734P .0405 .0746 .0717 .0677 .0643

2:3:5 1:1:1:1:1 X" .0425 .0444 .0460 .0497 .0476G" .1383 .1121 .0828 .0781 .0682P .0652 .0847 .0729 .0698 .0630

3:3:4 2:3:4:5:6 X" .0369 .0423 .0453 .0490 .0497G" .1187 .1046 .0886 .0734 .0690P .0628 .0799 .0748 .0637 .0598

3:3:4 1:1:1:1:1 X" .0409 .0506 .0472 .0543 .0490G" .1449 .1042 .0799 .0731 .0653P .0724 .0953 .0728 .0608 .0539

NOTE: Values are based on 10,000 Irials wilh Ihe same Irlals used 10rX',G', and 7"'.Approximale standard error (based on Irue level 01 .05) lor each value is .0022.

For further validation, 5 X 5 incomplete contingencytables were generated under the hypothesis of quasi­independence. The cases considered included combina-

tions of row and column marginal parameters takenproportional to (1, 1, 1, 1, 1), (1, 2, 3, 2, 2), (1, 1, 3, 3, 2),(1, 1,5,2, 1), or (1, 1, 6, 1, 1). The samples were drawnfrom single multinomials with probabilities following thequasi-independence model with the above row and columnparameters. Ten thousand random trials were generatedfor each row-column combination for total sample sizes15, 22, 32, 47, 68, and 100. The results for nominal .05tests are presented in Table 7. Again the Pearson statisticrejection rates are stable at or slightly below .05 exceptfor the smallest sample sizes where the level sometimesgoes below three percent. The likelihood ratio andFreeman-Tukey behavior is quite similar to that foundpreviously; namely" conservative when there are manysmall expecteds and liberal rejection when there is apreponderance of expected values in the 1.0-4.0 range.Again the likelihood ratio statistic overstates the levelsomewhat more than does the Freeman-Tukey statistic,while T2 is more conservative for very small expectedvalues.

Finally, we consider the model of no three-factor in­teraction in a 3 X 3 X 3 contingency table. Results forfour cases are given in Table 8. For each case and samplesize, 2,000 random trials were generated satisfying themodel of no association between 'the three fixed margins;from these data, tests of no three-factor interaction werecalculated. The conclusions from this experiment arebasically the same as the others with the relative goodstanding of the Pearson statistic evident in all but thesparsest situations (i.e., N = 20 for a 3 X 3 X 3 = 27cell table).

3.4 Summary of Computation Results

Comparing the results for the five models, similarpatterns are noted for each of the three statistics. ThePearson statistic attains exact levels that are close to thenominal levels for a wide range of sample sizes and pa­rameter values. The Type I error rates of the approxi­mate chi-squared test based on the likelihood ratiostatistic are too high for both simple and composite nullhypotheses with moderate cell expectations. This resultis identical to that presented by Margolin and Light(1974) for two-way tables. For the Freeman-Tukeystatistic it appears that the chi-squared approximationworks slightly better for composite null cases than forsimple multinomials. This is true at least in comparing T2with the likelihood ratio statistic G2. However, T2 doesindeed have higher than nominal Type I error rates forboth simple and composite null hypotheses.

In summary, the Pearson chi-square is the best sta­tistic in terms of having Type I error rates that areclosest to the nominal levels based on the asymptoticchi-squared approximation. This result holds for simplemultinomial null hypotheses as well as for a variety ofcomposite null hypotheses. The next section will offer anexplanation of the results for the likelihood ratio andFreeman-Tukey statistics.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

260 Journal of the American Statistical Association, June 1978

7. Rejection Rates for Incomplete 5 x 5 Quasi-Independence Model

Row Column Sample sizeparameters parameters

proportional to: proportional to: Statistic 15 22 32 47 68 100

1:1:1:1 :1 1:1:1:1:1 X2 .0276 .0327 .0385 .0480 .0448 .0476G' .0700 .0994 .1037 .0983 .0755 .0602T'- .0202 .0476 .0705 .0800 .0667 .0522

1:2:3:2:2: 1:1:1:1:1 X' .0275 .0381 .0409 .0458 .0474 .0464G2 .0510 .0800 .0902 .0907 .0806 .0692T'- .0165 .0427 .0581 .0678 .0661 .0623

1:1 :3:3:2 1:1:1:1:1 X' .0281 .0367 .0405 .0434 .0468 .0481G2 .0413 .0692 .0843 .0871 .0820 .0761T'- .0110 .0290 .0483 .0609 .0693 .0693

1:1 :5:2:1 1:1:1:1:1 X2 .0235 .0291 .0361 .0443 .0435 .0477G' .0267 .0534 .0746 .0872 .0835 .0801T'- .0065 .0187 .0342 .0561 .0669 .0738

1:1:6:1:1 1:1:1:1:1 X' .0235 .0245 .0319 .0389 .0428 .0418G2 .0206 .0385 .0627 .0882 .0924 .0788T'- .0047 .0110 .0243 .0544 .0743 .0711

1:2:3:2:2 1:2:3:2:2 X' .0312 .0420 .0425 .0437 .0476 .0462G' .0433 .0715 .0795 .0847 .0801 .0716T'- .0113 .0322 .0465 .0595 .0650 .0612

1:1 :3:3:2 1:2:3:2:2 X' .0344 .0393 .0440 .0463 .0467 .0471G' .0352 .0605 .0788 .0835 .0794 .0755T'- .0085 .0217 .0447 .0574 .0625 .0656

1:1 :5:2:1 1:2:3:2:2 X' .0319 .0398 .0448 .0473 .0469 .0472G2 .0228 .0470 .0695 .0828 .0835 .0745T'- .0046 .0152 .0318 .0510 .0624 .0643

1:1:6:1:1 1:2:3:2:2 X' .0306 .0341 .0392 .0403 .0471 .0437G' .0164 .0320 .0526 .0713 .0831 .0768T'- .0039 .0078 .0203 .0384 .0594 .0614

1:1 :3:3:2 1:1 :3:3:2 X' .0464 .0524 .0514 .0497 .0505 .0501G' .0346 .0673 .0791 .0817 .0807 .0745T'- .0068 .0249 .0416 .0524 .0623 .0619

1:1 :5:2:1 1:1 :3:3:2 X' .0422 .0510 .0523 .0483 .0502 .0450G2 .0255 .0465 .0657 .0740 .0811 .0713T'- .0048 .0134 .0292 .0395 .0571 .0609

1:1:6:1:1 1:1 :3:3:2 X' .0375 .0392 .0460 .0430 .0477 .0485G2 .0194 .0315 .0528 .0631 .0771 .0804T'- .0034 .0088 .0223 .0338 .0502 .0628

1:1 :5:2:1 1:1 :5:2:1 X2 .0444 .0503 .0511 .0481 .0466 .0481G2 .0165 .0310 .0531 .0646 .0734 .0791T'- .0027 .0092 .0183 .0307 .0403 .0570

1:1:6:1:1 1:1 :5:2:1 X2 .0385 .0494 .0468 .0435 .0433 .0459G2 .0103 .0242 .0319 .0480 .0634 .0773T'- .0014 .0063 .0095 .0205 .0324 .0521

1:1:6:1:1 1:1:6:1:1 X' .0379 .0454 .0495 .0422 .0432 .0419G2 .0072 .0160 .0282 .0358 .0548 .0701T'- .0009 .0038 .0112 .0116 .0205 .0427

NOTE: Values are based on 10,000 trials with the same trials used for X'. G', and P. Approximate standard error (based on true level of .05) for each value is .0022.

4. EFFECTS OF VERY SMALL COUNTS ON THE Consider the case where a table has a cell with an ob-CHI-SQUARED STATISTICS served value of O. If the expected value is positive, then

In the last section it was concluded that for moderateeach of the three statistics X 2, (P, and T2 will give someweight to this discrepancy between the observed and

expected values, the Pearson statistic attained an exact expected cell frequencies. The minimum contributions tolevel closer to the nominal than did either the likelihood the three statistics caused by a zero count can be cal-ratio or Freeman-Tukey statistics. The principal thesis culated; these minimum contributions are given Inof this section is that this discrepancy in behavior is due Table 9 for a range of expected cell frequencies. Thusto the differing influence given to very small observed when a zero count appears with an expected cell fre-counts by' the statistics. quency of 2.0, X2 is at least 2.00, and G2 and T2 are at

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

Larntz: Small-Sample Comparisons of Chi-Squared Statistics 261

8. Rejection Rates for 3 x 3 x 3 No Three-Factor 10. Minimum Contributions to Chi-SquareInteraction Model for an Observed Count of One

Row Column Layer P(Zero ormargins margins margins Sample size Cell Minimum contributions one count)propor- propor- propor- Sta- expecta- under

tiona/ to: tionalto: tiona/ to: tistic 20 40 60 80 100 tion X" G" P Poisson2:3:5 2:3:5 2:3:5 X' .0175 .0550 .0585 .0845 .0690

0.50 0.61 0.34 .40601G' .0190 .0885 .1125 .1160 .1265 2.0T' .0010 .0175 .0475 .0800 .0775 2.5 0.90 1.17 0.81 .28730

3.0 1.33 1.80 1.42 .199152:3:5 2:3:5 6:6:7 X' .0435 .0820 .0650 .0740 .0485 3.5 1.79 2.49 2.13 .13589

G' .0335 .1220 .1375 .1340 .0925 4.0 2.25 3.23 2.92 .09158T' .0025 .0370 .0635 .0835 .0630

4.5 2.72 3.99 3.78 .061102:3:5 6:6:7 6:6:7 X' .0440 .0710 .0675 .0700 .0585 5.0 3.20 4-78 4.70 .04043

G' .0575 .1410 .1475 .1275 .1035 5.5 3.68 5.59 5.67 .02657T' .0025 .0400 .0665 .0875 .0725 6.0 4.17 6.42 6.69 .01735

6.5 4.65 7.26 7.74 .011276:6:7 6:6:7 6:6:7 X' .0825 .0870 .0855 .0625 .06807.0 5.14 13·11 8.83 .00730G' .0950 .1740 .1660 .1140 .1120

T' .0045 .0820 .1085 .0805 .0840NOTE: Minimum contribution for G' is lim._ 210g/1/exp) + 2(n - 1)log((n - 1)/(n - exp»)

NOTE: Values are based on 2,000 trials with the same trials used lor X', G', and T'. = -210g(exp) + 2(exp - 1).Approximate standard error (based on true level of .05) for each value is .0049.

H. Level of Significance for Pearson Chi-Squared.05 Test, Binomial Ho: P = .05.

SIGNIFICANCE LEVEL

that very small observed counts increase (j2 and T2 to amuch greater extent than X 2• In addition, for expectedeell frequencies in the range of 2.0-5.0, zero and onecounts are very common occurrences as shown by thePoisson approximation.

To illustrate the preceding statements, Figures H, I,and J give the exact levels for the nominal .05 tests basedon X2, (j2, and T2, respectively, for the case of a binomialnull hypothesis with cell probabilities of (.05, .95). Ineach figure exact levels were computed for N = 10 (1)100; thus minimum cell expectations were .50 (.05) 5.00.Figure H for the Pearson statistic X2 shows that theexact level fluctuates in a fairly regular pattern aroundthe nominal value of .05. In contrast, an examination ofFigures I and J reveals that G2 has two fairly large in­creases in exact level at N = 38 and N = 87, while T2

has large increases at N = 38 and N = 89. These are

9. Minimum Contribution to Chi-Square 0.15for an qbserved Count of Zero

ProbCell Minimum contributions (zero count)

0.10expecta- undertion X" G" T" Poisson

1.0 1.00 2.00 1.53 .367881.5 1.50 3.00 2.71 .22313

0.052.0 2.00 4.00 4.00 .135342.5 2.50 5.00 5.37 .082083.0 3.00 6.00 6.79 .049793.5 3.50 7.00 8.25 .030204.0 4.00 8.00 9.75 .018324.5 4.50 9.00 11.28 .011115.0 5.00 10.00 12.83 .00674

NOTE: Minimum contribution for G' is lim._ 2nlog(n/(n - exp)) = 2 expo

10080

SAMPLE SIZE

604020o

0.20

least 4.00. The last column in Table 6 gives the prob­abilities of a zero observed value for a Poisson randomvariable with mean equal to the corresponding expectedcell frequency. Under multinomial sampling with a largesample size, this is a good approximation to the prob­ability of getting an observed zero in that cell. Table 10gives similar minimum contributions and Poisson prob­abilities for the influence of an observed count of 1.0.

For the likelihood ratio statistic, the minimum con­tribution for a zero or one count occurs when the remain­ing counts are spread over the other cells in exact propor­tion to the expected values of those cells. Exact formulasare given in the footnotes to Tables 9 and 10.

Looking at the case of a zero observed count withexpected cell frequency of 2.0, the corresponding Poissonprobability ofa zero count is 0.13534. In the special caseof a chi-squared test with one degree of freedom, thenominal .05 level test would reject for statistic values inexcess of 3.84146. Thus both (j2 and T2 would have exactlevels somewhat in excess of 0.13534 compared to anominal value of .05. For chi-squared tests with moredegrees of freedom, the effects are not as strong, butnonetheless, Tables 9 and 10 illustrate the general pattern

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

262 Journal of the American Statistical Association, June 1978

SIGNIFICANCE LEVEL

SIGNIFICANCE LEVEL

[Received September 1973. Revised October 1977.J

REFERENCES

Bahadur, R.R. (1967), "An Optimal Property of the Likelihood RatioStatistic," Proceedings of Fifth Berkeley Symposium on Mathe­matical Statistics and Probability, 1, 13-26.

Bartlett, M.S. (1935), "Contingency Table Interactions," Journal ofthe Royal Statistical Society, Supplement, 2, 248-252.

Bishop, Yvonne M.M. (1969), "Calculating Smoothed ContingencyTables," in The National Halothane Study, ed. John P. Bunker,William H. Forrest, Jr., Frederick Mosteller, and Leroy D.Vandam, National Institutes of Health, Washington, D.C.: U.S.Government Printing Office, 273-286.

---, and Fienberg, Stephen E. (1969), "Incomplete Two-Di­mensional Contingency Tables," Biometrics, 27, .54.5-562.

---, Fienberg, Stephen E. and Holland, Paul W. (1975), DiscreteMultivariate Analysis, Cambridge, Mass.: M.LT. Press.

Bliss, C.L (1967), Statistics in Biology, Volume I, New York:McGraw-Hill Book Co.

Cochran, William G. (1952), "The x2 Test of Goodness of Fit,"Annals of Mathematical Statistics, 23, 31.5--346.

Cramer, Harald (1945), Mathematical Methods of Statistics, Princeton,N.J. : Princeton University Press.

Easterling, RG., and Prairie, RR. (1971), "Combining Componentand System Information," Technometrics, 13, 271-280.

Feller, William (1957), An Introduction to Probability Theory and ItsApplications, Volume I, 2nd ed., New York: John Wiley & Sons.

Fienberg, Stephen E., and Larntz, Kinley (1971), "Some Models forIndividual-Group Comparisons and Group Behavior," Psy­chometrika, 36, 349-367.

Fisher, Ronald A. (1958), Statistical Methods for Research Workers,13th ed., New York: Hafner Publishing Co.

Freeman, M.F., and Tukey, John W. (1950), "TransformationsRelated to the Angular and the Square Root," Annals of Mathe­matical Statistics, 27, 607-611..

Good, LJ., Gover, T.N., and Mitchell, G.J. (1970), "Exact Distribu­tions for x2 and for the Likelihood-Ratio Statistic for theEquiprobable Multinomial Distributions," Journal of the AmericanStatistical Association, 65, 267-283.

Goodman, Leo A. (1973), "Guided and Unguided Methods forSelecting Models for a Set of T Multidimensional ContingencyTables," Journal of the American Statistical Association, 68,165-175.

Hammersley, J.M., and Handscomb, D.C. (1964), Monte CarloMethods, London: Methuen & Co.

Hoeffding, Wassily (1965), "Asymptotically Optimal Tests for theMultinomial Distribution," Annals of Mathematical Statistics, 36,369-401.

Margolin, Barry H., and Light, Richard J. (1974), "An Analysis ofVariance for Categorical Data, II: Small Sample Comparisonswith Chi Square and Other Competitors," Journal of the AmericanStatistical Associati<Jn, 69, 755-764.

Mosteller, Frederick, and Tukey, John W. (1968), "Data Analysis,Including Statistics," in Handbook of Social Psychology, Volume !B,ed. G. Lindsey and E. Aronson, Reading, Mass.: Addison-Wesley.

tribution. The conclusions for several composite modelsrequiring estimates of parameters were identical to thosefor the completely specified multinomial case. Given thediversity of the models studied, it is clear that the statisticof choice, as far as null hypothesis behavior is concerned.is the Pearson chi-squared statistic.

The high Type I error rates for the likelihood ratioand Freeman-Tukey statistics result from the large con­tributions to the chi-squared value for very small countsin cells with moderate expected values. If a user desiresto use one of these statistics in a case with moderate cellexpectations, then it must be understood that the Pvalues based upon the nominal tests will, on the average,be somewhat understated.

100

100

80

80

SAMPLE SIZE

SAMPLE SIZE

&0

60

40

40

20

20

o

o

0.00

0.10

J. Level of Significance for Freeman-Tukey Chi­Squared .05 Test, Binomial Ho: P = .05.

0.20

0.15

S. RECOMMENDATIONS AND CONCLUSIONS

0.15

0.20

. I. Level of. Significance for Likelihood Ratio Chi­Squared .05 Test, Binomial Ho: p = .05.

0.05

0.10

0.05

precisely the points at which zero and one counts become"significant at .05" for these statistics.

For the case of a specified null hypothesis in the multi­nomial goodness-of-fit problem, the small-sample dis­tributions of three chi-squared statistics were examined.Based on a criterion of the closeness of small sample dis­tribution to the asymptotic chi-squared approximation,the Pearson chi-squared statistic is by far the most de­sirable. Both the likelihood ratio and Freeman-Tukeystatistics yield too many rejections under the null dis-

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

Larntz: Small-Sample Comparisons of Chi-Squared Statistics 263

Odoroff', Charles L. (1970), "A Comparison of Minimum Logit Chi­Square Estimation and Maximum Likelihood Estimation in2 X 2 X 2 and 3 X 2 X 2 Contingency Tables: Tests for Inter­actions," Journal of the American Statistical Association, 65,1617-1631.

Roscoe, John T., and Byars, Jackson A. (1971), "Sample Size Re­straints Commonly Imposed on the Use of the Chi-Square Sta­tistics," Journal of the American Statistical AssoCiatiQ1l, 66, 755-759.

Staub, Ervin (1970), "A Child in Distress: The Influence of Age andNumber of Witnesses on Children's Attempts to Help," Journal ofPersonality and Social Psychology, 14, 130-140.

Sylwester, David (1974), Abstract: "A Monte Carlo Study ofMultidimensional Contingency Table Analysis," Biometrics, 30,386.

Tate, Merle W., and Hyer, Leon A. (1973), "Inaccuracy of the X2

Test of Goodness of Fit When Expected Frequencies Are Small,"Journal of the American Stat1'stt"cal Assocwtion, 68, 836-841.

Yarnold, James K. (1970), "The Minimum Expectation in X2Goodness of Fit Tests and the Accuracy of Approximations for theNull Distribution," Journal of the American Statistt"cal AssocUJtion,65, 864-886.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

10:

54 0

6 O

ctob

er 2

014

top related