test statistics for simple markov chains. a monte carlo study

I Biom. J. voL 20. 1978. no. I, PP. 41-4Q 1 Staatsinstitut fur Bildungeforsohnng und Bildungsplannng, Dfiinchen

Test Statistics for Simple MABgov Chains. A Monte Carlo Study

FRITZ K. BEDAU

A b s t r a c t

The statistical method for the analysis of frequency tables which has been published by GRIZZLE, STARMEB, and KOCH (1969) is outlined for the caee of linear functiom of relative frequencierr. Ita use for the analysis of aggregate time wries data under the model of simple MABXOV chains is suggeeted. Weighty objections againet the resulting test etatistice were grounds for the planning and performance of a Monte Carlo study. The reaulta of the study demonstrated that the test statistic 88 ( a - 0 ) which is used for the comparison of different MABKOV chains is in fact ZUistributed. The teat statistic 88 (F(p) =X6) for evaluation of the lack of fit of the model also follows a Xz-distribntion when a simple MABXOV chain is used inetead of the observed time series X. can be determined by simply iterating the original 08K elgorithm.

Key worde : Aggregate time series data, X2-dktribution, linear model, Monte Carlo study, simple MARKOV chains, test statistic, weighted least squares.

Introduction and presentation of models and testa

GRIZZLE, STARMER, and KOCH (1969) presented a statistical procedure (the “GSK procedure”) which made possible the analysis of frequency tables within the framework of the linear model; in the meantime many articles based on that model have appeared in various American journals and demonetrated the merits of the procedure in varying areaa of empirical research work.

Recently BEDALL (1974) suggested extension of the GSK procedure to data which are, in effect, aggregate time series and which can be looked upon aa realizations of simple MARKOV chains.

This paper presenta a development of the proposed extension and highlights its departures from tfie original procedure.

The starting point for the development is the data of Table 1. With the MARKOV model in mind we define the following symbols:

xi = bji? ~ j 2 9 * * * 9 Pj,r+iI

= [Pji, pj2.p * . * 9 PjrI

( j J 4 - 1 )

P; (W

F ( p ) ‘ = [ p i , p i , . . . pi ] ( i.r(r - 1 ))

42 F. K. BBDALL

Table 1 The data matrix hae entries for e time point8 and r + 1 different etatea, the calle p f j are observed probabilities. The eatimation of the bkiKov trendion matrix ie b d upon thi8 matrix.

time 8 b h 3

point 1 2 r t l z 1 Pll P12

P?1 Pn

...

P1,rti 1 . . . . . . P2:r+i . 1 2

8 Pti P a 2 ... Psir+i i

These definitions enable UB to express the fundamental relations of the MARKOV model in simple terms :

F ( P ) = x b + error. W - 1),1) (I(* - Wo + 1)) (t(r + 1)J) (r(8 -%I)

Such models can be analyzed by the GSK procedure. As the F ( p ) are linear func- tions of the original probabilities p, the variance covariance matrix S of the F ( p ) can be estimated exactly by A’ Var (p) A, where A is the indicator

(t(8 - 1 M r + 1)) matrix which selects the F ( p ) of the probabilities of Table 1.

The weighted least squares estimates b are now found by minimizing the qua- dratic form ( F ( p ) - X b ) ’ S-1 ( F ( p ) -Xb) with respect to b ; this yields the well- known estimates b = ( X S - 1 X ) - ‘ X’S-fF(p) which have variances Var (6) = =(XS-lX)-l.

The test statistic for lack of f i t is given by

ss ( P ( p ) = x b ) = P ( p ) ’ S - W ( p ) -b’X’S-jXb

which is 22-distributed with r ( 8 - 1) -r ( r + 1) degrees of h e d o m when the null hypothesis is true.

The test statistic for hypotheses of the form Cb = O , where C is an appropriate matrix , is computed by

ss (Cb=O)=b‘C’[C(X’S-~ X)-i C‘I-iCb

which is 22-distributed with rank C degrees of freedom when the null hypothesis is true.

43

Objection to the suggested procedure

It is well known that the solution offered by the lewt squares method provides best linear unbiased estimators, bu t it is also known that the validity of familiar test statistics accompanying the solution are suspect when certain assumptions are not met. I n the above extension of the GSK procedure one of the fundamental prerequesites, namely that X is a matrix of constants, is most obviously violated. I n fact, X must simultaneously play both the role of a matrix of constants and the role of a matrix of random variates.

Monte Carlo study of the test statistics

As the available literature (BILLINQSLEY, 1961 ; LEE, JUDQE, and ZEUNER, 1970; KEND~LL and STUART, 1968) hardly offers analytical hints, a Monte Carlo study ww conducted to investigate the distributions of the GSK h t statistics under the null hypothesis. GRIZZLE, STARMER, and KOCH (1969) would expect X2-distributions. A Monte Csrlo study haa t o take a number of factors into account ; to reduce the multiplicity of the cmm to be studied the levels of the factors have been held to a minimum.

There are five factors of interest with levels fixed follows: 1.

2.

3.

4.

5.

the number of MARKOV states : ?-+1=2 the number of units per observation :

ni= 2 nij= 100 for all i , i = 1, . . . , 6

the number of ANOVA factors in a design where the MARKOV chains are the dependent variables : 1 the levels of the AXOVA factor: q = 2 , i.e., there were two MARKOV chains the number of time points (observations) per MARKOV chain

r + l

j= i

sfq = 10. The matrix

0.75. 0.25

was chosen w a transitsion matrix for both MARKOV chains. This matrix is suitable because i t is likely to prevent least squares solutions

of l e ~ s than zero. With these definitions an algorithm can be programmed. The algorithm first

creates at time point 1 a random number zl for chain 1 and another random number z2 for chain 2. zi and z2 are drawn from a uniform distribution.

44 F. K. BEDALL

chain 1 chain 2

I x1 ! l-nt I To these starting probabjlitiea an N( 0, xi( 1 -n,)/n)-distributed random number is added ; this takes into account the fact that t h e observation of n, is not free of error. The fallible pi are noted. Next, the error-free ni are continued by multipli- cation with the transition matrix P ; again, error is added and the resultingpi are recorded. This happens sl2 = 10 times per MARKOV chain i, i = 1 , 2 .

Table 2 Scheme for modeling OM simple MARKOV chain

fallible data - time point true data + error -

9 -

The fallible data of both MARKOV chains are now submitted to an analysis by the GSK program. The relevant test statistics are recorded, the slgorithm finds two new random numbers, and the whole process begins again.

Formulation of the hypotheses

The test statistics wkich are computed by the GSK program are wsumed to be X2-distributed. This is the assumption to be subjected to a statistical test.

Of definite interest are the distributions of the following test statistics : 1. test of lack of f i t

ss (P(p) = xb) = F(p)' s- 1 P(p) -b'XS- 1 3 % *

Under H o the statistic should be 12-distributed with r ( s - q ) -qr ( r + ' l ) degrees of freedom, where q is the number of ~ R K O V chains.

4. test of equality of MARKOV chains SS (cb = 0) =b'C"C( x's- IX) - 'C'J- 1 C%

1 0 - 1 0 1 0 - 1

where C =

Under H o the statistic should be X2-distributed with rank C degrees of freedom.

Test statistics for simple ~ L ~ R K O V chains 45

The kind of problem formulated here makes i t especially important to control the p-error, that is, the risk of not rejecting H o when it should be rejected.

It is our goal to confirm H,, i.e., to confirm that the above test statistics are X’-distributed under Ho.

With the aid of COHEN’S power tables (COHEN, 1969, pp. 247-251) the order of magnitude of the simulation study can beestimated. 860 rum through the described algorithm are necessary to make possible the detection of small deviations of the expected X2-dietribution rtt the 1 O/o level with probability 0.99. Following COHEN, a smell deviation is a value ex0 .05 , where e = 2 (pOi-pei)2/pd with m 88 the

number of intervals. Thus a simulation including 3000 runs is more than d e -

m

i =i

quate.

First results

The distributions of the two test statistice which stem from the GSK procedure were compared to the X?-distributions which were expected.

Therefore we defined 15 percentiles per distribution and retained 14 intervals (the STURQES formula (1926) suggesta 1 + 3.32 log 1000 x 11 intervals). The test statistics which fell into these intervals were counted; the frequency table thus constructed can then be compared with H o , the complete fit, and can be evaluated statistically by the g;4-i -or distribution (~:3,0.10 = 19.81 ; ~:3,o.oi = 27.69).

Table 3 Frequencies of observed teat statistics in 14 intervals between 15 specified peroentiles of the x:,- end X$distribntion ‘

Percentiles 0 1 2 6 10 20 30 60 70 80 90 96 98 99 100 %& ~ ~~

SS(&’(p)=lyb) 6 7 17 ’42 72 79 162 190 124 111 85 65 26’ 35 170.99 ss ( c b = O ) 16 10 29 66 113 104 198 186 90 91 50 30 12 7 13.36 Ho(expected 10 10 30 5 3 100 100 200 200 100 -100 60 30 10 10 0.00 proportions)

Table 3 confirms that the X2-distribution is appropriate for the comparison of MARKOV chains; there is no evidence that the observed distribution differs from the hypothetical one. Table 3 also makes clear that the 12-distribution is not suitable to test the fit of the model.

A technical extension of the GSK procedure

The P(#) which are estimated by the term X6 should produce an error-free MAR- KOV chain; but in general that is not the case. A correct MARKOV chain could be produced by iteration. The rule is that the computed P(#) are utilized to con- struct a new matrix X. This is without problems with the exception of the starting

46 F. K. BEDALL

values of a chain; it was decided that these values should remain constant. The F( p ) also allow a reestimation of the variance covariance matrix S .

The original MARROV equation whose transition matrix is to be found by means of weighted least squares is changed to -

P ( p ) = Xb + error

where % is an error-free simple MARKOV chain which can be determined by iteration.

Further regults

1000 simulation rum of five iterations each were performed. Normally, a stable state was reached by five iterations. The investigation of the two test statistics showed that the test of lack of f i t can safely be done after iteration; however, the comparison of MARKOV chains is no longer possible at that point.

Table 4 Frequencies of observed test statistice in 14 intervals between 15 specified percentiles of the xz4- and &distribution

Percentiles 0 1 2 5 10 20 30 50 70 80 90 95 98 99 100 &

S S ( F ( p ) = X b ) 11 10 28 58 74 105 209 199 115 89 55 28 7 12 14.33 SS (Cb =0) 15 15 34 79 135 104 215 183 87 71 37 8 10 7 67.85 H,(expected 10 10 30 50 100 100 200 200 100 100 50 30 10 10 0.00 proportions)

Discussion

Both types of results suggest use of a differentiated decision system for the analysis of aggregate time series data when simple MARKOV chains are presumed.

If there is more than one time series and do we want to compare the time series, GSK i s used in its original version. The number z of significantly different time series is to be determined. The GEK test statistic is 22-distributed with rank 17 degrees of freedom when H , is true.

A test of f i t of the model and the estimation of the MAR.KOV parameters must be performed iteratively. The solution must be found for each of z time series. Under H , the lack of f i t statistic ia 22-distributed with r (s - q ) - r ( r + 1 ) degrees of freedom, where q is the number of collapsed (equal) time series.

If there is no reason to reject the MABKOV model, the estimates b can be inter- preted as elements of a transition matrix.

The following demonstrates the collapsing with two statistically equal time series with sI2 = 3 and r = 1 ; the last index refers to chains.

Ted statistics for simple 'MARKOV chains 47

START )

Yes

- A b are elements

transition matrix

test number of collapse X to time series is not time series and one time .sertes a MAa~cw &om divide data into

different series

of the MARKOW

I END

Fig. 1. Decision flow chart for the analysis of aggregate time aeries.

X b +error

1 collapsjng

Pi21

P2.i) Pi22 + error . P 2 2 2

controlled again by 1000 simulation rum. Because the two sirnulatad MAFLKOV chains were based on identical transition matrices, both chains were treated w one single c h h . The test statistic 88 ( P ( p ) =%b) wsscomputed.Itsdistributionwaa~2withr(s-q)-r(r+1)=1(20-2)-1(1+1)= = 16 degrees of freedom. Table 6 Frequenciea of the obeerved test statistic in 14 intervale between 16 specified percentiles of the x is - distr ibntion

Percentilee 0 1 2 5 10 20 30 60 70 80 90 96 98 99 100

SS (P(p)=%b) 10 16 27 49 92 86 196 216 100 102 62 29 6 13 13.47 HO(expected 10 10 30 50 100 100 200 200 100 100 50 30 10 10 0.00 proportions)

48 F. K. BEDALL

The sociometric data which have been analyzed by BED~LL (1974) were reanalyzed under these circumstances. The criteria for comparison of time series have not changed; thus, the results are identical. The procedure to estimate lack of fit of a model w a ~ , however, different. The test statietic for group I became SS ( P ( p ) = = x b ) = 8 . 5 7 and for group I1 SS ( P ( p ) = h ) = 5 . 6 1 . Table 6 displays the new transition matricee.

Table 6 Reastimated tramition matncse for two hf~mtov chains. 0-81 data and analpie in Biom. z. 16 (1974) 7, pp. 451458.

voter belongs to social group I Bociel group II

chooaea at time choosee at time point t point t I II I II

chow at time 1 0.55 0.45 0.95 0.05 point t - 1 II 0.47 0.63 0.01 0.99

Thus there is evidence that the GSK procedure is useful for the analysis of simple MAXKOV chains on the basis of aggregate time series data. It allows for the comparison of different chains. Chains which are treated a~ cells of an experimental design can be investigated and compared with the test statistics offered byQSK; inferences about main effects of ANOVA factore and interactions can be made on statistical grounds.

It is our opinion that the GSK procedure represents a powerful tool for the

It would, however, be highly desirable to add a feature not yet implemented in the standard procedure. This new feature would be an iteration of the original algorithm to determine the matrix z; it would permit estimation of the degree of f i t of the mathematical model to the data and estimation of the MARKOV parameters. Figure 1 outlines a rather clumsy way of coming to that end.

Possibly, a more elegant solution would be to iterate not only the SS ( P ( p ) = =Xb)-section but the complete G S R procedure inclusive tests. This could be done by calculating the parameters6 which are modified by hypotheses, b (Cb = 0) = = [ I - (X’S-1X)-iC’[C(X‘S-iX)-1C’] -iC]b, andusingthemto estimatetheerror free MAFLKOV chain 2.

above PUrpOSeS.

Acknowledg emente

I should like to thank bfr. H. ZIMXIKBMAXN, Munich, who propoaod the iterative approach when I confronted him with f i i Monte Car10 I W S U ~ ~ ~ .

Test et&istice for simple MAEKOV chains 49

References

BEDALL, F. K., 1974: Die ha+? von eggregierten Zeitreihendaten unter der Annahme einer

BILLINOSLEY, P., 1981 : Statistical methode in Markov chahe. Ann. Math. Statist. 8 3 , 1 2 4 . COHEN, J., 1969: Stathticel Power h a l y s h for the Behavioral Sciences. Academic Preee, New

GRIZZLE, J. E., C. F. STABMER, G. G. K m , 1989: Analyah of categorical data by linear models.

KENDALL, M. G.. A. STUABT, 1968: The Advanced Theory of Stetietics, Vol. 3. Deaign a n d h a l y -

LEE, T. C., G. G. JUDOE, A. ZELLNER, 1970: Estimeting the Parametere of the Merkov Probebility

STURCES, H. A., 1926: The choice of a clam intervel. J. Amer. Statist. Aeaoc. 31, 85-86.

einfechen Merkoffkette. Biom. Z. 1 6 , 4 5 1 4 8 .

York.

Biometries 36,489-504.

eie, and Time-Series. Griffin, London.

Model from Aggregate Time Series Date. North-Hollend, Ameterdam.

Bomlved: X/21/1076

Author’e eddreee: Dr. F R ~ z K. BEDALL Steeteinetitut fiir Bildungaforechung und Bildungsplanung 8000 Miinchen 81, BRD Arabellastr. 1/vI

4 Biom. Z. 20, 1978

test statistics for simple markov chains. a monte carlo study

Documents