regression for continuous and binary longitudinal · estimation methods in random coefficient...

ESTIMATION METHODS IN RANDOM COEFFICIENT

REGRESSION FOR CONTINUOUS AND BINARY

LONGITUDINAL DATA

BY

S. Samuel Bederman

A THESIS SUBMITTED IN CONFORMITY WITH THE

REQUIREMENTS FOR T H E DEGREE OF

MASTER OF SCIENCE

GRADUATE DEPARTMENT O F COMMUNITY HEALTH

UNIVERSITY OF TORONTO

@ Copyright by S. Satnuel Bederman, 1997

395 Wellington Street 395, rue Wellington OttawaON K1A ON4 Ottawa ON K1 A ON4 Canada Canada

Your lFle Votre mf6rence

Our fi& Notre réUrence

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or othenivise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/h, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent ê e imprimés ou autrement reproduits sans son autorisation.

ESTIMATION METHODS IN RANDOM COEFFICIENT REGRESSION

FOR CONTINUOUS AND BINARY LONGITUDINAL DATA

Master of Science

S. Samuel Bederman

Graduate Department of Community Healt 11

University of Toronto, 1997

Abstract

Random coefficient regression (RCR) models are commonly used in the analysis of longitu-

dinal data. Longitudinal studies involve a number of subjects 011 whom repeated outcome

and explanatory measurements are taken over tirne. RCR is necessary since eacli subject

may have a different relationship between the explanatory and outcome measurements.

This thesis attempts to improve upon the current techniques used of estimation in RCR in

two ways. First, the weighted least-squares estimator put fort11 by Swamy [16] is adapted

to allow an iterative procedure to update the parameter estimates. This i terated zueighted

least-squares estimator is compared with the weighted least-squares and unweighted least-

squares estimators. Second, these t hree RCR estimators are t hen extended, by analogy,

to the case where the repeated outcorne variable is binary. These theoretical inethods

are explained in detail, tested on simulated datasets, and used to analyze a longitudinal

dataset in bot h the continuous and bi~iary outcorne cases.

Contents

1 Introduction 1

1 An Introduction to Random Component Models 2

. . . . . . . . . . . . . . . . . . . . . . 1.1 The need for Random Components 3

. . . . . . . . . . . . . . . . . 1.2 Random Effects versus Random Coefficients 8

. . . . . . . . . . . . . . . . . . . . 1.3 Extension to the Binary Outcome Case 9

II Continuous Outcome Longitudinal Data 10

2 Theory of Random Coefficients for Multiple Linear Regression 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The Mode1 1 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Estimation Methods 13

. . . . . . . . . . . . . . . 2.2.1 First Stage Estimation (The Individual) 13

. . . . . . . . . . . . . . 2.2.2 Second Stage Estimation (The Aggregate) 14

. . . . . . . . . 2.2.3 Third Stage Estimation (The Updating Procedure) 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Inference 19

3 Continuous Outcome Longitudinal Data Simulation 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Simulation 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 TheResults 40

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Estimation 40

. . . . . . . . . . . . . . . 3.2.2 Type 1 Error Rates and Statistical Power 41

4 An Application: Environmental Health Among Asthmatics in the City

of Windsor 43

III Binary Out corne Longitudinal Data

5 Theory of Randorn Coefficients for Multiple Logistic Regression 50

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Mode1 50

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Estimation Methods 52

. . . . . . . . . . . . . . . 5.2.1 First Stage Estimation (The Individual) 52

. . . . . . . . . . . . . . 5.2.2 Second Stage Estimation (The Aggregate) 57

5.2.3 Third Stage Estimation (The Updating Procedure) . . . . . . . . . 60

6 Binary Outcorne Longitudinal Data Simulation 62

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 The Simulation 62

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 TheResults 77

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Estimation 77

. . . . . . . . . . . . . . . 6.2.2 Type 1 Error Rates and Statistical Power 79

7 An Application: Word Recall Success in Head Injury Patients 81

IV Conclusions 86

8 OveraIl Conclusions of the Algorithm 87

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Continuous Outcome $7

8.1.1 The Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

. . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Simulation Findings 89

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Binary Outcome 90

8.2.1 The Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

8.2.2 Simulation Findings . . . . . . . . . . . . . . . . . . . . . . . . . . 90

9 Discussion 93

9.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

9.2 Extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

V Appendix

A Henderson and Swamy Mode1 Equivalency 98

B Estimates of Variance 103

B. 1 Regression Parameter Variance Estimation . . . . . . . . . . . . . . . . . . 103

B.2 Variances of Logistic Regression Estimators using the Delta Method . . . . 105

C Updating Conjectures 109

C.1 The Individual Parameter Estimate . . . . . . . . . . . . . . . . . . . . . . 109

C.2 The Individual's Sum of Squares for Error . . . . . . . . . . . . . . . . . . 1 10

C.3 The Regression Parameter Variance Estimator . . . . . . . . . . . . . . . . 1 1 1

D Continuous Outcome Simulation Results 114

E Binary Outcorne Simulation Results 125

Bibliography

List of Tables

1.1 Example Data of 5 Individuals . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.1 Definition of the Factors used in the Simulation Study . . . . . . . . . . . 25

3.2 Tests used for the different levels of N and T . . . . . . . . . . . . . . . . . 26

3.3 Situations where the MIX estimator failed to converge . . . . . . . . . . . . 28

3.4 Estimates of the slope for the rernaining estimators when the MIX estimator

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . failed to converge 29

4.1 Parameter Estimates for the Asthma Data . . . . . . . . . . . . . . . . . . 45

4.2 Parameter Variance Estimates for the Asthma Data . . . . . . . . . . . . . 46

6.1 Situations where the IWLS estimator failed to converge when ,8 = (2.4)'. . 64

6.2 Estimates of the slope for the remaining estimators wlien the TWLS esti-

rnator failed to converge for ,û = (0.0)'. . . . . . . . . . . . . . . . . . . . . 65

6.3 Estimates of the slope for the remaining estimators when the lWLS esti-

mator failed to converge for /? = (2.4)'. . . . . . . . . . . . . . . . . . . . . 66

7.1 Parameter Estimates for the Head-Injury Data . . . . . . . . . . . . . . . . 82

7.2 Parameter Variance Estimates for t h e Head-Injury Data . . . . . . . . . . 83

9.1 Simulatecl Range of Logistic Probabilities for giveii Parameters . . . . . . . 94

. D 1 Analysis of Variance Table for Slope . . . . . . . . . . . . . . . . . . . . . . 1 15

. . . . . . . . . . . D.2 Analysis of Variance Table for Variance Ratio of Slope 115

. . . . . . . . . D.3 Analysis of Variance Table for Empirical Variance of Slope 116

. . . . . . . . . D.4 Analysis of Variance Table for Mean Square Error of Slope 116

D.5 Maximum Likelihood ANOVA Table for Rejection Rates of the Slope . . . 1 1 7

D.6 Number of Times kpp was Not PositiveDefinite (out of 100 sarnples) . . . 117

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.7 Average of Slopes 118

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.8 Variance Ratio of Slope 119

. . . . . . . . . . . . . . . . . . . . . . . . . . D.9 Empirical Variance of Slope 120

. . . . . . . . . . . . . . . . . . . . . . . . . . D . 10 Mean Square Error of Slope 121

. . . . . . . . . . . . D.l l Number of Test Rejections for Slope where ,O = (0, 0)' 122

. . . . . . . . . . . . D.12 Number of Test Rejections for Slope where @ = (1, 2)' 123

. . . . . . . D.13 Average number of iterations for tlie Methods MIX and IWLS 124

. . . . . . . . . . . . . . . . . . . . . . E.1 Analysis of Variance Table for Slope 126

. . . . . . . . . . . E.2 Analysis of Variance Table for Variance Ratio of Slope 126

. . . . . . . . . E.3 Analysis of Variance Table for Empirical Variance of Slope 126

. . . . . . . . . E.4 Analysis of Variance Table for Mean Square Error of Slope 127

. . . E.5 Maximum Likelihood ANOVA Table for Rejection Rates of the Slope 127

E.6 Nurnber of Times gpa was Not Positive-Definite (out of 100 sainples) . . . 127

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.7 Average of Slopes 128

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.8 Variance Ratio of Slope 128

. . . . . . . . . . . . . . . . . . . . . . . . . . E.9 Empirical Variance of Slope 129

. . . . . . . . . . . . . . . . . . . . . . . . . . E.10 Mean Square Error of Slope 129

. . . . . . . . . . . . E . I 1 Number of Test Rejections for Slope where P = (0, 0)' 130

. . . . . . . . . . . . E.12 Number of Test Rejections for Slope wliere /3 = (2, 4)' 130

. . . . . . . E.13 Average number of iterations for the Metliods GEE and IWLS 131

vii

List of Figures

Plots of Y versus X for each individual and for the entire sample for the

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example Data. 7

Plot of Rejection Rate of Slope under the null hypothesis for al1 five esti-

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mators.. 24

Plots of Slope Mean against Method for different levels of the Parameters

. . . . . . . . . . . . . . . . . of the Simulation uader the nul1 hypothesis. 30

Plots of Variance Ratio* of Slope against Method for different levels of the

. . . . . . . . . . . Parameters of the Simulation under the nul1 hypothesis. 31

Plots of Empirical Variance (VT) of Slope against Method for different levels

of the Parameters of the Simulation under the nul1 liypotliesis. . . . . . . . 32

Plots of Mean Square Error of Slope against Method for clifferent levels of

the Paraineters of the Simulation under the nul1 hypothesis. . . . . . . . . 33

Plots of Slope Rejection Rates against Method for different levels of the

Parameters of the Simulation under the nul1 hypothesis. . . . . . . . . . . . 34

Plots of Slope Mean against Method for different levels of the Paraineters

of the Simulation under the alternative hypotliesis. . . . . . . . . . . . . . 35


Parameters of the Simulation under the alternative hypothesis. . . . . . . . 36


of the Parameters of the Simulation under the alternative hypothesis. . . . 37

Plots of Mean Square Error of Slope against Method for different leveIs of

the Parameters of the Simulation under the alternative hypothesis. . . . . . 38

Plots of Slope Rejection Rates against Method for clifferent levels of the


Plots of Parameter Estimates against their Weights for Asthma Data. . . . 48

Plot of Rejection Rate of Slope under the null hypothesis for al1 five esti-

mators.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Plots of Slope Mean against Method for different levels of the Parameters

of the Simulation under the null hypothesis. . . . . . . . . . . . . . . . . . 67


Parameters of the Simulation under the nul1 hypothesis. . . . . . . . . . . . 68


of the Parameters of the Simulation under the nui1 hypothesis. . . . . . . . 69

Plots of Mean Square Error of Slope against Method for different levels of

the Parameters of the Simulation under the null hypothesis. . . . . . . . . 70

Plots of Slope Rejection Rates against Method for different levels of the

Parameters of the Simulation under the null hypothesis. . . .

Plots of Slope Mean against Metliod .for different levels of the

of the Simulation under the alternative hypotliesis. . . . . . . Plots of Variance Ratio* of Slope against Metliocl for different 1

. . . . . . . 71

Paramet ers

. . . . . . . 72

evels of the


Plots of Empirical Variance (VT) of Slope against Metliod for clifferent levels

of the Paraineters of the Siiiiulation under the alternative hypotliesis. . . . 74

6.9 Plots of Mean Square Error of Slope against Metbod for different levels of

. . . . . . the Parameters of the Simulation under the alternative hypothesis 75

6.10 Plots of Slope Rejection Rates against Method for different levels of the

. . . . . . . . Parameters of the Simulation under the alternative hypothesis 76

7.1 Plots of Intercept and Slope against their Weights for Head-Injury Data . . 84

Part 1

Introduction

Chapter 1

An Introduction to Random

Component Models

The essentials of this tliesis are contained in four main parts. In the first part, an in-

troduction to the importance of random component models and their development ancl

emergence in the analysis of longitudinal studies is given. The second part consists of the

continuous outcome situation in which Chapter 2 outlines the theory of the model, the

associated assumptions, and the tnethod of estimation in the case of continuous outcome

longitudinal data. The results of several cornpetitive techniques used to analyze simulated

data are reported in Chapter 3 and their use in a study of the effect of environ~nental

pollutants on respiratory health among asthmatics is reported in Chapter 4. In the thircl

part, the procedures designecl for the case of binary outcome longitudinal data are studied.

The theory of this mode1 is expIained in Chapter 5 as are the underlying assumptions and

estimation methods associated witli this type of dataset analysis. Anotlier simulation is

used to study the properties of these procedures which is reported in Chapter 6 ancl the

analysis of a longitudinal study of word recall success post head-injuries is given in Chap-

ter 7. The final part, inchding Chapters 8 and 9, consists of some concluding reniarks as

well as a discussion of possible extensions and limitations of the techniques encouiiterecl

in this thesis. 2

1.1 The need for Random Components

Random components regression theory utilizes one of the most fundamental assumptions

with regards to longitudinal data structure - heterogeneity of individuals. At the most

basic level, it incorporates a clustered sampling design into the analysis of the data. A

simple example where we have 5 individuals with 3 replicates each might look like this:

Individual 1 Measured Values

In this case, Our mode1 equation for the j th of 3 observations for the itIL of 5 individuals

would look like:

yij = /J +ai + eij (l4

In the more general case, we would have Equation 1.1, where j = 1,. . . , ti and i = 1, . . . , n.

In this model, yij is the jt" measure of outcome for the ith individual, p is

population average of the measure of outcome, ai is the it" individual's dev

the population average, which is assumed to be a random variable that has

the overall

ation froln

nlean zero

and variance ai, and e i j is the random error for the j t h measurement on the ith iodividual

with mean zero and variance a2.

From this mode1 and its assumptions, we can show that:

where N = xy=, ti. If we decompose the total sum of squared deviations into a betweeii

and within component, we get:

where we cal1 the two terms the sum of squares withiu and sum of squares between,

respectively, i. e.

SST = S S B + SSW

Using these results, we have:

E { S S B } = [N - C b ''1 g; + ln - 11 o2 (From Anderson & Bancroft [l])

so therefore our estimates of the variance components in the mode1 are:

SSW ssw [z - -

6 2 = - N-n] (n - and &:=

N - n cy.l t: (.- N ) and our estimate of the variance of the overall weighted mean, g.., using Result 1.2 is

given by:

where Cl and C2 are given by the coefficients from Equation 1.3, M S B is the SSB diviclecl

by its degrees of freedom (n - l ) , and MSW is the SSW divided by its degrees of freedom

( N - n). If ti = t then Cl = 2 and C2 = O and we see tliat the variance of the meati

simplifies to

Let us examine the importance of modeling the correct variance components of the

model. If we proceeded in the analysis of a longitudinal study with heterogeneity of

the individuals, where al1 individuals have the same number of repeated measurements

(ti = t ) and we neglected t o take the clustering into account then our naive estimate of

the variance of a single measurement yij would be:

CZ 1 VarN{y,} = - S S B SSW

SST=-+- nt - 1 n t - 1 n t - 1

Our true estimate of the variance of a single measurement using Result 1.2 would be:

CI

Var{y,) = ô.: + ô.2 = S S B SSW +-

t ( n - 1 ) nt

If we take the expected values of these equations we get:

Since Lo = 1 - L-l nt-1 nt-] and t < nt then Varw{yij} 5 Var{yij) for 0; 2 O, that is,

the naive estimate of the variance of an observation, which neglects the clustering, will

underestimate the true variance of an observation. Let us compare these two variance

estimates for three specific cases.

t n 1 2. If 0; > O and t -t m then -' n and VarN{yii} < Var{yij}.

Therefore, in the cases where heterogeneity exists (0: > O) and Our sample size is not

very large, neglecting the clustering of the data underestimates the true variance of eacli

observation.

Now, let us compare the distribution of this variance estimate wlien different indi-

viduals have different numbers of repeated measurements with the distribution when al1

individuals have the saine nunlber repeated measurements. If we look at Equation 1.3,

when t; = t , then the estimated variance of the estimate of the overall mean is given by:

- M S B Var{y . . ) = -

nt

where MSB is the SSB divided by its degrees of freedom (n - 1). Therefore, the dis-

tribution of v%{ij..} is proportional to a XZ with n - 1 degrees of freedorn. In the case

where ti # t for i = 1, . . . , n then the distribution of Var{jj..) given in Equation 1.3 is

not proportional to a x2 with n - 1 degrees of freedom, but rather to a x2 with degrees

of freedom given by Satterthwaite's approximation as:

[CIMSB + C ~ M S W ] ~ df = (cl M s , ) , , ( a M s w 1 2 (Searle et. al.[l4, Page 1341)

where Cl and C2 are given in Equation 1.3.

As in the case of linear regression, wliere we estimate the straight line relationship

between outcome and covariate, random component models can be similarly extended to

mode1 continuous covariates. The main difference between fixed effects linear regression

and random effects linear regression is that the vector of intercepts and slopes for each

cluster is assumed to be randomly distributed with some variance-covariance structure.

Consider an example of five individuals with 5 repeated outcome and explanatory

measurements given in Table 1.1. If we perform simple linear regression analyses on each

individual (within-individual analysis) we find that al1 of the estimated slopes are negative

and their average is -1.7. An overall regression of individuals' average Y against average

Table 1.1: Example Data of 5 Individuals

X (between-individual analysis) yields 3.0 as the estimate of the slope. If we perform one

last analysis in which we use al1 of the data but disregard any clustering by the individual

(marginal analysis) we get a dope estimate of 1.3. The individual regression lines as well

as the marginal regression lines are plotted in Figure 1 .l.

Figure 1.1: Plots of Y versus X for each individual and for the entire sample for the Example Data.

We can clearly see that from a marginal analysis, the estimate of the dope is positive

but from a within-individual analysis it is negative. This occurs because the marginal

analysis is a weighted average of the within-individual dope and the slope of the betweeii-

individual analysis. Since the between-individual analysis is in the opposite direction of

the within-individual analysis, we see that the marginal analysis is 'pulled7 away from the

within individual analysis to a positive value.

Therefore, ignoring in the statistical analysis heterogeneity between indivicluals when

it exists may give biased estimates of the slope in addition to biasing the estimate of its

standard error.

1.2 Random Effects versus Random Coefficients

There is a slight distinction between random effects models and random coefficients mod-

els. In the model discussed above, we assumed that the individual parameters, the in-

dividual deviations from the population inean, had mean zero. The model equation was

given by:

( 1.8) y;j=/L+Cti$€ij i = I ,..., 72 j = l , ..., ti

We could re-parameterize this equation as:

where instead of each a; having mean zero, the pi's now have mean p. We have defined

pi = p + CY;. SO, in Equation 1.8, we see that there are fixed effects, p, as well as random

effects, Cti and E i j , while in the Equation 1.9, al1 effects are randorn.

In the context of multiple linear regression, the random effects model is given by:

where the A's are fixed ( k = 1, . . . , p), the Bki7s are random wit h mean zero and some

variance-covariance structure, and the E ~ ~ ' s are random errors with meati zero and variance

a2. For the case of random coefficients, the analogous model is given by:

where the Pki's are al1 random with rnean PI, and the same variance-covariance structure

as the random parameters in Equation 1.10 and the fij7s are random errors witli mean

zero and variance a'.

1.3 Extension to the Binary Outcome Case

As in continuous outcome longitudinal data, heterogeneity can exist when our outcoilie of

interest is binary, that is, when we are measuring the presence or absence of some event.

Wlien we have clustering in the data, the standard errors of our parameter estimates differ.

Therefore, we would like a method that can properly model this heterogeneity and carry

out estimation and testing of these parameter estimates that is unbiased. The generalized

estimating equation (GEE) techniques of Liang & Zeger which are discussed further in

Part III can model this heterogeneity. However, this GEE estimator models a marginal re-

lationship between the out come and explanatory variables and only corrects the standard

errors of these estimates by incorporating this heterogeneity. Although these standard

errors may be correct, a marginal analysis in many situations fails to give us accurate esti-

mates of the regression parameters as we saw in Section 1.1. The SAS procedure MIXED

can also account for the heterogeneity by modeling the within subject relationships with

a random effects technique. However, the SAS procedure MIXED can only be usecl in

the continuous outcome case. Other methods exist that mode1 random components for

binary outcome situations [12, 15, 18, 191 but they tend to be very computer-intensive.

Tlierefore, we would like to investigate the possibility of extending the 'simple' methods

of Swamy that are explored in Part II to the case wliere the outcorne of interest in binary.

Part II

Continuous Out corne Longitudinal

Data

Chapter 2

Theory of Random Coefficients for

Multiple Linear Regression

Let us suppose that we have t repeated measurements for n individuals. The mode1 for

the it" individual is given by

In this model, Yi is a t x 1 vector representing the ith indiviclual's t repeated measureiiients

on the response variable, Xi is a 1 x p matrix of t correspondiug repeatecl measureiiients

on p explanatory variables, ,Bi is a p x 1 vector of individual-specific regressioti coefficients

corresponding to the p explanatory variables, anci ci is a t x 1 vector of t raildoin errors.

So, froni this model, we can see that each individual lias a different relationship be-

tween his or her outcotne measurements and explanatory variables defined by the p re-

gression parameters of pi. For the purpose of inference in tliis rnodel, we need to make

the following assumptions:

A l . ,Bi - MVN (8, X p p ) and I ,Bk, {i # k; i , k E 1,. . . , n); i.e. each pi cornes from

a multivariate normal population of dimension p with mean vector P , variance-

covariance matrix Epp, and the P;s are independent across individuals.

A2. a; N MVN ( O , $III ) ; Le. each c,, j = 1 , . . . , t , cornes froin a univariate normal

population with meau zero, variance 02, and al1 of the e+ are independent of each

other (this is often referred to as the conditional independence assumption).

A3. ,Bi I e k , {Yi, k E 1,. . . , n); i.e. each individual's vector of regression parameters, ,ûi

is independent of their and al1 other individual's vector of random errors, c k .

A4. The X;s are fixed and of full row rank for each i.

A5. For a fixed number of explanatory variables, p, min(n, t ) > p, i.e. there are botli

more repeated measurements per individual and nurnbers of individuals than there

are explanatory variables.

A6. The elements of (X;X;)-' are uniformly O(t-'), i . e . there exists a finite upper

bound, M, such that the elements of t(X:Xi)-' are less than M in absolute value

for al1 i and t .

The first assumption describes the fundainentals of the randoni coefficient regression

model. That is, eacli individual's parameters corne from an underlying multivariate Gaus-

sian population and that t hese individual parameter vectors are independent of each ot lier.

The second assumption is clearly a inore restrictive assumption and is not necessary for

random coefficient regression in general but for the purpose of this thesis we will use it to

simplify much of the computation. We will address this assumption later in the discussion.

From assuinptions Al and A2 and Equation 2.1, we can see that. conditional on Xi,

Using the above model, we can now proceed with developing techniques for estimation of

the unknown parameters.

2.2 Estimation Methods

Commonly used estimation methods in random coefficient regression analysis are often

referred to as two-stage estimation methods. In this thesis a third stage is added to allow

the original estimates to be updated using an iterative technique.

2.2.1 First Stage Estimation (The Individual)

The first stage of estimation comes at the level of the observational unit, i .e. the individ-

ual. Each individual contributes a set of measureinents on both response and explanatory

variables. We use these measurements to model each indiviclual's regression paraineters

separately. Since each individual follows the model given in Equation 2.1, we cari use

the theory of least-squares to estimate those parameters. This least-squares estimator is

given by:

bi = (x:x~)-' Xiyi (2.3)

Frorn assumptions A l and A2, we can see tliat, conditional on Xi, we have the following

results:

where each bi is independent of al1 bk, for al1 i and k where i, k E 1 , . . . , n. At this point we can also get an estimate of the mithin-individual composent of vari-

ance, rr2. The sum of squares for error for the it" individual is given by:

Tlierefore because it is well known that E {g ) = 02, it follows t hat:

1 " é: 2; +2=-C- n i=i t - P

is an unbiased estimate for the pooled within individual variance.

With estimates of each individual's regression parameters, b;, and an estitnate of

the within individual variance component, S2, we can now move to the second stage of

estimation, tliat of estimation over the individuals by aggregating the individual regression

parameter estimates b;.

2.2.2 Second Stage Estimation (The Aggregate)

Unweighted Least-Squares Estirnator

Guinpertz & Pantula [8] define the following unweighted least-squares (ULS) estimator as

the unweighted average of the individual regression paraineter estimates:

It is important to note that this estimator does not depend on the variance coniponents in

the model. This property tnakes it a simple way of estimating the aggregatecl regressioil

parameter witliout the need for estimating these variance components. However, for the

puryose of inference, the variance cornponents may be needed in calculating the degrees

of freedom, as showri below in Section 2.3.

Weighted Least-Squares Estimator

For the case where 02 and Zoo are known, Swamy [16, 171 sliowed tliat the generalized

least-squares estimator given by,

is the Best Linear Unbiased Estimator ( B L U E ) for B where W i is the inverse of the

variance of the regression parameter estirnate for the ith individual, namely,

Wz:' = V a r {bi ) = Cpp + 02 (X;X~)-' (see Equation 2.5).

Since a2 and Epp are seldom known, Swamy 116,171 proposed the estimated generalized

least-squares (EGLS) or weighted least-squares (W L S ) estirnator sirnilar to the ,d,,,, but

with estimates of Wi. For this thesis, we will refer to it as the WLS estimator - it is

given by:

w s = ( i ) ' ($ ~ i b i )

where WF' = epp + ô.2 (x:x,)-'. A more commonly known version of the generalized

least-squares estimator was put forth by Henderson [IO] and improved upon by Harville

[9] and Laird & Ware [Il]. It is given by:

where

W,' = V a ? * { y i } = X;EppX: + 021 (see Equatioii 2.2) . In Appendix A it is showii that the Heiiderson and Swainy versions of the general-

ized least-squares estimator are, in fact , identical under the assumption of concli t ional

independence.

Therefore, our two working estimators for /3 ( f i o L s and b,,,) are identical except for

their averaging weiglits. /3,,, uses equal weights l / n ( i .e . uuweiglited) aiid fi,,, uses

weiglits given by the inverse of the variance of the regression parameter estimates.

Since we have already defined our estimate of the witliin-individual variance, 82, we

need to have an estimate for XOP in order to proceed with the weighted least-squares

estimator.

Let the statistic Sbb be given by:

It has been reported [16, 17, 81 and can be proven (refer to Appendix B.l) that:

and using the fact that bu,, = CF=, b;, we have:

Therefore, the method of moments estimator of the variance-covariance: matrix of the

ranclom coefficients is given by:

Since we are calculating the ciifference of two matrices, there is no guarantee that

the resulting matrix will be positive-definite. Thus, it is possible that we may not get a

positive-definite estimate for our variance-covariance matrix of the random coefficients. To

safeguard against tliis, Carter L Yang [3] proposed the following modification to produce

the corrected estimator:

wllere D = Cy==, t + Trace {'& ( X ; X ~ ) - ' ) and 9 is the srnallest root of the equntion:

It is this corrected estimate of the parameter variance that will be usecl in al1 subsequent

estimation techniques for Part II.

Since we now have estimates of the variance cornponents in the mode1 (2 and Zoa) as well as the individual regression parameter estimates (b;), we are now in a position to

calculate the weighted least-squares estirnator given in Equation 2.8.

As shown by Equation B.2 in Appendix B.l , var{$,,,} = $ E { s ~ ~ ) , therefore,

is an estimate for the variance of the unweighted least-squares estimator. For the Swamy

estimator, we have:

Tlierefore,

is an estimate for the variance of the weighted Eeast-squares estirnator.

2.2.3 Third Stage Estimation (The Updating Procedure)

Are we able to 'borrow' information across individuals to improve our individual regression

parameter estimates? Can we re-calculate our sum of squares for error by weighting t hem

by the reciprocal of their variances? Or, can we update our estimates of the weights Wie?

Symbolically, these may be stated as follows:

where W;;' = Var {yi} = XiX;OpX: + 021.

These conjectures are explored one a t a time and the results are given in Appenclix C.

From Appendix C.l we can see that updating our individual paraineter estimates by

borrowing information across individuals gives us exactly the same parameter estimate.

Therefore, we carmot iniprove upon the individual's estiinate by using information froni

the entire sample. Also, from Appendix C.2 we see tbat we cannot re-caIculate our sum of

squares for error for an individual using the aggregate group variance as a weight to give us

an improved estimate. In both Conjectures 1 and 2, we clearly see that using information

from the whole sample does not give us more information about the individual. This is

consistent with our randoin coefficient regression assuniptions, namely, that individuals

behave differently and independently from each other. However, from Conjecture 3 and

Appendix C.3, it is evident tliat Sbb is clearly difFerent from Sbb, but their expectations are

approximately equal. So, we have now identified a means of updating our estiinate of the

regression parameter variance which will, in turn, update Our estimate of the aggregate

regression parameters.

Iterated Weighted Least-Squares Estimator

Therefore our iterated weighted least-squares estirnator, B*, is given by:

where,

Thus, we iterate through estimates for ,8* until convergence is achieved. At that point we

cal1 the value of p* a t convergence the iterated weighted least-squares (IWLS) estiinator.

For the purpose of inference, its variance is given by:

where W! is the weight rnatrix calculated using fi,,,,.

2.3 Inference

For the unweighted least-squares estimator defined in Equation 2.6 wi t h its variance given

by Equation 2.15, we define the statistic:

for testing Ho : LB,,, = Xo, where L is a q x p matrix of linearly independent rows.

Gumpertz & Pantula [8] proved for this statistic, TU,,, tliat:

1. For fixed n and t tending to infinity, *TiL, is distributed as F ( q , n - q ) .

2. For fixed t and n tending to infinity, Ti,, is distributed as

3. For n . t large and q = 1 , T:,, is distributed as F(1 , u ) 5 T 2 ( v ) (cf. Satterthwaite

on Page 6), where

and L = 1'.

For the weighted least-squares estimator defined in Equation 2.8 with its variance given

by Equation 2.17, we define the statistic:

A

for testing Ho : LBwLs = Ao, where L is a q x p matrix of linearly indepenclent rows.

Swamy [16, 171 and Carter & Yang [3] proved for this statistic, T;,,, that:

1. For fixed n and t tending to iiifinity, *T$,, is distributed as F ( q , n - q ) .

2. For fixed t and n tending to infinity, TZ,, is distribiited as Xi.

3. For n . t large and q = 1 , TW,, is distributed as F(1, v ) T 2 ( v ) (cf. Satterthwaite

on Page 6), where

1 -1 / = {- ( i ~ ~ ~ i ) ~ + [t2 (nt - np)] o4 ( i t c 2 ~ " , n - 1

Now, using the results of Swamy and Carter Rc Yang, we extend tliese firidiags to our

iterated estimator given in Equation 2.21 with its variance given by Equation 2.25 and

define the statist ic:

for testing H, : L P ~ ~ ~ ~ = Xo, where L is a q x p matrix of linearly independent rows. We

will assume for this statistic, Th,,, tliat:

2. For fixed t and n tending to infinity, T h , , is distributed as xi.

3. For n . t large and q = 1, Th, , is distributed as F (1 , v) T 2 ( v ) (cf. Satterthwaite

on Page 6), where

It is these test statistics that will be used in al1 subsequent analyses for contiriuous outcome

data analyses.

In the next cl~apter, we compare the statistical properties of bias and power for the *. A

three estimators, BtrLS7 ,ûwLs, and Plwt s , as well as two other well-known estimators - the

Henderson estimator (Equation 2.9) calculated via the SAS procedure MIXED which will

be referred to as the MIX estimator and the ordinary least-squares estimator that ignores

al1 assumptions of rand0111 coefficient regression mocleling and treats every observation

equally and independently usirig the SAS procedure REG which will be referred to as the

REG estimator.

Chapter 3

Continuous Outcome Longitudinal

Data Simulation

3.1 The Simulation

Let us consider the mode1 given by Equation 2.1 where we set p = 2, that is, we have

only one covariate and we are interested in estimating an intercept and slope. Therefore,

the mode1 for the j th of T observations for the it" of N individuals is given by:

w here

0 pi = (Bio, @il)' h/ NID [n ] , [ 2j 1) and @ = (b, ,ûl)' taies on the .dues

of (0,O)' or (1,2)' and y (Scale) takes on the values of 1 or 5.

Xu = (1, Xitj)' NID and oz (Sigmax2) takes on the values

of 0.1 or 1.0.

e, - NID (0,02) and 02 (Sigma2) takes on the values of 1 or 5.

0 The quantities N and T take on the values of 10 or 50.

Therefore, for this simulation we have a 26 factorial design, or equivalently, for each level

of the true regression parameters, ,O = (0, 0)' or p = (1,2)', we have a 25 factorial design.

For each of these levels of the factorial design we generated 100 random trials. Each

trial consistecl of N subjects with T measurements per subject. At each trial, the three

randorn coefficient methods (i.e. ULS, WLS, and IWLS), the Henderson randoin effects

technique of the SAS procedure MIXED (MIX), and the ordinary least-squares estimator

of the SAS procedure REG (REG) were used to analyze the simulated dataset.

The estimates of the regression parameters, their estimated standard errors using for-

mulae 2.15, 2.17, and 2.25, and the number of times the null hypothesis ( H o : Bo = (0, 0)')

was rejected (based on a 5% level of significance) were al1 recorded for each method of

each simulated dataset for the slope only. For the random coefficient methods, rejection

numbers were collected based on al1 three distributions of the test statistics for each of

the three methods given by Equations 2.26, 2.27, and 2.28.

Analysis of variance techniques (and logistic regression for the proportions) were usecl

to analyze the results of the simulation. As was anticipated, the Type 1 error rate of the

REG estimator was much larger than that of the other four test statistics. Therefore, in

Our analysis of the simulation results we created a classification variable, METHOD, that

included only the ULS, WLS, IWLS, and MIX estimators. The plot of the rejection rate

of the dope under the null hypothesis is pIottec1 in Figure 3.0. It is clear that an atialysis

that ignores al1 heterogeneity in the data (REG) is inadequate in properly analyzing data

of this form. Therefore, the rest of the figures presented below as well as all statistical

analyses include only tliose four estimators whicli are the out conles of the METHOD

variable.

In Our analyses of variance, a full interaction term mode1 was fit for the outcoine

variables of mean slope, relative variance (see below) of slope, and rejection rate of dope

wliere Our sample was based on approximately 12800 (= z5 x 4 x 100) observations. 111

Figure 3.0: Plot of Rejection Rate of Slope under the nul1 hypothesis for al1 five estimators.

-. . O

these models, only the main effects and the second order interactions with METHOD were

of interest. In the remaining cases, where the outcome of interest depended on factorial

level surnmary measures and our sample was based on 128 (= 25 x 4) observations, we used

a generalized F-test and determined that a fifth order interaction term was statistically

significant. This mode1 was fit so that al1 F-tests of the main effects and tlieir interactions

with METHOD were based on a mean square error from this high order model. In al1

instances where logistic regression was used to analyze the proportion of test rejections,

0.5 was added to each ce11 to prevent empty ce11 counts from giving uninformative analysis

of variance parameters (refer to Gart k Zweiful [7]).

Each of the four methods defined by the METHOD variable has associatecl with it

a formula for the estimate of the variance of the estimated slope. An example of such

a formula is that given in Equation 2.15 for the unweighted estimator (ULS) and Equa-

tion 2.17 for the weighted estimator (WLS). In practice we would like to know whetlier

these calculated variance estimates, VF, tend to have positive or negative bias. Our es-

timate of the true variance (einpirical variance), VT, is taken to be the variance of tlie

100 simulated estimates of the intercept and slope. Because this estimate is our proxy

for the true variance it would have been preferable if it had been based on a much larger

number of simulations. Therefore, we can calculate for each of the 32 cells of the factorial

design and for each metliod the average variance of the 100 calculated variances using tlie

I .î . 1 l

REG ULS WLS l WLS MIX

Method

Table 3.1: Definition of the Factors used in the Simulation Study

FI. Hypothesis 1 Nul1 1 P = (0,o)'

appropriate formulae to obtain an estimate of VF. AS well, we calculatecl the empirical

variance based on the actual variance of the 100 estimates across the 100 simulations for

each method and used it as Our estimate of the true variance VT. We then calculated the

difference between these values (VF - VT) divided by the estimate of the true variance

VT. If this 'variance ratio' is approximately equal to zero we woulcl conclude that the

formulae used to calculate the variance of the slope lead t o estimates of the variance that

were close to the true value as defined by the siinulation study. Our 'mean square error'

was calculated as the sum of the empirical variance of the estimate and the square of the

difference between the average parameter estimate and the true value.

The results of the simulation study are summarized in Figures 3.1 to 3.10. 111 our

analysis of variance we were able to estimate the main effects for each of the four estiinatioti

methocls, first for the nul1 hypotliesis ( P = (0, O)'), Figures 3.1 to 3.5, ancl then for the

alternative hypotliesis (/3 = (1,2)'), Figures 3.6 to 3.10. The six factors, as previously

described on Page 22 are su~nmarized in Table 3.1.

Each figure has six plots. The top left plot suinmarizes tlie results over al1 3200

(= 2' x 100) simulations. Each of the remaining five plots gives tlie results for tlie low

F2. Size of variance of regression parameters (y)

F3. Size of variance of the explanatory variable X (O:)

F4. Size of the variance of the within individual error (c2)

F5. Number of clusters or individuals (N)

F6. Number of repeated measurements per individual (T)

Alternative LOW HIGH LOW HIGH LOW HIGH LOW HIGH LOW HIGH

Table 3.2: Tests used for the different levels of N and T

1 N = 1 0 N = 5 0

and high values for the five factors F2 to F6 defined in Table 3.1. This permits visual

inspection of the presence of a two factor interaction of the factor with the variable

METHOD.

In Figures 3.1 and 3.6 are reported the mean values of the estimates of the slope of

the model. In Figures 3.2 and 3.7 are reported the means for the variance ratio, i . e . the

difference of the formula derived ( V , ) and simulated empirical (VT) variances expressed

as a proportion of the simulated empirical variance for the slope. In Figures 3.3 and 3.8

are reported the empirical variaiice of the slope. In Figures 3.4 and 3.9 are reported the

mean square error of the dope. In Figures 3.5 and 3.10 are reported the rejection rates

for the null and alternative liypotheses for the slope.

For the plots of rejection rates given in Figures 3.5 and 3.10, we used the most appro-

priate test among those defined by Equations 2.26, 2.27, and 2.28, for each of the three

random coefficient methods ULS, WLS, and IWLS, respectively. For the 2 x2 cases of pos-

sibilities for factors F5 and F6, we suinmarize the tests that were employed by Table 3.2.

In this table, the distributions given in each ceil refer to tliose given in Section 2.3.

Each plot has one or more horizontal lines that define important values. For exainple,

in Figure 3.1 the line is equal to zero which is the true value of the slope under the nul1

hypothesis. In Figure 3.5, for example, the line is equal to 0.05, the expected rejection

rate under the null hypothesis for an unbiased test.

Consider the top right plot in Figure 3.1. The two Iines relate to whetlier factor F4,

the variance of the within indiviclual error, is liigli, the top line, or low, the bottoiil line.

For al1 figures we have adopted the convention of usitig a solicl line for the liiglier value of

T = 10 T = 5 0

T 2 ( v ) X: ( n - q ) X I

the factor and a clotted line for the lower value of the factor.

Two p values are reported at the top of eacli plot. The first p value is labelecl as P(in)

for main effect. It is the p value associated with the test of whether the average values

associated with the dotted and solid lines are significantly different from each other. The

second p value, P(i), reported in each plot relates to the test of the interaction between

the METHOD and FACTOR variables. For example, if we look at the middle right plot

in Figure 3.1, we notice that the average value of the 1600 simulatecl esti~nated slopes is

approximately equal t o -0.005 for high variance of the explanatory variable X and it is also

equal to -0.005 for the ULS procedure but equal to -0.010 for the other three procedures

for low variance of the explanatory variable X (factor F3). Here the p value associated

with the test of the interaction term METHOD * F3 is equal to 0.996. Therefore, we

have no reason to interpret the apparent difference in the ULS means of the estimated

slopes for the low and high variance situations as statistically different tlian the other

three procedures. Only if the reported p value is low, Say < 0.05, in a plot as is the case,

for example, in the top right of Figure 3.1 where the p value for the main effect, F4 (02)

is equal to 0.0001 should we consider the difference as not being due to cliance.

For the top left plot of each figure, the reported p value simply indicates the overall

p value associated with the F-test of the comparison of mean values or the X2 test of the

comparison of rejection rates.

There were nine cases under the nul1 hypothesis (/3 = (0,O)') and nine cases uncler

the alternative liypothesis (B = (1,2)') where the SAS Procedure MIXED failed to con-

verge. In al1 simulations, the IWLS estimator converged in under 8 iterations. Table 3.3

sutninarizes the non-convergence situations for both t h e nul1 and alternative hypotheses.

The results of the other three estimators for the 18 siinulated datasets that MIX clic1

tiot converge were kept in the statistical analyses and graphical summaries and are dis-

played in Table 3.4. Tliere are advaiitages and disadvantages in doing tliis. If the lack

of convergence of the MIX procedure occurred in situations where lack of convergence

were appropriate, that is, for aberrant datasets, then the lack of convergence would be

considered an advantage of the MIX procedure. If the estimates obtained by the other

three rnethods were quite different from their true value in these 18 situations then by

keeping these estimates in the analyses would give an advantage to the MIX procedure.

However, if the estimates obtained with the other three estimators were close to the true

values then keeping these 18 datasets in the analyses would dernonstrate the strength

of these other methods. The values of the estimates for al1 three methods for these 18

datasets are given in Table 3.4 and demonstrate that in alrnost al1 of the cases, the tliree

estimators (ULS, WLS, and IWLS) give estimates close to the true values. The average

of the estimates of the slope under t h e nul1 hypothesis for the ULS estimator is 0.31, for

the WLS estimator is 0.30, and for the IWLS estimator is 0.30. Under the alternative

hypothesis the ULS estimator is 2.09, WLS is 2.20, and IWLS is 2.19. This implies that

these estimators are accurate even when the MIX estimator fails to converge.

Table 3.3: Situations where the MIX estimator failed to converge.

p = (0,O)' p = (1,2)' y a2 O N T number y a2 O N T number 1 1 0.1 10 10 1 1 0.1 10 10

Table 3.4: Estimates of the slope for the remaining estimators when the MIX estimator failed to converge.

ULS WLS IWLS -.48 -.17 -.17 .135 .140 .140 1.87 1.48 1.48 .261 .303 .303 .581 .683 .683 - 3 3 -1.1 -1.1 -.38 .645 .645 .374 ,364 .364 -791 .331 .330 2.39 2.34 2.34 2.66 2.56 2.56 2.74 3.15 3.15 3.63 3.34 3.34 2.23 1.92 1.92 1.06 1.27 1.27 .889 1.13 1.13 1.86 2.64 2.63 1.32 1.41 1.41

U LS WLS IWLS MIX

Method

P(m) = 0.890 and P(i) = 1 .O00

I scale

Method

P(m) = 0.1 02 and P(i) = 1 .O00

N

Method

F(m) = 0.0001 and P(i) = 0.997

Method

P(m) = 0.698 and P(i) = 0.996

ln Sigmax2

O .S.. - 0.1 2. 1

Method

P(m) 0.374 and P(i) = 0.995

I T

Figure 3.1: Plots of Slope Mean against Method for different levels of the Parameters of the Simulation under the nul1 hypothesis.

U LS WLS IWLS MIX Method

P(m) = 0.01 6 and P(i) = 0.81 8 8 a Scaie I

WLS I WLS MIX I

Method

p(m) = 0.01 1 and P(i) = 0.673

N

P(m) = 0.016 and P(i) = 0.675 - - -. -- - --.._ -- Sig ma2 -.-__------ S.*.---*.*.__ --.. 1

Method

o . P(rn) = 0.0001 and P(i) = 0.852

Method

O P(m) = 0.0001 and P(i) = 0.799

7 O T

WLS IWLS MIX I

Method Method

Figure 3.2: Plots of Variance Ratio* of Slope against Method for different levels of the: Parameters of the Simulation under the nul1 hypothesis.

* Variance Ratio (V R) is given by: V R = ''2~ see text Page 25.

U LS WLS IWLS MIX

Method

P(m) = 0.0001 and P(i) = 0.802

WLS IWLS M I X

Method

P(m) = 0.0001 and P(i) = 0.964 ---- ------ N

U L b W La IWLS M I &

Method

U LS WLS IWLS MIX

Meihod

WLS IWLS M I X

Melhod

P(m) = 0.0001 and P(i) = 0.612 ----------.--__.__...----.-*-..--------------- T

Meihod

Figure 3.3: Plots of Empirical Variance (VT) of Slope against Method for clifferent levels of the Parameters of the Simulation uncler the nul1 hypothesis.

U LS WLS l WLS MIX Meihod

P(m) = 0.0001 and P(i) = 0.802

ScaIe ---. - S

ULb W La IWLS MIX

Meihod

P(m) = 0.0001 and P(i) = 0.968 ............................................. N 1

Method

P(m) = 0.0001 and P(i) = 0.479

O 1 sigma2

S WLS IWLS MIX

Method

P m = 0.0001 and P i = 0.698 ...... ................. ............... 2 ' - 1 !!.* Sigma2 .... - 0.1

1

Method

P(m) = 0.0001 and P(i) = 0.620 ..............................................

O T

Method

Figure 3.4: Plots of Mean Square Error of Slope against Method for different levels of the Parameters of the Simulation under the nul1 hypotliesis.

P(m) = 0.505 and P(i) = 0.879

O Sigma2

V ) t 9 O

. U LS WLS IWLS MIX

Method

P(m) = 0.074 and P(i) = 0.993

Scaie

WLS MIX

Method

P(m) = 0.0001 and P(i) = 0.200

Method

Figure 3.5: Plots of Slope Rejection Rates

W LS IWLS MIX

Method

P(m) = 0.060 and P(i) = 0.852

Sigmax2

ULY WL3 IWLS MIX

Meihod

p(m) = 0.063 and P(i) = 0.869

O

U L S WLb IWLS MIX

Meihod

against Method for different levels of the Parameters of the Simulation under the nul1 liypotliesis.

ù=J m . - . .

(Ji r ULS WLS IWLS MIX

Method

P(m) = 0.033 and P(i) = 1 .O00

sade l

.- I U L J WLb IWLb MIX

Method

Method

P(m) = 0.531 and P(i) = 1 .O00 z 9 ni Sigma2

..-- 1

Method

Method

Ln Y

P(rn) = 0.875 and P(i) = 0.999

P(m) = 0.093 and P(i) = 1 .O00 1 T

9 ' 01

Method

Sigmax2 ..-- - 3.1

Figure 3.6: Plots of Slope Mean against Metliod for different levels of the Parameters of the Simulation under the alternative liypothesis.

U LS WLS IWLS MIX Melhod

P(m) = 0.774 and P(i) = 0.834

/*-*---.----.*---.*.------.-- .. ---- ... r 1

~ L S IWLY MIX

Method

P(m) = 0.0001 and P(i) = 0.971

/ = N.. 10 - 50

ULY WLS IWLY MIX

Method

P(m) E 0.0001 and P(i) = 0.166

Method

P(m) = 0.001 and P(i) = 0.793 #

Sigmax2

Melhod

P(m) = 0.0001 and P(i) = 0.422

1 T

WLS lWLb MIX

Method

Figure 3.7: Plots of Variance Ratio* of Slope against Methocl for different levels of the Parameters of the Simulation under the alternative liypotliesis.

* Variance Ratio (VR) is given by: V R = "~6"~ see text Page 25

U LS WLS l WLS MIX

Method

P(m) = 0.0001 and P(i) = 0.288

Scale

U LS WLS l WLb MIX

Method

P(m) = o.ooo1 and P(i) = 0.1 18 .--------...-__I*----------------------------

N

ULb WLb IWLS Ml A

Method

P(m) = 0.0001 and P(i) = 0.01 5

Sigma2

Method

P(m) = 0.0001 and P(i) = 0.023 --------------___ ..-----*-.-- * --------- -**-*-- Sigmax2

S... - 0.1 1

ULb WLS IWLS MIX

Method

P(m) = 0.0001 and P(i) = 0.019 2 -- ----..- -----*-._...-*---.-------------------

T

Method

Figure 3.8: Plots of Empirical Variance (VT) of Slope against Metliod for different levels of the Parameters of the Simulation under the alternative hypothesis.

U LS WLS IWLS MIX Method

P(m) = 0,oool and P(i) = 0.291

Scaie

U L J WLY IWLS MIX

Method

P(m) = 0.0001 and P(i) = 0.1 19 -------.-.-.-._------.----------------------- N 1

Method

Method

P(m) = 0.0001 and P(i) = 0.024 ---------_.______*...*....*--------------- Sigmax2

-m.. 0.1 - 1

Meihod

p(m) = 0.0001 and P(i) = 0.019 .-- -----------.______**-.*.*...----------m.--- T

S... - 10 50

ULb W L b l WLb MIX

Meihod

Figure 3.9: Plots of Mean Square Error of Slope against Method for different levels of the Paraineters of the Simulation under the alternative liypothesis.

I J U LS WLS l WLS MIX

Method

2 U Lb WLb l WLb MIX

P(m) = 0.0000 and P(i) = 0.829

Meihod

9. Y

al n

P(m) = 0.0000 and P(i) = 0.488

9 N 7

Scale .... *-------*-------------.--.----**-----.------- - 5

I WLS IWLS MIX

Method

P(m) = 0.0000 and P(i) = 0.964

1 Sigma2 .... 1

WLS IWLS I

MIX

Melhod

Method

P(m) 0.0000 and P(i) = 0.997

P(m) = 0.0000 and P(i) = 0.972

9. T .-- .... - 10

50

9. 7

al n

W LS IWLS MIX

Sigmax2 .--- - o. 1 1

Meihod

Figure 3.10: Plots of Slope Rejection Rates against Method for different levels of the Parameters of the Simulation u~ider the alternative hypothesis.

3.2 The Results

3.2.1 Estimation

From Figures 3.1 and 3.6, we can determine the accuracy in the estimation of the slope.

Under the null hypothesis, in Figure 3.1, we see that the ULS estimator is slightly closer

to the true value of zero, however, this difference is not statistically significant. We also

notice that the estimation of the slope parameter improves when the size of the variance

of the within individual error is hi&. Under the alternative hypothesis, in Figure 3.6, we

see that al1 four estimators are around the same value of 1.987 which is very close to the

true value of 2.0. The estimation of the slope is improved when the size of the variance

of the regression parameters is low and the number of individuals is high.

In Figures 3.2 and 3.7 are shown the variance ratio of the estimators. Under the null

hypothesis, in Figure 3.2, we see that the MIX estimator has a slightly lower variance ratio

than the other three estirnators, however, this difference is not statistically significant . The

variance ratio of al1 four estimators seems to decrease when the size of the variance of the

regression parameters is low, the size of the variance of the within inclividual error is high,

the size of the variance of the explanatory variable X is high, the nuniber of indivicluals

is high, and the number of repeated rneasurements per individual is high. Under the

alternative hypothesis, in Figure 3.7, we see that the variance ratio of the ULS estimator is

slightly lower than the other t hree estimators, as under the null hypothesis, tliis difference

is not statistically significant. The variance ratio is lower when the size of the variance of

the within individual error is low, the size of the variance of the explanatory variable X

is low, the number of individuals is low, and the nuinber of repeated iiieasurements per

individual is low. In the case where the number of repeatecl tneasuren~euts per inclividual

is low, the ULS estimator has an extremely low variance ratio although this cliffereuce is

not statistically significant.

In Figures 3.3 and 3.8 are shown the empirical variances of t he estimators. Under the

null hypothesis, in Figure 3.3, we see that the empirical variance is almost identical for

al1 four estimators and that the empirical variance is lower when the size of the variance

of the regression parameters is low, the size of the variance of the within individual error

is low, the size of the variance of the explanatory variable X is high, the number of

individuals is high, and the number of repeated measurements per indiviclual is high.

Under the alternative hypothesis, in Figure 3.8, we see that the empirical variance of the

ULS estimator is slightly larger than it is for the WLS, IWLS, and MIX estimators. The

empirical variance is lower, as under the null hypothesis, when the size of the variance

of the regression parameters is low, the size of the variance of the within individual error

is low, the size of the variance of the explanatory variable X is high, the nuniber of

individuals is high, and the number of repeated rneasurements per individual is high. The

empirical variance of the ULS estimator is much larger than the empirical variance of the

WLS, IWLS, or MIX estimators when the size of the variance of the within individual

error is high, the size of the variance of the explanatory variable X is low, and the number

of repeated measurements per individual is low.

The mean square errors are shown in Figures 3.4 and 3.9. For both the null ancl

alternative hypotheses, the mean square errors of the estimators folIow exact Iy the same

patterns as the empirical variances of the estimators shown in Figures 3.3 and 3.8, re-

spectively, and are described above. This occurs because the bias in the estimation of the

slope is negligible when added to the empirical variance in the calculation of the mean

square error.

3.2.2 Type 1 Error Rates and Statistical Power

In Figure 3.5 the Type 1 error rates, the proportion of times that the nul1 hypothesis is

rejected wlien it is true, are sliown. We can see that the rejection rates of the IWLS,

and to a snialler extent MIX estimators are much closer to the true value, 0.05, than the

ULS or WLS estimators. The rejection rates become doser to the true value when the

number of individuals is high. In the case where the number of individuals is low (N =

10) the rejection rate of the ULS estimator is around 0.072, the WLS estimator is around

0.070, the IWLS estimator is around 0.050, and the MIX estimator is around 0.047. This

implies that when the sample size is low, the IWLS and MIX estimators are much more

accurate than the ULS or WLS estimators in the test bias.

From Figure 3.10 we can determine the proportion of tirnes t hat the nul1 hypothesis is

rejected when it is known to be false. This quantity is called the power. In our simulation,

we consider the situation where ,8 = (1,2)' and we see that the power of al1 four estimators

is statistically equivalent, however, the power of the ULS and WLS estimators is slightly

larger then the IWLS and MIX estimators, although this effect is not significant. The

power of the estimators is larger when the size of the variance of the regression parameters

is low, the size of the variance of the within individual error is low, the size of the variance

of the explanatory variable X is high, the number of individuals is high, and the nuinber

of repeated measurements per individual is high. T h e largest increases in power occur

when the size of the regression parameters is low and the number of individuals is high.

In these cases, the power of al1 four estimators increases from around 0.7 to very close to

1.0 where al1 estimators have almost identical rejection rates.

In Appendix D are reported the analysis of variance tables for al1 main effects and

first order interaction terms with the variable METHOD as well as the sunixnarizecl data

for each of the 32 levels of the factorial design.

Chapter 4

An Application: Environmental

Health Among Asthmatics in the

City of Windsor

To illustrate an application of the techniques discussed above, we used data frorii a study

of asthmatics in the City of Windsor carried out by the Gage Research Institute. The

objective of the study was to estimate the relationship between daiIy mean concentrations

of air pollutants and several indicators of respiratory health in a group of astlunatics. The

study was based on a sample of 39 asthmatics who reside in Windsor, Ontario, aged 12

years and older. These asthinatics were classifiecl as such if tliey had an ongoing neecl for

asthma medication.

For each of the 39 participants, data was collected on 21 consecutive clays. Peak flow

rates were collected as measures of respiratory status. Eacl-i individual recorded the best of

three peak fiow rates each morning and at bedtime before the use of astlima medications.

For each subject, environinental data was collected by a network of six Ontario Ministry of

the Environment fixecl site monitoring statiolis. The estimate of each subject's exposure

was based on pollution reaclings obtained from the monitoring station closest to his horne.

This environmental data consisted of a variety of substances, such as ozone, sulfur dioxide

(S02), total reduced sulfur (TRS), and nitrogen dioxide (NO2), routinely monitored by

tlie Ministry of tlie Environment. The measure of pollution used here was based on tlie

mean of hourly readings between 8 AM and 8 PM.

Fbr the purpose of this analysis, we will try to estirnate the relationship between an

individuaI's evening peak flow rates and the corresponding day's average measurements of

sulfur dioxide and total reduced sulfur. Since it was assumed that each of the 39 individ-

uals could have a different relationship between their respiratory outcorne measureinent

and their corresponding environmental data, a random components mode1 was used. For

this analysis, we compare the results of the five different techniques discussed iti Chap-

ter 2. They are the unweighted Zeast-squares estimator (ULS), the weighted Eeast-squares

estimator (WLS), the iterated weighted least-squares estimator (IW LS), the Henderson es-

timator calculated via the SAS procedure MIXED (MIX), and the ordinary Eeast-squares

estimator calculated via the SAS procedure REG (REG). We have also incIuded the gen-

eralized estimating equation (GEE) estimator of Liang PI. Zeger. This estimator which is

discussed further in Part III uses a marginal analysis for the parameter estirnates, equal to

the REG estimator, but calculates robust standard errors wliich account for clustering in

the data. The results of this analysis are presented in Table 4.1 where is the intercept,

pl is the dope of the S02, and P2 is the slope of the TRS.

We can see from Table 4.1 that tlie results of the WLS and IWLS metliods yield

identical results for the accuracy of tlie data given. This tnay be due to the small nuinber

of iterations (4) required for convergence.

The intercept was estimateci similarly by al1 of the methods used in this atialysis.

The main difference with the intercept was that the standard error calculated by the

REG estimator is much lower than al1 five other estimators. This result was due to the

large heterogeneity of individuals that can be seen from the estimates of the variance

of the intercept (- 23000) in Table 4.2. III Section 1.1 we introduced the idea that if

Table 4.1: Parameter Estimates for the Astlima Data

IWLS took 4 iterations to converge. $ MIX took 20 iterations to converge. b GEE took 2 iterations to converge. * P values for ULS, WLS, and IWLS were equivalent to two decimal places for a11 three tests

(refer to Pages 19 - 2 1).

lieterogeneity is present and is ignored, which is what REG cloes, then the standard error

MIX$ 374.4 25.2 0.00 -0.5 3.6

0.89 -7.6 6.9

0.28 2274.8

Estimate Po Std. Err. P value Pi Std. Err. P value @2

Std. Err. P value 8 2

of the intercept or mean will be underestimatecl.

The estimate of the effect of sulfur dioxide (SOz) on peak flow was less consistent

among the different methods than was the estimate of the intercept. The WLS, IWLS,

ULS WLS IWLSt 373.9 373.9 373.9 24.7 24.7 24.7 0.00* O.OO* 0.00* -3.6 -1.1 -1.1 5.1 4.6 4.6

0.49* 0.81* 0.81" -5.2 -7.4 -7.4 7.0 6.8 6.8

0.46* 0.28* 0.28" 2283.3 2283.3 2283.3

and MIX, seemed to have the most similarities in parameter estimate (- -1) and standard

errors (N 4).

In Figure 4.1 we see three graphs. In eacli grapli every individual has the paraineter

estimate (given by the vertical axis) plotted against the corresponding diagonal elenient

of tbe inverse of the individual's variance-covariance niatrix for the estimate, its weight

(given by the horizontal axis).

In the middle graph (the plot of SOa) we see one individual witli a slope around -150

and a very small corresponding weigbt. This individual drives the ULS estimate of the

slope much lower since it is an unweighted average of al1 of the dopes. We find that the

Table 4.2: Parameter Variance Estirnates for the Asthma Data

METHOD

ULS

WLS

IWLS

MIX

I aB, OPoP1 ahpz Estimate of &p = opop, a l 0 p 1 a

OLIoa 001P2 0;2

23462.4 - 1873.4 - 1665.2 -1873.4 378.1 -216.0 - 1665.2 -216.0 1129.5 23462.4 -1873.3 -1665.3 - 1873.3 384.4 -221 -5 -1665.3 -221.5 1134.3 23462.4 -1873.3 -1665.3 - 1873.3 384.3 -221 -5 -1665.3 -221.5 1134.3 24397.0 -695.3 -2745.4 -695.3 71.4 -0.6

-2745.4 -0.6 1241.9

ULS estimate of the slope is indeed smaller than the WLS, IWLS, and MIX estimates

which, in turn, leads to a smaller p value despite the slightly larger standard error. For the

GEE estimate, we see that the estimate of the dope is much smaller than al1 of the other

estimates and the robust standard error (21.9) is drastically Iarger than the naive standard

error (9.2). The combination of this bias and overestimation of the standard error leacls

to the p value (- 0.8) in the same range as the WLS, IWLS, and MIX estimators. The

between individual slope calculated from the iticlividual average measurements is -27.5 (p

= 0.77) and since the average of the iiidividual slopes is iiegative (-3.6), we see a marginal

slope (-6.9) that is betweetl the two because it is a weighted average of these two estimates.

Total reduced sulfur (TRS) has regression est imates that were estimatecl negatively

by the ULS, WLS, IWLS, and MIX estimators and positively by the GEE ancl REG esti-

mators. This can occur because the GEE and REG estimate of the regression paranleters

are population averages. That is, the GEE and REG estimation is a marginal analysis

wliich does not account for individuaIity in the regression estimates. This phenoineiion

was previously encountered in our example of Section 1 . l . The between indivicluai slope

of the average total reduced sulfur (TRS) with average evening peak flow rates is 117.3

(p = 0.46). Although this dope is not statistically significant it still seems to pull the

marginal estimate in a positive direction away from the negative average of the iiidividual

slopes (-5.2) to a value of 2.8 which is in the opposite direction. The estimate of the slope

of the totaI reduced sulfur measurement when each individual was niodeIed with orily a

unique intercept was -1 1.7, a value much closer to the randoni coefficients methods than

the marginal method.

By calculating ratios of the variance components in the mode1 as well as the variances of

the explanatory variables, we determined that this example was closest to our simulation

situation where the size of the variance of the regression parameters is high, the size of

the variance of the explanatory variable is low, and thenumber of individuals is high.

The simulation results indicate tliat the IWLS estimator was the best estimator for that

particular situation. Thus, using the results of the IWLS estimator in this dataset, we

would conclude that the average sulfur dioxide (p=0.81) and total reduced sulfur (p =

0.28) measurements have little effect on the evening peak flow rates of asthmatics.

6 1 0"-5 6.5'1 0"-5 7'1 0"-5 7.5.1 OA-5 8.1 0"-5 Weight

I 0.001 0 0.001 5 0.0020 0.0025 0.0030

Weight

O

0.0006 0.0007 0.0008 0.0009 0.001 O Weight

Figure 4.1: Plots of Parameter Estimates against their Weiglits for Astliina Data.

Part III

Binary Outcome Longitudinal Data

Chapter 5

Theory of Random Coefficients for

Multiple Logistic Regression

Let us suppose that we have t repeated measurements for n iildividuals where Our outconle

variable of interest, y, is binary and cari assume the value of 1 or O. Then the mode1 for

the it" individual is given by

where

Zi =

P r { y i j = 1)

log (4-) l - ~ t j

X i P i + &i

In tliis model, zi is a t x 1 vector of t repeated unobservable estimated logits for the il"

individual, where each 2, represents the logit of the probability of a 'success' in y,, Xi

is a t x p matrix of t corresponding repeated measurenients on p explanatory variables,

pi is a p x 1 vector of individual-specific logistic regression coefficients corresponding to

the p explanatory variables, and e; is a t x 1 vector of t random logit errors where each

E i j has mean zero and variance pij(Lpij). For unreplicated data, z;j is undefined, liowever,

the mode1 equation is used to illustrate the relationship.

It should be noted that in the formulation of most logistic regression models, the logit

error is modeled implicitly by stressing that zq is an unobservable random variable witli

its own variance such that conditional on Xi and Pi, the variance of 2, is

As in the continuous case, we assume that the individual parameter vectors pi exhibits

significant variation across individuals and that these parameters are assumed to follow

the multivariate Gaussian distribution. We will make the following assumptions which

are the analogues of those given in Section 2.1:

A l . ,& - M V N (p, X P P ) and I P k , {i # k; i, k E 1, . . . , n} ; i. e. each pi cornes from

a multivariate normal population of dimension y with mean vector ,8, variance-

covariance matrix Xpp, and the fis are independent across individualç.

A2. The X;s are fixed and of full row rank for each i.

A3. For a fixed number of explanatory variables, p, min(n, t ) > p, i . e . there are botli

more repeated measurements per individual and nurnbers of individuals than there

are explanatory variables.

A4. The elements of (X;v; 'Xi ) - ' are uniformly O(t- ' ) , i .e . there exists a finite upper

bound, M , such that the elements of t(XiV;'Xi)- ' are less than M in absolute

value for ail i and t .

where

Using Assumption A l , Mode1 5.1 and the properties of the logit, we can see that

conditional on Xi,

With the above model, we can now proceed witli extending the techniques of the con-

tinuous outcome random coefficient regression model to the binary case in the estiniation

of the unknown logistic regression parameters.

5.2 Estimation Methods

In exploring the estimation methods of the random coefficient Iogistic regression model,

we follow the same pathway that we used in the continuous case - the two-stage model

followed by a third updating stage. Thus for the binary case we also have three estimation

stages .

5.2.1 First Stage Estimation (The Individual)

The first stage of estimation is the main point of difference between the continuous and

binary outcome cases. This is because for each individual, the outcome variable is rnodelecl

as a linear futiction of the predictors in the continuous case and as a logistiç function of

the predictors in the binary case. Thus our estimation of these different paranieters will

also be different. For the continuous case, parameter estimation for each inclividual used

ordinary least-squares. In the binary case, the parameters of the non-linear relationship

are obtained using inaxirnuin likelihood estimation. This approach necessitates the use of

the Newton-Raphson method.

The likelihood for the it" individual is given by:

where y, = 1 if a success occurs on the j th observation for the i th individual and y, = O

otherwise. The logarithm of this likelihood is given by:

t

The first derivative of the logarithm of the likelihood with respect to the vector f i leads

to the Score vector,

si {Bi ) = X{ (yi - pi )

and the negative of the second derivative will be the Information matrix,

Therefore, using Newton-Raphson, our estiinate of the inclividual logistic regression

parameter, bi at the mt" and (m + l ) t h step of the iterative process is given by the

equation:

where pi"') = (p!y) , . . . , and l q i t = + xil biy) + - + X - 2~ b!>lL) r p - Using the properties of t his likelihood equation and its derivatives and the assumptions

of the random coefficient iiiodel described above we have that, conditioiial on Xi ancl

asymptotically as t + oo,

where each bi is independent of al1 bk, for al1 i and k where i, k E 1,. . . , n.

Before moving on to the next stage of estimation we must consider the possibility

of non-convergence of the Newton-Raphson algorit hm. Convergence will occur in most

situations. There are, however, two situations where it is clear that convergence will

not occur. The first situation occurs when the data have a complete (or quasi-complete)

separation and occurs when the largest covariate measurement for one of the responses

is smaller than (or equal to) the srnaIlest covariate measurement for the ot lier response.

The second situation occurs when the response is constant for an individual, that is, when

a11 responses are 1 or al1 responses are 0. Diagrammatically, these cases look like this:

Cornplete Separation Al1 Same Outcome Y Y

To proceed with estimation for these cases we will assume that within the two categories

defined by whether yij = 1 or yij = O the vector of covariates is multivariate Gaussian witli

m a . a ores a a 1

coinmon variance-covariance matrix. For the case of a single predictor, this simplifies t o

X O

X

l a @ l l 1

I I I

o.... -1

the covariate having constant variance but a different mean for each of the two outcornes

( i - e - fx l~ (x i j ly i j = k) N ( p I X ' , Li) ; k = 0 , l and Pr{yij = 1) = IIi).

We are interested in inodeling:

where ,Bio is a scalar and f i is a (p - 1) x 1 vector of logistic regression parameters. We

have,

1 exp { -+(x i j - p:l))'Ei: (xi j - P:'))} IIi

= log exp { - + ( x , - pIO))'~" (xij - 1 " 1 ° ) ) ) ( 1 - I I i )

as was shown by Cornfield [5]. Therefore our logistic regression parameters, in terms of

the parameters of the distributions of the measurement variables are given by:

Therefore, the two parameters of t lie logistic mode1 can be estimated usiiig ,&, and IIi. As estimates we will use:

1. $ ) = 1 d l where x::) is a covariate corresponding to y, = 1 and tjl' is , ! l > Z = l u

the number of repeated measurements of x',:).

t ! O ) O ) = , x where x$' is a covariate corresponding to y, = O and t$'" is P i ,y the xiumber of repeated measurements of xi:).

With these estimates of the parameters, we can obtain estimates of the logistic regres-

sion parameters as well as derive the variances of these estimators using the delta method.

The results from the delta method are given in Apyendix B.2. Our final estimators aiid

their variances and covariance are given by:

where the variance- covariance parameters are analogous t O the variance of b;, condi t ional

on Xi and ,Bi, narnely, analogous to (x;v;'x;) -'. We can now estimate our logistic regression parameters and their variances even in the

case of complete separation. If one individual has al1 the same response then Our estinlate

of is set equal to zero. In this case, the mean of the non-response distribution becomes

the mean of the all-response distribution and the pooled estimate of variance becomes the

estimate of variance of the all-response distribution. Also, we add a correction factor to

1 the number of observations such that if tf ' = O then we force tlO) = ancl t;" = t - or

vice-versa for the other case.

For al1 individuals, we now have a recipe for estimating the logistic regression parain-

eter vector and its variance-covariance matrix. We can now tnove on to the second stage:

of estimating over the aggregate.

5.2.2 Second Stage Estimation (The Aggregate)

We are now ready to define our aggregated estimates of the logistic regression parameters,

p, as well as the estimate of the variance-covariance matrix of this parameter, EpB. This

is done in the same way as the continuous case.

Unweighted Least-Squares Estimator

Our first estimator is the logistic analog of the Gurnpertz Pc: Pantula [8] unweighted least-

squares (ULS) estimator - the unweighted average of the individual regression parame-

ters:

It is important to note that this estimator does not depend on the variance components

in the model. This property makes it a simple way of looking at the aggregated regression

parameter without the need for estimating these variance components. This point was

introduced in Section 2.2.2.

Weighted Least-Squares Estimator

For the case where Epa is not known, Swamy [16, 171 proposed the estimnted genemlized

least-squares or weighted least-squares (WLS) estimator for the continuous outcome case.

We will argue by analogy and define the weighted estimator by

where WC' = Var {bi) = Xpp+ (X;V;'X~)-' which was given Liy Result 5.4 on Page 53.

Therefore, our two working estirnators for B (f iuLs and fi,,,) are the analogues of

those given in Chapter 2, Equations 2.6 and 2.8. PuLs uses the equal weights 1 /n ( i . e .

unweighted) and P,,, uses weights given by the inverse of the variance of the regression

parameter estimates.

We now need to define Our estimate of the between-individual variance, Epp, in order

to proceed with the weighted least-squares estimator.

Let the statistic Sbb be given by:

It can be shown (refer to Appendix B.1) that:

Var {bi} - n Var {bULS}

and using the fact that BU,, = Cy="=,i, we have:

1 n

Therefore, by the method of moments, our estimate for the variance-covariance matrix of

the random coefficients is given by:

Since we are calculating the difference of two matrices, there is uo guarantee that

the resulting matrix will be positive-definite. Thus, it is possible that we may uot get a

positive-definite estimate for our variance-covariance matrix of tlie ratidotn coefficients.

To safeguard against this, we have adapted tlie Carter & Yang [3] correction used in

Part II.

Using Equation 2.14, and defining % = j / b 2 and Vi = 021, we Iiave:

D* = a =y==, t + Trace {z:=i (x:v;'x~)-*), and 8 and ? are the s r n a k t roots

of the equations respectivel~ given by:

If we use V; as it is defined on Page 51, we now have a working version of the Carter &

Yang correction for the binary outcome case. It is this corrected estimate of the paranieter

variance that will be used in al1 subsequent estimation techniques for Part III.

Since we now have an estimate of the variance component in the rnodel (&) as well

as the individual regression parameter estimates (b;), we are now in a position to calculate

the weighted least-squares estimator given in Equation 5.13.

Before moving on to the third estimation stage, we will first determine the variances of

the two estimators we have discussed. These variances are necessary for inference based

on these estirnators. As shown by Equation B.2 in Appendix B. 1, var{fi,,,} = +E{sbb),

therefore, - 1 ~ar{ iÔu , s } = -SM n

is an estimate for the variance of the unweighted least-squares estimator. For the Swamy

estimator, we have:

Therefore,

is an estimate for the variance of the weighted least-squares estimator.

5.2.3 Third Stage Estimation (The Updating Procedure)

For this third stage of estimation in the binary outcome case, we will proceed in exactly

the same manner as we did in the continuous case using the result of Appendix (2.3 -

that is, we will use our newest estimate of the aggregate logistic regression parameter to

improve our estimation of the between-individual variance. We represent this as:

Although this statistic is clearly different frorn Sbb , if f i w L s is different frorn f i u L s , in

Appendix C.3 we see that th& expected values are approximately equal. As in the con-

tinuous case, we have identified a means of updating our estimate of the logistic regression

parameter variance which will improve our estimate of the aggregate logistic regression

parameter.

Iterated Weighted Least-Squares Estimator

As was defined for the continuous outcome case, we have:

where,

1 '" - -x (bi - ,3*) (bi - f i * ) / Sb6 - n, - 1 ;=,

We iterate througli estimates for ,& until convergence is acliieved at wliicli point we cd1

tlie value of tlie iteroted veighted least-squares (IWLS) estirnate. For the purpose of

inference, its variance is given by:

where w is the weight rnatrix calculated using fi,,,,. We have now definecl our three

estimators and their variances. We can now form test statistics to test Our prior hypotheses

about them. We will use the same test statistics as those given for the continuous outcoine

situation. They can be found on Pages 19 - 21.

In the next chapter, we compare the statistical properties of bias and power of these

three estimators as well as two other well-known estimators - the Liang & Zeger [20]

generalized estimating equation robust estimator using their SAS GEE macro wliich will

be referred to as the GEE estimator and the ordinary Zogistic regression naive estimator

which ignores al1 assumptions of random coefficient regression modeling and treats every

observation equally and independently which will be referred to as the LOG estimator.

Chapter 6

Binary Outcome Longitudinal Data

Simulation

6.1 The Simulation

Let us consider the mode1 given by Equation 5.1 where we set p = 2, that is, we have

only one covariate and we are interested in estimating an intercept and slope. Therefore,

the mode1 for the j th of T observations for the ith of N individuals is given by:

where

Pi = ( f i o , f i l )' - N I D ( [ ] . [ 1) ancl p = p, i l tak.~ 0. tile mi,,,,

of (O, O)' or (2,4)' and y (Scale) takes on the values of 1 or 5.

X, = (1, Xiij) ' N I D

of 0.1 or 1.

[ O 1) and < (Signiar2) t a k a on tlie values

0 E, is the logit error with mean zero and variance wliich depends only on pjj.

The quaiitities N and T take on the values of 10 or 50.

Therefore, for this simulation we have a 2' factorial design, or equivalently, for botli levels

of the true regression parameters, we have a 24 factorial design and each level of the

factorial design consists of 100 trials. At each trial, the three random coefficient methods

(i.e. ULS, WLS, and IWLS), and both the robust (GEE) and naive (LOG) estinlators

from the Liang & Zeger generalized estimating equation technique using the SAS GEE

macro are used to analyze the simulated dataset.

The estimated regression parameters, their estimated standard errors, and the number

of times the nul1 hypothesis (Ho : ,ûo = (0,O)') was rejected (based on a 5% level of sig-

nificance) were al1 recorded for each method of each simulated dataset for the slope only.

For the random coefficient methods, rejection numbers were collected based on al1 three

distributions of the test statistics for each of the three methods given by Equations 2.26,

2.27, and 2.28.

Analysis of variance techniques (and logistic regression for the proportions) were used

to analyze the results of the simulation. The plot of the rejection rate of the slope under

the nul1 hypotliesis is plotted in Figure 6.0. It is clear that an analysis tliat ignores al1

heterogeneity in the data (LOG) is inadequate in properly analyzing data of this forin.

Therefore, the rest of the figures presented below as well as al1 statistical analyses iiiclude

only those four estimators nientioned in the METHOD variable.

In Our analyses of variance, a full interaction terin mode1 was fit for the outcome

variables of mean slope, relative variance of slope, and rejection rate of dope where our

sample was based on approximately 6400 (= 24 x 4 x 100) observatioiis. In tliese iiioclels,

only the main effects and the second order interactions with METHOD were of interest.

In the reinaining cases, where the outcome of interest was dependent on factorial level

summary measures and Our sample was based ou 64 (= 24 x 4) observations, we used a

Method

Figure 6.0: Plot of Rejection Rate of Slope under the nul1 hypothesis for al1 five estimators.

Table 6.1: Situations where the IWLS estimator failed to converge when = (2,4)'.

generalized F-test to determine that a fourth order interaction term was significant. This

model was fit so that al1 F-tests of the main effects and their interactions with METHOD

were based on a mean square error from this high orcler model. 111 al1 instances where

logistic regression was used to analyze the proportion of test rejections, 0.5 was addecl

to each ce11 to prevent einpty ce11 counts from giving uninformative analysis of variance

paraineters (refer to Gart & Zweiful [7]).

The results of the simulation study are suminarized in Figures 6.1 to 6.10. A clescrip-

tion of these figures was given in Cliapter 3. The only difference is tliat the factor F4

froin Table 3.1, relatecl to the size of the resiclual variance a2, does uot exist in the binary

7 02 N T number

: ::; ;: :; L

2 1 1.0 10 10 5 1 1.0 50 10 3

outcome situation.

+y 02 N T number L : 0 5: 50 1

1 1.0 10 50 7 1 1.0 50 50 4

Table 6.2: Estimates of the slope for the remaining estimators when the IWLS estimator failed to converge for P = (0,O)'.

It shouId be noted that the GEE method did not converge on two occasions. These

7 O: N T

5 : :O 5 1.0 10 10 5 1.0 10 10

were both under the alternative hypothesis (@ = (2,4)') when the size of variance of

ULS WLS GEE

!i? -.76 - . I l .O11 -1.0 -.28 -.73

regression parameters was high (y = 5), the size of the variance of the explanatory

variable X was low (a: = 0.1), the number of individuals was high (N = 50), and the

number of repeated measurements per individual was also high (T = 50). The IWLS

estimator failed to converge in under 50 iterations in 46 situations. Four of these occurred

under the null hypothesis ( P = (0,O)') when the size of the variance of the regression

parameters was high (7 = 5), the number of individuals was low ( N = IO), and the

number of repeated measurements per individual was also low (T = 10). Of these .four

situations, two occurred when the size of the variance of the explanatory variable X was

low (0: = 0.1) and two when it was high (c: = 1.0). The other 42 situations, under the

alternative hypothesis ( p = (2,4)'), are summarized in Table 6.1. The results of the other

three estimators for the 48 simulated datasets that GEE and IWLS did not converge were

kept in the statistical analyses and graphical sumniaries. As inentioned in Chapter 3,

t here were aclvantages and disadvantages in doing this. Table 6.2 suriiiiiarizes t lie iion-

convergence situations for the IWLS estimator under the null hypothesis and Table 6.3

suinmarizes the non-convergence situations for the IW LS estimator under the alternative

hypothesis. These latter two tables give the values of the estimates of the slope for al1

three remaining estimators (ULS, WLS, GEE). The average of the estimates of the slope

under the null hypothesis for the ULS estimator is -0.31, .for the WLS estimator is -0.22,

Table 6.3: Estimates of the slope for the reniaining estimators when the IWLS estimator failed to converge for ,û = (2,4)'.

ULS WLS GEE 6.98 1 4.76 1.45 312 2.47 4.47 3.18 4.37 3.34 1.89 2.92 2.95 1.04 3.78 3.34 -996 3.55 4.26 2.90 3.57 1.45 -487 1.95 3.01 ,913 2.76 1.75 .703 3.63 1.65 6 8 4.20 3.85 .326 3.43 5.03 1.08 3.29 5.33 3.23 3.39 4.51 3.03 2.63 2.74 1.79 2.65 7.13 3.04 2.83 3.68 2.94 2.81 5.89 3.93 3.02 1.26 .404 2.85 1.61 .329 3.52

ULS WLS GEE . . 6 .

L:E $5: 4.80 2.59 2.88 4.21 1.92 2.76 5.68 3.29 3.26 3.22 1.55 3.97 3.61 ,697 2.46 .248 .O77 2.92 5.97 4.22 4.37 4.55 3.12 4.24 5.07 1.39 2.20 5.34 1.29 1.89 4.57 2.29 2.79 1.13 .435 1.58 5.72 1.64 1.89 6.07 2.40 2.53 4.92 .847 1.17 4.51 2.02 1.86 5.77 .992 1.76 3.95 1.23 1.71 6.57 1.85 2.14

and for the GEE estimator is -0.65. Under the alternative hypothesis the ULS estiniator is

4.02, WLS is 1.70, and GEE is 2.91. This implies that the ULS estimator is more reliable

tlian the WLS or GEE estimators when the IWLS estimator fails to converge. Thus, it

may be an advantage of the IWLS estimator to not reach convergence if its estimate is

not very close to the true value.

Estimates from the situations for these methods where convergence was not achieved

were removed froni al1 analyses but the remainiug estimators of the same trials were

included. The reasons for leaving the reinaining estirnators in the anaIysis were given in

the contiriuous outcoine data simulation in Chapter 3.

P(m) = 0.126 and P(i) = 0.764

P(m) = 0.764

WLS IWLS titt 1

(U

8 ' 5

a g vj

6 . ? N 9. ?

8,

Method

.. . - - -

Plml = 0.209 and Pli) = 0.934

? ULS WLS IWLS GEE

WLb lWLb C i t t 1

Method

Method

Method

P(m) = 0.001 and P(i) = 0.073

w Sigmax2 8 ' ---- CU - 0.1

1 8 .

g, 8 . g 8 .

?

8 . 4:

Method

,.*- 2- .-•

*--.-------------. ..*-'

/.- __a- /.--

P(m) = 0.005 and P(i) = 0.254

Figure 6.1: Plots of Slope Mean against Metliod for different levels of the Parameters of the Simulation under the nul1 hypothesis.

U LY WLS IWLS titt

8 ' CU

2.

W . .., T ---- - 1 O 50

U LS WLS l WLS GEE Meihod

Method

P(m) = 0.713 and P(i) = 0.466

N --.- l n

U LS WLS: IWLS titt

Method

P(m) = 0.085 and P(i) = 0.074

Sigrnax2 . , S..-

m - - 0.1 n 1 - r/)

0 9 * .- ia II: 5 4. >

Meihod

P(m) = 0.0001 and P(i) = 0.0001 f

WLS: IWLS: titt

Meihod

Figure 6.2: Plots of Variance Ratio* of Slope against Method for different levels of the Parameters of the Simulation under the nul1 hypotliesis.

A- Variance Ratio ( V R ) is given by: V R = VFz see text Page 25.

U LS WLS IWLS GEE

Method

P(m) = 0.034 and P(i) = 0.155

scaie I

Method

P(m) = 0.009 and P(i) = 0.072

Method

P(m) = 0.025 and P(i) = 0.1 15

Sigrnax2

WLS IWLS titt

Method

P(m) = 0.275 and P(i) = 0.080

T

Method

Figure 6.3: Plots of Empirical Variance (VT) of Slope against Method for differeiit levels of the Parameters of the Simulation under the nul1 hypotliesis.

s t U LS WLS IWLS GEE

Method

P(m) = 0.032 and P(i) = 0.1 43

Scale S... - 1

5

i WLb lWLb U t t

Method

P(m) = 0.009 and P(i) = 0.069

'. N

8 - 10 50

.

--

U LS WLS I WLS titt

Method

P(m) = 0.025 and P(i) = 0.1 18 u! . i .- *, Sigma2 . .S.. . - 0.1

1

Method

P(m) = 0.260 and P(i) = 0.077 * A

Method

Figure 6.4: Plots of Mean Square Error of Slope against Method for different levels of t h e Parameters of the Simulation under the nul1 hypotliesis.

h O

P(m) = 0.006 and P(i) = 0.039

W La IWLb kitt J

Meihod

03 P(m) = 0.021 and P(i) = 0.242

; N

Method

Method

P(m) = 0.650 and P(i) = 0.691 r

Melhod

P(rn) = 0.0001 and P(i) = 0.001

Method

Figure 6.5: Plots of Slope Rejection Rates against Method for differetit levels of the Parameters of the Simulation under the nul1 hypothesis.

9 N . 1

U LS WLS IWLS GEE

P(m) = 0.0001 and P(i) = 0.0001

Scde S.*- - b

Melhod

P(m) = 0.0001 and P(i) = 0.0001

m . N ...* - 10 50

w LS IWLS C i t t I

Meihod

Meihod

P(m) = 0.0001 and P(i) = 0.026

rn. Sigmax2 ---- - 0.1 t

WLS IWLS iàtk

Method

P(m) = 0.0001 and P(i) = 0.0001

T

WLb IWLS bi t

Merhod

Figure 6.6: Plots of Slope Mean against Method for different levels of tlie Parameters of the Simulation under the alternative hypothesis.

I . U LS WLS IWLS GEE

P(m) = 0.0001 and P(i) = 0.000

Scaie

Method

P(m) = 0.0001 and P(i) = 0.0001 a----.-.-*.----- , N

Melhod

Melhod

P(m) = 0.0001 and P(i) = 0.0001

WLS IWLS titt

Method

P(m) = 0.0001 and P(i) = 0.0001

Method

Figure 6.7: Plots of Variance Ratio* of Slope against Method for different levels of the Parameters of the Simulation under the alternative liypothesis.

* Variance Ratio (V R) is given by: V R = see text Page 25.

s l ULS WLS IWLS GEE

Meihod

P(m) = 0.041 and P(i) = 0.075

\ scaie

U Lb WLb IWLS C j t t

Method

P(m) = 0.0001 and P(i) = 0.003 9 '. '.. N 1

Method

P(m) = 0.001 and P(i) = 0.005

Sigmax2

WLS IWLS titt

Method

P(m) = 0.01 8 and P(i) = 0.004

Melhod

Figure 6.8: Plots of Empirical Variance (VT) of Slope agaiiist Metliod for different levels of the Parameters of the Simulation under the alternative hypothesis.

U J .

m C L * .

m O (rJ.

I N .

F.

0

U LS WLS l WLS GEE

m a WLS Mllb U t t

Method

O

u LS WLS IWLS titt I

Method

Method

P(m) = 0.0001 and P(i) = 0.002 .

Method

P(m) = 0.0001 and P(i) = 0.000t

-

U FS WLS IWL5 U t t

Meihod

Figure 6.9: Plots of Mean Square Error of Slope agairist Methocl for different levels of the Paramet ers of the Simulation under the alternative hypothesis.

- 7

ULS WLS IWLS GE€ Method

P(m) = 0.0000 and P(i) s 0.0001

9 . 7

scale , ..-. ..* - 5 1

U LY W Lb l WLb l a t t

Method

P(m) = 0.0000 and P(i) = 0.018

IWLS titt I

Method

P(m) = 0,799 and P(i) = 0.026

l Sigmax2

Melhod

P(m) = 0.0000 and P(i) = 0.0000

s 1 T I

WLS IWLS C j t t 1

Method

Figure 6.10: Plots of Slope Rejection Rates against Method for clifferent levels of the Parameters of the Simulation under the alternative hypotliesis.

6.2 The Results

6.2.1 Estimation

From Figures 6.1 we can determine the accuracy in the estimation of the slope. Under the

null hypothesis, in Figure 6.1, we see tbat the ULS and GEE estimators give slope esti-

mates closer to the true value than the WLS or IWLS estimators, although this difference

is not statistically signifiant. The estimate of the slope is closer to its true value when

the size of the variance of the explanatory variable X is high, and the number of repeated

measurements per individual is low. Under the alternative hypothesis, in Figure 6.6, we

see that the ULS estimate of the slope is closest to the true value followed by the GEE

estimate and then the WLS and IWLS estimates. The slope estirnate is closer to its true

value when the size of the variance of the regression parameters is low, the size of the

variance of the explanatory variable X is low, the number of individuals is low, and the

number of repeated measurements per individual is high. The GEE estimate of the slope

was greatly improved when the size of the variance of the regression paranieters is low.

However, the GEE was unchanged with a change in the number of repeated measurements

per individual and the ULS estimate was unchanged with a change in the number of in-

dividuals. It appears that the discrepancy between the effect of the number of repeated

measurements per individual in the nul1 and alternative hypothesis can be explained by

the fact that the estimate of t h e slope may be dampened with a lower number of repeated

measurements per individual.

From Figures 6.2 and 6.7, we see the variance ratios of tlie four estinlators. Under

the null hypothesis, in Figure 6.2, we see that the variance ratios of the WLS and IWLS

estimators are much larger than tlie variance ratios of the ULS and GEE estimators. These

variance ratios are lower when the size of the variance of the regression parameters is low

and the nuinber of repeated ineasureinents per individual is hi&. In al1 cases, the variance

ratios of the ULS and GEE estimators are around zero and the changes in variance ratios

with changes in the factor levels occur for the WLS and IWLS estimators only. Under the

alternative hypothesis, in Figure 6.7, the variance ratios of the WLS and IWLS est itnators

are much larger than the variance ratios of the ULS and GEE estimators, as under the null

hypothesis. These variance ratios are lower when the size of the variance of the regression

parameters is low, the size of the variance of the explanatory variable X is low, tlie

number of individuals is high, and the number of repeatecl measurements per individual

is high. As under the null hypothesis, the ULS and GEE variance ratios are around zero

and only the WLS and IWLS variance ratios change with changing .factor levels. The

largest change in variance ratio occurs with the number of repeated measurements per

individual for the WLS estimator. When this number is low (T = 10) the variance ratio

of the WLS estimator is around 3.0 but when this nuniber is high (T = 50) the variance

ratio is around 0.5.

In Figures 6.3 and 6.8 we see the empirical variances of the four estimators. Under the

null hypothesis, in Figure 6.3, we see that the empirical variance of the ULS estimator

is much larger than the empirical variances of the otlier three estimators. The empirical

variances are lower when the size of the variance of the regression parameters is low, the

size of the variance of the explanatory variable X is high, and the number of individuals

is high. Under the alternative hypothesis, in Figure 6.8, we see that, as under the null

hypothesis, the ernpirical variance of the ULS estirnator is rnuch larger than the empirical

variances of the other three estimators and that the empirical variances are lower when the

size of the variance of the regression parameters is low (except for the GEE estimator), the

size of the variance of the explanatory variable X is high, and the number of individuals

is higli. When the number of individuals is high ancl the size of the variance of the

explanatory variable X is higli, the largest decrease in empirical variance is seen in the ULS

estimator (from around 2.0 to around 0.7). Wlien tlie number of repeatecl rneasurements

per individual is high, the empirical variances of the ULS and GEE estimators are lower

but the empirical variances of the WLS and IWLS estimators are higher.

In Figures 6.4 and 6.9 we see the mean square errors of the estimators. Under the

null hypothesis, in Figure 6.4, we notice that the mean square errors of the estimators

behave in exactly the same manner as the empirical variances of the estimators which was

described above. This is because the contribution of the bias to the inean square error

is negligible compared with the empirical variance. Under the alternative hypothesis, in

Figure 6.9, we see that the mean square errors of the ULS and GEE estimators are much

lower than the mean square errors for the WLS and IWLS estimators despite the much

larger empirical variance of the ULS estimator. The mean square error is lower when

the size of the variance of the regression parameters is low, the size of the variance of the

explanatory variable X is low, and the number of repeated measurements per individual is

high. In the case where the number of individuals is high, the ULS and GEE mean square

errors are lower but t h e WLS and IWLS mean square errors are higher. When tlie size

of the variance of the regression parameters is Iow, the largest decrease in mean square

error is in the GEE estirnator. However, when the size of the variance of the explanatory

variable X is low and t h e number of repeated measurements per individual is high, the

largest decrease in mean square error is in the WLS and IWLS estimators. For the case

where the number of repeated measurements per individual is low, the mean square error

of the WLS and IWLS estimators are both around I l but when the number of repeatecl

measurements per individual is liigh these mean square errors drop to around 2.

6.2.2 Type 1 Error Rates and Statistical Power

In Figure 6.5 we see t h e proportion of times that the null liypothesis is rejected when it

is known t o be true. This rejection rate is mucli closer to the true value, 0.05, for the

ULS and GEE estimators tlian it is for the WLS aricl IWLS estinlators. The rejection

rates becoine closer t o 0.05 when the size of the variance of the regression parameters is

low. When tlie nuniber of individuals is low, the rejection rates for the WLS and IWLS

estimators approacli 0 .Os, but the rejection rate of the GEE estimator becomes further

away from the true value aiid when the nuniber of repeated measurements per iildiviclual

is high, the rejection rates for the WLS, IWLS, and GEE estimators approach 0.05, but

the rejection rate of the ULS estimator becomes further away from the true vaIue. That

is, the GEE estimator with a low number of individuals and the ULS estimator with a

low number of repeated measurements per individual have i ncreases in t heir test biases.

The power, the proportion of times that the nu11 hypothesis is rejectecl when it is

known to be false is shown in Figure 6.10. We can clearly see that the power of the GEE

estimator is the highest (around 0.95) followed by the ULS estimator (around 0.80), and

the power of the WLS and IWLS estimators is the lowest (around 0.60). The power of

al1 of the estimators is higher when the size of the variance of the regression parameters

is low, the number of individuals is high, and the number of repeated measurements per

individual is high. When the size of the variance of the explanatory variable X is high,

the power of the GEE estimator increases slightly but decreases slightly for the W LS and

IWLS estimators. Al1 changes in factor levels affected the power of the WLS and IWLS

estimators much more than the ULS or GEE estimators. The biggest iucrease in power

occurred when the number of repeated measurements per individual increased froni 10 to

50. In this case, the power of the WLS estimator iricreased froin arouiicl 0.20 to around

0.95.

In Appendix E are reported the analysis of variance tables for aH main effects aiid first

orcler interaction terrils with the variable METHOD as well as the suininarized data for

eacli of the 16 levels of the factorial design.

Chapter 7

An Application: Word Recall

Success in Head Injury Patients

As an application of the techniques of this thesis in the binary outcome case, we use the

Post-Traumatic Amnesia Assessrnent study carried out by the Rot man Researcli Inst i tute

of the Baycrest Centre for Geriatric Care a t the University of Toronto. The main objective

of this study was to improve the assessrnent of post-traumat ic amnesia in traumatic brain

injury patients. The subjects under study consisted of 140 individuals who required

liospitalization for a head injury subsequent t o a serious accident. The patients, between

the ages of 16 and 65, with no previous significant neurological disease, were includecl in

the study from hospitalizations at the Sunnybrook and St. Michael's Hospitals in Toronto.

Since it was hypothesized that the ability to recall words after 24 hours was related

to recovery of orientation, the data that was used in this analysis was the occurrence of

perfect recall of a t least two of three words given 24 hours prior to the test as the dependent

variable where testitig was ended after perfect recall of al1 three worcls occurrecl on tliree

successive days. As the explanatory variable, we used the Galveston Orientation and

Amnesia Test (GOAT), a measure of recovery of orientation, wliich was calculatecl on a

scale of up to 100, where a score of 75 or better is considered to be a 'normal range'.

Table 7.1: Parameter Estimates for the Head-Injury Data

t IWLS took 7 iterations to converge. $ GEE took 5 iterations to converge. -;4 P values for ULS, WLS, and IWLS were given only for the XZ distribution

(refer to Pages 19 - 21).

In the analysis of the data, we removed the last three measurements since, by design,

the last three days of measurements rnust al1 be 3 out of 3 perfect recalls. We have

also transformed the GOAT score by subtracting 50 and dividing by 50 t o briilg it into

a more 'reasonable' range for analysis purposes. From the remaining data we included

only the individuals that had more than three days of word-recall tests and we used the

corresponding transformed GOAT score from the day that the words were given. This

gave us a final sample size of 39 individuals where the number of repeated measures per

individual ranged from a minimum of 4 to a maximuni of 23 with an average value of 8.4.

In this analysis we attempt to describe the association between the GOAT score and

the ability to recall a t least two of three words given 24 hours before. We use a ranclom

coefficients mode1 in this scenario due to the variable recovery patterns of iiidiviclual

patients. We compare the ranclom coefficient techniques of this tliesis in the binary

outcome case with the Geiieralized Estimating Equations (GEE) approacli of Liang 8c

Zeger and ordinary logistic regression (LOG).

In our analysis, 8 of the individuals had the sarne outcome rneasure for al1 repeatecl

measuren-ients and 9 individuals had complete separations in their data thus 17 individuals

GEES LOG -2.62 -2.62 0.42 0.46

0.0000 0.0000 3.29 3.29 0.52 0.58

0.0000 0.0000

Estimate Po Std.Err. P value Pi Std.Err. P value

ULS WLS I W L S ~ -15.2 -3.90 -4.66 5.11 2.00 2.74

0.003* 0.050* 0.106* 17.0 4.60 5.40 5.53 2.54 3.26

0.002" 0,050" 0.089*

Table 7.2: Paraineter Variance Estimates for the Head-Injury Data

METHOD

ULS

1 IWLS

WLS

out of 39 did not have parameters estimated from the Newton-Raphson algorithm.

The results of the analysis are presented in Table 7.1 where is the logistic intercept

and pi is the logistic dope of the transformed GOAT variable. In the table, we can see

that both the WLS and IWLS parameter estimates are dampened by a factor greater

than three compared with the ULS estimate. Also, their standard errors are smaller by

about half. This combination results in a higher p value for these estimators than for

the ULS estimator. The GEE estimator, a population averaged estimate, is closer to zero

than any of the other three estimators, but its standard error is rnuch smaller which, in

turn, gives a much smaller p value than even the ULS estimator.

In Figure 7.1 we see two graphs that resemble those from Figure 4.1 in Chapter 4.

Here the top graph has the individual's estimated intercept against the individual's first

diagonal element of the inverse of the variance-covariance matrix for the parameter esti-

mate (its weight) and the bottom graph has the estimate of the slope against the second

diagonal element of the same matrix (its weight). In the top graph we see one itidivid-

ual's intercept that is extremely small (-170) with a corresponding weight that is relatively

small. Therefore, we would expect the ULS estimate of the intercept to be snlaller than

the WLS or IWLS estimates. Also, fronl the bottom graph, we see one individual's slope

that is extremely large (175) with a corresponding weight that is moclerate in size relative

Estimate of Epp =

249.2 -296.4 -296.4 360.7

O$,, a&@, OPoP1

117.9 -152.8 -152.8 203.5

to the rest of the individuals. Therefore, the ULS dope estimate is greater than the slope

estimate from WLS or IWLS. A marginal analysis in whicli each individual was modelecl

0.1 5 Weight

Weight

Figure 7.1: Plots of Intercept and Slope against their Weights for Head-Injury Data.

with their own logit intercept but a common slope yielded a transformed GOAT slope

value of -3.6 which is very different from the complete marginal analysis and the raildom

coefficients analyses which are both positive.

Since the logistic regression (between individual) slope of each individual's average

transformed GOAT score is 0.42 (p = 0.0001) and the individual slopes are positive and

average to 17 then we would expect to see marginal slopes in the same positive direction

and between tlie two values which is what the GEE and LOG estimators show (3.3).

By calcu!ating ratios of the variance components in tlie mode1 as well as the variances

of the explanatory variable, we cleterini~ied that tliis exanip1e was closest to our siriiulatior~

situation where the size of the variance of the regression paraineters is liigh, the size of the

variance of the explanatory variable is low, and the number of individuals is liigh. The

simulation results show that the ULS estimator was the best estimator for that particular

situation. Thus, using the results of the ULS estimator in this dataset, we would conclude

that the Galveston Orientation and Amnesia Test (p = 0.002) is highly associated witli

the probability of a high word recall success in patients that sufferecl from traumatic

head-injuries.

Part IV

Conclusions

Chapter 8

Overall Conclusions of the Algorit hm

8.1 Continuous Outcome

8, l . l The Procedures

For the case where the outcome of interest is a continuous variable, we compared several

procedures of analysis. For a longitudinal dataset, where heterogeneity is present we

showed that the random coefficients met hods-the unweighted, weighted, and iterat ed

weighted Eeast-squares estimators as well as the random effects Henderson estimator (using

the SAS procedure MIXED) are designed to properly mode1 suc11 situations.

In this thesis, we saw that in the presence of heterogeiieity, a marginal analysis fails

to do a proper job in estimating parameters and standard errors, only in the case where

significant correlation exists between an individual's average explanatory ancl outcotne

measurement. Tlierefore in tliis situation a random cornponents mode1 would tiot result

in a biased estimate of the dope as woulcl occur in a marginal analysis. However, iii

the simulation performed in this thesis the X covariate was generated independently

of the level of the outcome variable and therefore the potential bias discussed above

would not be a problem. Furthermore, it must be emphasized that in occupational and

environmental studies it is often the case that subjects can take measures to reduce their

exposure to toxins and pollutants. The probability of taking sucli action may be directly

related to the health or sensitivity of the subjects. Therefore, the very kind of bias that

we mentioned might operate. That is, such preventive behaviour by the subjects could

induce a relationship between their average exposure and average response.

Of the random component models discussed, only the Henderson estimator failed to

converge in the simulation study. This occurred when both the number of individuals

and the number of repeated measuremeuts per individual was low. 111 these situations

the ULS, WLS, and IWLS estimators al1 gave estimates. The reliability of these esti-

mators was established by noting the accuracy of their estimates of dope in the cases

where the MIX estimator failed to converge. Also, this Henderson estimator required

several more iterations on average than the IWLS estimator to reach convergence (refer

to Table D.13). One major drawback of the Henderson estimator may be that although

it models individuals separately, like the random coefficients methods, it requires the in-

version of a t x t square matrix, where t is the number of repeated tneasurements per

individual. The random coefficients ~nethods, on the other hand, iiivert a p x p rnatrix,

wliere p is the number of regression parameters which is necessarily smaller than t (refer to

Assumption A5 on Page 12). This ensures that the random coefficients algoritlim may be

less cornputer-intensive than the aIgoritlim of the Henderson estimator. Also, in the case

where the variance-covariance iiiatrix of the regression parameters is 11ot positive-definite,

the Carter & Yang correction procedure allows estimation to continue. The SAS proce-

dure MIXED reports that the matrix is not positive-definite but still gives the parameter

estimates based on that matrix. Thus, for the continuous outconie case, it seems that

the three ranclom coefficients estimators are slightly better than tlie Henderson estimator

and much better tl-ian a marginal analysis in the presence of heterogeneity. We turn to

tlie simulation results to better describe tlieir properties in general and relative to eacli

otlier.

8.1.2 Simulation Findings

For the continuous outcome simulation, we found that under the alternative hypothesis,

the ULS estimator hacl a larger empirical variance and mean square error than the other

three estimators compared (WLS, IWLS, and MIX), especially when the size of the vari-

ance of the within individual error was high, the size of the variance of the explanatory

variable X was low, and the number of repeated measurenlents per individual was low.

The Type 1 error rates of the simulation showed tliat, in general, the U LS and WLS

estimators had larger test bias than the IWLS and MIX estimators. However, when the

size of the variance of the regression parameters was low, the size of the variance of the

explanatory variable X was high, the number of individuals was low, and the nu~nber

of repeated measurements per individual was low the test bias of the IWLS and MIX

estimators was much lower than the ULS and WLS estimators. For the case where the

number of individuals is high, the test bias of the four estimators are al1 extremely close to

the true value. The power of the simulation revealed that there was a slightly larger power

for the ULS and WLS estimators than the IWLS and MIX estimators, except wheri the

size of the variance of the regression parameters was low and the number of indivicluals

was hi& however, these results were not statistically significant. The highest values of

power occurred in the cases where there was low heterogeneity (variance of the regression

parameters) and large sample size. In these cases, the power was very close to 1.0.

In general, it seems that for the continuous outcome situation, the IWLS and MIX

estimators had lower empirical variances and mean square errors than the ULS estiniator

(under the alternative hypotliesis), less test bias tlian the ULS and WLS estirnators,

and near equivalent power to the ULS and WLS estimators. For the case where the

heterogeneity is low ancl the sample size is high, al1 estimators had similar and optimal

results.

8.2 Binary Outcome

8.2.1 The Procedures

In the longitudinal data setting when the outconle of interest can take on the values

of 1 or O the yresence of heterogeneity of individuals poses an analytical problem. The

procedures encountered in tliis thesis are the three random coefficient methods introduced

in the continuous outcome case, namely, the unweighted, weighted, and iterated weighted

least-squares estimators, and the generalized estimating equation approach of Liang &

Zeger, a marginal analysis, with both the robust and naive standard errors. Only the

robust standard error GEE estimator models the heterogeneity in the data, but only in

the estimate of the standard error, not in the estimate of the regression parameters.

As in the continuous outcome case, a marginal analysis fails to give proper regression

estimates, especially when the correlation between average outcorne and average explana-

tory measurement is different than the average of the individual regression parameter

estimates. In cases like these, even the robust standard errors cannot give reliable liy-

pothesis tests of the regression parameters. We turn again to the simulation results to

better describe their properties in general and relative to eacli other.

8.2.2 Simulation Findings

In the binary outcome simulation we found tliat the ULS and GEE estimators were closer

to their true values than the WLS and IWLS estimators (only significant for the alternative

hypothesis). The variance ratios of the WLS ancl IWLS estirnators were much larges tliaii

the variance ratios of the ULS and GEE estimators, implying that the variance formulae

for those estimators are more accurate than the variance forrnulae for the WLS and IWLS

estimators. Altliougli differei-tt levels of the factors clecreasecl the variance ratios of the

WLS and IWLS estimators, they had alinost no effect on the ULS or GEE variance ratios.

The empirical variance of the ULS estimator was much larger than the empirical variance

of the other three estimators, liowever, the empirical variances of al1 four estimators were

comparable when the size of the variance of the explanatory variable X was liigh, the

number of individuals was high, and the number of repeated rneasurements per indiviclual

was high.

The mean square error of the ULS estimator was niuch Iarger than the inean square

errors of the other three estimators under the null hypothesis due to the relatively equal

bias in the parameter estimation between the four estimators. Under the alternative hy-

pothesis, the mean square error of the WLS and IWLS estimators was larger than the

mean square error of the ULS and GEE estirnators due to the smaller bias in paranieter

estimation of the ULS and GEE estimators. We also noticed that under the null hypotli-

esis, as the number of repeated measurements per individual increased from 10 to 50 the

mean square error of the ULS and GEE estimators decreased while the WLS and IWLS

estimators increased. Under the alternative hypothesis, as the number of individuals in-

creased frorn 10 to 50, the mean square error of the ULS ancl GEE estimators decreased

while the WLS and IWLS estimators increased.

The Type 1 error rates of the simulation sliowed that tlie test bias was smaller for tlie

ULS and GEE estimators than it was for the WLS and IWLS estimators. The test bias

in al1 four estimators was comparable when the number of repeated measurements per

individual was high. The power of the GEE estimator seetned to be the largest followed

by the the ULS estimator and then the WLS and IWLS estimators. The power of the

GEE estimator renlained almost the saine for the different levels of tlie factors ancl the

power of al1 four esti mators was comparable when the nuniber of repeatecl measuremeii ts

per individual was high.

It seems that the ULS and GEE estimators are superior to the WLS alid IWLS es-

tiniators in most of the situations. One major problem with the IWLS estimator was

its inability to reach convergence in a large nuinber of cases. A problem with the GEE

estimator is that it is a population average estimator that would provide inaccurate re-

sults if the correlation between average outcome and average explanatory measurement

is different than the average of the individual regression parameter estimates, as already

mentioned. Thus, for the case of binary outcome longitudinal data, the ULS estimator

seems to have performed the best. The main factor that improved estimation and in-

ference in ail estimators, in some cases to the point of equivalency between thern, was a

higher number of repeated measurements per individual.

Chapter 9

Discussion

9.1 Limitations

We now present some of the limitations of the methods introduced in this thesis. First, in

our random coefficient model, we denoted each individual's number of repeated measure-

ments as being constant across individuah. We first introduced the discrepancy bet ween

constant and varying numbers of repeated measurements per individual in Section 1.1

where in the case when the number of repeated measurements per individual was clif-

ferent, Satterthwaite's approximation was needed to properly test the significance of the

means. 111 Our case, since we modeled each individual separately, estimation of the regres-

sion parameters poses no probiem, but the test statistics introdiicecl in Section 2.3 require

slight modifications. Second, we used the assumption of conditional independence in the

randorn coefficients model (refer to Assu~nption A2 on Page 12). Although this assump-

tion is liiiiited, many of the results of this thesis depend on its presence. Third, we macle

the assurnption that the regression parameters had a multivariate Gaussiau distribution.

Although this is a reasonable assurnption, there are some cases where this assurnption is

too restrictive. These last two limitations will be discussed further in the next section.

With respect to the simulation study, the methods used need to be tested on more

Table 9.1: Simulated Range of Logistic Probabilities for giveii Paraineters.

y 90% Range 90% Range I 0.1 (0.17.0.84) (0.63,0.98)

extensive ranges for the parameters in the factorial design. Tliat is, testing the inetliods

when the ranges of variances for the within individual error (in the case of the continuous

outcome), the regression parameters, and the explanatory variable include smaller as well

as larger values would be beneficial in determining more exact patterns of the methods

under the circurnstances. Testing these methods with more than 100 simulations per

level of the factorial design would irnprove the reliability. In most cases, we chose the

parameter estimates based on trial and error and the number of simulations per level was

deterrnined by cornputer memory availability. In addition, we tested the range of logistic

probability means based on the parameters in the mode1 post hoc. The 90% range of

probabilities ( p i j ) based on 5000 independent simulations is given by Table 9.1. We see

here tliat when the size of the variance of the regression parameters is high under the

alternative hypothesis, the upper 95th percentile is extreinely close to 1.0. This iinplies

that there is a higher chance of generating individuals witli coiistant 'success' outcome in

these situations.

In the analysis of the simulation results, a blocking design was imposecl by the sim-

ulation structure. Tliat is, each simulated dataset was analyzed by al1 of the rnethods.

However, in the analysis, tliis repeated measurement structure was ignored. It was be-

lieved that in the case of 100 siniulations per level of the factorial design that our analysis

of independent outconies would not be very different from a repeated measures analysis.

Also, we could have used an empirical cut-off point for the rejection rates of each test

nientioned that would have been based on the percentiles of the simulatecl distribution

under the nul1 Ziypothesis. Instead, we relied on the clistributions of the test statistics

themselves under different conditions. One problem with this was that since the T or F

test has a broader tail than the X2 distribution, bias was introduced when coinparisons

were made between the random coefficients methods and the MIX estimator, which usecl

a T distribution, or the GEE estimator, whicli used a X2 distribution.

9.2 Extensions

Some possible extensions of the methods described in this thesis were introduced in the

previous section. First, the assumption of conditional independence may not be prac-

tical. In many cases where subjects are measured over time, these measurements can

be correlated by the proximity of the observations in time. That is, two measurements

taken on consecutive days may be more highly correlated than two measurements taken

several days apart. Tliere are many within individual error correlation structures that

are practical in modeling such as first-order autocorrelation or compound symmetry. Any

flexible random components estimation mode1 should be able to mode1 these structures.

Second, the assumption of Gaussian random parameters may be somewhat restrictive.

Although in most situations this assumption seems to be a reasonabie one ttiere niay

exist cases where a more flexible mode1 is beneficial. In these cases, new test statistics

must be determinecl and their distributions inust be calculatecl. Also, in the case where

the explanatory variables are not normally distributed the logistic regression inclividual

estimation method of Cornfield cannot be used. Similar techniques can be adaptecl based

on the assumed distributions of these explanatory variables.

It may be desirable to model both fixed and random components in the same model

using ratidotn coefficients methodology. This is oftetl referred to as a 'mixed' rnodel. In

the case of a random effects inodel, tliis can be easily done by tnocieling the explanatory

variables of the fixed regression parameters separately from the explanatory variable of

the random regression parameters. That is, referring to Equation 1.10 the parameters Pl to & can be modeled as Pi to ,Bj with covariates xlij to x j i j and the parameters Bli to

Bpi can be modeled as Bi; to Bri with covariates xi, to sr, where f is the tiumber of

fixed regression parameters and r is the number of random regression parameters. In the

case of random coefficients the change in modehg is done with the variance-covariance

matrix Xpp. In this case we set the rows and columns of the elements of the variance-

covariance matrix to zero that correspond with the elements of the regression parameter

vector f i that are assumed to be fixed. For the purpose of calculation, the matrices are

inanipulated in a partitioned form where a11 of the random parameters are the bottom r

elements of pi and the variance-covariance components are the bottom right r x r matrix

within the p x p matrix (p > r) Epp where the rest of the matrix is zero.

Further cornparisons need to be made with the estimators presented in this thesis,

especially in the binary outcome case. Some other algoritbms wliere comparisoiis should

be made are those presented by Schall [12], Waclawiw & Liang [18, 191, and Stiratelli

etpal. [l5], to name a few.

Part V

Appendix

Appendix A

Henderson and Swamy Mode1

Equivalency

We want to prove the equivalency of the Swamy and Henderson mode1 formulations.

The Swamy estimator from Equation 2.7 is given by:

wliere: W;' = Xpp + 02(X:Xi)-' The Henderson estimator from

and bi = (XiXi)-lX:yi.

Equation 2.9 is given by:

wliere: W,' = X;CppXi + 0'1. W e want to prove that

PswAM = PHEND

Let us first rewrite Equatioxis A.1 and A.2 as:

and

where,

We will show that, elementwise, A; = C; and Bi = Di, thus proving the result.

Let us first prove t h e following lemma:

Lemma A.l If W = R + ZDZt , then,

Proof of Lemma A. 1 :

If AB = I and BA = I where A and B are non-singular square matrices of the saine

dimension, then B = A-' and A = B-'.

Therefore, if we let

we need to show that W V = I and V W = I .

VW = (R-l - R-'Z (z'R-' z + D - l ) -' Z'R-') ( R + Z DZ')

= I + R-'ZDZt- R-'Z(Z'R-'Z+ D-')-'Z'

- R-12 (ZR-12 + D-')-' ZIR-' Z D Z I

= I + R-' Z DZ' - R-'Z (z'R-' z + D-') -' [ I + Z'R-'Z D] Z t

= I + R-'ZDZ'- R-'2 (z'R-'z+ D-')-' [D-' +z'R-'z] DZt

= I + R - ~ Z D Z ' - - R - ~ Z D Z ~

Therefore, we have proven the leniina.

Now, let us prove that A; = Ci , where A; and Ci are given above.

Proof of Equatioii A.5:

RHS = X: { X i S X I + II-' X i

Now, let us prove that Bi = Di, where Bi and Di are given above. So, we need to

prove:

1 where S = 02Xpa.

Proof of Equation A.6:

Therefore, we have shown tbat for any values of Epp and o', in the case of the

condi tional independence model,

Appendix B

Estimates of Variance

B. 1 Regression Paramet er Variance Estimation

We want to show that:

Thus proving Equation B.1.

Also, we can see that:

C Var {biJ = Var i=l

so, that from Equation B.1:

Therefore,

B .2 Variances of Logistic Regression Est imators us-

ing the Delta Method

We intend to prove Equations 5.9, 5.10, and 5.1 1, where their expectations are respectively

given by:

using the Delta Method and Equations 5.7 and 5.8, respectively given by:

We will prove the following lemma using the Multivariate Delta Method given in

Theorem 14.6-2 of Bishop et.al. 12, Page 4931:

Lemma B. l If k and 1 are r x 1 vectors then:

V a r { k t l } = E { I t ) V a r { k } E { l } + E { k t } V n r { l ) E { k )

+E{k t }Cov{Z, k } E { l ) + E { l t J C o v { k , l } E { k } (B.3)

C o v { k , k'l) = V a r { k } E { l ) + C o v { k , l } E { k } (B.4)

Proof of Lemma B.l: Let us define 0 = I : I and the estimate of 0 as 8 = I I l L J L J

where r;, A, k , 1 are al1 r x 1 vectors. We will furtlier define the functions f (0) = d X and

We know that the Taylor Expansion of any function u around 8 is:

If we differentiate f and g with respect to the vector 0 we get:

B = [ t ] and $=[:] where Ir is the r x r identity matrix and 0, is a r x r matrix of zeros.

Therefore,

thus proving Equation B.3.

Cov{k, k'l) x Cov { n: + ([:] - [l]) [ u : ] ([ i l - [l]) [:]} = [ I . o.] [

Cov{l, k } Var{l}

thus proving Equatioii B.4 aiid Lemnia B.1.

Let us iiow proceed with Equation 5.10, assuming as we clid witli t h e WLS estiinator

that the variance estirnate liad no error (fi,, = c,~).

proving Equation 5.10.

Before rnoving on to the variance of the intercept, we will first redefine Our estimate:

So, for the variance of the estirnate of the intercept, we have:

proving Equation 5.9 and using tlie independence of an ancillary (Pi = c , ' ($" - @y' ) ) and complete sufficient ((?Il) + ,Gr))) statistic (given by Basu's Tlieoreiii in Casella &

[ { o ! g ' ~ g } n o ~ ] = {?gioFg}no3 I

:q"?q9

~ o U y a~ 'puy -s~!qs!l'e9s quapgns aqa~duio:, puv hql!3uv aqq JO a~uapuadapu! ayq Bu!sn

Appendix C

Updating Conjectures

C . 1 The Individual Parameter Estimate

bl = (XI wYixi)-' (X',Wyiui)

= (XI [xi D ~ ~ x : + 0'11 ' xi) ' (x: [xi x ~ ~ x I + 0'11 -' = (xi [ X ~ S X : +I]-'x~)-' (x: [X~SX~+I]-'~~) [ wllere s = - Ç ~ ~ ] 0' 1

= (S + (x:x;)-') ( ( s + (x~x;)-')-' (x:x~)-' ?Cly,) [Equations A.5 and A.61

= (x;x;)-' x;yi

= b;

Therefore, we cannot update our estimate of the individual dope parameters using weiglits

that are equal t o the inverse of the variance of y;.

C.2 The Individual's Sum of Squares for Error

Let us transform the problem into a slightly different form. Let

We will now determine what T is and what our improved estimation of the sum of squares

for error is for an individual.

Therefore,

So, we cannot update our sum of squares for error using weights equal to the inverse of

the variance of y;. Thus, Our wit hiil individual variance es t i~nate remains unchangecl.

C .3 The Regression Parameter Variance Estimator

Following the same algebra as in the first part of Appendix B.1, it is easy to show tliat:

Let us replace. PWLs with f l swAM in calculating the expected value of the above equation

as we have done before in determining its mean and variance. So, we have:

using Equation 2.16. Now, we oeed to solve (C?=L=, wi)-l explicitly for Xpp and 02

Sin ce

we will start with { X p p + a2(XiXi)-Il-'.

using Lemma A . l in the first and second calculations.

Now, we have that:

So, t herefore:

1 c2 -CD, + - C(X;X;)-' + o(t-') (A6 on Page 12) n n2 i=i

using Lemma A.1.

Our approximation of (Cy=, wi)-' is, tlierefore, given by:

So now substituting back into Equation C.1, we have:

Appendix D

Continuous Outcome Simulation

Results

In this appendix, we present the results of the continuous outcome simulation. In Ta-

bles D.l - D.5 are results of the Analysis of Variance Tables of the parameter estitnates,

the variance ratios, the empirical variances, and the mean square errors of the slope for

the main effects and their interactions with the METHOD variable. These Anaiysis of

Variance Tables were first described in Section 3.1. Table D.5 is the logistic regression

analyses of tlie proportion of times that the null hypothesis of the dope was rejectecl from

each level of the factorial design, respectively. Table D.6 is the number of times tbat the

Carter & Yang matrix correction was used on a non positive-definite variance-covariance

matrix of regression parameters. Tables D.7 - D.10 are the summarized data for each

level of the factorial design of the paraineters that correspond to the Analysis of Variance

Tables in D.l - D.4. Tables D.11 and D.12 are the number of test rejections for tlie slope

under the null and alternative liypotheses where al1 test statistics are reportecl for the

ULS, WLS, and IWLS estimators. Table D.13 shows the average iiiimber of iterations

of the IWLS and MIX estimators required for convergence. In tables where the last row

begins with a series of dots, the values given are overall averages.

Table D. 1 : Analysis of Variance Table for Slope

M E T H O D Y o2

4 N T 7 * M E T H O D O~*METHOD uZ*METHOD N*METHOD T*METHOD

Source

Table D.2: Analysis of Variance Table for Variance Ratio of Slope

p = (0,O)' d f MS Prob

Source - Y os

2 a, N T y*METHOD a 2 * M E T H O D OZ*METHOD N*METHOD T * M E T H O D

P = (090)' p = (1,2)' MS Prob clf M S Pro b

Table D.3: Analysis of Variance Table for Empirical Variance of Slope

p = (O, O)' d f MS Prob Prob

0.01 9 0.000 0.000 0.000 0.000 0.000 0.288 0.015 0.023 0.118 0.019

Table D.4: Analysis of Variance Table for Mean Square Error of Slope

Source METHOD

d f Prob 3 0.000 0.579

Table D.5: Maximum Likelihood ANOVA Table for Rejection Rates of the Slope

P = (192)' d f x2 Prob 3 1 .21 0.7503

Source METHOD

Table D.6: Nurnber of Times Cpp was Not Positive-Definite (out of 100 sarnples)

p = (O, O)' d f x2 Prob 3 8.99 0.0294

Table D.7: Average of Slopes

ULS WLS IWLS MIX REG P - - (11 2)'

ULS WLS IWLS MIX REG

Table D.8: Variance Ratio of SIope

P - - (O, O)' ULS WLS IWLS MIX REG .392 -229 -243 .173 -.47

P = (W' ULS WLS IWLS MIX REG .O79 .O04 .O10 .O22 -.Fi3

Table D.11: Number of Test Rejections for Slope wliere P = (0,O)'

Result is based on 99 samples. $ Result is based on 98 samples.

Result is based on 95 samples.

Table D. 12: Number of Test Rejections t'or Slope where = (1,2)'

t Result is based on 99 samples. $ Result is based on 98 samples. e Result is based on 94 samples.

Table D. 13: Average number of iterations for the Methods MIX and IW LS

/3= ( 0 , O ) ' I I p = ( 1 3 ' MIX IWLS I I MIX IWLS

Appendix E

Binary Out corne Simulation Results

In this appendix, we present the results of the binary outcome simulation. In Tables E. 1 -

E.5 are results of the Analysis of Variance Tables of the parameter estimates, the variance

ratios, the empirical variances, and the mean square errors of the slope for the main

effects and their interactions with the METHOD variable. These Analysis of Variance

Tables were described in Section 6.1. TabIe E.5 is the logistic regression analyses of the

proportion of times that the null hypothesis of the slope was rejected from eacli level

of the factorial design, respectively. Table E.6 is the number of times that the adaptecl

Carter & Yang matrix correction was used on a non positive-definite variance-covariance

matrix of regression parameters. Tables E.7 - E.10 are the summarized data for each

level of the factorial design of the parameters that correspond to the Aualysis of Variance

Tables in E.l - E.4. Tables E.ll and E.12 are the number of test rejections for the dope

under the null ancl alternative hypotheses wliere al1 test statistics are reported for the

ULS, WLS, and IWLS estimators. Table E.13 shows the average number of iterations

of the IWLS and GEE estimators requirecl for convergence. 111 tables where the last row

begins with a series of dots, the values giveii are overall averages.

Table E.1: Analysis of Variance Table for Slope

Source METWOD

Table E.2: Analysis of Variance Table for Variance Ratio of Slope

p = (0,O)' clf MS Prob

Source METHOD

Y2 03:

N T y*METHOD

P = (34 ) ' df MS Prob

Table E.3: Analysis of Variance Table for Empirical Variance of Slope

,O = (O, O)' d f MS Prob 3 1593.1 0.000 1 137.64 0.000 1 9.136 0.085 1 0.415 0.713 1 1494.2 0.000 3 52.627 0.000 3 7.118 0.074 3 2.620 0.466 3 536.38 0.000

ource

,O = (2,4)' df MS Prob 3 1656.9 0.000 1 100.41 0.000 1 408.19 0.000 1 661.19 0.000 1 1421.6 0.000 3 31.145 0.008 3 171.05 0.000 3 250.36 0.000 3 587.35 0.000

p = (0,O)' d f MS Prob

p = (2,4)' d f MS Prob

Table E.4: Analysis of Variance Table for Mean Square Error of SIope

l Source METHOD

y 2 0s N T y*METHOD ~ * M E T H O D N*MMETHOD T*METHOD

p = (O, O)' II P = (294)' d f MS Prob I I df MS Prob

Table E.5: Maximum Likelihood ANOVA Table for Rejection Rates of the SIope

Source METHOD

y2 oz N T METHOD*y METHOD*~: METHOD*N METHOD*T

Table E.6: Number of Times kpa was Not Positive-Definita (out of 100 samples)

f l = (0,O)' cl f x Prob

/3 = (2,4)' df x Prob

Table E.7: Average of Slopes

B - - (O, O)'

7 N T ULS WLS IWLS GEE LOG 1 0.1 10 10 -.O4 .O22 .O16 -.O1 -.O1

P - - (234)' ULS WLS IWLS GEE LOG 3.27 1.50 1.40 3.60 3.60

Table E.8: Variance Ratio of Slope

P - - (2,4)' ULS WLS IWLS GEE LOG .O13 3.37 2.52 -.49 -.44

7 a: N T 1 0.1 10 10

. . r

ULS WLS IWLS GEE LOG .347 2.43 2.48 -.18 -.23

Table E.9: Empirical Variance of Slope

Table E.10: Mean Square Error of Slope

P - - (Z4) ' ULS WLS lWLS GEE LOG 4.58 .601 -577 2.20 2.20

y 0: N T 1 0.1 10 10

B - - (O, 0)' ULS WLS IWLS GEE LOG 2.23 356 367 .582 -582

P - - (0, O)' ULS WLS IWLS GEE LOG 2.23 356 3 6 7 .582 .582 -364 269 ,270 209 ,209 3 1 4 .O49 .O50 .O83 .O83 .O85 .O55 .O55 .O44 .O44 .674 .O80 .O86 ,114 . I l 4 .285 .177 .l79 .O79 .O79 .183 .O15 ,015 .O21 .O21 .O80 .O47 .O47 .O19 .O19 5.67 .394 .410 .960 -960 2.03 1.20 1.21 .430 -430 -951 .O46 .O48 ,140 -140 -321 .192 .193 .O68 .O68 -895 .O55 .O64 ,182 .182 1.64 .698 .710 .131 .131 -289 .O15 .O15 .O40 .O40 .320 ,119 ,120 .O19 .O19 1.05 236 .240 .195 -195

B - - (214)' ULS WLS IWLS GEE LOG 5.11 6.85 7.33 2.37 2.37 1.48 .56 1 .482 ,527 3 2 7 1.91 7.63 8.69 .473 .473 ,605 .390 279 ,486 .486 5.96 1 7 11.5 1.88 1.88 1.64 .94 1 ,537 1.20 1.20 4.83 12.4 12.5 1.25 1.25 1.15 .798 3 4 3 1.22 1.22 5.88 10.7 10.6 3.98 3.98 2.75 1.63 1.62 3.00 3.00 3.63 11.0 11.5 3.27 3.27 .729 1.46 1.25 3.45 3.45 8.88 13.7 13.7 5.91 5.91 2.70 2.26 1.76 5.86 5.86 7.79 14.1 14.0 6.30 6.30 3 7 1 2.37 1.57 6.31 6.31 3.49 6.15 6.10 2.97 2.97

TabIe E.ll: Number of Test Rejections for Slope where 8 = (0,O)'

$ ResuIt is based on 98 samples.

Table E.12: Number of Test Rejections for Slope where ,û = (2,4)'

t ResuIt is based on 99 samples. $ Result is based on 98 samples. b Result is based on 97 samples. 1 Result is based on 96 samples. 4 Result is based on 95 samples. p Result is based on 93 samples.

Table E.13: Average number of iterations for the Methods GEE and IWLS

,O= (0,O)' p = (2,4)' GEE IWLS GEE IWLS

Bibliography

[l] Anderson, R. L. and Bancroft, T. A. (1952). Statistical Theory in Research. New York: McGraw-Hill.

[2] Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975). Discrete Multivariate Analysis - Theory and Practice. Cambridge, M A : M.I.T. Press.

[3] Carter, R. L. and Yang, M. C. K. (1986). Large Sample Inference in Random Co- efficient Regression Models. Communications in Statistics-Theory and Methods 15, 2507-2525.

[4] Casella, G. and Berger, R. L. (1990). Statistical Inference. Belmont, CA: Duxbury Press .

[5] Cornfield, J. (1962). Joint Dependence of Risk of Coronary Heart Disease on Serum Cholesterol and Systolic Blood Pressure: A Discriminant Function Analysis. Fed- eration Proceedings 2 1, 58-6 1.

[6] Diggle, P. J., Liang, K.-Y., and Zeger, S. L. (1994). Analysis of Longitudinal Data. Oxford: Oxford University Press.

[7] Gart. J. J . and Zweiful J . R. (1967) On the Basis of Various Estirnators of the Logit and its Variance with Applications to Quanta1 Bioassay. Biometrika 54, 181-187.

[8] Gumpertz, M. and Pantula, S. G. (1989). A Simple Approach to Inference in Random Coefficient Models. The A merican Statistician 43, 203-21 0.

[9] Harville, D. A. (1 977). Maximum Likelihood Approaclies to Variance Coinpoiieiit Estimation and to Related Proble~ns. Journal of the American Statistical Associ- ation 72, 320-340.

[IO] Henderson, C. R. (1975). Best Linear Unbiased Estimation and Prediction Under a Selection Model. Biometrics 31, 423-447.

[Il] Laird, N. M. and Ware, J . H. (1982). Random-Effects Models for Longitudiiial Data. Biometrics 38, 963-974.

1121 Schall, R. (1 991). Estimation in Generalized Linear Models with Random Effects. Biometrika 78, 719-727.

[13] Schott, J . R. (1997). Matrix Analysis for Statistics. New York: John Wiley k Sons, Inc.

[14] Searle, S., Casella, G., and McCulloch, C. (1992). Variance Components. New York: John Wiley & Sons, Inc.

[15] Stiratelli, R., Laird, N. M., and Ware, J . H. (1984). Random Effects Models for Serial Observations with Binary Response. Biometrics 40, 961-971.

[16] Swamy, P. A. V. B. (1970). Efficient Inference in a Random Coefficient Regression Model. Econometrica 38, 311-323.

[17] Swamy, P. A. V. B. (1971). Statistical Inference in Random Coeficient Regression Models. Berlin: Springer-Verlag.

[18] Waclawiw, M. A. and Liang, K.-Y. (1993). Prediction of Random Effects in the Generalized Linear Model. Journal of the American Statistical Association 88, 171-178.

1191 Waclawiw, M. A. and Liang, K.-Y. (1994). Empirical Bayes Estimation and Inference for the Random Effects Model with Binary Response. Statistics in Medicine 13, 541-551.

[20] Zeger, S.L. and Liang, K.-Y. (1986). Longitudinal Data Analysis for Discrete and Continuous Outcornes. Biometrics 42, 121-130.

regression for continuous and binary longitudinal · estimation methods in random coefficient...

Documents