regression for continuous and binary longitudinal · estimation methods in random coefficient...
TRANSCRIPT
ESTIMATION METHODS IN RANDOM COEFFICIENT
REGRESSION FOR CONTINUOUS AND BINARY
LONGITUDINAL DATA
BY
S. Samuel Bederman
A THESIS SUBMITTED IN CONFORMITY WITH THE
REQUIREMENTS FOR T H E DEGREE OF
MASTER OF SCIENCE
GRADUATE DEPARTMENT O F COMMUNITY HEALTH
UNIVERSITY OF TORONTO
@ Copyright by S. Satnuel Bederman, 1997
395 Wellington Street 395, rue Wellington OttawaON K1A ON4 Ottawa ON K1 A ON4 Canada Canada
Your lFle Votre mf6rence
Our fi& Notre réUrence
The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or othenivise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/h, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent ê e imprimés ou autrement reproduits sans son autorisation.
ESTIMATION METHODS IN RANDOM COEFFICIENT REGRESSION
FOR CONTINUOUS AND BINARY LONGITUDINAL DATA
Master of Science
S. Samuel Bederman
Graduate Department of Community Healt 11
University of Toronto, 1997
Abstract
Random coefficient regression (RCR) models are commonly used in the analysis of longitu-
dinal data. Longitudinal studies involve a number of subjects 011 whom repeated outcome
and explanatory measurements are taken over tirne. RCR is necessary since eacli subject
may have a different relationship between the explanatory and outcome measurements.
This thesis attempts to improve upon the current techniques used of estimation in RCR in
two ways. First, the weighted least-squares estimator put fort11 by Swamy [16] is adapted
to allow an iterative procedure to update the parameter estimates. This i terated zueighted
least-squares estimator is compared with the weighted least-squares and unweighted least-
squares estimators. Second, these t hree RCR estimators are t hen extended, by analogy,
to the case where the repeated outcorne variable is binary. These theoretical inethods
are explained in detail, tested on simulated datasets, and used to analyze a longitudinal
dataset in bot h the continuous and bi~iary outcorne cases.
Contents
1 Introduction 1
1 An Introduction to Random Component Models 2
. . . . . . . . . . . . . . . . . . . . . . 1.1 The need for Random Components 3
. . . . . . . . . . . . . . . . . 1.2 Random Effects versus Random Coefficients 8
. . . . . . . . . . . . . . . . . . . . 1.3 Extension to the Binary Outcome Case 9
II Continuous Outcome Longitudinal Data 10
2 Theory of Random Coefficients for Multiple Linear Regression 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The Mode1 1 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Estimation Methods 13
. . . . . . . . . . . . . . . 2.2.1 First Stage Estimation (The Individual) 13
. . . . . . . . . . . . . . 2.2.2 Second Stage Estimation (The Aggregate) 14
. . . . . . . . . 2.2.3 Third Stage Estimation (The Updating Procedure) 18
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Inference 19
3 Continuous Outcome Longitudinal Data Simulation 22
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Simulation 22
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 TheResults 40
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Estimation 40
. . . . . . . . . . . . . . . 3.2.2 Type 1 Error Rates and Statistical Power 41
4 An Application: Environmental Health Among Asthmatics in the City
of Windsor 43
III Binary Out corne Longitudinal Data
5 Theory of Randorn Coefficients for Multiple Logistic Regression 50
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Mode1 50
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Estimation Methods 52
. . . . . . . . . . . . . . . 5.2.1 First Stage Estimation (The Individual) 52
. . . . . . . . . . . . . . 5.2.2 Second Stage Estimation (The Aggregate) 57
5.2.3 Third Stage Estimation (The Updating Procedure) . . . . . . . . . 60
6 Binary Outcorne Longitudinal Data Simulation 62
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 The Simulation 62
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 TheResults 77
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Estimation 77
. . . . . . . . . . . . . . . 6.2.2 Type 1 Error Rates and Statistical Power 79
7 An Application: Word Recall Success in Head Injury Patients 81
IV Conclusions 86
8 OveraIl Conclusions of the Algorithm 87
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Continuous Outcome $7
8.1.1 The Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
. . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Simulation Findings 89
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Binary Outcome 90
8.2.1 The Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.2.2 Simulation Findings . . . . . . . . . . . . . . . . . . . . . . . . . . 90
9 Discussion 93
9.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
9.2 Extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
V Appendix
A Henderson and Swamy Mode1 Equivalency 98
B Estimates of Variance 103
B. 1 Regression Parameter Variance Estimation . . . . . . . . . . . . . . . . . . 103
B.2 Variances of Logistic Regression Estimators using the Delta Method . . . . 105
C Updating Conjectures 109
C.1 The Individual Parameter Estimate . . . . . . . . . . . . . . . . . . . . . . 109
C.2 The Individual's Sum of Squares for Error . . . . . . . . . . . . . . . . . . 1 10
C.3 The Regression Parameter Variance Estimator . . . . . . . . . . . . . . . . 1 1 1
D Continuous Outcome Simulation Results 114
E Binary Outcorne Simulation Results 125
Bibliography
List of Tables
1.1 Example Data of 5 Individuals . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1 Definition of the Factors used in the Simulation Study . . . . . . . . . . . 25
3.2 Tests used for the different levels of N and T . . . . . . . . . . . . . . . . . 26
3.3 Situations where the MIX estimator failed to converge . . . . . . . . . . . . 28
3.4 Estimates of the slope for the rernaining estimators when the MIX estimator
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . failed to converge 29
4.1 Parameter Estimates for the Asthma Data . . . . . . . . . . . . . . . . . . 45
4.2 Parameter Variance Estimates for the Asthma Data . . . . . . . . . . . . . 46
6.1 Situations where the IWLS estimator failed to converge when ,8 = (2.4)'. . 64
6.2 Estimates of the slope for the remaining estimators wlien the TWLS esti-
rnator failed to converge for ,û = (0.0)'. . . . . . . . . . . . . . . . . . . . . 65
6.3 Estimates of the slope for the remaining estimators when the lWLS esti-
mator failed to converge for /? = (2.4)'. . . . . . . . . . . . . . . . . . . . . 66
7.1 Parameter Estimates for the Head-Injury Data . . . . . . . . . . . . . . . . 82
7.2 Parameter Variance Estimates for t h e Head-Injury Data . . . . . . . . . . 83
9.1 Simulatecl Range of Logistic Probabilities for giveii Parameters . . . . . . . 94
. D 1 Analysis of Variance Table for Slope . . . . . . . . . . . . . . . . . . . . . . 1 15
. . . . . . . . . . . D.2 Analysis of Variance Table for Variance Ratio of Slope 115
. . . . . . . . . D.3 Analysis of Variance Table for Empirical Variance of Slope 116
. . . . . . . . . D.4 Analysis of Variance Table for Mean Square Error of Slope 116
D.5 Maximum Likelihood ANOVA Table for Rejection Rates of the Slope . . . 1 1 7
D.6 Number of Times kpp was Not PositiveDefinite (out of 100 sarnples) . . . 117
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.7 Average of Slopes 118
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.8 Variance Ratio of Slope 119
. . . . . . . . . . . . . . . . . . . . . . . . . . D.9 Empirical Variance of Slope 120
. . . . . . . . . . . . . . . . . . . . . . . . . . D . 10 Mean Square Error of Slope 121
. . . . . . . . . . . . D.l l Number of Test Rejections for Slope where ,O = (0, 0)' 122
. . . . . . . . . . . . D.12 Number of Test Rejections for Slope where @ = (1, 2)' 123
. . . . . . . D.13 Average number of iterations for tlie Methods MIX and IWLS 124
. . . . . . . . . . . . . . . . . . . . . . E.1 Analysis of Variance Table for Slope 126
. . . . . . . . . . . E.2 Analysis of Variance Table for Variance Ratio of Slope 126
. . . . . . . . . E.3 Analysis of Variance Table for Empirical Variance of Slope 126
. . . . . . . . . E.4 Analysis of Variance Table for Mean Square Error of Slope 127
. . . E.5 Maximum Likelihood ANOVA Table for Rejection Rates of the Slope 127
E.6 Nurnber of Times gpa was Not Positive-Definite (out of 100 sainples) . . . 127
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.7 Average of Slopes 128
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.8 Variance Ratio of Slope 128
. . . . . . . . . . . . . . . . . . . . . . . . . . E.9 Empirical Variance of Slope 129
. . . . . . . . . . . . . . . . . . . . . . . . . . E.10 Mean Square Error of Slope 129
. . . . . . . . . . . . E . I 1 Number of Test Rejections for Slope where P = (0, 0)' 130
. . . . . . . . . . . . E.12 Number of Test Rejections for Slope wliere /3 = (2, 4)' 130
. . . . . . . E.13 Average number of iterations for the Metliods GEE and IWLS 131
vii
List of Figures
Plots of Y versus X for each individual and for the entire sample for the
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example Data. 7
Plot of Rejection Rate of Slope under the null hypothesis for al1 five esti-
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mators.. 24
Plots of Slope Mean against Method for different levels of the Parameters
. . . . . . . . . . . . . . . . . of the Simulation uader the nul1 hypothesis. 30
Plots of Variance Ratio* of Slope against Method for different levels of the
. . . . . . . . . . . Parameters of the Simulation under the nul1 hypothesis. 31
Plots of Empirical Variance (VT) of Slope against Method for different levels
of the Parameters of the Simulation under the nul1 liypotliesis. . . . . . . . 32
Plots of Mean Square Error of Slope against Method for clifferent levels of
the Paraineters of the Simulation under the nul1 hypothesis. . . . . . . . . 33
Plots of Slope Rejection Rates against Method for different levels of the
Parameters of the Simulation under the nul1 hypothesis. . . . . . . . . . . . 34
Plots of Slope Mean against Method for different levels of the Paraineters
of the Simulation under the alternative hypotliesis. . . . . . . . . . . . . . 35
Plots of Variance Ratio* of Slope against Method for different levels of the
Parameters of the Simulation under the alternative hypothesis. . . . . . . . 36
Plots of Empirical Variance (VT) of Slope against Method for different levels
of the Parameters of the Simulation under the alternative hypothesis. . . . 37
Plots of Mean Square Error of Slope against Method for different leveIs of
the Parameters of the Simulation under the alternative hypothesis. . . . . . 38
Plots of Slope Rejection Rates against Method for clifferent levels of the
Parameters of the Simulation under the alternative hypothesis. . . . . . . . 39
Plots of Parameter Estimates against their Weights for Asthma Data. . . . 48
Plot of Rejection Rate of Slope under the null hypothesis for al1 five esti-
mators.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Plots of Slope Mean against Method for different levels of the Parameters
of the Simulation under the null hypothesis. . . . . . . . . . . . . . . . . . 67
Plots of Variance Ratio* of Slope against Method for different levels of the
Parameters of the Simulation under the nul1 hypothesis. . . . . . . . . . . . 68
Plots of Empirical Variance (VT) of Slope against Method for different levels
of the Parameters of the Simulation under the nui1 hypothesis. . . . . . . . 69
Plots of Mean Square Error of Slope against Method for different levels of
the Parameters of the Simulation under the null hypothesis. . . . . . . . . 70
Plots of Slope Rejection Rates against Method for different levels of the
Parameters of the Simulation under the null hypothesis. . . .
Plots of Slope Mean against Metliod .for different levels of the
of the Simulation under the alternative hypotliesis. . . . . . . Plots of Variance Ratio* of Slope against Metliocl for different 1
. . . . . . . 71
Paramet ers
. . . . . . . 72
evels of the
Parameters of the Simulation under the alternative hypothesis. . . . . . . . 73
Plots of Empirical Variance (VT) of Slope against Metliod for clifferent levels
of the Paraineters of the Siiiiulation under the alternative hypotliesis. . . . 74
6.9 Plots of Mean Square Error of Slope against Metbod for different levels of
. . . . . . the Parameters of the Simulation under the alternative hypothesis 75
6.10 Plots of Slope Rejection Rates against Method for different levels of the
. . . . . . . . Parameters of the Simulation under the alternative hypothesis 76
7.1 Plots of Intercept and Slope against their Weights for Head-Injury Data . . 84
Part 1
Introduction
Chapter 1
An Introduction to Random
Component Models
The essentials of this tliesis are contained in four main parts. In the first part, an in-
troduction to the importance of random component models and their development ancl
emergence in the analysis of longitudinal studies is given. The second part consists of the
continuous outcome situation in which Chapter 2 outlines the theory of the model, the
associated assumptions, and the tnethod of estimation in the case of continuous outcome
longitudinal data. The results of several cornpetitive techniques used to analyze simulated
data are reported in Chapter 3 and their use in a study of the effect of environ~nental
pollutants on respiratory health among asthmatics is reported in Chapter 4. In the thircl
part, the procedures designecl for the case of binary outcome longitudinal data are studied.
The theory of this mode1 is expIained in Chapter 5 as are the underlying assumptions and
estimation methods associated witli this type of dataset analysis. Anotlier simulation is
used to study the properties of these procedures which is reported in Chapter 6 ancl the
analysis of a longitudinal study of word recall success post head-injuries is given in Chap-
ter 7. The final part, inchding Chapters 8 and 9, consists of some concluding reniarks as
well as a discussion of possible extensions and limitations of the techniques encouiiterecl
in this thesis. 2
1.1 The need for Random Components
Random components regression theory utilizes one of the most fundamental assumptions
with regards to longitudinal data structure - heterogeneity of individuals. At the most
basic level, it incorporates a clustered sampling design into the analysis of the data. A
simple example where we have 5 individuals with 3 replicates each might look like this:
Individual 1 Measured Values
In this case, Our mode1 equation for the j th of 3 observations for the itIL of 5 individuals
would look like:
yij = /J +ai + eij (l4
In the more general case, we would have Equation 1.1, where j = 1,. . . , ti and i = 1, . . . , n.
In this model, yij is the jt" measure of outcome for the ith individual, p is
population average of the measure of outcome, ai is the it" individual's dev
the population average, which is assumed to be a random variable that has
the overall
ation froln
nlean zero
and variance ai, and e i j is the random error for the j t h measurement on the ith iodividual
with mean zero and variance a2.
From this mode1 and its assumptions, we can show that:
where N = xy=, ti. If we decompose the total sum of squared deviations into a betweeii
and within component, we get:
where we cal1 the two terms the sum of squares withiu and sum of squares between,
respectively, i. e.
SST = S S B + SSW
Using these results, we have:
E { S S B } = [N - C b ''1 g; + ln - 11 o2 (From Anderson & Bancroft [l])
so therefore our estimates of the variance components in the mode1 are:
SSW ssw [z - -
6 2 = - N-n] (n - and &:=
N - n cy.l t: (.- N ) and our estimate of the variance of the overall weighted mean, g.., using Result 1.2 is
given by:
where Cl and C2 are given by the coefficients from Equation 1.3, M S B is the SSB diviclecl
by its degrees of freedom (n - l ) , and MSW is the SSW divided by its degrees of freedom
( N - n). If ti = t then Cl = 2 and C2 = O and we see tliat the variance of the meati
simplifies to
Let us examine the importance of modeling the correct variance components of the
model. If we proceeded in the analysis of a longitudinal study with heterogeneity of
the individuals, where al1 individuals have the same number of repeated measurements
(ti = t ) and we neglected t o take the clustering into account then our naive estimate of
the variance of a single measurement yij would be:
CZ 1 VarN{y,} = - S S B SSW
SST=-+- nt - 1 n t - 1 n t - 1
Our true estimate of the variance of a single measurement using Result 1.2 would be:
CI
Var{y,) = ô.: + ô.2 = S S B SSW +-
t ( n - 1 ) nt
If we take the expected values of these equations we get:
Since Lo = 1 - L-l nt-1 nt-] and t < nt then Varw{yij} 5 Var{yij) for 0; 2 O, that is,
the naive estimate of the variance of an observation, which neglects the clustering, will
underestimate the true variance of an observation. Let us compare these two variance
estimates for three specific cases.
t n 1 2. If 0; > O and t -t m then -' n and VarN{yii} < Var{yij}.
Therefore, in the cases where heterogeneity exists (0: > O) and Our sample size is not
very large, neglecting the clustering of the data underestimates the true variance of eacli
observation.
Now, let us compare the distribution of this variance estimate wlien different indi-
viduals have different numbers of repeated measurements with the distribution when al1
individuals have the saine nunlber repeated measurements. If we look at Equation 1.3,
when t; = t , then the estimated variance of the estimate of the overall mean is given by:
- M S B Var{y . . ) = -
nt
where MSB is the SSB divided by its degrees of freedom (n - 1). Therefore, the dis-
tribution of v%{ij..} is proportional to a XZ with n - 1 degrees of freedorn. In the case
where ti # t for i = 1, . . . , n then the distribution of Var{jj..) given in Equation 1.3 is
not proportional to a x2 with n - 1 degrees of freedom, but rather to a x2 with degrees
of freedom given by Satterthwaite's approximation as:
[CIMSB + C ~ M S W ] ~ df = (cl M s , ) , , ( a M s w 1 2 (Searle et. al.[l4, Page 1341)
where Cl and C2 are given in Equation 1.3.
As in the case of linear regression, wliere we estimate the straight line relationship
between outcome and covariate, random component models can be similarly extended to
mode1 continuous covariates. The main difference between fixed effects linear regression
and random effects linear regression is that the vector of intercepts and slopes for each
cluster is assumed to be randomly distributed with some variance-covariance structure.
Consider an example of five individuals with 5 repeated outcome and explanatory
measurements given in Table 1.1. If we perform simple linear regression analyses on each
individual (within-individual analysis) we find that al1 of the estimated slopes are negative
and their average is -1.7. An overall regression of individuals' average Y against average
Table 1.1: Example Data of 5 Individuals
X (between-individual analysis) yields 3.0 as the estimate of the slope. If we perform one
last analysis in which we use al1 of the data but disregard any clustering by the individual
(marginal analysis) we get a dope estimate of 1.3. The individual regression lines as well
as the marginal regression lines are plotted in Figure 1 .l.
Figure 1.1: Plots of Y versus X for each individual and for the entire sample for the Example Data.
We can clearly see that from a marginal analysis, the estimate of the dope is positive
but from a within-individual analysis it is negative. This occurs because the marginal
analysis is a weighted average of the within-individual dope and the slope of the betweeii-
individual analysis. Since the between-individual analysis is in the opposite direction of
the within-individual analysis, we see that the marginal analysis is 'pulled7 away from the
within individual analysis to a positive value.
Therefore, ignoring in the statistical analysis heterogeneity between indivicluals when
it exists may give biased estimates of the slope in addition to biasing the estimate of its
standard error.
1.2 Random Effects versus Random Coefficients
There is a slight distinction between random effects models and random coefficients mod-
els. In the model discussed above, we assumed that the individual parameters, the in-
dividual deviations from the population inean, had mean zero. The model equation was
given by:
( 1.8) y;j=/L+Cti$€ij i = I ,..., 72 j = l , ..., ti
We could re-parameterize this equation as:
where instead of each a; having mean zero, the pi's now have mean p. We have defined
pi = p + CY;. SO, in Equation 1.8, we see that there are fixed effects, p, as well as random
effects, Cti and E i j , while in the Equation 1.9, al1 effects are randorn.
In the context of multiple linear regression, the random effects model is given by:
where the A's are fixed ( k = 1, . . . , p), the Bki7s are random wit h mean zero and some
variance-covariance structure, and the E ~ ~ ' s are random errors with meati zero and variance
a2. For the case of random coefficients, the analogous model is given by:
where the Pki's are al1 random with rnean PI, and the same variance-covariance structure
as the random parameters in Equation 1.10 and the fij7s are random errors witli mean
zero and variance a'.
1.3 Extension to the Binary Outcome Case
As in continuous outcome longitudinal data, heterogeneity can exist when our outcoilie of
interest is binary, that is, when we are measuring the presence or absence of some event.
Wlien we have clustering in the data, the standard errors of our parameter estimates differ.
Therefore, we would like a method that can properly model this heterogeneity and carry
out estimation and testing of these parameter estimates that is unbiased. The generalized
estimating equation (GEE) techniques of Liang & Zeger which are discussed further in
Part III can model this heterogeneity. However, this GEE estimator models a marginal re-
lationship between the out come and explanatory variables and only corrects the standard
errors of these estimates by incorporating this heterogeneity. Although these standard
errors may be correct, a marginal analysis in many situations fails to give us accurate esti-
mates of the regression parameters as we saw in Section 1.1. The SAS procedure MIXED
can also account for the heterogeneity by modeling the within subject relationships with
a random effects technique. However, the SAS procedure MIXED can only be usecl in
the continuous outcome case. Other methods exist that mode1 random components for
binary outcome situations [12, 15, 18, 191 but they tend to be very computer-intensive.
Tlierefore, we would like to investigate the possibility of extending the 'simple' methods
of Swamy that are explored in Part II to the case wliere the outcorne of interest in binary.
Part II
Continuous Out corne Longitudinal
Data
Chapter 2
Theory of Random Coefficients for
Multiple Linear Regression
Let us suppose that we have t repeated measurements for n individuals. The mode1 for
the it" individual is given by
In this model, Yi is a t x 1 vector representing the ith indiviclual's t repeated measureiiients
on the response variable, Xi is a 1 x p matrix of t correspondiug repeatecl measureiiients
on p explanatory variables, ,Bi is a p x 1 vector of individual-specific regressioti coefficients
corresponding to the p explanatory variables, anci ci is a t x 1 vector of t raildoin errors.
So, froni this model, we can see that each individual lias a different relationship be-
tween his or her outcotne measurements and explanatory variables defined by the p re-
gression parameters of pi. For the purpose of inference in tliis rnodel, we need to make
the following assumptions:
A l . ,Bi - MVN (8, X p p ) and I ,Bk, {i # k; i , k E 1,. . . , n); i.e. each pi cornes from
a multivariate normal population of dimension p with mean vector P , variance-
covariance matrix Epp, and the P;s are independent across individuals.
A2. a; N MVN ( O , $III ) ; Le. each c,, j = 1 , . . . , t , cornes froin a univariate normal
population with meau zero, variance 02, and al1 of the e+ are independent of each
other (this is often referred to as the conditional independence assumption).
A3. ,Bi I e k , {Yi, k E 1,. . . , n); i.e. each individual's vector of regression parameters, ,ûi
is independent of their and al1 other individual's vector of random errors, c k .
A4. The X;s are fixed and of full row rank for each i.
A5. For a fixed number of explanatory variables, p, min(n, t ) > p, i.e. there are botli
more repeated measurements per individual and nurnbers of individuals than there
are explanatory variables.
A6. The elements of (X;X;)-' are uniformly O(t-'), i . e . there exists a finite upper
bound, M, such that the elements of t(X:Xi)-' are less than M in absolute value
for al1 i and t .
The first assumption describes the fundainentals of the randoni coefficient regression
model. That is, eacli individual's parameters corne from an underlying multivariate Gaus-
sian population and that t hese individual parameter vectors are independent of each ot lier.
The second assumption is clearly a inore restrictive assumption and is not necessary for
random coefficient regression in general but for the purpose of this thesis we will use it to
simplify much of the computation. We will address this assumption later in the discussion.
From assuinptions Al and A2 and Equation 2.1, we can see that. conditional on Xi,
Using the above model, we can now proceed with developing techniques for estimation of
the unknown parameters.
2.2 Estimation Methods
Commonly used estimation methods in random coefficient regression analysis are often
referred to as two-stage estimation methods. In this thesis a third stage is added to allow
the original estimates to be updated using an iterative technique.
2.2.1 First Stage Estimation (The Individual)
The first stage of estimation comes at the level of the observational unit, i .e. the individ-
ual. Each individual contributes a set of measureinents on both response and explanatory
variables. We use these measurements to model each indiviclual's regression paraineters
separately. Since each individual follows the model given in Equation 2.1, we cari use
the theory of least-squares to estimate those parameters. This least-squares estimator is
given by:
bi = (x:x~)-' Xiyi (2.3)
Frorn assumptions A l and A2, we can see tliat, conditional on Xi, we have the following
results:
where each bi is independent of al1 bk, for al1 i and k where i, k E 1 , . . . , n. At this point we can also get an estimate of the mithin-individual composent of vari-
ance, rr2. The sum of squares for error for the it" individual is given by:
Tlierefore because it is well known that E {g ) = 02, it follows t hat:
1 " é: 2; +2=-C- n i=i t - P
is an unbiased estimate for the pooled within individual variance.
With estimates of each individual's regression parameters, b;, and an estitnate of
the within individual variance component, S2, we can now move to the second stage of
estimation, tliat of estimation over the individuals by aggregating the individual regression
parameter estimates b;.
2.2.2 Second Stage Estimation (The Aggregate)
Unweighted Least-Squares Estirnator
Guinpertz & Pantula [8] define the following unweighted least-squares (ULS) estimator as
the unweighted average of the individual regression paraineter estimates:
It is important to note that this estimator does not depend on the variance coniponents in
the model. This property tnakes it a simple way of estimating the aggregatecl regressioil
parameter witliout the need for estimating these variance components. However, for the
puryose of inference, the variance cornponents may be needed in calculating the degrees
of freedom, as showri below in Section 2.3.
Weighted Least-Squares Estimator
For the case where 02 and Zoo are known, Swamy [16, 171 sliowed tliat the generalized
least-squares estimator given by,
is the Best Linear Unbiased Estimator ( B L U E ) for B where W i is the inverse of the
variance of the regression parameter estirnate for the ith individual, namely,
Wz:' = V a r {bi ) = Cpp + 02 (X;X~)-' (see Equation 2.5).
Since a2 and Epp are seldom known, Swamy 116,171 proposed the estimated generalized
least-squares (EGLS) or weighted least-squares (W L S ) estirnator sirnilar to the ,d,,,, but
with estimates of Wi. For this thesis, we will refer to it as the WLS estimator - it is
given by:
w s = ( i ) ' ($ ~ i b i )
where WF' = epp + ô.2 (x:x,)-'. A more commonly known version of the generalized
least-squares estimator was put forth by Henderson [IO] and improved upon by Harville
[9] and Laird & Ware [Il]. It is given by:
where
W,' = V a ? * { y i } = X;EppX: + 021 (see Equatioii 2.2) . In Appendix A it is showii that the Heiiderson and Swainy versions of the general-
ized least-squares estimator are, in fact , identical under the assumption of concli t ional
independence.
Therefore, our two working estimators for /3 ( f i o L s and b,,,) are identical except for
their averaging weiglits. /3,,, uses equal weights l / n ( i .e . uuweiglited) aiid fi,,, uses
weiglits given by the inverse of the variance of the regression parameter estimates.
Since we have already defined our estimate of the witliin-individual variance, 82, we
need to have an estimate for XOP in order to proceed with the weighted least-squares
estimator.
Let the statistic Sbb be given by:
It has been reported [16, 17, 81 and can be proven (refer to Appendix B.l) that:
and using the fact that bu,, = CF=, b;, we have:
Therefore, the method of moments estimator of the variance-covariance: matrix of the
ranclom coefficients is given by:
Since we are calculating the ciifference of two matrices, there is no guarantee that
the resulting matrix will be positive-definite. Thus, it is possible that we may not get a
positive-definite estimate for our variance-covariance matrix of the random coefficients. To
safeguard against tliis, Carter L Yang [3] proposed the following modification to produce
the corrected estimator:
wllere D = Cy==, t + Trace {'& ( X ; X ~ ) - ' ) and 9 is the srnallest root of the equntion:
It is this corrected estimate of the parameter variance that will be usecl in al1 subsequent
estimation techniques for Part II.
Since we now have estimates of the variance cornponents in the mode1 (2 and Zoa) as well as the individual regression parameter estimates (b;), we are now in a position to
calculate the weighted least-squares estirnator given in Equation 2.8.
As shown by Equation B.2 in Appendix B.l , var{$,,,} = $ E { s ~ ~ ) , therefore,
is an estimate for the variance of the unweighted least-squares estimator. For the Swamy
estimator, we have:
Tlierefore,
is an estimate for the variance of the weighted Eeast-squares estirnator.
2.2.3 Third Stage Estimation (The Updating Procedure)
Are we able to 'borrow' information across individuals to improve our individual regression
parameter estimates? Can we re-calculate our sum of squares for error by weighting t hem
by the reciprocal of their variances? Or, can we update our estimates of the weights Wie?
Symbolically, these may be stated as follows:
where W;;' = Var {yi} = XiX;OpX: + 021.
These conjectures are explored one a t a time and the results are given in Appenclix C.
From Appendix C.l we can see that updating our individual paraineter estimates by
borrowing information across individuals gives us exactly the same parameter estimate.
Therefore, we carmot iniprove upon the individual's estiinate by using information froni
the entire sample. Also, from Appendix C.2 we see tbat we cannot re-caIculate our sum of
squares for error for an individual using the aggregate group variance as a weight to give us
an improved estimate. In both Conjectures 1 and 2, we clearly see that using information
from the whole sample does not give us more information about the individual. This is
consistent with our randoin coefficient regression assuniptions, namely, that individuals
behave differently and independently from each other. However, from Conjecture 3 and
Appendix C.3, it is evident tliat Sbb is clearly difFerent from Sbb, but their expectations are
approximately equal. So, we have now identified a means of updating our estiinate of the
regression parameter variance which will, in turn, update Our estimate of the aggregate
regression parameters.
Iterated Weighted Least-Squares Estimator
Therefore our iterated weighted least-squares estirnator, B*, is given by:
where,
Thus, we iterate through estimates for ,8* until convergence is achieved. At that point we
cal1 the value of p* a t convergence the iterated weighted least-squares (IWLS) estiinator.
For the purpose of inference, its variance is given by:
where W! is the weight rnatrix calculated using fi,,,,.
2.3 Inference
For the unweighted least-squares estimator defined in Equation 2.6 wi t h its variance given
by Equation 2.15, we define the statistic:
for testing Ho : LB,,, = Xo, where L is a q x p matrix of linearly independent rows.
Gumpertz & Pantula [8] proved for this statistic, TU,,, tliat:
1. For fixed n and t tending to infinity, *TiL, is distributed as F ( q , n - q ) .
2. For fixed t and n tending to infinity, Ti,, is distributed as
3. For n . t large and q = 1 , T:,, is distributed as F(1 , u ) 5 T 2 ( v ) (cf. Satterthwaite
on Page 6), where
and L = 1'.
For the weighted least-squares estimator defined in Equation 2.8 with its variance given
by Equation 2.17, we define the statistic:
A
for testing Ho : LBwLs = Ao, where L is a q x p matrix of linearly indepenclent rows.
Swamy [16, 171 and Carter & Yang [3] proved for this statistic, T;,,, that:
1. For fixed n and t tending to iiifinity, *T$,, is distributed as F ( q , n - q ) .
2. For fixed t and n tending to infinity, TZ,, is distribiited as Xi.
3. For n . t large and q = 1 , TW,, is distributed as F(1, v ) T 2 ( v ) (cf. Satterthwaite
on Page 6), where
1 -1 / = {- ( i ~ ~ ~ i ) ~ + [t2 (nt - np)] o4 ( i t c 2 ~ " , n - 1
Now, using the results of Swamy and Carter Rc Yang, we extend tliese firidiags to our
iterated estimator given in Equation 2.21 with its variance given by Equation 2.25 and
define the statist ic:
for testing H, : L P ~ ~ ~ ~ = Xo, where L is a q x p matrix of linearly independent rows. We
will assume for this statistic, Th,,, tliat:
2. For fixed t and n tending to infinity, T h , , is distributed as xi.
3. For n . t large and q = 1, Th, , is distributed as F (1 , v) T 2 ( v ) (cf. Satterthwaite
on Page 6), where
It is these test statistics that will be used in al1 subsequent analyses for contiriuous outcome
data analyses.
In the next cl~apter, we compare the statistical properties of bias and power for the *. A
three estimators, BtrLS7 ,ûwLs, and Plwt s , as well as two other well-known estimators - the
Henderson estimator (Equation 2.9) calculated via the SAS procedure MIXED which will
be referred to as the MIX estimator and the ordinary least-squares estimator that ignores
al1 assumptions of rand0111 coefficient regression mocleling and treats every observation
equally and independently usirig the SAS procedure REG which will be referred to as the
REG estimator.
Chapter 3
Continuous Outcome Longitudinal
Data Simulation
3.1 The Simulation
Let us consider the mode1 given by Equation 2.1 where we set p = 2, that is, we have
only one covariate and we are interested in estimating an intercept and slope. Therefore,
the mode1 for the j th of T observations for the it" of N individuals is given by:
w here
0 pi = (Bio, @il)' h/ NID [n ] , [ 2j 1) and @ = (b, ,ûl)' taies on the .dues
of (0,O)' or (1,2)' and y (Scale) takes on the values of 1 or 5.
Xu = (1, Xitj)' NID and oz (Sigmax2) takes on the values
of 0.1 or 1.0.
e, - NID (0,02) and 02 (Sigma2) takes on the values of 1 or 5.
0 The quantities N and T take on the values of 10 or 50.
Therefore, for this simulation we have a 26 factorial design, or equivalently, for each level
of the true regression parameters, ,O = (0, 0)' or p = (1,2)', we have a 25 factorial design.
For each of these levels of the factorial design we generated 100 random trials. Each
trial consistecl of N subjects with T measurements per subject. At each trial, the three
randorn coefficient methods (i.e. ULS, WLS, and IWLS), the Henderson randoin effects
technique of the SAS procedure MIXED (MIX), and the ordinary least-squares estimator
of the SAS procedure REG (REG) were used to analyze the simulated dataset.
The estimates of the regression parameters, their estimated standard errors using for-
mulae 2.15, 2.17, and 2.25, and the number of times the null hypothesis ( H o : Bo = (0, 0)')
was rejected (based on a 5% level of significance) were al1 recorded for each method of
each simulated dataset for the slope only. For the random coefficient methods, rejection
numbers were collected based on al1 three distributions of the test statistics for each of
the three methods given by Equations 2.26, 2.27, and 2.28.
Analysis of variance techniques (and logistic regression for the proportions) were usecl
to analyze the results of the simulation. As was anticipated, the Type 1 error rate of the
REG estimator was much larger than that of the other four test statistics. Therefore, in
Our analysis of the simulation results we created a classification variable, METHOD, that
included only the ULS, WLS, IWLS, and MIX estimators. The plot of the rejection rate
of the dope under the null hypothesis is pIottec1 in Figure 3.0. It is clear that an atialysis
that ignores al1 heterogeneity in the data (REG) is inadequate in properly analyzing data
of this form. Therefore, the rest of the figures presented below as well as all statistical
analyses include only tliose four estimators whicli are the out conles of the METHOD
variable.
In Our analyses of variance, a full interaction term mode1 was fit for the outcoine
variables of mean slope, relative variance (see below) of slope, and rejection rate of dope
wliere Our sample was based on approximately 12800 (= z5 x 4 x 100) observations. 111
Figure 3.0: Plot of Rejection Rate of Slope under the nul1 hypothesis for al1 five estimators.
-. . O
these models, only the main effects and the second order interactions with METHOD were
of interest. In the remaining cases, where the outcome of interest depended on factorial
level surnmary measures and our sample was based on 128 (= 25 x 4) observations, we used
a generalized F-test and determined that a fifth order interaction term was statistically
significant. This mode1 was fit so that al1 F-tests of the main effects and tlieir interactions
with METHOD were based on a mean square error from this high order model. In al1
instances where logistic regression was used to analyze the proportion of test rejections,
0.5 was added to each ce11 to prevent empty ce11 counts from giving uninformative analysis
of variance parameters (refer to Gart k Zweiful [7]).
Each of the four methods defined by the METHOD variable has associatecl with it
a formula for the estimate of the variance of the estimated slope. An example of such
a formula is that given in Equation 2.15 for the unweighted estimator (ULS) and Equa-
tion 2.17 for the weighted estimator (WLS). In practice we would like to know whetlier
these calculated variance estimates, VF, tend to have positive or negative bias. Our es-
timate of the true variance (einpirical variance), VT, is taken to be the variance of tlie
100 simulated estimates of the intercept and slope. Because this estimate is our proxy
for the true variance it would have been preferable if it had been based on a much larger
number of simulations. Therefore, we can calculate for each of the 32 cells of the factorial
design and for each metliod the average variance of the 100 calculated variances using tlie
I .î . 1 l
REG ULS WLS l WLS MIX
Method
Table 3.1: Definition of the Factors used in the Simulation Study
FI. Hypothesis 1 Nul1 1 P = (0,o)'
appropriate formulae to obtain an estimate of VF. AS well, we calculatecl the empirical
variance based on the actual variance of the 100 estimates across the 100 simulations for
each method and used it as Our estimate of the true variance VT. We then calculated the
difference between these values (VF - VT) divided by the estimate of the true variance
VT. If this 'variance ratio' is approximately equal to zero we woulcl conclude that the
formulae used to calculate the variance of the slope lead t o estimates of the variance that
were close to the true value as defined by the siinulation study. Our 'mean square error'
was calculated as the sum of the empirical variance of the estimate and the square of the
difference between the average parameter estimate and the true value.
The results of the simulation study are summarized in Figures 3.1 to 3.10. 111 our
analysis of variance we were able to estimate the main effects for each of the four estiinatioti
methocls, first for the nul1 hypotliesis ( P = (0, O)'), Figures 3.1 to 3.5, ancl then for the
alternative hypotliesis (/3 = (1,2)'), Figures 3.6 to 3.10. The six factors, as previously
described on Page 22 are su~nmarized in Table 3.1.
Each figure has six plots. The top left plot suinmarizes tlie results over al1 3200
(= 2' x 100) simulations. Each of the remaining five plots gives tlie results for tlie low
F2. Size of variance of regression parameters (y)
F3. Size of variance of the explanatory variable X (O:)
F4. Size of the variance of the within individual error (c2)
F5. Number of clusters or individuals (N)
F6. Number of repeated measurements per individual (T)
Alternative LOW HIGH LOW HIGH LOW HIGH LOW HIGH LOW HIGH
Table 3.2: Tests used for the different levels of N and T
1 N = 1 0 N = 5 0
and high values for the five factors F2 to F6 defined in Table 3.1. This permits visual
inspection of the presence of a two factor interaction of the factor with the variable
METHOD.
In Figures 3.1 and 3.6 are reported the mean values of the estimates of the slope of
the model. In Figures 3.2 and 3.7 are reported the means for the variance ratio, i . e . the
difference of the formula derived ( V , ) and simulated empirical (VT) variances expressed
as a proportion of the simulated empirical variance for the slope. In Figures 3.3 and 3.8
are reported the empirical variaiice of the slope. In Figures 3.4 and 3.9 are reported the
mean square error of the dope. In Figures 3.5 and 3.10 are reported the rejection rates
for the null and alternative liypotheses for the slope.
For the plots of rejection rates given in Figures 3.5 and 3.10, we used the most appro-
priate test among those defined by Equations 2.26, 2.27, and 2.28, for each of the three
random coefficient methods ULS, WLS, and IWLS, respectively. For the 2 x2 cases of pos-
sibilities for factors F5 and F6, we suinmarize the tests that were employed by Table 3.2.
In this table, the distributions given in each ceil refer to tliose given in Section 2.3.
Each plot has one or more horizontal lines that define important values. For exainple,
in Figure 3.1 the line is equal to zero which is the true value of the slope under the nul1
hypothesis. In Figure 3.5, for example, the line is equal to 0.05, the expected rejection
rate under the null hypothesis for an unbiased test.
Consider the top right plot in Figure 3.1. The two Iines relate to whetlier factor F4,
the variance of the within indiviclual error, is liigli, the top line, or low, the bottoiil line.
For al1 figures we have adopted the convention of usitig a solicl line for the liiglier value of
T = 10 T = 5 0
T 2 ( v ) X: ( n - q ) X I
the factor and a clotted line for the lower value of the factor.
Two p values are reported at the top of eacli plot. The first p value is labelecl as P(in)
for main effect. It is the p value associated with the test of whether the average values
associated with the dotted and solid lines are significantly different from each other. The
second p value, P(i), reported in each plot relates to the test of the interaction between
the METHOD and FACTOR variables. For example, if we look at the middle right plot
in Figure 3.1, we notice that the average value of the 1600 simulatecl esti~nated slopes is
approximately equal t o -0.005 for high variance of the explanatory variable X and it is also
equal to -0.005 for the ULS procedure but equal to -0.010 for the other three procedures
for low variance of the explanatory variable X (factor F3). Here the p value associated
with the test of the interaction term METHOD * F3 is equal to 0.996. Therefore, we
have no reason to interpret the apparent difference in the ULS means of the estimated
slopes for the low and high variance situations as statistically different tlian the other
three procedures. Only if the reported p value is low, Say < 0.05, in a plot as is the case,
for example, in the top right of Figure 3.1 where the p value for the main effect, F4 (02)
is equal to 0.0001 should we consider the difference as not being due to cliance.
For the top left plot of each figure, the reported p value simply indicates the overall
p value associated with the F-test of the comparison of mean values or the X2 test of the
comparison of rejection rates.
There were nine cases under the nul1 hypothesis (/3 = (0,O)') and nine cases uncler
the alternative liypothesis (B = (1,2)') where the SAS Procedure MIXED failed to con-
verge. In al1 simulations, the IWLS estimator converged in under 8 iterations. Table 3.3
sutninarizes the non-convergence situations for both t h e nul1 and alternative hypotheses.
The results of the other three estimators for the 18 siinulated datasets that MIX clic1
tiot converge were kept in the statistical analyses and graphical summaries and are dis-
played in Table 3.4. Tliere are advaiitages and disadvantages in doing tliis. If the lack
of convergence of the MIX procedure occurred in situations where lack of convergence
were appropriate, that is, for aberrant datasets, then the lack of convergence would be
considered an advantage of the MIX procedure. If the estimates obtained by the other
three rnethods were quite different from their true value in these 18 situations then by
keeping these estimates in the analyses would give an advantage to the MIX procedure.
However, if the estimates obtained with the other three estimators were close to the true
values then keeping these 18 datasets in the analyses would dernonstrate the strength
of these other methods. The values of the estimates for al1 three methods for these 18
datasets are given in Table 3.4 and demonstrate that in alrnost al1 of the cases, the tliree
estimators (ULS, WLS, and IWLS) give estimates close to the true values. The average
of the estimates of the slope under t h e nul1 hypothesis for the ULS estimator is 0.31, for
the WLS estimator is 0.30, and for the IWLS estimator is 0.30. Under the alternative
hypothesis the ULS estimator is 2.09, WLS is 2.20, and IWLS is 2.19. This implies that
these estimators are accurate even when the MIX estimator fails to converge.
Table 3.3: Situations where the MIX estimator failed to converge.
p = (0,O)' p = (1,2)' y a2 O N T number y a2 O N T number 1 1 0.1 10 10 1 1 0.1 10 10
Table 3.4: Estimates of the slope for the remaining estimators when the MIX estimator failed to converge.
ULS WLS IWLS -.48 -.17 -.17 .135 .140 .140 1.87 1.48 1.48 .261 .303 .303 .581 .683 .683 - 3 3 -1.1 -1.1 -.38 .645 .645 .374 ,364 .364 -791 .331 .330 2.39 2.34 2.34 2.66 2.56 2.56 2.74 3.15 3.15 3.63 3.34 3.34 2.23 1.92 1.92 1.06 1.27 1.27 .889 1.13 1.13 1.86 2.64 2.63 1.32 1.41 1.41
U LS WLS IWLS MIX
Method
P(m) = 0.890 and P(i) = 1 .O00
I scale
Method
P(m) = 0.1 02 and P(i) = 1 .O00
N
Method
F(m) = 0.0001 and P(i) = 0.997
Method
P(m) = 0.698 and P(i) = 0.996
ln Sigmax2
O .S.. - 0.1 2. 1
Method
P(m) 0.374 and P(i) = 0.995
I T
Figure 3.1: Plots of Slope Mean against Method for different levels of the Parameters of the Simulation under the nul1 hypothesis.
U LS WLS IWLS MIX Method
P(m) = 0.01 6 and P(i) = 0.81 8 8 a Scaie I
WLS I WLS MIX I
Method
p(m) = 0.01 1 and P(i) = 0.673
N
P(m) = 0.016 and P(i) = 0.675 - - -. -- - --.._ -- Sig ma2 -.-__------ S.*.---*.*.__ --.. 1
Method
o . P(rn) = 0.0001 and P(i) = 0.852
Method
O P(m) = 0.0001 and P(i) = 0.799
7 O T
WLS IWLS MIX I
Method Method
Figure 3.2: Plots of Variance Ratio* of Slope against Method for different levels of the: Parameters of the Simulation under the nul1 hypothesis.
* Variance Ratio (V R) is given by: V R = ''2~ see text Page 25.
U LS WLS IWLS MIX
Method
P(m) = 0.0001 and P(i) = 0.802
WLS IWLS M I X
Method
P(m) = 0.0001 and P(i) = 0.964 ---- ------ N
U L b W La IWLS M I &
Method
U LS WLS IWLS MIX
Meihod
WLS IWLS M I X
Melhod
P(m) = 0.0001 and P(i) = 0.612 ----------.--__.__...----.-*-..--------------- T
Meihod
Figure 3.3: Plots of Empirical Variance (VT) of Slope against Method for clifferent levels of the Parameters of the Simulation uncler the nul1 hypothesis.
U LS WLS l WLS MIX Meihod
P(m) = 0.0001 and P(i) = 0.802
ScaIe ---. - S
ULb W La IWLS MIX
Meihod
P(m) = 0.0001 and P(i) = 0.968 ............................................. N 1
Method
P(m) = 0.0001 and P(i) = 0.479
O 1 sigma2
S WLS IWLS MIX
Method
P m = 0.0001 and P i = 0.698 ...... ................. ............... 2 ' - 1 !!.* Sigma2 .... - 0.1
1
Method
P(m) = 0.0001 and P(i) = 0.620 ..............................................
O T
Method
Figure 3.4: Plots of Mean Square Error of Slope against Method for different levels of the Parameters of the Simulation under the nul1 hypotliesis.
P(m) = 0.505 and P(i) = 0.879
O Sigma2
V ) t 9 O
. U LS WLS IWLS MIX
Method
P(m) = 0.074 and P(i) = 0.993
Scaie
WLS MIX
Method
P(m) = 0.0001 and P(i) = 0.200
Method
Figure 3.5: Plots of Slope Rejection Rates
W LS IWLS MIX
Method
P(m) = 0.060 and P(i) = 0.852
Sigmax2
ULY WL3 IWLS MIX
Meihod
p(m) = 0.063 and P(i) = 0.869
O
U L S WLb IWLS MIX
Meihod
against Method for different levels of the Parameters of the Simulation under the nul1 liypotliesis.
ù=J m . - . .
(Ji r ULS WLS IWLS MIX
Method
P(m) = 0.033 and P(i) = 1 .O00
sade l
.- I U L J WLb IWLb MIX
Method
Method
P(m) = 0.531 and P(i) = 1 .O00 z 9 ni Sigma2
..-- 1
Method
Method
Ln Y
P(rn) = 0.875 and P(i) = 0.999
P(m) = 0.093 and P(i) = 1 .O00 1 T
9 ' 01
Method
Sigmax2 ..-- - 3.1
Figure 3.6: Plots of Slope Mean against Metliod for different levels of the Parameters of the Simulation under the alternative liypothesis.
U LS WLS IWLS MIX Melhod
P(m) = 0.774 and P(i) = 0.834
/*-*---.----.*---.*.------.-- .. ---- ... r 1
~ L S IWLY MIX
Method
P(m) = 0.0001 and P(i) = 0.971
/ = N.. 10 - 50
ULY WLS IWLY MIX
Method
P(m) E 0.0001 and P(i) = 0.166
Method
P(m) = 0.001 and P(i) = 0.793 #
Sigmax2
Melhod
P(m) = 0.0001 and P(i) = 0.422
1 T
WLS lWLb MIX
Method
Figure 3.7: Plots of Variance Ratio* of Slope against Methocl for different levels of the Parameters of the Simulation under the alternative liypotliesis.
* Variance Ratio (VR) is given by: V R = "~6"~ see text Page 25
U LS WLS l WLS MIX
Method
P(m) = 0.0001 and P(i) = 0.288
Scale
U LS WLS l WLb MIX
Method
P(m) = o.ooo1 and P(i) = 0.1 18 .--------...-__I*----------------------------
N
ULb WLb IWLS Ml A
Method
P(m) = 0.0001 and P(i) = 0.01 5
Sigma2
Method
P(m) = 0.0001 and P(i) = 0.023 --------------___ ..-----*-.-- * --------- -**-*-- Sigmax2
S... - 0.1 1
ULb WLS IWLS MIX
Method
P(m) = 0.0001 and P(i) = 0.019 2 -- ----..- -----*-._...-*---.-------------------
T
Method
Figure 3.8: Plots of Empirical Variance (VT) of Slope against Metliod for different levels of the Parameters of the Simulation under the alternative hypothesis.
U LS WLS IWLS MIX Method
P(m) = 0,oool and P(i) = 0.291
Scaie
U L J WLY IWLS MIX
Method
P(m) = 0.0001 and P(i) = 0.1 19 -------.-.-.-._------.----------------------- N 1
Method
Method
P(m) = 0.0001 and P(i) = 0.024 ---------_.______*...*....*--------------- Sigmax2
-m.. 0.1 - 1
Meihod
p(m) = 0.0001 and P(i) = 0.019 .-- -----------.______**-.*.*...----------m.--- T
S... - 10 50
ULb W L b l WLb MIX
Meihod
Figure 3.9: Plots of Mean Square Error of Slope against Method for different levels of the Paraineters of the Simulation under the alternative liypothesis.
I J U LS WLS l WLS MIX
Method
2 U Lb WLb l WLb MIX
P(m) = 0.0000 and P(i) = 0.829
Meihod
9. Y
al n
P(m) = 0.0000 and P(i) = 0.488
9 N 7
Scale .... *-------*-------------.--.----**-----.------- - 5
I WLS IWLS MIX
Method
P(m) = 0.0000 and P(i) = 0.964
1 Sigma2 .... 1
WLS IWLS I
MIX
Melhod
Method
P(m) 0.0000 and P(i) = 0.997
P(m) = 0.0000 and P(i) = 0.972
9. T .-- .... - 10
50
9. 7
al n
W LS IWLS MIX
Sigmax2 .--- - o. 1 1
Meihod
Figure 3.10: Plots of Slope Rejection Rates against Method for different levels of the Parameters of the Simulation u~ider the alternative hypothesis.
3.2 The Results
3.2.1 Estimation
From Figures 3.1 and 3.6, we can determine the accuracy in the estimation of the slope.
Under the null hypothesis, in Figure 3.1, we see that the ULS estimator is slightly closer
to the true value of zero, however, this difference is not statistically significant. We also
notice that the estimation of the slope parameter improves when the size of the variance
of the within individual error is hi&. Under the alternative hypothesis, in Figure 3.6, we
see that al1 four estimators are around the same value of 1.987 which is very close to the
true value of 2.0. The estimation of the slope is improved when the size of the variance
of the regression parameters is low and the number of individuals is high.
In Figures 3.2 and 3.7 are shown the variance ratio of the estimators. Under the null
hypothesis, in Figure 3.2, we see that the MIX estimator has a slightly lower variance ratio
than the other three estirnators, however, this difference is not statistically significant . The
variance ratio of al1 four estimators seems to decrease when the size of the variance of the
regression parameters is low, the size of the variance of the within inclividual error is high,
the size of the variance of the explanatory variable X is high, the nuniber of indivicluals
is high, and the number of repeated rneasurements per individual is high. Under the
alternative hypothesis, in Figure 3.7, we see that the variance ratio of the ULS estimator is
slightly lower than the other t hree estimators, as under the null hypothesis, tliis difference
is not statistically significant. The variance ratio is lower when the size of the variance of
the within individual error is low, the size of the variance of the explanatory variable X
is low, the number of individuals is low, and the nuinber of repeated iiieasurements per
individual is low. In the case where the number of repeatecl tneasuren~euts per inclividual
is low, the ULS estimator has an extremely low variance ratio although this cliffereuce is
not statistically significant.
In Figures 3.3 and 3.8 are shown the empirical variances of t he estimators. Under the
null hypothesis, in Figure 3.3, we see that the empirical variance is almost identical for
al1 four estimators and that the empirical variance is lower when the size of the variance
of the regression parameters is low, the size of the variance of the within individual error
is low, the size of the variance of the explanatory variable X is high, the number of
individuals is high, and the number of repeated measurements per indiviclual is high.
Under the alternative hypothesis, in Figure 3.8, we see that the empirical variance of the
ULS estimator is slightly larger than it is for the WLS, IWLS, and MIX estimators. The
empirical variance is lower, as under the null hypothesis, when the size of the variance
of the regression parameters is low, the size of the variance of the within individual error
is low, the size of the variance of the explanatory variable X is high, the nuniber of
individuals is high, and the number of repeated rneasurements per individual is high. The
empirical variance of the ULS estimator is much larger than the empirical variance of the
WLS, IWLS, or MIX estimators when the size of the variance of the within individual
error is high, the size of the variance of the explanatory variable X is low, and the number
of repeated measurements per individual is low.
The mean square errors are shown in Figures 3.4 and 3.9. For both the null ancl
alternative hypotheses, the mean square errors of the estimators folIow exact Iy the same
patterns as the empirical variances of the estimators shown in Figures 3.3 and 3.8, re-
spectively, and are described above. This occurs because the bias in the estimation of the
slope is negligible when added to the empirical variance in the calculation of the mean
square error.
3.2.2 Type 1 Error Rates and Statistical Power
In Figure 3.5 the Type 1 error rates, the proportion of times that the nul1 hypothesis is
rejected wlien it is true, are sliown. We can see that the rejection rates of the IWLS,
and to a snialler extent MIX estimators are much closer to the true value, 0.05, than the
ULS or WLS estimators. The rejection rates become doser to the true value when the
number of individuals is high. In the case where the number of individuals is low (N =
10) the rejection rate of the ULS estimator is around 0.072, the WLS estimator is around
0.070, the IWLS estimator is around 0.050, and the MIX estimator is around 0.047. This
implies that when the sample size is low, the IWLS and MIX estimators are much more
accurate than the ULS or WLS estimators in the test bias.
From Figure 3.10 we can determine the proportion of tirnes t hat the nul1 hypothesis is
rejected when it is known to be false. This quantity is called the power. In our simulation,
we consider the situation where ,8 = (1,2)' and we see that the power of al1 four estimators
is statistically equivalent, however, the power of the ULS and WLS estimators is slightly
larger then the IWLS and MIX estimators, although this effect is not significant. The
power of the estimators is larger when the size of the variance of the regression parameters
is low, the size of the variance of the within individual error is low, the size of the variance
of the explanatory variable X is high, the number of individuals is high, and the nuinber
of repeated measurements per individual is high. T h e largest increases in power occur
when the size of the regression parameters is low and the number of individuals is high.
In these cases, the power of al1 four estimators increases from around 0.7 to very close to
1.0 where al1 estimators have almost identical rejection rates.
In Appendix D are reported the analysis of variance tables for al1 main effects and
first order interaction terms with the variable METHOD as well as the sunixnarizecl data
for each of the 32 levels of the factorial design.
Chapter 4
An Application: Environmental
Health Among Asthmatics in the
City of Windsor
To illustrate an application of the techniques discussed above, we used data frorii a study
of asthmatics in the City of Windsor carried out by the Gage Research Institute. The
objective of the study was to estimate the relationship between daiIy mean concentrations
of air pollutants and several indicators of respiratory health in a group of astlunatics. The
study was based on a sample of 39 asthmatics who reside in Windsor, Ontario, aged 12
years and older. These asthinatics were classifiecl as such if tliey had an ongoing neecl for
asthma medication.
For each of the 39 participants, data was collected on 21 consecutive clays. Peak flow
rates were collected as measures of respiratory status. Eacl-i individual recorded the best of
three peak fiow rates each morning and at bedtime before the use of astlima medications.
For each subject, environinental data was collected by a network of six Ontario Ministry of
the Environment fixecl site monitoring statiolis. The estimate of each subject's exposure
was based on pollution reaclings obtained from the monitoring station closest to his horne.
This environmental data consisted of a variety of substances, such as ozone, sulfur dioxide
(S02), total reduced sulfur (TRS), and nitrogen dioxide (NO2), routinely monitored by
tlie Ministry of tlie Environment. The measure of pollution used here was based on tlie
mean of hourly readings between 8 AM and 8 PM.
Fbr the purpose of this analysis, we will try to estirnate the relationship between an
individuaI's evening peak flow rates and the corresponding day's average measurements of
sulfur dioxide and total reduced sulfur. Since it was assumed that each of the 39 individ-
uals could have a different relationship between their respiratory outcorne measureinent
and their corresponding environmental data, a random components mode1 was used. For
this analysis, we compare the results of the five different techniques discussed iti Chap-
ter 2. They are the unweighted Zeast-squares estimator (ULS), the weighted Eeast-squares
estimator (WLS), the iterated weighted least-squares estimator (IW LS), the Henderson es-
timator calculated via the SAS procedure MIXED (MIX), and the ordinary Eeast-squares
estimator calculated via the SAS procedure REG (REG). We have also incIuded the gen-
eralized estimating equation (GEE) estimator of Liang PI. Zeger. This estimator which is
discussed further in Part III uses a marginal analysis for the parameter estirnates, equal to
the REG estimator, but calculates robust standard errors wliich account for clustering in
the data. The results of this analysis are presented in Table 4.1 where is the intercept,
pl is the dope of the S02, and P2 is the slope of the TRS.
We can see from Table 4.1 that tlie results of the WLS and IWLS metliods yield
identical results for the accuracy of tlie data given. This tnay be due to the small nuinber
of iterations (4) required for convergence.
The intercept was estimateci similarly by al1 of the methods used in this atialysis.
The main difference with the intercept was that the standard error calculated by the
REG estimator is much lower than al1 five other estimators. This result was due to the
large heterogeneity of individuals that can be seen from the estimates of the variance
of the intercept (- 23000) in Table 4.2. III Section 1.1 we introduced the idea that if
Table 4.1: Parameter Estimates for the Astlima Data
IWLS took 4 iterations to converge. $ MIX took 20 iterations to converge. b GEE took 2 iterations to converge. * P values for ULS, WLS, and IWLS were equivalent to two decimal places for a11 three tests
(refer to Pages 19 - 2 1).
lieterogeneity is present and is ignored, which is what REG cloes, then the standard error
MIX$ 374.4 25.2 0.00 -0.5 3.6
0.89 -7.6 6.9
0.28 2274.8
Estimate Po Std. Err. P value Pi Std. Err. P value @2
Std. Err. P value 8 2
of the intercept or mean will be underestimatecl.
The estimate of the effect of sulfur dioxide (SOz) on peak flow was less consistent
among the different methods than was the estimate of the intercept. The WLS, IWLS,
ULS WLS IWLSt 373.9 373.9 373.9 24.7 24.7 24.7 0.00* O.OO* 0.00* -3.6 -1.1 -1.1 5.1 4.6 4.6
0.49* 0.81* 0.81" -5.2 -7.4 -7.4 7.0 6.8 6.8
0.46* 0.28* 0.28" 2283.3 2283.3 2283.3
and MIX, seemed to have the most similarities in parameter estimate (- -1) and standard
errors (N 4).
In Figure 4.1 we see three graphs. In eacli grapli every individual has the paraineter
estimate (given by the vertical axis) plotted against the corresponding diagonal elenient
of tbe inverse of the individual's variance-covariance niatrix for the estimate, its weight
(given by the horizontal axis).
In the middle graph (the plot of SOa) we see one individual witli a slope around -150
and a very small corresponding weigbt. This individual drives the ULS estimate of the
slope much lower since it is an unweighted average of al1 of the dopes. We find that the
Table 4.2: Parameter Variance Estirnates for the Asthma Data
METHOD
ULS
WLS
IWLS
MIX
I aB, OPoP1 ahpz Estimate of &p = opop, a l 0 p 1 a
OLIoa 001P2 0;2
23462.4 - 1873.4 - 1665.2 -1873.4 378.1 -216.0 - 1665.2 -216.0 1129.5 23462.4 -1873.3 -1665.3 - 1873.3 384.4 -221 -5 -1665.3 -221.5 1134.3 23462.4 -1873.3 -1665.3 - 1873.3 384.3 -221 -5 -1665.3 -221.5 1134.3 24397.0 -695.3 -2745.4 -695.3 71.4 -0.6
-2745.4 -0.6 1241.9
ULS estimate of the slope is indeed smaller than the WLS, IWLS, and MIX estimates
which, in turn, leads to a smaller p value despite the slightly larger standard error. For the
GEE estimate, we see that the estimate of the dope is much smaller than al1 of the other
estimates and the robust standard error (21.9) is drastically Iarger than the naive standard
error (9.2). The combination of this bias and overestimation of the standard error leacls
to the p value (- 0.8) in the same range as the WLS, IWLS, and MIX estimators. The
between individual slope calculated from the iticlividual average measurements is -27.5 (p
= 0.77) and since the average of the iiidividual slopes is iiegative (-3.6), we see a marginal
slope (-6.9) that is betweetl the two because it is a weighted average of these two estimates.
Total reduced sulfur (TRS) has regression est imates that were estimatecl negatively
by the ULS, WLS, IWLS, and MIX estimators and positively by the GEE ancl REG esti-
mators. This can occur because the GEE and REG estimate of the regression paranleters
are population averages. That is, the GEE and REG estimation is a marginal analysis
wliich does not account for individuaIity in the regression estimates. This phenoineiion
was previously encountered in our example of Section 1 . l . The between indivicluai slope
of the average total reduced sulfur (TRS) with average evening peak flow rates is 117.3
(p = 0.46). Although this dope is not statistically significant it still seems to pull the
marginal estimate in a positive direction away from the negative average of the iiidividual
slopes (-5.2) to a value of 2.8 which is in the opposite direction. The estimate of the slope
of the totaI reduced sulfur measurement when each individual was niodeIed with orily a
unique intercept was -1 1.7, a value much closer to the randoni coefficients methods than
the marginal method.
By calculating ratios of the variance components in the mode1 as well as the variances of
the explanatory variables, we determined that this example was closest to our simulation
situation where the size of the variance of the regression parameters is high, the size of
the variance of the explanatory variable is low, and thenumber of individuals is high.
The simulation results indicate tliat the IWLS estimator was the best estimator for that
particular situation. Thus, using the results of the IWLS estimator in this dataset, we
would conclude that the average sulfur dioxide (p=0.81) and total reduced sulfur (p =
0.28) measurements have little effect on the evening peak flow rates of asthmatics.
6 1 0"-5 6.5'1 0"-5 7'1 0"-5 7.5.1 OA-5 8.1 0"-5 Weight
I 0.001 0 0.001 5 0.0020 0.0025 0.0030
Weight
O
0.0006 0.0007 0.0008 0.0009 0.001 O Weight
Figure 4.1: Plots of Parameter Estimates against their Weiglits for Astliina Data.
Part III
Binary Outcome Longitudinal Data
Chapter 5
Theory of Random Coefficients for
Multiple Logistic Regression
Let us suppose that we have t repeated measurements for n iildividuals where Our outconle
variable of interest, y, is binary and cari assume the value of 1 or O. Then the mode1 for
the it" individual is given by
where
Zi =
P r { y i j = 1)
log (4-) l - ~ t j
X i P i + &i
In tliis model, zi is a t x 1 vector of t repeated unobservable estimated logits for the il"
individual, where each 2, represents the logit of the probability of a 'success' in y,, Xi
is a t x p matrix of t corresponding repeated measurenients on p explanatory variables,
pi is a p x 1 vector of individual-specific logistic regression coefficients corresponding to
the p explanatory variables, and e; is a t x 1 vector of t random logit errors where each
E i j has mean zero and variance pij(Lpij). For unreplicated data, z;j is undefined, liowever,
the mode1 equation is used to illustrate the relationship.
It should be noted that in the formulation of most logistic regression models, the logit
error is modeled implicitly by stressing that zq is an unobservable random variable witli
its own variance such that conditional on Xi and Pi, the variance of 2, is
As in the continuous case, we assume that the individual parameter vectors pi exhibits
significant variation across individuals and that these parameters are assumed to follow
the multivariate Gaussian distribution. We will make the following assumptions which
are the analogues of those given in Section 2.1:
A l . ,& - M V N (p, X P P ) and I P k , {i # k; i, k E 1, . . . , n} ; i. e. each pi cornes from
a multivariate normal population of dimension y with mean vector ,8, variance-
covariance matrix Xpp, and the fis are independent across individualç.
A2. The X;s are fixed and of full row rank for each i.
A3. For a fixed number of explanatory variables, p, min(n, t ) > p, i . e . there are botli
more repeated measurements per individual and nurnbers of individuals than there
are explanatory variables.
A4. The elements of (X;v; 'Xi ) - ' are uniformly O(t- ' ) , i .e . there exists a finite upper
bound, M , such that the elements of t(XiV;'Xi)- ' are less than M in absolute
value for ail i and t .
where
Using Assumption A l , Mode1 5.1 and the properties of the logit, we can see that
conditional on Xi,
With the above model, we can now proceed witli extending the techniques of the con-
tinuous outcome random coefficient regression model to the binary case in the estiniation
of the unknown logistic regression parameters.
5.2 Estimation Methods
In exploring the estimation methods of the random coefficient Iogistic regression model,
we follow the same pathway that we used in the continuous case - the two-stage model
followed by a third updating stage. Thus for the binary case we also have three estimation
stages .
5.2.1 First Stage Estimation (The Individual)
The first stage of estimation is the main point of difference between the continuous and
binary outcome cases. This is because for each individual, the outcome variable is rnodelecl
as a linear futiction of the predictors in the continuous case and as a logistiç function of
the predictors in the binary case. Thus our estimation of these different paranieters will
also be different. For the continuous case, parameter estimation for each inclividual used
ordinary least-squares. In the binary case, the parameters of the non-linear relationship
are obtained using inaxirnuin likelihood estimation. This approach necessitates the use of
the Newton-Raphson method.
The likelihood for the it" individual is given by:
where y, = 1 if a success occurs on the j th observation for the i th individual and y, = O
otherwise. The logarithm of this likelihood is given by:
t
The first derivative of the logarithm of the likelihood with respect to the vector f i leads
to the Score vector,
si {Bi ) = X{ (yi - pi )
and the negative of the second derivative will be the Information matrix,
Therefore, using Newton-Raphson, our estiinate of the inclividual logistic regression
parameter, bi at the mt" and (m + l ) t h step of the iterative process is given by the
equation:
where pi"') = (p!y) , . . . , and l q i t = + xil biy) + - + X - 2~ b!>lL) r p - Using the properties of t his likelihood equation and its derivatives and the assumptions
of the random coefficient iiiodel described above we have that, conditioiial on Xi ancl
asymptotically as t + oo,
where each bi is independent of al1 bk, for al1 i and k where i, k E 1,. . . , n.
Before moving on to the next stage of estimation we must consider the possibility
of non-convergence of the Newton-Raphson algorit hm. Convergence will occur in most
situations. There are, however, two situations where it is clear that convergence will
not occur. The first situation occurs when the data have a complete (or quasi-complete)
separation and occurs when the largest covariate measurement for one of the responses
is smaller than (or equal to) the srnaIlest covariate measurement for the ot lier response.
The second situation occurs when the response is constant for an individual, that is, when
a11 responses are 1 or al1 responses are 0. Diagrammatically, these cases look like this:
Cornplete Separation Al1 Same Outcome Y Y
To proceed with estimation for these cases we will assume that within the two categories
defined by whether yij = 1 or yij = O the vector of covariates is multivariate Gaussian witli
m a . a ores a a 1
coinmon variance-covariance matrix. For the case of a single predictor, this simplifies t o
X O
X
l a @ l l 1
I I I
o.... -1
the covariate having constant variance but a different mean for each of the two outcornes
( i - e - fx l~ (x i j ly i j = k) N ( p I X ' , Li) ; k = 0 , l and Pr{yij = 1) = IIi).
We are interested in inodeling:
where ,Bio is a scalar and f i is a (p - 1) x 1 vector of logistic regression parameters. We
have,
1 exp { -+(x i j - p:l))'Ei: (xi j - P:'))} IIi
= log exp { - + ( x , - pIO))'~" (xij - 1 " 1 ° ) ) ) ( 1 - I I i )
as was shown by Cornfield [5]. Therefore our logistic regression parameters, in terms of
the parameters of the distributions of the measurement variables are given by:
Therefore, the two parameters of t lie logistic mode1 can be estimated usiiig ,&, and IIi. As estimates we will use:
1. $ ) = 1 d l where x::) is a covariate corresponding to y, = 1 and tjl' is , ! l > Z = l u
the number of repeated measurements of x',:).
t ! O ) O ) = , x where x$' is a covariate corresponding to y, = O and t$'" is P i ,y the xiumber of repeated measurements of xi:).
With these estimates of the parameters, we can obtain estimates of the logistic regres-
sion parameters as well as derive the variances of these estimators using the delta method.
The results from the delta method are given in Apyendix B.2. Our final estimators aiid
their variances and covariance are given by:
where the variance- covariance parameters are analogous t O the variance of b;, condi t ional
on Xi and ,Bi, narnely, analogous to (x;v;'x;) -'. We can now estimate our logistic regression parameters and their variances even in the
case of complete separation. If one individual has al1 the same response then Our estinlate
of is set equal to zero. In this case, the mean of the non-response distribution becomes
the mean of the all-response distribution and the pooled estimate of variance becomes the
estimate of variance of the all-response distribution. Also, we add a correction factor to
1 the number of observations such that if tf ' = O then we force tlO) = ancl t;" = t - or
vice-versa for the other case.
For al1 individuals, we now have a recipe for estimating the logistic regression parain-
eter vector and its variance-covariance matrix. We can now tnove on to the second stage:
of estimating over the aggregate.
5.2.2 Second Stage Estimation (The Aggregate)
We are now ready to define our aggregated estimates of the logistic regression parameters,
p, as well as the estimate of the variance-covariance matrix of this parameter, EpB. This
is done in the same way as the continuous case.
Unweighted Least-Squares Estimator
Our first estimator is the logistic analog of the Gurnpertz Pc: Pantula [8] unweighted least-
squares (ULS) estimator - the unweighted average of the individual regression parame-
ters:
It is important to note that this estimator does not depend on the variance components
in the model. This property makes it a simple way of looking at the aggregated regression
parameter without the need for estimating these variance components. This point was
introduced in Section 2.2.2.
Weighted Least-Squares Estimator
For the case where Epa is not known, Swamy [16, 171 proposed the estimnted genemlized
least-squares or weighted least-squares (WLS) estimator for the continuous outcome case.
We will argue by analogy and define the weighted estimator by
where WC' = Var {bi) = Xpp+ (X;V;'X~)-' which was given Liy Result 5.4 on Page 53.
Therefore, our two working estirnators for B (f iuLs and fi,,,) are the analogues of
those given in Chapter 2, Equations 2.6 and 2.8. PuLs uses the equal weights 1 /n ( i . e .
unweighted) and P,,, uses weights given by the inverse of the variance of the regression
parameter estimates.
We now need to define Our estimate of the between-individual variance, Epp, in order
to proceed with the weighted least-squares estimator.
Let the statistic Sbb be given by:
It can be shown (refer to Appendix B.1) that:
Var {bi} - n Var {bULS}
and using the fact that BU,, = Cy="=,i, we have:
1 n
Therefore, by the method of moments, our estimate for the variance-covariance matrix of
the random coefficients is given by:
Since we are calculating the difference of two matrices, there is uo guarantee that
the resulting matrix will be positive-definite. Thus, it is possible that we may uot get a
positive-definite estimate for our variance-covariance matrix of tlie ratidotn coefficients.
To safeguard against this, we have adapted tlie Carter & Yang [3] correction used in
Part II.
Using Equation 2.14, and defining % = j / b 2 and Vi = 021, we Iiave:
D* = a =y==, t + Trace {z:=i (x:v;'x~)-*), and 8 and ? are the s r n a k t roots
of the equations respectivel~ given by:
If we use V; as it is defined on Page 51, we now have a working version of the Carter &
Yang correction for the binary outcome case. It is this corrected estimate of the paranieter
variance that will be used in al1 subsequent estimation techniques for Part III.
Since we now have an estimate of the variance component in the rnodel (&) as well
as the individual regression parameter estimates (b;), we are now in a position to calculate
the weighted least-squares estimator given in Equation 5.13.
Before moving on to the third estimation stage, we will first determine the variances of
the two estimators we have discussed. These variances are necessary for inference based
on these estirnators. As shown by Equation B.2 in Appendix B. 1, var{fi,,,} = +E{sbb),
therefore, - 1 ~ar{ iÔu , s } = -SM n
is an estimate for the variance of the unweighted least-squares estimator. For the Swamy
estimator, we have:
Therefore,
is an estimate for the variance of the weighted least-squares estimator.
5.2.3 Third Stage Estimation (The Updating Procedure)
For this third stage of estimation in the binary outcome case, we will proceed in exactly
the same manner as we did in the continuous case using the result of Appendix (2.3 -
that is, we will use our newest estimate of the aggregate logistic regression parameter to
improve our estimation of the between-individual variance. We represent this as:
Although this statistic is clearly different frorn Sbb , if f i w L s is different frorn f i u L s , in
Appendix C.3 we see that th& expected values are approximately equal. As in the con-
tinuous case, we have identified a means of updating our estimate of the logistic regression
parameter variance which will improve our estimate of the aggregate logistic regression
parameter.
Iterated Weighted Least-Squares Estimator
As was defined for the continuous outcome case, we have:
where,
1 '" - -x (bi - ,3*) (bi - f i * ) / Sb6 - n, - 1 ;=,
We iterate througli estimates for ,& until convergence is acliieved at wliicli point we cd1
tlie value of tlie iteroted veighted least-squares (IWLS) estirnate. For the purpose of
inference, its variance is given by:
where w is the weight rnatrix calculated using fi,,,,. We have now definecl our three
estimators and their variances. We can now form test statistics to test Our prior hypotheses
about them. We will use the same test statistics as those given for the continuous outcoine
situation. They can be found on Pages 19 - 21.
In the next chapter, we compare the statistical properties of bias and power of these
three estimators as well as two other well-known estimators - the Liang & Zeger [20]
generalized estimating equation robust estimator using their SAS GEE macro wliich will
be referred to as the GEE estimator and the ordinary Zogistic regression naive estimator
which ignores al1 assumptions of random coefficient regression modeling and treats every
observation equally and independently which will be referred to as the LOG estimator.
Chapter 6
Binary Outcome Longitudinal Data
Simulation
6.1 The Simulation
Let us consider the mode1 given by Equation 5.1 where we set p = 2, that is, we have
only one covariate and we are interested in estimating an intercept and slope. Therefore,
the mode1 for the j th of T observations for the ith of N individuals is given by:
where
Pi = ( f i o , f i l )' - N I D ( [ ] . [ 1) ancl p = p, i l tak.~ 0. tile mi,,,,
of (O, O)' or (2,4)' and y (Scale) takes on the values of 1 or 5.
X, = (1, Xiij) ' N I D
of 0.1 or 1.
[ O 1) and < (Signiar2) t a k a on tlie values
0 E, is the logit error with mean zero and variance wliich depends only on pjj.
The quaiitities N and T take on the values of 10 or 50.
Therefore, for this simulation we have a 2' factorial design, or equivalently, for botli levels
of the true regression parameters, we have a 24 factorial design and each level of the
factorial design consists of 100 trials. At each trial, the three random coefficient methods
(i.e. ULS, WLS, and IWLS), and both the robust (GEE) and naive (LOG) estinlators
from the Liang & Zeger generalized estimating equation technique using the SAS GEE
macro are used to analyze the simulated dataset.
The estimated regression parameters, their estimated standard errors, and the number
of times the nul1 hypothesis (Ho : ,ûo = (0,O)') was rejected (based on a 5% level of sig-
nificance) were al1 recorded for each method of each simulated dataset for the slope only.
For the random coefficient methods, rejection numbers were collected based on al1 three
distributions of the test statistics for each of the three methods given by Equations 2.26,
2.27, and 2.28.
Analysis of variance techniques (and logistic regression for the proportions) were used
to analyze the results of the simulation. The plot of the rejection rate of the slope under
the nul1 hypotliesis is plotted in Figure 6.0. It is clear that an analysis tliat ignores al1
heterogeneity in the data (LOG) is inadequate in properly analyzing data of this forin.
Therefore, the rest of the figures presented below as well as al1 statistical analyses iiiclude
only those four estimators nientioned in the METHOD variable.
In Our analyses of variance, a full interaction terin mode1 was fit for the outcome
variables of mean slope, relative variance of slope, and rejection rate of dope where our
sample was based on approximately 6400 (= 24 x 4 x 100) observatioiis. In tliese iiioclels,
only the main effects and the second order interactions with METHOD were of interest.
In the reinaining cases, where the outcome of interest was dependent on factorial level
summary measures and Our sample was based ou 64 (= 24 x 4) observations, we used a
Method
Figure 6.0: Plot of Rejection Rate of Slope under the nul1 hypothesis for al1 five estimators.
Table 6.1: Situations where the IWLS estimator failed to converge when = (2,4)'.
generalized F-test to determine that a fourth order interaction term was significant. This
model was fit so that al1 F-tests of the main effects and their interactions with METHOD
were based on a mean square error from this high orcler model. 111 al1 instances where
logistic regression was used to analyze the proportion of test rejections, 0.5 was addecl
to each ce11 to prevent einpty ce11 counts from giving uninformative analysis of variance
paraineters (refer to Gart & Zweiful [7]).
The results of the simulation study are suminarized in Figures 6.1 to 6.10. A clescrip-
tion of these figures was given in Cliapter 3. The only difference is tliat the factor F4
froin Table 3.1, relatecl to the size of the resiclual variance a2, does uot exist in the binary
7 02 N T number
: ::; ;: :; L
2 1 1.0 10 10 5 1 1.0 50 10 3
outcome situation.
+y 02 N T number L : 0 5: 50 1
1 1.0 10 50 7 1 1.0 50 50 4
Table 6.2: Estimates of the slope for the remaining estimators when the IWLS estimator failed to converge for P = (0,O)'.
It shouId be noted that the GEE method did not converge on two occasions. These
7 O: N T
5 : :O 5 1.0 10 10 5 1.0 10 10
were both under the alternative hypothesis (@ = (2,4)') when the size of variance of
ULS WLS GEE
!i? -.76 - . I l .O11 -1.0 -.28 -.73
regression parameters was high (y = 5), the size of the variance of the explanatory
variable X was low (a: = 0.1), the number of individuals was high (N = 50), and the
number of repeated measurements per individual was also high (T = 50). The IWLS
estimator failed to converge in under 50 iterations in 46 situations. Four of these occurred
under the null hypothesis ( P = (0,O)') when the size of the variance of the regression
parameters was high (7 = 5), the number of individuals was low ( N = IO), and the
number of repeated measurements per individual was also low (T = 10). Of these .four
situations, two occurred when the size of the variance of the explanatory variable X was
low (0: = 0.1) and two when it was high (c: = 1.0). The other 42 situations, under the
alternative hypothesis ( p = (2,4)'), are summarized in Table 6.1. The results of the other
three estimators for the 48 simulated datasets that GEE and IWLS did not converge were
kept in the statistical analyses and graphical sumniaries. As inentioned in Chapter 3,
t here were aclvantages and disadvantages in doing this. Table 6.2 suriiiiiarizes t lie iion-
convergence situations for the IWLS estimator under the null hypothesis and Table 6.3
suinmarizes the non-convergence situations for the IW LS estimator under the alternative
hypothesis. These latter two tables give the values of the estimates of the slope for al1
three remaining estimators (ULS, WLS, GEE). The average of the estimates of the slope
under the null hypothesis for the ULS estimator is -0.31, .for the WLS estimator is -0.22,
Table 6.3: Estimates of the slope for the reniaining estimators when the IWLS estimator failed to converge for ,û = (2,4)'.
ULS WLS GEE 6.98 1 4.76 1.45 312 2.47 4.47 3.18 4.37 3.34 1.89 2.92 2.95 1.04 3.78 3.34 -996 3.55 4.26 2.90 3.57 1.45 -487 1.95 3.01 ,913 2.76 1.75 .703 3.63 1.65 6 8 4.20 3.85 .326 3.43 5.03 1.08 3.29 5.33 3.23 3.39 4.51 3.03 2.63 2.74 1.79 2.65 7.13 3.04 2.83 3.68 2.94 2.81 5.89 3.93 3.02 1.26 .404 2.85 1.61 .329 3.52
ULS WLS GEE . . 6 .
L:E $5: 4.80 2.59 2.88 4.21 1.92 2.76 5.68 3.29 3.26 3.22 1.55 3.97 3.61 ,697 2.46 .248 .O77 2.92 5.97 4.22 4.37 4.55 3.12 4.24 5.07 1.39 2.20 5.34 1.29 1.89 4.57 2.29 2.79 1.13 .435 1.58 5.72 1.64 1.89 6.07 2.40 2.53 4.92 .847 1.17 4.51 2.02 1.86 5.77 .992 1.76 3.95 1.23 1.71 6.57 1.85 2.14
and for the GEE estimator is -0.65. Under the alternative hypothesis the ULS estiniator is
4.02, WLS is 1.70, and GEE is 2.91. This implies that the ULS estimator is more reliable
tlian the WLS or GEE estimators when the IWLS estimator fails to converge. Thus, it
may be an advantage of the IWLS estimator to not reach convergence if its estimate is
not very close to the true value.
Estimates from the situations for these methods where convergence was not achieved
were removed froni al1 analyses but the remainiug estimators of the same trials were
included. The reasons for leaving the reinaining estirnators in the anaIysis were given in
the contiriuous outcoine data simulation in Chapter 3.
P(m) = 0.126 and P(i) = 0.764
P(m) = 0.764
WLS IWLS titt 1
(U
8 ' 5
a g vj
6 . ? N 9. ?
8,
Method
.. . - - -
Plml = 0.209 and Pli) = 0.934
? ULS WLS IWLS GEE
WLb lWLb C i t t 1
Method
Method
Method
P(m) = 0.001 and P(i) = 0.073
w Sigmax2 8 ' ---- CU - 0.1
1 8 .
g, 8 . g 8 .
?
8 . 4:
Method
,.*- 2- .-•
*--.-------------. ..*-'
/.- __a- /.--
P(m) = 0.005 and P(i) = 0.254
Figure 6.1: Plots of Slope Mean against Metliod for different levels of the Parameters of the Simulation under the nul1 hypothesis.
U LY WLS IWLS titt
8 ' CU
2.
W . .., T ---- - 1 O 50
U LS WLS l WLS GEE Meihod
Method
P(m) = 0.713 and P(i) = 0.466
N --.- l n
U LS WLS: IWLS titt
Method
P(m) = 0.085 and P(i) = 0.074
Sigrnax2 . , S..-
m - - 0.1 n 1 - r/)
0 9 * .- ia II: 5 4. >
Meihod
P(m) = 0.0001 and P(i) = 0.0001 f
WLS: IWLS: titt
Meihod
Figure 6.2: Plots of Variance Ratio* of Slope against Method for different levels of the Parameters of the Simulation under the nul1 hypotliesis.
A- Variance Ratio ( V R ) is given by: V R = VFz see text Page 25.
U LS WLS IWLS GEE
Method
P(m) = 0.034 and P(i) = 0.155
scaie I
Method
P(m) = 0.009 and P(i) = 0.072
Method
P(m) = 0.025 and P(i) = 0.1 15
Sigrnax2
WLS IWLS titt
Method
P(m) = 0.275 and P(i) = 0.080
T
Method
Figure 6.3: Plots of Empirical Variance (VT) of Slope against Method for differeiit levels of the Parameters of the Simulation under the nul1 hypotliesis.
s t U LS WLS IWLS GEE
Method
P(m) = 0.032 and P(i) = 0.1 43
Scale S... - 1
5
i WLb lWLb U t t
Method
P(m) = 0.009 and P(i) = 0.069
'. N
8 - 10 50
.
--
U LS WLS I WLS titt
Method
P(m) = 0.025 and P(i) = 0.1 18 u! . i .- *, Sigma2 . .S.. . - 0.1
1
Method
P(m) = 0.260 and P(i) = 0.077 * A
Method
Figure 6.4: Plots of Mean Square Error of Slope against Method for different levels of t h e Parameters of the Simulation under the nul1 hypotliesis.
h O
P(m) = 0.006 and P(i) = 0.039
W La IWLb kitt J
Meihod
03 P(m) = 0.021 and P(i) = 0.242
; N
Method
Method
P(m) = 0.650 and P(i) = 0.691 r
Melhod
P(rn) = 0.0001 and P(i) = 0.001
Method
Figure 6.5: Plots of Slope Rejection Rates against Method for differetit levels of the Parameters of the Simulation under the nul1 hypothesis.
9 N . 1
U LS WLS IWLS GEE
P(m) = 0.0001 and P(i) = 0.0001
Scde S.*- - b
Melhod
P(m) = 0.0001 and P(i) = 0.0001
m . N ...* - 10 50
w LS IWLS C i t t I
Meihod
Meihod
P(m) = 0.0001 and P(i) = 0.026
rn. Sigmax2 ---- - 0.1 t
WLS IWLS iàtk
Method
P(m) = 0.0001 and P(i) = 0.0001
T
WLb IWLS bi t
Merhod
Figure 6.6: Plots of Slope Mean against Method for different levels of tlie Parameters of the Simulation under the alternative hypothesis.
I . U LS WLS IWLS GEE
P(m) = 0.0001 and P(i) = 0.000
Scaie
Method
P(m) = 0.0001 and P(i) = 0.0001 a----.-.-*.----- , N
Melhod
Melhod
P(m) = 0.0001 and P(i) = 0.0001
WLS IWLS titt
Method
P(m) = 0.0001 and P(i) = 0.0001
Method
Figure 6.7: Plots of Variance Ratio* of Slope against Method for different levels of the Parameters of the Simulation under the alternative liypothesis.
* Variance Ratio (V R) is given by: V R = see text Page 25.
s l ULS WLS IWLS GEE
Meihod
P(m) = 0.041 and P(i) = 0.075
\ scaie
U Lb WLb IWLS C j t t
Method
P(m) = 0.0001 and P(i) = 0.003 9 '. '.. N 1
Method
P(m) = 0.001 and P(i) = 0.005
Sigmax2
WLS IWLS titt
Method
P(m) = 0.01 8 and P(i) = 0.004
Melhod
Figure 6.8: Plots of Empirical Variance (VT) of Slope agaiiist Metliod for different levels of the Parameters of the Simulation under the alternative hypothesis.
U J .
m C L * .
m O (rJ.
I N .
F.
0
U LS WLS l WLS GEE
m a WLS Mllb U t t
Method
O
u LS WLS IWLS titt I
Method
Method
P(m) = 0.0001 and P(i) = 0.002 .
Method
P(m) = 0.0001 and P(i) = 0.000t
-
U FS WLS IWL5 U t t
Meihod
Figure 6.9: Plots of Mean Square Error of Slope agairist Methocl for different levels of the Paramet ers of the Simulation under the alternative hypothesis.
- 7
ULS WLS IWLS GE€ Method
P(m) = 0.0000 and P(i) s 0.0001
9 . 7
scale , ..-. ..* - 5 1
U LY W Lb l WLb l a t t
Method
P(m) = 0.0000 and P(i) = 0.018
IWLS titt I
Method
P(m) = 0,799 and P(i) = 0.026
l Sigmax2
Melhod
P(m) = 0.0000 and P(i) = 0.0000
s 1 T I
WLS IWLS C j t t 1
Method
Figure 6.10: Plots of Slope Rejection Rates against Method for clifferent levels of the Parameters of the Simulation under the alternative hypotliesis.
6.2 The Results
6.2.1 Estimation
From Figures 6.1 we can determine the accuracy in the estimation of the slope. Under the
null hypothesis, in Figure 6.1, we see tbat the ULS and GEE estimators give slope esti-
mates closer to the true value than the WLS or IWLS estimators, although this difference
is not statistically signifiant. The estimate of the slope is closer to its true value when
the size of the variance of the explanatory variable X is high, and the number of repeated
measurements per individual is low. Under the alternative hypothesis, in Figure 6.6, we
see that the ULS estimate of the slope is closest to the true value followed by the GEE
estimate and then the WLS and IWLS estimates. The slope estirnate is closer to its true
value when the size of the variance of the regression parameters is low, the size of the
variance of the explanatory variable X is low, the number of individuals is low, and the
number of repeated measurements per individual is high. The GEE estimate of the slope
was greatly improved when the size of the variance of the regression paranieters is low.
However, the GEE was unchanged with a change in the number of repeated measurements
per individual and the ULS estimate was unchanged with a change in the number of in-
dividuals. It appears that the discrepancy between the effect of the number of repeated
measurements per individual in the nul1 and alternative hypothesis can be explained by
the fact that the estimate of t h e slope may be dampened with a lower number of repeated
measurements per individual.
From Figures 6.2 and 6.7, we see the variance ratios of tlie four estinlators. Under
the null hypothesis, in Figure 6.2, we see that the variance ratios of the WLS and IWLS
estimators are much larger than tlie variance ratios of the ULS and GEE estimators. These
variance ratios are lower when the size of the variance of the regression parameters is low
and the nuinber of repeated ineasureinents per individual is hi&. In al1 cases, the variance
ratios of the ULS and GEE estimators are around zero and the changes in variance ratios
with changes in the factor levels occur for the WLS and IWLS estimators only. Under the
alternative hypothesis, in Figure 6.7, the variance ratios of the WLS and IWLS est itnators
are much larger than the variance ratios of the ULS and GEE estimators, as under the null
hypothesis. These variance ratios are lower when the size of the variance of the regression
parameters is low, the size of the variance of the explanatory variable X is low, tlie
number of individuals is high, and the number of repeatecl measurements per individual
is high. As under the null hypothesis, the ULS and GEE variance ratios are around zero
and only the WLS and IWLS variance ratios change with changing .factor levels. The
largest change in variance ratio occurs with the number of repeated measurements per
individual for the WLS estimator. When this number is low (T = 10) the variance ratio
of the WLS estimator is around 3.0 but when this nuniber is high (T = 50) the variance
ratio is around 0.5.
In Figures 6.3 and 6.8 we see the empirical variances of the four estimators. Under the
null hypothesis, in Figure 6.3, we see that the empirical variance of the ULS estimator
is much larger than the empirical variances of the otlier three estimators. The empirical
variances are lower when the size of the variance of the regression parameters is low, the
size of the variance of the explanatory variable X is high, and the number of individuals
is high. Under the alternative hypothesis, in Figure 6.8, we see that, as under the null
hypothesis, the ernpirical variance of the ULS estirnator is rnuch larger than the empirical
variances of the other three estimators and that the empirical variances are lower when the
size of the variance of the regression parameters is low (except for the GEE estimator), the
size of the variance of the explanatory variable X is high, and the number of individuals
is higli. When the number of individuals is high ancl the size of the variance of the
explanatory variable X is higli, the largest decrease in empirical variance is seen in the ULS
estimator (from around 2.0 to around 0.7). Wlien tlie number of repeatecl rneasurements
per individual is high, the empirical variances of the ULS and GEE estimators are lower
but the empirical variances of the WLS and IWLS estimators are higher.
In Figures 6.4 and 6.9 we see the mean square errors of the estimators. Under the
null hypothesis, in Figure 6.4, we notice that the mean square errors of the estimators
behave in exactly the same manner as the empirical variances of the estimators which was
described above. This is because the contribution of the bias to the inean square error
is negligible compared with the empirical variance. Under the alternative hypothesis, in
Figure 6.9, we see that the mean square errors of the ULS and GEE estimators are much
lower than the mean square errors for the WLS and IWLS estimators despite the much
larger empirical variance of the ULS estimator. The mean square error is lower when
the size of the variance of the regression parameters is low, the size of the variance of the
explanatory variable X is low, and the number of repeated measurements per individual is
high. In the case where the number of individuals is high, the ULS and GEE mean square
errors are lower but t h e WLS and IWLS mean square errors are higher. When tlie size
of the variance of the regression parameters is Iow, the largest decrease in mean square
error is in the GEE estirnator. However, when the size of the variance of the explanatory
variable X is low and t h e number of repeated measurements per individual is high, the
largest decrease in mean square error is in the WLS and IWLS estimators. For the case
where the number of repeated measurements per individual is low, the mean square error
of the WLS and IWLS estimators are both around I l but when the number of repeatecl
measurements per individual is liigh these mean square errors drop to around 2.
6.2.2 Type 1 Error Rates and Statistical Power
In Figure 6.5 we see t h e proportion of times that the null liypothesis is rejected when it
is known t o be true. This rejection rate is mucli closer to the true value, 0.05, for the
ULS and GEE estimators tlian it is for the WLS aricl IWLS estinlators. The rejection
rates becoine closer t o 0.05 when the size of the variance of the regression parameters is
low. When tlie nuniber of individuals is low, the rejection rates for the WLS and IWLS
estimators approacli 0 .Os, but the rejection rate of the GEE estimator becomes further
away from the true value aiid when the nuniber of repeated measurements per iildiviclual
is high, the rejection rates for the WLS, IWLS, and GEE estimators approach 0.05, but
the rejection rate of the ULS estimator becomes further away from the true vaIue. That
is, the GEE estimator with a low number of individuals and the ULS estimator with a
low number of repeated measurements per individual have i ncreases in t heir test biases.
The power, the proportion of times that the nu11 hypothesis is rejectecl when it is
known to be false is shown in Figure 6.10. We can clearly see that the power of the GEE
estimator is the highest (around 0.95) followed by the ULS estimator (around 0.80), and
the power of the WLS and IWLS estimators is the lowest (around 0.60). The power of
al1 of the estimators is higher when the size of the variance of the regression parameters
is low, the number of individuals is high, and the number of repeated measurements per
individual is high. When the size of the variance of the explanatory variable X is high,
the power of the GEE estimator increases slightly but decreases slightly for the W LS and
IWLS estimators. Al1 changes in factor levels affected the power of the WLS and IWLS
estimators much more than the ULS or GEE estimators. The biggest iucrease in power
occurred when the number of repeated measurements per individual increased froni 10 to
50. In this case, the power of the WLS estimator iricreased froin arouiicl 0.20 to around
0.95.
In Appendix E are reported the analysis of variance tables for aH main effects aiid first
orcler interaction terrils with the variable METHOD as well as the suininarized data for
eacli of the 16 levels of the factorial design.
Chapter 7
An Application: Word Recall
Success in Head Injury Patients
As an application of the techniques of this thesis in the binary outcome case, we use the
Post-Traumatic Amnesia Assessrnent study carried out by the Rot man Researcli Inst i tute
of the Baycrest Centre for Geriatric Care a t the University of Toronto. The main objective
of this study was to improve the assessrnent of post-traumat ic amnesia in traumatic brain
injury patients. The subjects under study consisted of 140 individuals who required
liospitalization for a head injury subsequent t o a serious accident. The patients, between
the ages of 16 and 65, with no previous significant neurological disease, were includecl in
the study from hospitalizations at the Sunnybrook and St. Michael's Hospitals in Toronto.
Since it was hypothesized that the ability to recall words after 24 hours was related
to recovery of orientation, the data that was used in this analysis was the occurrence of
perfect recall of a t least two of three words given 24 hours prior to the test as the dependent
variable where testitig was ended after perfect recall of al1 three worcls occurrecl on tliree
successive days. As the explanatory variable, we used the Galveston Orientation and
Amnesia Test (GOAT), a measure of recovery of orientation, wliich was calculatecl on a
scale of up to 100, where a score of 75 or better is considered to be a 'normal range'.
Table 7.1: Parameter Estimates for the Head-Injury Data
t IWLS took 7 iterations to converge. $ GEE took 5 iterations to converge. -;4 P values for ULS, WLS, and IWLS were given only for the XZ distribution
(refer to Pages 19 - 21).
In the analysis of the data, we removed the last three measurements since, by design,
the last three days of measurements rnust al1 be 3 out of 3 perfect recalls. We have
also transformed the GOAT score by subtracting 50 and dividing by 50 t o briilg it into
a more 'reasonable' range for analysis purposes. From the remaining data we included
only the individuals that had more than three days of word-recall tests and we used the
corresponding transformed GOAT score from the day that the words were given. This
gave us a final sample size of 39 individuals where the number of repeated measures per
individual ranged from a minimum of 4 to a maximuni of 23 with an average value of 8.4.
In this analysis we attempt to describe the association between the GOAT score and
the ability to recall a t least two of three words given 24 hours before. We use a ranclom
coefficients mode1 in this scenario due to the variable recovery patterns of iiidiviclual
patients. We compare the ranclom coefficient techniques of this tliesis in the binary
outcome case with the Geiieralized Estimating Equations (GEE) approacli of Liang 8c
Zeger and ordinary logistic regression (LOG).
In our analysis, 8 of the individuals had the sarne outcome rneasure for al1 repeatecl
measuren-ients and 9 individuals had complete separations in their data thus 17 individuals
GEES LOG -2.62 -2.62 0.42 0.46
0.0000 0.0000 3.29 3.29 0.52 0.58
0.0000 0.0000
Estimate Po Std.Err. P value Pi Std.Err. P value
ULS WLS I W L S ~ -15.2 -3.90 -4.66 5.11 2.00 2.74
0.003* 0.050* 0.106* 17.0 4.60 5.40 5.53 2.54 3.26
0.002" 0,050" 0.089*
Table 7.2: Paraineter Variance Estimates for the Head-Injury Data
METHOD
ULS
1 IWLS
WLS
out of 39 did not have parameters estimated from the Newton-Raphson algorithm.
The results of the analysis are presented in Table 7.1 where is the logistic intercept
and pi is the logistic dope of the transformed GOAT variable. In the table, we can see
that both the WLS and IWLS parameter estimates are dampened by a factor greater
than three compared with the ULS estimate. Also, their standard errors are smaller by
about half. This combination results in a higher p value for these estimators than for
the ULS estimator. The GEE estimator, a population averaged estimate, is closer to zero
than any of the other three estimators, but its standard error is rnuch smaller which, in
turn, gives a much smaller p value than even the ULS estimator.
In Figure 7.1 we see two graphs that resemble those from Figure 4.1 in Chapter 4.
Here the top graph has the individual's estimated intercept against the individual's first
diagonal element of the inverse of the variance-covariance matrix for the parameter esti-
mate (its weight) and the bottom graph has the estimate of the slope against the second
diagonal element of the same matrix (its weight). In the top graph we see one itidivid-
ual's intercept that is extremely small (-170) with a corresponding weight that is relatively
small. Therefore, we would expect the ULS estimate of the intercept to be snlaller than
the WLS or IWLS estimates. Also, fronl the bottom graph, we see one individual's slope
that is extremely large (175) with a corresponding weight that is moclerate in size relative
Estimate of Epp =
249.2 -296.4 -296.4 360.7
O$,, a&@, OPoP1
117.9 -152.8 -152.8 203.5
to the rest of the individuals. Therefore, the ULS dope estimate is greater than the slope
estimate from WLS or IWLS. A marginal analysis in whicli each individual was modelecl
0.1 5 Weight
Weight
Figure 7.1: Plots of Intercept and Slope against their Weights for Head-Injury Data.
with their own logit intercept but a common slope yielded a transformed GOAT slope
value of -3.6 which is very different from the complete marginal analysis and the raildom
coefficients analyses which are both positive.
Since the logistic regression (between individual) slope of each individual's average
transformed GOAT score is 0.42 (p = 0.0001) and the individual slopes are positive and
average to 17 then we would expect to see marginal slopes in the same positive direction
and between tlie two values which is what the GEE and LOG estimators show (3.3).
By calcu!ating ratios of the variance components in tlie mode1 as well as the variances
of the explanatory variable, we cleterini~ied that tliis exanip1e was closest to our siriiulatior~
situation where the size of the variance of the regression paraineters is liigh, the size of the
variance of the explanatory variable is low, and the number of individuals is liigh. The
simulation results show that the ULS estimator was the best estimator for that particular
situation. Thus, using the results of the ULS estimator in this dataset, we would conclude
that the Galveston Orientation and Amnesia Test (p = 0.002) is highly associated witli
the probability of a high word recall success in patients that sufferecl from traumatic
head-injuries.
Part IV
Conclusions
Chapter 8
Overall Conclusions of the Algorit hm
8.1 Continuous Outcome
8, l . l The Procedures
For the case where the outcome of interest is a continuous variable, we compared several
procedures of analysis. For a longitudinal dataset, where heterogeneity is present we
showed that the random coefficients met hods-the unweighted, weighted, and iterat ed
weighted Eeast-squares estimators as well as the random effects Henderson estimator (using
the SAS procedure MIXED) are designed to properly mode1 suc11 situations.
In this thesis, we saw that in the presence of heterogeiieity, a marginal analysis fails
to do a proper job in estimating parameters and standard errors, only in the case where
significant correlation exists between an individual's average explanatory ancl outcotne
measurement. Tlierefore in tliis situation a random cornponents mode1 would tiot result
in a biased estimate of the dope as woulcl occur in a marginal analysis. However, iii
the simulation performed in this thesis the X covariate was generated independently
of the level of the outcome variable and therefore the potential bias discussed above
would not be a problem. Furthermore, it must be emphasized that in occupational and
environmental studies it is often the case that subjects can take measures to reduce their
exposure to toxins and pollutants. The probability of taking sucli action may be directly
related to the health or sensitivity of the subjects. Therefore, the very kind of bias that
we mentioned might operate. That is, such preventive behaviour by the subjects could
induce a relationship between their average exposure and average response.
Of the random component models discussed, only the Henderson estimator failed to
converge in the simulation study. This occurred when both the number of individuals
and the number of repeated measuremeuts per individual was low. 111 these situations
the ULS, WLS, and IWLS estimators al1 gave estimates. The reliability of these esti-
mators was established by noting the accuracy of their estimates of dope in the cases
where the MIX estimator failed to converge. Also, this Henderson estimator required
several more iterations on average than the IWLS estimator to reach convergence (refer
to Table D.13). One major drawback of the Henderson estimator may be that although
it models individuals separately, like the random coefficients methods, it requires the in-
version of a t x t square matrix, where t is the number of repeated tneasurements per
individual. The random coefficients ~nethods, on the other hand, iiivert a p x p rnatrix,
wliere p is the number of regression parameters which is necessarily smaller than t (refer to
Assumption A5 on Page 12). This ensures that the random coefficients algoritlim may be
less cornputer-intensive than the aIgoritlim of the Henderson estimator. Also, in the case
where the variance-covariance iiiatrix of the regression parameters is 11ot positive-definite,
the Carter & Yang correction procedure allows estimation to continue. The SAS proce-
dure MIXED reports that the matrix is not positive-definite but still gives the parameter
estimates based on that matrix. Thus, for the continuous outconie case, it seems that
the three ranclom coefficients estimators are slightly better than tlie Henderson estimator
and much better tl-ian a marginal analysis in the presence of heterogeneity. We turn to
tlie simulation results to better describe tlieir properties in general and relative to eacli
otlier.
8.1.2 Simulation Findings
For the continuous outcome simulation, we found that under the alternative hypothesis,
the ULS estimator hacl a larger empirical variance and mean square error than the other
three estimators compared (WLS, IWLS, and MIX), especially when the size of the vari-
ance of the within individual error was high, the size of the variance of the explanatory
variable X was low, and the number of repeated measurenlents per individual was low.
The Type 1 error rates of the simulation showed tliat, in general, the U LS and WLS
estimators had larger test bias than the IWLS and MIX estimators. However, when the
size of the variance of the regression parameters was low, the size of the variance of the
explanatory variable X was high, the number of individuals was low, and the nu~nber
of repeated measurements per individual was low the test bias of the IWLS and MIX
estimators was much lower than the ULS and WLS estimators. For the case where the
number of individuals is high, the test bias of the four estimators are al1 extremely close to
the true value. The power of the simulation revealed that there was a slightly larger power
for the ULS and WLS estimators than the IWLS and MIX estimators, except wheri the
size of the variance of the regression parameters was low and the number of indivicluals
was hi& however, these results were not statistically significant. The highest values of
power occurred in the cases where there was low heterogeneity (variance of the regression
parameters) and large sample size. In these cases, the power was very close to 1.0.
In general, it seems that for the continuous outcome situation, the IWLS and MIX
estimators had lower empirical variances and mean square errors than the ULS estiniator
(under the alternative hypotliesis), less test bias tlian the ULS and WLS estirnators,
and near equivalent power to the ULS and WLS estimators. For the case where the
heterogeneity is low ancl the sample size is high, al1 estimators had similar and optimal
results.
8.2 Binary Outcome
8.2.1 The Procedures
In the longitudinal data setting when the outconle of interest can take on the values
of 1 or O the yresence of heterogeneity of individuals poses an analytical problem. The
procedures encountered in tliis thesis are the three random coefficient methods introduced
in the continuous outcome case, namely, the unweighted, weighted, and iterated weighted
least-squares estimators, and the generalized estimating equation approach of Liang &
Zeger, a marginal analysis, with both the robust and naive standard errors. Only the
robust standard error GEE estimator models the heterogeneity in the data, but only in
the estimate of the standard error, not in the estimate of the regression parameters.
As in the continuous outcome case, a marginal analysis fails to give proper regression
estimates, especially when the correlation between average outcorne and average explana-
tory measurement is different than the average of the individual regression parameter
estimates. In cases like these, even the robust standard errors cannot give reliable liy-
pothesis tests of the regression parameters. We turn again to the simulation results to
better describe their properties in general and relative to eacli other.
8.2.2 Simulation Findings
In the binary outcome simulation we found tliat the ULS and GEE estimators were closer
to their true values than the WLS and IWLS estimators (only significant for the alternative
hypothesis). The variance ratios of the WLS ancl IWLS estirnators were much larges tliaii
the variance ratios of the ULS and GEE estimators, implying that the variance formulae
for those estimators are more accurate than the variance forrnulae for the WLS and IWLS
estimators. Altliougli differei-tt levels of the factors clecreasecl the variance ratios of the
WLS and IWLS estimators, they had alinost no effect on the ULS or GEE variance ratios.
The empirical variance of the ULS estimator was much larger than the empirical variance
of the other three estimators, liowever, the empirical variances of al1 four estimators were
comparable when the size of the variance of the explanatory variable X was liigh, the
number of individuals was high, and the number of repeated rneasurements per indiviclual
was high.
The mean square error of the ULS estimator was niuch Iarger than the inean square
errors of the other three estimators under the null hypothesis due to the relatively equal
bias in the parameter estimation between the four estimators. Under the alternative hy-
pothesis, the mean square error of the WLS and IWLS estimators was larger than the
mean square error of the ULS and GEE estirnators due to the smaller bias in paranieter
estimation of the ULS and GEE estimators. We also noticed that under the null hypotli-
esis, as the number of repeated measurements per individual increased from 10 to 50 the
mean square error of the ULS and GEE estimators decreased while the WLS and IWLS
estimators increased. Under the alternative hypothesis, as the number of individuals in-
creased frorn 10 to 50, the mean square error of the ULS ancl GEE estimators decreased
while the WLS and IWLS estimators increased.
The Type 1 error rates of the simulation sliowed that tlie test bias was smaller for tlie
ULS and GEE estimators than it was for the WLS and IWLS estimators. The test bias
in al1 four estimators was comparable when the number of repeated measurements per
individual was high. The power of the GEE estimator seetned to be the largest followed
by the the ULS estimator and then the WLS and IWLS estimators. The power of the
GEE estimator renlained almost the saine for the different levels of tlie factors ancl the
power of al1 four esti mators was comparable when the nuniber of repeatecl measuremeii ts
per individual was high.
It seems that the ULS and GEE estimators are superior to the WLS alid IWLS es-
tiniators in most of the situations. One major problem with the IWLS estimator was
its inability to reach convergence in a large nuinber of cases. A problem with the GEE
estimator is that it is a population average estimator that would provide inaccurate re-
sults if the correlation between average outcome and average explanatory measurement
is different than the average of the individual regression parameter estimates, as already
mentioned. Thus, for the case of binary outcome longitudinal data, the ULS estimator
seems to have performed the best. The main factor that improved estimation and in-
ference in ail estimators, in some cases to the point of equivalency between thern, was a
higher number of repeated measurements per individual.
Chapter 9
Discussion
9.1 Limitations
We now present some of the limitations of the methods introduced in this thesis. First, in
our random coefficient model, we denoted each individual's number of repeated measure-
ments as being constant across individuah. We first introduced the discrepancy bet ween
constant and varying numbers of repeated measurements per individual in Section 1.1
where in the case when the number of repeated measurements per individual was clif-
ferent, Satterthwaite's approximation was needed to properly test the significance of the
means. 111 Our case, since we modeled each individual separately, estimation of the regres-
sion parameters poses no probiem, but the test statistics introdiicecl in Section 2.3 require
slight modifications. Second, we used the assumption of conditional independence in the
randorn coefficients model (refer to Assu~nption A2 on Page 12). Although this assump-
tion is liiiiited, many of the results of this thesis depend on its presence. Third, we macle
the assurnption that the regression parameters had a multivariate Gaussiau distribution.
Although this is a reasonable assurnption, there are some cases where this assurnption is
too restrictive. These last two limitations will be discussed further in the next section.
With respect to the simulation study, the methods used need to be tested on more
Table 9.1: Simulated Range of Logistic Probabilities for giveii Paraineters.
y 90% Range 90% Range I 0.1 (0.17.0.84) (0.63,0.98)
extensive ranges for the parameters in the factorial design. Tliat is, testing the inetliods
when the ranges of variances for the within individual error (in the case of the continuous
outcome), the regression parameters, and the explanatory variable include smaller as well
as larger values would be beneficial in determining more exact patterns of the methods
under the circurnstances. Testing these methods with more than 100 simulations per
level of the factorial design would irnprove the reliability. In most cases, we chose the
parameter estimates based on trial and error and the number of simulations per level was
deterrnined by cornputer memory availability. In addition, we tested the range of logistic
probability means based on the parameters in the mode1 post hoc. The 90% range of
probabilities ( p i j ) based on 5000 independent simulations is given by Table 9.1. We see
here tliat when the size of the variance of the regression parameters is high under the
alternative hypothesis, the upper 95th percentile is extreinely close to 1.0. This iinplies
that there is a higher chance of generating individuals witli coiistant 'success' outcome in
these situations.
In the analysis of the simulation results, a blocking design was imposecl by the sim-
ulation structure. Tliat is, each simulated dataset was analyzed by al1 of the rnethods.
However, in the analysis, tliis repeated measurement structure was ignored. It was be-
lieved that in the case of 100 siniulations per level of the factorial design that our analysis
of independent outconies would not be very different from a repeated measures analysis.
Also, we could have used an empirical cut-off point for the rejection rates of each test
nientioned that would have been based on the percentiles of the simulatecl distribution
under the nul1 Ziypothesis. Instead, we relied on the clistributions of the test statistics
themselves under different conditions. One problem with this was that since the T or F
test has a broader tail than the X2 distribution, bias was introduced when coinparisons
were made between the random coefficients methods and the MIX estimator, which usecl
a T distribution, or the GEE estimator, whicli used a X2 distribution.
9.2 Extensions
Some possible extensions of the methods described in this thesis were introduced in the
previous section. First, the assumption of conditional independence may not be prac-
tical. In many cases where subjects are measured over time, these measurements can
be correlated by the proximity of the observations in time. That is, two measurements
taken on consecutive days may be more highly correlated than two measurements taken
several days apart. Tliere are many within individual error correlation structures that
are practical in modeling such as first-order autocorrelation or compound symmetry. Any
flexible random components estimation mode1 should be able to mode1 these structures.
Second, the assumption of Gaussian random parameters may be somewhat restrictive.
Although in most situations this assumption seems to be a reasonabie one ttiere niay
exist cases where a more flexible mode1 is beneficial. In these cases, new test statistics
must be determinecl and their distributions inust be calculatecl. Also, in the case where
the explanatory variables are not normally distributed the logistic regression inclividual
estimation method of Cornfield cannot be used. Similar techniques can be adaptecl based
on the assumed distributions of these explanatory variables.
It may be desirable to model both fixed and random components in the same model
using ratidotn coefficients methodology. This is oftetl referred to as a 'mixed' rnodel. In
the case of a random effects inodel, tliis can be easily done by tnocieling the explanatory
variables of the fixed regression parameters separately from the explanatory variable of
the random regression parameters. That is, referring to Equation 1.10 the parameters Pl to & can be modeled as Pi to ,Bj with covariates xlij to x j i j and the parameters Bli to
Bpi can be modeled as Bi; to Bri with covariates xi, to sr, where f is the tiumber of
fixed regression parameters and r is the number of random regression parameters. In the
case of random coefficients the change in modehg is done with the variance-covariance
matrix Xpp. In this case we set the rows and columns of the elements of the variance-
covariance matrix to zero that correspond with the elements of the regression parameter
vector f i that are assumed to be fixed. For the purpose of calculation, the matrices are
inanipulated in a partitioned form where a11 of the random parameters are the bottom r
elements of pi and the variance-covariance components are the bottom right r x r matrix
within the p x p matrix (p > r) Epp where the rest of the matrix is zero.
Further cornparisons need to be made with the estimators presented in this thesis,
especially in the binary outcome case. Some other algoritbms wliere comparisoiis should
be made are those presented by Schall [12], Waclawiw & Liang [18, 191, and Stiratelli
etpal. [l5], to name a few.
Part V
Appendix
Appendix A
Henderson and Swamy Mode1
Equivalency
We want to prove the equivalency of the Swamy and Henderson mode1 formulations.
The Swamy estimator from Equation 2.7 is given by:
wliere: W;' = Xpp + 02(X:Xi)-' The Henderson estimator from
and bi = (XiXi)-lX:yi.
Equation 2.9 is given by:
wliere: W,' = X;CppXi + 0'1. W e want to prove that
PswAM = PHEND
Let us first rewrite Equatioxis A.1 and A.2 as:
and
where,
We will show that, elementwise, A; = C; and Bi = Di, thus proving the result.
Let us first prove t h e following lemma:
Lemma A.l If W = R + ZDZt , then,
Proof of Lemma A. 1 :
If AB = I and BA = I where A and B are non-singular square matrices of the saine
dimension, then B = A-' and A = B-'.
Therefore, if we let
we need to show that W V = I and V W = I .
VW = (R-l - R-'Z (z'R-' z + D - l ) -' Z'R-') ( R + Z DZ')
= I + R-'ZDZt- R-'Z(Z'R-'Z+ D-')-'Z'
- R-12 (ZR-12 + D-')-' ZIR-' Z D Z I
= I + R-' Z DZ' - R-'Z (z'R-' z + D-') -' [ I + Z'R-'Z D] Z t
= I + R-'ZDZ'- R-'2 (z'R-'z+ D-')-' [D-' +z'R-'z] DZt
= I + R - ~ Z D Z ' - - R - ~ Z D Z ~
Therefore, we have proven the leniina.
Now, let us prove that A; = Ci , where A; and Ci are given above.
Proof of Equatioii A.5:
RHS = X: { X i S X I + II-' X i
Now, let us prove that Bi = Di, where Bi and Di are given above. So, we need to
prove:
1 where S = 02Xpa.
Proof of Equation A.6:
Therefore, we have shown tbat for any values of Epp and o', in the case of the
condi tional independence model,
Appendix B
Estimates of Variance
B. 1 Regression Paramet er Variance Estimation
We want to show that:
Thus proving Equation B.1.
Also, we can see that:
C Var {biJ = Var i=l
so, that from Equation B.1:
Therefore,
B .2 Variances of Logistic Regression Est imators us-
ing the Delta Method
We intend to prove Equations 5.9, 5.10, and 5.1 1, where their expectations are respectively
given by:
using the Delta Method and Equations 5.7 and 5.8, respectively given by:
We will prove the following lemma using the Multivariate Delta Method given in
Theorem 14.6-2 of Bishop et.al. 12, Page 4931:
Lemma B. l If k and 1 are r x 1 vectors then:
V a r { k t l } = E { I t ) V a r { k } E { l } + E { k t } V n r { l ) E { k )
+E{k t }Cov{Z, k } E { l ) + E { l t J C o v { k , l } E { k } (B.3)
C o v { k , k'l) = V a r { k } E { l ) + C o v { k , l } E { k } (B.4)
Proof of Lemma B.l: Let us define 0 = I : I and the estimate of 0 as 8 = I I l L J L J
where r;, A, k , 1 are al1 r x 1 vectors. We will furtlier define the functions f (0) = d X and
We know that the Taylor Expansion of any function u around 8 is:
If we differentiate f and g with respect to the vector 0 we get:
B = [ t ] and $=[:] where Ir is the r x r identity matrix and 0, is a r x r matrix of zeros.
Therefore,
thus proving Equation B.3.
Cov{k, k'l) x Cov { n: + ([:] - [l]) [ u : ] ([ i l - [l]) [:]} = [ I . o.] [
Cov{l, k } Var{l}
thus proving Equatioii B.4 aiid Lemnia B.1.
Let us iiow proceed with Equation 5.10, assuming as we clid witli t h e WLS estiinator
that the variance estirnate liad no error (fi,, = c,~).
proving Equation 5.10.
Before rnoving on to the variance of the intercept, we will first redefine Our estimate:
So, for the variance of the estirnate of the intercept, we have:
proving Equation 5.9 and using tlie independence of an ancillary (Pi = c , ' ($" - @y' ) ) and complete sufficient ((?Il) + ,Gr))) statistic (given by Basu's Tlieoreiii in Casella &
[ { o ! g ' ~ g } n o ~ ] = {?gioFg}no3 I
:q"?q9
~ o U y a~ 'puy -s~!qs!l'e9s quapgns aqa~duio:, puv hql!3uv aqq JO a~uapuadapu! ayq Bu!sn
Appendix C
Updating Conjectures
C . 1 The Individual Parameter Estimate
bl = (XI wYixi)-' (X',Wyiui)
= (XI [xi D ~ ~ x : + 0'11 ' xi) ' (x: [xi x ~ ~ x I + 0'11 -' = (xi [ X ~ S X : +I]-'x~)-' (x: [X~SX~+I]-'~~) [ wllere s = - Ç ~ ~ ] 0' 1
= (S + (x:x;)-') ( ( s + (x~x;)-')-' (x:x~)-' ?Cly,) [Equations A.5 and A.61
= (x;x;)-' x;yi
= b;
Therefore, we cannot update our estimate of the individual dope parameters using weiglits
that are equal t o the inverse of the variance of y;.
C.2 The Individual's Sum of Squares for Error
Let us transform the problem into a slightly different form. Let
We will now determine what T is and what our improved estimation of the sum of squares
for error is for an individual.
Therefore,
So, we cannot update our sum of squares for error using weights equal to the inverse of
the variance of y;. Thus, Our wit hiil individual variance es t i~nate remains unchangecl.
C .3 The Regression Parameter Variance Estimator
Following the same algebra as in the first part of Appendix B.1, it is easy to show tliat:
Let us replace. PWLs with f l swAM in calculating the expected value of the above equation
as we have done before in determining its mean and variance. So, we have:
using Equation 2.16. Now, we oeed to solve (C?=L=, wi)-l explicitly for Xpp and 02
Sin ce
we will start with { X p p + a2(XiXi)-Il-'.
using Lemma A . l in the first and second calculations.
Now, we have that:
So, t herefore:
1 c2 -CD, + - C(X;X;)-' + o(t-') (A6 on Page 12) n n2 i=i
using Lemma A.1.
Our approximation of (Cy=, wi)-' is, tlierefore, given by:
So now substituting back into Equation C.1, we have:
Appendix D
Continuous Outcome Simulation
Results
In this appendix, we present the results of the continuous outcome simulation. In Ta-
bles D.l - D.5 are results of the Analysis of Variance Tables of the parameter estitnates,
the variance ratios, the empirical variances, and the mean square errors of the slope for
the main effects and their interactions with the METHOD variable. These Anaiysis of
Variance Tables were first described in Section 3.1. Table D.5 is the logistic regression
analyses of tlie proportion of times that the null hypothesis of the dope was rejectecl from
each level of the factorial design, respectively. Table D.6 is the number of times tbat the
Carter & Yang matrix correction was used on a non positive-definite variance-covariance
matrix of regression parameters. Tables D.7 - D.10 are the summarized data for each
level of the factorial design of the paraineters that correspond to the Analysis of Variance
Tables in D.l - D.4. Tables D.11 and D.12 are the number of test rejections for tlie slope
under the null and alternative liypotheses where al1 test statistics are reportecl for the
ULS, WLS, and IWLS estimators. Table D.13 shows the average iiiimber of iterations
of the IWLS and MIX estimators required for convergence. In tables where the last row
begins with a series of dots, the values given are overall averages.
Table D. 1 : Analysis of Variance Table for Slope
M E T H O D Y o2
4 N T 7 * M E T H O D O~*METHOD uZ*METHOD N*METHOD T*METHOD
Source
Table D.2: Analysis of Variance Table for Variance Ratio of Slope
p = (0,O)' d f MS Prob
Source - Y os
2 a, N T y*METHOD a 2 * M E T H O D OZ*METHOD N*METHOD T * M E T H O D
P = (090)' p = (1,2)' MS Prob clf M S Pro b
Table D.3: Analysis of Variance Table for Empirical Variance of Slope
p = (O, O)' d f MS Prob Prob
0.01 9 0.000 0.000 0.000 0.000 0.000 0.288 0.015 0.023 0.118 0.019
Table D.4: Analysis of Variance Table for Mean Square Error of Slope
Source METHOD
d f Prob 3 0.000 0.579
Table D.5: Maximum Likelihood ANOVA Table for Rejection Rates of the Slope
P = (192)' d f x2 Prob 3 1 .21 0.7503
Source METHOD
Table D.6: Nurnber of Times Cpp was Not Positive-Definite (out of 100 sarnples)
p = (O, O)' d f x2 Prob 3 8.99 0.0294
Table D.7: Average of Slopes
ULS WLS IWLS MIX REG P - - (11 2)'
ULS WLS IWLS MIX REG
Table D.8: Variance Ratio of SIope
P - - (O, O)' ULS WLS IWLS MIX REG .392 -229 -243 .173 -.47
P = (W' ULS WLS IWLS MIX REG .O79 .O04 .O10 .O22 -.Fi3
Table D.11: Number of Test Rejections for Slope wliere P = (0,O)'
Result is based on 99 samples. $ Result is based on 98 samples.
Result is based on 95 samples.
Table D. 12: Number of Test Rejections t'or Slope where = (1,2)'
t Result is based on 99 samples. $ Result is based on 98 samples. e Result is based on 94 samples.
Table D. 13: Average number of iterations for the Methods MIX and IW LS
/3= ( 0 , O ) ' I I p = ( 1 3 ' MIX IWLS I I MIX IWLS
Appendix E
Binary Out corne Simulation Results
In this appendix, we present the results of the binary outcome simulation. In Tables E. 1 -
E.5 are results of the Analysis of Variance Tables of the parameter estimates, the variance
ratios, the empirical variances, and the mean square errors of the slope for the main
effects and their interactions with the METHOD variable. These Analysis of Variance
Tables were described in Section 6.1. TabIe E.5 is the logistic regression analyses of the
proportion of times that the null hypothesis of the slope was rejected from eacli level
of the factorial design, respectively. Table E.6 is the number of times that the adaptecl
Carter & Yang matrix correction was used on a non positive-definite variance-covariance
matrix of regression parameters. Tables E.7 - E.10 are the summarized data for each
level of the factorial design of the parameters that correspond to the Aualysis of Variance
Tables in E.l - E.4. Tables E.ll and E.12 are the number of test rejections for the dope
under the null ancl alternative hypotheses wliere al1 test statistics are reported for the
ULS, WLS, and IWLS estimators. Table E.13 shows the average number of iterations
of the IWLS and GEE estimators requirecl for convergence. 111 tables where the last row
begins with a series of dots, the values giveii are overall averages.
Table E.1: Analysis of Variance Table for Slope
Source METWOD
Table E.2: Analysis of Variance Table for Variance Ratio of Slope
p = (0,O)' clf MS Prob
Source METHOD
Y2 03:
N T y*METHOD
P = (34 ) ' df MS Prob
Table E.3: Analysis of Variance Table for Empirical Variance of Slope
,O = (O, O)' d f MS Prob 3 1593.1 0.000 1 137.64 0.000 1 9.136 0.085 1 0.415 0.713 1 1494.2 0.000 3 52.627 0.000 3 7.118 0.074 3 2.620 0.466 3 536.38 0.000
ource
,O = (2,4)' df MS Prob 3 1656.9 0.000 1 100.41 0.000 1 408.19 0.000 1 661.19 0.000 1 1421.6 0.000 3 31.145 0.008 3 171.05 0.000 3 250.36 0.000 3 587.35 0.000
p = (0,O)' d f MS Prob
p = (2,4)' d f MS Prob
Table E.4: Analysis of Variance Table for Mean Square Error of SIope
l Source METHOD
y 2 0s N T y*METHOD ~ * M E T H O D N*MMETHOD T*METHOD
p = (O, O)' II P = (294)' d f MS Prob I I df MS Prob
Table E.5: Maximum Likelihood ANOVA Table for Rejection Rates of the SIope
Source METHOD
y2 oz N T METHOD*y METHOD*~: METHOD*N METHOD*T
Table E.6: Number of Times kpa was Not Positive-Definita (out of 100 samples)
f l = (0,O)' cl f x Prob
/3 = (2,4)' df x Prob
Table E.7: Average of Slopes
B - - (O, O)'
7 N T ULS WLS IWLS GEE LOG 1 0.1 10 10 -.O4 .O22 .O16 -.O1 -.O1
P - - (234)' ULS WLS IWLS GEE LOG 3.27 1.50 1.40 3.60 3.60
Table E.8: Variance Ratio of Slope
P - - (2,4)' ULS WLS IWLS GEE LOG .O13 3.37 2.52 -.49 -.44
7 a: N T 1 0.1 10 10
. . r
ULS WLS IWLS GEE LOG .347 2.43 2.48 -.18 -.23
Table E.9: Empirical Variance of Slope
Table E.10: Mean Square Error of Slope
P - - (Z4) ' ULS WLS lWLS GEE LOG 4.58 .601 -577 2.20 2.20
y 0: N T 1 0.1 10 10
B - - (O, 0)' ULS WLS IWLS GEE LOG 2.23 356 367 .582 -582
P - - (0, O)' ULS WLS IWLS GEE LOG 2.23 356 3 6 7 .582 .582 -364 269 ,270 209 ,209 3 1 4 .O49 .O50 .O83 .O83 .O85 .O55 .O55 .O44 .O44 .674 .O80 .O86 ,114 . I l 4 .285 .177 .l79 .O79 .O79 .183 .O15 ,015 .O21 .O21 .O80 .O47 .O47 .O19 .O19 5.67 .394 .410 .960 -960 2.03 1.20 1.21 .430 -430 -951 .O46 .O48 ,140 -140 -321 .192 .193 .O68 .O68 -895 .O55 .O64 ,182 .182 1.64 .698 .710 .131 .131 -289 .O15 .O15 .O40 .O40 .320 ,119 ,120 .O19 .O19 1.05 236 .240 .195 -195
B - - (214)' ULS WLS IWLS GEE LOG 5.11 6.85 7.33 2.37 2.37 1.48 .56 1 .482 ,527 3 2 7 1.91 7.63 8.69 .473 .473 ,605 .390 279 ,486 .486 5.96 1 7 11.5 1.88 1.88 1.64 .94 1 ,537 1.20 1.20 4.83 12.4 12.5 1.25 1.25 1.15 .798 3 4 3 1.22 1.22 5.88 10.7 10.6 3.98 3.98 2.75 1.63 1.62 3.00 3.00 3.63 11.0 11.5 3.27 3.27 .729 1.46 1.25 3.45 3.45 8.88 13.7 13.7 5.91 5.91 2.70 2.26 1.76 5.86 5.86 7.79 14.1 14.0 6.30 6.30 3 7 1 2.37 1.57 6.31 6.31 3.49 6.15 6.10 2.97 2.97
TabIe E.ll: Number of Test Rejections for Slope where 8 = (0,O)'
$ ResuIt is based on 98 samples.
Table E.12: Number of Test Rejections for Slope where ,û = (2,4)'
t ResuIt is based on 99 samples. $ Result is based on 98 samples. b Result is based on 97 samples. 1 Result is based on 96 samples. 4 Result is based on 95 samples. p Result is based on 93 samples.
Table E.13: Average number of iterations for the Methods GEE and IWLS
,O= (0,O)' p = (2,4)' GEE IWLS GEE IWLS
Bibliography
[l] Anderson, R. L. and Bancroft, T. A. (1952). Statistical Theory in Research. New York: McGraw-Hill.
[2] Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975). Discrete Multivariate Analysis - Theory and Practice. Cambridge, M A : M.I.T. Press.
[3] Carter, R. L. and Yang, M. C. K. (1986). Large Sample Inference in Random Co- efficient Regression Models. Communications in Statistics-Theory and Methods 15, 2507-2525.
[4] Casella, G. and Berger, R. L. (1990). Statistical Inference. Belmont, CA: Duxbury Press .
[5] Cornfield, J. (1962). Joint Dependence of Risk of Coronary Heart Disease on Serum Cholesterol and Systolic Blood Pressure: A Discriminant Function Analysis. Fed- eration Proceedings 2 1, 58-6 1.
[6] Diggle, P. J., Liang, K.-Y., and Zeger, S. L. (1994). Analysis of Longitudinal Data. Oxford: Oxford University Press.
[7] Gart. J. J . and Zweiful J . R. (1967) On the Basis of Various Estirnators of the Logit and its Variance with Applications to Quanta1 Bioassay. Biometrika 54, 181-187.
[8] Gumpertz, M. and Pantula, S. G. (1989). A Simple Approach to Inference in Random Coefficient Models. The A merican Statistician 43, 203-21 0.
[9] Harville, D. A. (1 977). Maximum Likelihood Approaclies to Variance Coinpoiieiit Estimation and to Related Proble~ns. Journal of the American Statistical Associ- ation 72, 320-340.
[IO] Henderson, C. R. (1975). Best Linear Unbiased Estimation and Prediction Under a Selection Model. Biometrics 31, 423-447.
[Il] Laird, N. M. and Ware, J . H. (1982). Random-Effects Models for Longitudiiial Data. Biometrics 38, 963-974.
1121 Schall, R. (1 991). Estimation in Generalized Linear Models with Random Effects. Biometrika 78, 719-727.
[13] Schott, J . R. (1997). Matrix Analysis for Statistics. New York: John Wiley k Sons, Inc.
[14] Searle, S., Casella, G., and McCulloch, C. (1992). Variance Components. New York: John Wiley & Sons, Inc.
[15] Stiratelli, R., Laird, N. M., and Ware, J . H. (1984). Random Effects Models for Serial Observations with Binary Response. Biometrics 40, 961-971.
[16] Swamy, P. A. V. B. (1970). Efficient Inference in a Random Coefficient Regression Model. Econometrica 38, 311-323.
[17] Swamy, P. A. V. B. (1971). Statistical Inference in Random Coeficient Regression Models. Berlin: Springer-Verlag.
[18] Waclawiw, M. A. and Liang, K.-Y. (1993). Prediction of Random Effects in the Generalized Linear Model. Journal of the American Statistical Association 88, 171-178.
1191 Waclawiw, M. A. and Liang, K.-Y. (1994). Empirical Bayes Estimation and Inference for the Random Effects Model with Binary Response. Statistics in Medicine 13, 541-551.
[20] Zeger, S.L. and Liang, K.-Y. (1986). Longitudinal Data Analysis for Discrete and Continuous Outcornes. Biometrics 42, 121-130.
IMAGE. lnc 1653 East Main Street Rochester, NY 14609 USA Phone: 71 61482-0300 Fax: 7161288-5989
O 1993, Appiled Image. Inc., All Rights Reserved