running head: hierarchical linear modeling of ...web.pdx.edu/~newsomj/mlrdyad4.doc · web...

49
Nonconvergence and sample bias 1 Running head: HIERARCHICAL LINEAR MODELING OF DYADIC DATA Nonconvergence and Sample Bias in Hierarchical Linear Modeling of Dyadic Data Jason T. Newsom and Masami Nishishiba Portland State University Draft: 2/13/02

Upload: others

Post on 03-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 1

Running head: HIERARCHICAL LINEAR MODELING OF DYADIC DATA

Nonconvergence and Sample Bias in Hierarchical Linear Modeling of Dyadic Data

Jason T. Newsom and Masami Nishishiba

Portland State University

Draft: 2/13/02

We thank Tasha Beretvas, Joop Hox, and Aloen Townsend for helpful comments on an earlier

draft of the manuscript. Address correspondence to Jason T. Newsom, Ph.D., Institute on Aging,

School of Community Health, Portland State University, P.O. Box 751, Portland, OR 97207-

0751 or [email protected].

Page 2: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 2

Abstract

Recent statistical development and software availability for hierarchical linear models has led to

an increasingly wide range of research applications. Although researchers have begun to use

these techniques and even recommend their use with dyadic data, very little is known about

estimation with only two observations per group. This Monte Carlo study examines the effects

of the number of dyads and the intraclass correlation coefficient on convergence difficulties and

bias of parameter estimates and their standard errors. Results show that convergence problems

with intercept and slope variance estimates are extremely common. Bias was generally low for

fixed effects and their standard errors, but random effects and their standard errors frequently

showed serious bias. The findings suggest that random effect estimation and significance tests

are largely impractical with dyadic data.

Page 3: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 3

Convergence Difficulties and Sample Bias in Hierarchical Linear Modeling of Dyadic Data

With the recent development and widespread availability of hierarchical linear modeling

(HLM) techniques, new analysis strategies for a variety of research designs have emerged. HLM

is an appropriate analytic technique for analysis of nested or hierarchically structured data in

which individual observations are nested within groups. A common example is data that involve

students nested within classrooms. Students who share the same teacher, facilities, or curriculum

tend to have related or dependent scores. These data structures lead to violation of standard

independence assumptions of traditional regression analysis whenever the measure of

nonindependence, the intraclass correlation coefficient (ICC), is greater than zero. Among

several new areas of the application of HLM is the analysis of dyadic data, such as couples,

twins, parent-child interaction, or friendship pairs. There is a growing body of published reports

that use HLM for analysis of dyadic data (e.g., Barnett, Marshall, Raudenbush, & Brennan,

1993; Kurdek, 1997; Windle & Dumenci, 1997), and a few sources that suggest HLM as an

option for dyadic data (e.g., Maguire, 1999; Kashy & Kenny, 2000; Newsom, in press). Analysis

of dyadic data is appropriate under the rationale of HLM, because dyad members are individuals

nested within groups of two. However, little simulation work has been conducted that examines

the behavior of estimates and their standard errors when data are from individuals nested within

very small groups, and no studies that we could find have examined dyads. The growing

application of HLM to dyadic data has been proceeding with virtually no information available

to researchers about the practical difficulties of estimation, such as convergence failures, bias in

regression estimates, or bias in standard errors and significance tests. The present study

examines these issues in an extensive Monte Carlo study of the HLM with dyadic data under a

variety of conditions.

Page 4: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 4

Hierarchical Linear Models

HLM, sometimes referred to as "multilevel regression" or "multilevel modeling", is a

regression-based analysis that can be conceptualized as a two-level regression (Aitkin &

Longford, 1986; de Leeuw & Kreft, 1986: Goldstein, 1986; Mason, Wong, & Entwisle, 1984;

Raudenbush & Bryk, 1986). The first level of analysis involves an identical regression analysis

repeated within each group. This regression model follows the ordinary least squares regression

model1 with as many as p independent variables:

Level 1:

In the above equation, subscripts i and j represent individuals and groups, respectively. The

subscript, p, designates the number of predictors at level 1. In the case of dyads, the number of

observations per group, nj, is equal to two. In the second-level of analysis, the intercept, 0, and

slopes, p, serve as dependent variables in another regression analysis using predictors measured

at the group level. Such predictors might include classroom size, the teacher’s race, or average

student socioeconomic status with classroom data, or household income, the number of years

married, or twin’s age in the dyadic example. As in ordinary least squares regression analysis,

the intercept represents the value of the dependent variable when the predictors equal zero. It is

also possible to “center” predictors, by computing deviations from the group mean or grand

mean. Centering produces level-1 intercepts that represent the group mean or grand mean for

each group (for a more complete discussion of this topic, see Kreft, de Leeuw, & Aiken, 1995). 1 Although there are several estimators available with HLM, such as restricted maximum

likelihood, generalized least squares, or ordinary least squares, we will use “ordinary least

squares” to refer to standard, nonhierarchical linear regression, and “HLM” to refer to multilevel

regression using any of these possible estimators.

Page 5: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 5

Thus, the level-two analysis is a regression that predicts the intercept or a particular slope for

each group. Under certain coding schemes, the level-2 intercept can be interpreted as a grand

mean. The level-2 slopes can be interpreted as the effect of a level-2 predictor on the group

average when predicting intercepts or the effect of the level-2 predictor on the relationship

between a level-1 predictor and the dependent variable. The latter is referred to as a “cross-

group interaction” (e.g., Kreft and de Leew, 1998). There are p + 1 possible level-2 equations,

but, for simplicity, we present only two level-2 equations, based on an analysis with only one

level-1 predictor, 1:

Level-2 equations:

In the equations above, there are q possible level-2 predictors and represents the coefficient for

the intercept or slope. 00 or 10 are intercepts and 0q and 1q are slopes and both are commonly

referred to as “fixed effects.” u0j and u1j are level-2 residuals. Their variances, 00 and 11, which

represent the variation of the intercept or slope across groups, are referred to as “random effects”

and can be of principal interest to researchers.

The level-1 and level-2 equations can be written as a single regression equation using

algebraic substitution:

The extent to which observations within a group are related can be expressed as

an estimate of the ratio the between-group variation relative to the total variation in the

population, called the ICC:

Page 6: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 6

In this equation, is the ICC, 00 is the variance of the intercept when there are no predictors2,

and is the within-group variance (i.e, the variance of r). When the ICC equals 0, there is no

difference between OLS regression estimates and those obtained with HLM, because no

clustering exists. OLS standard error estimates become increasingly negatively biased as

clustering within groups increases.

Behavior of HLM Estimates and Their Standard Errors

Over the last decade, there has been a rapid growth in the popularity of multilevel models

in the social sciences and psychology, but it is surprising to learn that there is a great paucity of

published Monte Carlo studies examining the bias or efficiency of coefficients, random effects,

and their standard errors. Textbook discussions frequently cite unpublished doctoral

dissertations or technical reports, and conclude that the most commonly used estimator, restricted

maximum likelihood (REML), shows little bias in fixed effects estimates. 3 Although much of

this work is relevant, researchers interested in applying hierarchical models to dyads do not have

ready access to these findings nor are the findings easy to extrapolate to the dyadic case.

Moreover, in practice, researchers often encounter convergence difficulties due to nonpositive

definite variance estimates, but there is little work documenting the conditions under which these

problems are most likely to occur. Of the few Monte Carlo studies conducted, most have

attempted to retain a constant total sample size, while attempting to evaluate the effects of

differing numbers and observations per group. Thus, their findings make it difficult to

extrapolate to the effects of varying the number of groups when groups sizes are small. In

addition, varying both group size and the number of observations per group simultaneously may

have other problems. Bliese (1998), for instance, shows that the ICC is dependent on group size,

Page 7: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 7

making it difficult to assess the independent effects of ICC and group size without special

precautions.

Perhaps the most frequently discussed simulation work was done by Bassiri (1988),

reported in an unpublished doctoral dissertation. Bassiri examined the effects of intraclass

correlation (.10 vs. .25), the number of observations per group (5 through 150), and the number

of groups (10 through 150) on fixed effects estimates, their standard errors, random effects

estimates, and Type I and Type II error rates with REML estimation. She concluded that, in

general, level-2 estimates (fixed and random effects) are unbiased, consistent, and asymptotically

efficient, that standard errors are primarily a function of the number of groups rather than the

size of the groups, and higher ICC values are associated with poorer precision in fixed and

random effects estimates.

In another doctoral dissertation, Kim (1990) examined the magnitude of slope estimates,

the number of groups (25, 50, 100, and 200), and the number of observations within each group

(10, 20, 30). GLS and full ML estimates were compared using relatively complex regression

models (i.e., 9 parameters). His findings suggest that the greatest bias in fixed and random

effects occurs when there are a relatively large number of observations per group and there are a

small number of groups (i.e., 40 case per group and 25 groups). Kim, however, did not

manipulate group size and the number of groups independently (e.g., in the conditions in which

there were a small number of observations per group there were also more groups), so it is

difficult to discern which of these factors is responsible for parameter bias. In addition, Kim’s

study only included 50 replications per cell, which will lead to less reliable estimates of the true

sampling variability (Efron, 1990).

Page 8: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 8

In another small study, Mok (1995) reached similar conclusions. She concluded that the

number of groups is a more important factor than the number of observations per group in both

bias and efficiency of estimates. She states, "…if resources were available for a sample size n,

comprising J schools with I students from each school, then less bias and more efficiency would

be expected from sample designs involving more schools (large J), and fewer students per school

(small I) than sample designs involving fewer schools (small J), and fewer students per school

(small I)" (p.6).

Busing (1993; see also van der Leeden and Busing; 1994) conducted the most extensive

simulation study to date, using a large number of replications and examining a wide range of

ICC values (.2, .4, .6., and .8), number of groups (5, 10, 25, 50, 100, and 300), and the number of

observations within groups (5, 10, 25, 50, and 100). In addition to parameter and standard error

bias, he examined convergence difficulties and improper solutions. Because the number of

observations per group and the number of groups were independently manipulated and because

as few as 5 observations per group were investigated, these findings are perhaps the most

applicable to the dyadic case. Although fixed estimates performed well in all conditions, Busing

reports that random effects estimates were biased unless more than 300 groups were used. These

results are consistent with those reported by both Kim (1990) and Mok (1995) which show

random effects estimates are affected more by the number of groups rather than the number of

observations per group. With very few observations per group, there was a greater propensity

for nonconvergence or improper solutions (i.e., nonpositive definite matrix), although the largest

percentage of convergence problems, which was found in the smallest group size condition, was

under 2%. It is unclear, however, the extent to which these results will generalize to as few as 2

observations per group.

Page 9: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 9

Donoghue and Jenkins (1992) reported on a small simulation study with 20 replications

in each cell of the design, examining the effects of model misspecification. The study compared

misspecified models (i.e., inclusion of a predictor in the model when it was unrelated to the

outcome and failure to included a predictor in the model even though it had a large relationship

with the outcome) to correctly specified models. The authors report no bias of within-group

error estimates, slopes, intercepts, or covariance estimates in most conditions. These authors also

examine convergence difficulties. Although they indicate that nonpositive definite matrices or

convergence failures were more common with only 5 observations per group, detailed results for

all conditions in their study were not presented. The number of groups and the number of

individuals per group, were not independently manipulated, however, because every condition

contained 1500 observations total, ranging from 10 groups with 150 observations to 300 groups

with 5 observations).

In a recent simulation by Maas and Hox (2002), the authors examine the effects of ICC,

(.10 through .30) group size, and the number of groups of parameter bias. Their study includes

sample sizes as low as 5 observations, with the number of groups varying from 10 to 100.

Although they report that fixed effects are unbiased under all conditions, they do find evidence

of important bias in variance estimates and their standard errors with a small number of

observations per group and a small number of groups.

Although they do not present Monte Carlo findings, Bryk and Raudenbush (1992) and

Snijders and Bosker (1994, 1999) describe pertinent analytic work. Bryk and Raudenbush state

that estimates will be unbiased with balanced data (i.e., equal number of observations per group)

but are too small with unbalanced data, and standard errors will tend to be negatively biased

unless large total sample sizes are used. In addition, they speculate that a small number of

Page 10: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 10

observations per group may be problematic: "We suspect that the likelihood of [matrix of

variance estimates] can be quite skewed if nj [number of observations per group] is small, even

if J [number of groups] is large, thus rendering test results inaccurate" (p. 224, bracketed text

added).

Snijders and Bosker (1994,1999), based on analytic work with power analysis, discuss

the relationship between standard errors, group size, number of groups, and intraclass

correlation. They show a nonmonotonic decrease in standard errors associated with more

observations per group in some conditions. The shape of the decline across sample sizes,

however, differed by ICC, and, although there was little difference between 2 and 5 observations

per group as a function of ICC, larger group sizes only led to an overall decline in standard errors

when the ICC was low. Higher ICC values (e.g., .2 to .4) showed a slight or no increase in

standard error with larger samples. This finding suggests that, with dyadic data, increasing the

number of groups may not produce consistency in standard errors for larger ICC values. As with

several other studies, the effects of group size and number of groups are difficult to distinguish,

because group size and the number of groups were not independently examined (i.e., a constant

total sample size was used).

Finally, based on their synthesis of existing simulation work and experience with HLM,

some authors have provided recommendations for the optimal number of observations per group

and number of groups. Kreft (1996) proposes a general 30/30 rule, in which there are 30 groups

and 30 observations per group. Elaborating on Kreft’s recommendations, Hox (1998) suggests a

minimum ratio of 50 groups to 20 observations per group in order to test cross-level interactions,

and suggests a minimum ratio of 100 to 10 to test random effects. No specific guidance is

provided on the lowest possible number of observations per group or the minimum number of

Page 11: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 11

groups required when there are a small number of observations, but these recommendations carry

the implication that smaller group sizes of number of groups may be problematic.

In our review of the simulation and analytic work on HLM estimates, it appears that

dyadic researchers have little information to draw from in making analytic decisions. Authors

have most often concluded that the number of groups is more important than group size. No

simulation studies to date have investigated group sizes as small as two. Although two studies

have noted problems with convergence or nonpositive definite matrices that tend to increase with

fewer observations per group, more information is clearly needed on these estimation problems.

Thus, a variety of questions remain about convergence difficulties and the behavior of parameter

estimates and their standard errors when dyadic data are analyzed with hierarchical models. For

instance, what is the minimum number of dyads needed to obtain unbiased parameter estimates?

How likely are convergence difficulties in estimating random effects when dyadic data are used,

and under what conditions are they most or least likely to occur? To what extent do bias and

convergence problems depend on the number of dyads or the ICC?

To address these questions, we examine nonconvergence problems, bias in fixed effects

estimates (intercept and slope estimates), bias in random effects estimates (variance), and bias in

standard errors for these estimates in a large simulation study. These factors are examined under

a wide variety of ICC values and sample sizes in which population values are known.

Method

Page 12: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 12

Design

A Monte Carlo study was undertaken to examine the effects of intraclass correlation and

sample size on nonconvergence, parameter bias, and standard error bias. The study was a 4

(intraclass correlation) X 5 (sample size) experimental design. With a group size of two, a model

estimating both intercept and slope variance is not identified.4 Thus, two separate models were

tested in order to examine nonconvergence, parameter bias, and standard error bias in the

variances estimates of intercepts and slopes. Model 1 estimated the variance of the intercept and

Model 2 estimated the variance of the slope. Each cell of the design contained 200 samples

randomly drawn from a larger population so that expected values could be compared with known

population values.

Data Generation

A total of 400,000 observations were generated representing population data for the four

intraclass correlation conditions. The first step involved the generation of a dependent variable

representing paired, clustered observations. A normal deviate (with mean of 0 and SD of 1) for

50,000 observations was generated using the pseudorandom number generator in SPSS Version

10.0 (i.e., the RV.NORMAL function). To create clustered dyads, a second variable was

generated based on the normal deviate plus random error. The degree of random error in the

computation of the second variable was then varied to create four data sets with differing

correlations between the normal deviate and the second variable (r=.05, .10, .20, and .30). A

third variable, representing a predictor with a known correlation (r=.3), was generated using the

normal deviate plus random error. The data were then disaggregated to create a dependent

variable with one correlated predictor for the four intraclass correlation conditions, representing

four populations of 100,000 observations each.

Page 13: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 13

The 4 X 5 design involved a total of 4,000 replications. Two hundred samples for each

of the 20 conditions were repeatedly drawn from one of the four population data sets using the

pseudorandom number generator in SPSS Version 10.0 (i.e., the SAMPLE procedure). Five

sample sizes, representing different numbers of dyads, nj=50, 100, 200, 500, and 1000 (i.e., 100,

200, 400, 1000, and 2000 individuals respectively) were chosen to represent a range of sample

sizes likely to be employed in psychological research.

Model Specification and Data Analysis

Analyses were conducted using SAS Version 6.12 PROC MIXED with REML

estimation. Two different models were tested: Model 1 specified a random effect of the intercept

(i.e., estimation of 00) and Model 2 specified a random effect of the slope (i.e., estimate of 11).

The two-level equations for the models are as follows:

Model 1

Substituting, this model can be represented by a single equation as:

Model 2

Expressed as a single equation:

Page 14: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 14

Nonconvergence was estimated by the number of samples in which no estimate was

available for random effect. A failure to obtain an estimate of the random effect in all cases was

due to a nonpositive gamma matrix, which contains the random effects estimates. Nonpositive

definiteness occurs because an estimate generated during the iterative process is zero or negative,

and because the inversion of such a matrix is undefined, iterations are stopped and no estimate is

available. When a nonpositive definite matrix occurs, SAS PROC MIXED provides output for

fixed effects estimates but prints a warning regarding the random effects. Other packages, may

handle nonpositive solutions differently. For example, by default HLM version 5 resets

nonpositive variances to zero during the iteration process.

We computed bias estimates for the two fixed effects, o and 1, and the random effect,

00 for Model 1 or 11 for Model 2. Bias was computed by subtracting the expected sample value

from the population value: . To obtain a measure of the magnitude of this bias, we

computed a percentage bias measure using a z-value,

in which SD represents the standard deviation of sample estimates of the parameter, .The

cumulative probability of the z-value was then obtained, and .5 was subtracted. The probability

value was multiplied by 100 to obtain a percent difference from the population mean based on

the normal distribution. We chose this computation of percent bias because of difficulties with

the use of a more common measure of percent bias that divides bias by the population value.

The more common measure of percent bias leads to division by zero if the population value is

Page 15: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 15

zero, and, because of the use of random deviates, population values for the intercept were equal

to zero.

Bias in standard errors was computed by subtracting the standard deviation of the

parameter across samples from the expected value of the standard error for that parameter:

In this equation, SE represents the sample standard error estimate for the parameter, and SD

represents the standard deviation of sample estimates.

Results

Nonconvergence

The number and percentage of samples of 200 that had convergence failures is presented

in Table 1. For Model 1, in which intercept variance was estimated, nonconvergence was a

greater problem with fewer dyads and smaller intraclass correlation. As many as 39% of

samples had a nonpositive definite matrix in the condition with the lowest intraclass correlation

(ICC = .05) and 50 dyads. With a low ICC, even 1000 dyads were insufficient to avoid

convergence problems altogether. With higher ICC values, however, fewer dyads are required to

avoid convergence problems, J of 1000, 200, and 100 for ICCs of .10, .20, and .30, respectively.

For Model 2, which estimated slope variance, convergence was highly problematic and were

seemingly unrelated to intraclass correlation or number of dyads. For nearly all cells of the

design, nearly 50% of variance estimates for slopes were unobtainable.

Parameter Bias

Results for the bias estimates of 0, 1, 00, and 11 appear in Table 2. Bias estimates of

fixed effects 0 and 1 show little if any bias across all conditions. Percentage bias of these

Page 16: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 16

parameters was below 10% and nearly always below 5%. Moreover, bias estimates were not

consistently in a positive or negative direction.

Random effects results for 00, and 11, were quite different, however. Both variance

estimates were consistently positively biased. Bias in intercept variance estimates in Model 1,

00, decreases as the number of dyads increases. This decrease also appears to be dependent on

the ICC, with sample size having a greater effect for higher intraclass coefficients and a slight

tendency toward underestimation of intercept variance with high ICCs in large samples. In

Model 1, which estimates intercept variance (i.e., 00,), all percentage bias estimates were greater

than 10% when ICC = .05. For ICC of .10, .20, and .30, 1000, 200, and 50 dyads, respectively,

appear to be sufficient to keep bias in the estimate of intercept variance under 10%. In Model 2,

which estimates slope variance (i.e., 11,), bias estimates were greater than 10% in all ICC

conditions and were not dependent on sample size.

In summary, it appears that there is little bias in intercept or slope parameter estimates.

Intercept variance estimates will be problematic unless the ICC is greater than .10 and there is

sufficient sample size. Slope variance estimates appear to be positively biased regardless of the

ICC value or the sample size and are therefore likely to be impractical for dyadic researchers.

Standard Error Bias

To investigate potential problems with significance tests of fixed or random effects, we

computed bias for the standard error estimates of 0, 1, 00, and 11. Results are presented in

Table 3. There are few apparent biases for standard errors of intercepts, with nearly every bias

estimate below 10%. However, there did appear to be a trend toward negative bias as ICCs

values increased, and this may suggest some potential difficulties with Type I errors for very

large number of dyads sizes and higher ICCs. Bias was also generally low in standard error

Page 17: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 17

estimates of slopes. Although several cells showed greater than 10% overestimation, these

observations appear to be limited to the lower sample size conditions (e.g., fewer than 200

dyads) in Model 2.

There was evidence of severe bias for variance estimates in both Model 1 and Model 2.

Standard errors for variance estimates of intercepts in Model 1 showed positive bias as high as

69%. Although bias was reduced as sample size increased, as many as 1000 dyads was not

sufficient to reduce the bias in the standard error estimates for the intercept when ICC = .05. To

obtain unbiased standard error estimates for intercepts (Model 1), 500, 200, and 200 dyads

appear to be necessary for ICCs of .10, .20, and .30, respectively. For standard error estimates of

slopes (Model 2), the positive bias was quite high for all conditions, with an average bias of

approximately 64%.

Discussion

An extensive study of convergence difficulties and parameter bias shows important

problems with the use of HLM with dyadic data whenever random effects estimates are of

interest. Although fixed effects estimates of slopes and intercepts and their standard errors show

little bias across any sample size or ICC condition, random effects estimates led to frequent

convergence difficulties, positive bias in parameters, and positive bias in standard errors.

Although intercept random effects were less problematic with larger sample sizes and higher

ICC values, there were serious convergence difficulties, parameter bias, and standard error bias

for slope random effects. In practice, it will be difficult to estimate intercept or slope variances

in many dyadic applications.

These results provide greatly needed information on the application of HLM to dyadic

data and suggest that such applications are largely impractical when research questions concern

Page 18: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 18

variability across dyads. Our findings indicate that convergence with random effects will often

be a serious problem, but a proportion of models specifying random effects will converge. In

these cases, researchers may be tempted to report their findings, but, because of serious problems

with bias in random effects estimates and their standard errors, neither the estimates nor the

significance tests using Wald ratios (i.e., z-test using ratio of the estimate to the standard error)

can be trusted.

Dyadic researchers not interested in random effects estimates could successfully employ

HLM, but estimates would have to be assumed to be invariant across dyads. Inclusion or

exclusion of random effects may have an impact on fixed effects estimates (Kreft & de Leeuw,

1998; Longford, 1993), however, and therefore constraints on random parameters is not

recommended without reasonable justification. When level-2 predictors are incorporated and the

variance is set to zero, researchers can conceptualize level-1 parameters as varying nonrandomly

(Raudenbush & Bryk, 2002). In this case, slopes or intercepts from level 1 varying completely

as a function of some level-2 variable. An additional problem for researchers is that a proportion

of models with random effects will converge and estimates of random effects and their

significance tests will be obtained.

In this study, our focus has been on the application of HLM to dyadic data, but the results

also have implications for growth curve models with two time points. One important difference

may be the range of ICC values expected with longitudinal data. Many variables may have

higher ICC values or consistency over time than were examined in our study (i.e., greater

than .3). Our choice of values for ICC was guided by common values encountered in the dyadic

literature rather than those expected for repeated measures. The results, however, suggest that

ICC values had no effect on slope variance estimates or their standard errors within the range of

Page 19: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 19

values we examined, and, thus, there is little reason to expect a difference in this trend with

higher ICC values.

Although the present study represents valuable information on dyadic analysis with

HLM, there are some important limitations of our findings. Perhaps most importantly,

convergence difficulties and bias in random effects may be maximally estimated here, because

values of both the intercept and slope variance in the population were low. Nonpositive sample

variance estimates are most likely to occur under these circumstances, and, thus, bias in sample

estimates of intercept or slope variance may be less severe when their respective population

parameters are large. Nevertheless, such circumstances are likely to be common in practice

where researchers are unaware of true population values.

More generally, we did not systematically vary whether random or fixed parameters were

correctly or incorrectly specified in the model, and we cannot be certain of the impact of model

misspecification on estimates or standard errors. For fixed effects, both models we tested

represent some degree of misspecification.. With only two observations per group, one must

constrain either the variance estimate for the intercept or slope and their covariance to zero to

identify the model. Furthermore, more explicit comparisons between correct and incorrect

model specification would be needed for exact information about Type I or Type II error rates.

In addition to these issues, there are several other limitations of the present study. First,

the models we examined were relatively simple and did not include level-2 predictors or multiple

level-1 predictors. It is difficult to imagine, however, that better estimates would be obtained

with more complex models. Second, our manipulation of the ICC was not independent of

within-group variance or between-group variance, but differences in these variance components

are not likely to be independent of ICC values in practice. Because sample ICC values were

Page 20: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 20

compared to population ICC values, however, our results provide a fair comparison between

sample and population values and thus accurate estimates of bias in each of the ICC conditions.

Moreover, Bryk and Raudenbush (1992, p. 221) show that standard errors do not depend on

within- or between-group variance when balanced data are used. Third, a number of authors

have attempted to retain an equivalent sample size while examining the effects of the number of

observations per group and the number of groups (e.g., Bassiri, 1988; Kim, 1990; Mok, 1995).

Other authors (Busing, 1993; van der Leeden & Busing, 1994) have attempted to examine group

size and the number of groups independently, allowing total sample size to fluctuate. It may be

impossible to disentangle the importance of group size, number of groups, and total sample size.

Regardless of the resolution of this issue, our focus on dyadic data meant that the effects of the

number of groups and the total sample size could not be separated.

Our findings do not provide dyadic researchers interested in testing random effects with

much reason for optimism, and we can offer few suggestions for existing solutions. Some

authors (Bryk & Raudenbush, 1992; Snijders & Bosker, 1999) recommend that variance tests be

conducted via likelihood ratio tests using FIML instead of Wald tests. We did not obtain either

FIML estimates or nested model tests, and, thus, have no information on the usefulness of this

recommendation with dyadic data. Because REML and FIML estimates are similar with larger

sample sizes (Kim, 1990), we would expect little difference from our findings using FIML in

place of REML. Furthermore, given the rather severe bias in estimates for random effects, it is

unlikely nested tests of these biased parameters would be a substantial improvement over Wald

ratio tests. Researchers also rarely report or test random effects using nested models, typically

relying on the Wald tests automatically generated by the software. van der Leeden et al. (1997)

have suggested the use of bootstrap methods for standard error estimates and future work on this

Page 21: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 21

may prove useful for testing random effects with dyads. Another possible solution may involve

a bias correction factor that could be applied to parameters and standard error estimates. Until

such work is completed and tested in simulation work, researchers need to be cautioned against

the use of random effects with dyadic data.

Page 22: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 22

References

Aitkin, M., & Longford, N. (1986). Statistical modeling issues in school effectiveness

studies. Journal of the Royal Statistical Society, Series A, 149, 1-43.

Barnett, R.C., Marshall, N.L., Raudenbush, S.W., & Brennan, R.T. (1993). Gender and

the relationship between job experiences and psychological distress: A study of dual-earner

couples. Journal of Personality and Social Psychology, 64, 794-806.

Bassiri, D. (1988). Large and small sample properties of maximum likelihood estimates

for the hierarchical linear model. Unpublished doctoral dissertation. East Lansing, MI,

Michigan State University.

Bliese, P.D. (1998). Group size, ICC values, and group-level correlations: A simulation.

Organizational Research Methods, 1, 355-373

Bryk, A.S., & Raudenbush, S. W. (1992). Hierarchical linear models: Applications and

data analysis methods. Newbury Park, CA: Sage.

Busing, F.M.T.A. (1993). Distribution characteristics of variance estimates in two-level

models. Preprint PRM 93-04. Psychometric and Research Methodology, Leiden, Netherlands.

de Leeuw, J. & Kreft, I. (1986). Random coefficient models for multilevel analysis.

Journal of Educational Statistics, 11, 57-85.

Donoghue, J.R., & Jenkins, F. (1992). A Monte Carlo study of ethe effects of model

misspecification on HLM estimates. Technical report, Educational Testing Service, Princeton,

NJ. November 1992.

Efron (1990). More efficient bootstrap computations. Journal of the American Statistical

Association, 85, 79-89.

Page 23: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 23

Goldstein, H. (1986). Multilevel mixed linear model analysis with iterative generalized

least squares. Biometrika, 73, 43-56.

Hox, J. (1998). Multilevel modeling: When and why. In R.Mathar & M. Schader,

Classifcation, data analysis, and data highways. Berlin, Germany: Springer-Verlag.

Kashy, D.A., & Kenny, D.A. (2000). The analysis of data from dyads and groups. In

H.T. Reis & C. M. Judd, Handbook of research methods in social and personality psychology

(pp. 451-477). Cambridge: Cambridge University Press.

Kim, K.-S. (1990). Multilevel data analysis: A comparison of analytic alternatives.

Unpublished doctoral dissteration. University of California, Los Angeles, CA.

Kreft, I.G.G. (1996). Are multilevel techniques necessary? An overview, including

simulation studies. Unpublished manuscript, California State University, Los Angeles, CA.

Kreft, I.G.G., & de Leeuw, J. (1998). Introducing multilevel modeling. London: Sage.

Kreft, I.G.T., de Leeuw, J., & Aiken, L. (1995). The effect of different forms of

centering in hierarchical linear models. Multivariate Behavioral Research, 30, 1-22.

Kurdek, L.A. (1997). Adjustment to relationship dissolution in gay, lesbian, and

heterosexual partners. Personal Relationships, 4, 145-161.

Longford, N. (1993). Random coefficient models. Oxford, England: Oxford University

Press.

Maas, C.J.M., & Hox, J.J. (2002). Sufficient sample sizes for multilevel modeling.

Unpublished manuscript, Utrecht University, The Netherlands.

Maguire, M. C. (1999). Treating the dyad as the unit of analysis: A primer on three

analytic approaches. Journal of Marriage and Family, 61, 213-223

Page 24: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 24

Mason, W., Wong, G., & Entwisle, B. (1984). Contextual analysis through the multi-

level linear model. In S. Leinhardt, (Ed.), Sociological Methodology (pp. 72-103). San

Francisco: Jossey-Bass.

Mok, M. (1995) Sample size requirements for 2-level designs in educational research.

Unpublished manuscript, Macquarie University, Sydney Australia.

Newsom, J.T. (in press). A multilevel structural equation model for dyadic data.

Structural Equation Modeling.

Raudenbush, S., & Bryk, A. (1986). A hierarchical model for studying school effects.

Sociology of Education, 59, 1-17.

Raudenbush, S., & Bryk, A. (2002). Hierarchical linear models: Applications and data

analysis methods (second edition). Thousand Oaks, CA: Sage.

Kreft, IG.G., de Leeuw, J., & Aiken, L. (1995). The effect of different forms of centering

in hierarchical linear models. Multivariate Behavioral Research, 30, 1-22.

Snijders, T. A.B., & Bosker, R.J. (1994). Modeled variance in two-level models.

Sociological Methods and Research, 22, 342-363

Snijders, T. A.B., & Bosker, R.J. (1999). Multilevel analysis: An introduction to basic

and advanced multilevel modeling. Thousand Oaks, CA: Sage.

van der Leeden, R., & Busing, F.M.T.A. (1994). First iteration versus IGLS/RIGLS

estimates in two-level models: A Monte Carlo study with ML3. Preprint PRM 94-03.

Psychometrics and Research Methodology, Leiden, Netherlands.

Windle, M., & Dumenci, L. (1997). Parental and occupational stress as predictors of

depressive symptoms among dual-income couples: A multilevel modeling approach. Journal of

Marriage & the Family, 59, 625-634.

Page 25: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 25

Table 1

Random Effect Nonconvergence in Models 1 (Intercept) and 2 (Slope).

ICC .05 ICC .10 ICC .20 ICC .30

Number of dyads n % n % n % n %

Model 1(Intercept variance)

50 78 39.0 54 27.0 14 7.0 5 2.5

100 59 29.5 49 24.5 5 2.5 0 0.0

200 54 27.0 26 13.0 0 0.0 0 0.0

500 29 14.5 2 1.0 0 0.0 0 0.0

1000 15 7.5 0 0.0 0 0.0 0 0.0

Model 2 (Slope variance)

50 115 57.5 113 56.5 111 55.5 114 57.0

100 114 57.0 99 49.5 98 49.0 109 54.5

200 113 56.5 104 52.0 89 44.5 115 57.5

500 118 59.0 98 49.0 81 40.5 114 57.0

1000 114 57.0 113 56.5 71 35.5 122 61.0

Page 26: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 26

Table 2

Sample Bias of Parameter Estimates for Intercepts, Slopes, and Random Effects.

  B0 B1   T00/ T11

Number

of dyads bias % bias % bias %

Model 1

ICC .05 50 0.00362 1.64 0.00773 3.23 0.07344 32.69

100 -0.00392 -2.37 -0.00303 -1.85 0.04269 28.18

200 -0.00264 -1.98 0.00559 4.65 0.02815 24.58

500 -0.00598 -7.74 0.00468 6.38 0.01518 18.29

1000 -0.00212 -3.95 0.00348 6.66 0.00807 12.83

ICC .10

50 0.00816 3.24 -0.00284 -1.15 0.05591 21.35

100 0.00295 1.72 0.00914 5.11 0.03022 16.57

200 -0.00323 -2.56 0.00455 3.88 0.01173 8.47

500 0.00036 0.48 0.00137 1.88 0.00517 5.39

1000 -0.00019 -0.36 0.00069 1.29 0.00003 0.042

ICC .20

50 -0.00239 -0.95 -0.00418 -1.84 0.02768 9.10

100 -0.00169 -0.92 -0.00303 -1.98 0.01978 9.45

200 0.00015 0.12 0.00092 0.82 0.00436 2.77

500 -0.00155 -1.95 -0.00139 -1.90 0.00073 0.67

1000 -0.00112 -1.96 0.00012 0.22 -0.00013 -0.18

ICC.30

50 -0.01604 -5.71 -0.00599 -2.59 0.01200 4.06

100 -0.00641 -3.55 -0.00080 -0.48 0.00815 3.73

200 -0.00095 -0.71 0.00173 1.46 -0.00006 -0.04

500 0.00012 0.14 -0.00043 -0.59 -0.00254 -2.58

1000 -0.00068 -1.19 -0.00069 -1.28 -0.00610 -7.80

Page 27: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 27

(Table 2 continued)

  B0 B1   T00/ T11

Number

of dyads bias % bias % bias %

Model 2    

ICC .05

50 0.00526 2.33 0.00525 2.36 0.07022 39.00

100 -0.00055 -0.34 0.00739 4.64 0.04671 39.15

200 -0.00260 -1.99 0.00350 2.83 0.02936 39.72

500 -0.00093 -1.15 0.00463 6.71 0.01955 39.34

1000 0.00116 2.17 0.00252 4.68 0.01204 36.18

ICC .10    

50 -0.00454 -1.90 0.00120 0.47 0.07712 35.78

100 0.00010 0.053 0.00670 4.03 0.04568 37.56

200 -0.00096 -0.79 0.00285 2.17 0.03518 39.37

500 0.00059 0.83 0.00083 1.10 0.01821 37.05

1000 -0.00986 2.24 0.00070 -0.26 0.07924 43.08

ICC .20  

50 -0.00986 -4.03 0.00070 0.29 0.07924 31.01

100 -0.00307 -1.79 0.00440 2.81 0.04561 35.51

200 -0.00433 -3.26 0.00068 0.56 0.03281 36.30

500 -0.00213 -2.65 0.00056 0.69 0.01817 34.42

1000 -0.00189 -3.51 0.00107 1.96 0.01169 33.12

ICC .30    

50 -0.00142 -5.06 -0.01358 -5.48 0.06667 35.31

100 0.00308 1.62 0.00168 1.01 0.04562 39.18

200 -0.00154 -1.25 0.00590 5.11 0.03341 40.27

500 -0.00180 -2.32 -0.00145 -2.22 0.01862 39.91

1000 0.00279 4.97 0.00199 3.80 0.01222 41.34

Page 28: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 28

Table 3

Sample Bias of Standard Error Estimates for Intercepts, Slopes, and Random Effects.

B0 B1 T00 /T11

Number

of dyads J Bias % Bias % Bias %

Model 1

ICC .05

50 122 0.01251 14.18 -0.00074 -0.78 0.05404 69.31

100 141 0.00390 5.91 0.00130 1.98 0.03617 65.96

200 146 -0.00401 -7.52 -0.00029 -0.60 0.02215 52.04

500 171 0.00032 1.03 0.00093 3.18 0.00876 27.48

1000 185 0.00047 2.18 0.00051 2.46 0.00408 16.56

ICC .10

50 146 0.00268 2.68 -0.00282 -2.85 0.03595 36.23

100 151 0.00367 5.37 -0.00356 -5.00 0.02359 33.42

200 174 -0.00018 -0.35 0.00100 2.13 0.01069 19.47

500 198 0.00085 2.77 0.00092 3.18 0.00283 7.41

1000 200 0.00156 7.54 -0.00002 -0.12 0.00129 4.67

ICC.20

50 186 0.00499 4.97 0.00325 3.60 0.01590 13.22

100 195 0.00076 1.04 0.00497 8.15 0.01161 14.05

200 200 -0.00038 -0.73 0.00183 4.07 0.00311 4.96

500 200 0.00108 3.41 0.00048 1.64 -0.00191 -4.41

1000 200 0.00047 2.09 -0.00001 -0.08 -0.00057 -1.90

ICC .30

50 195 -0.00355 -3.18 -0.00156 -1.70 0.01924 16.33

100 200 0.00490 6.82 -0.00163 -2.45 0.00987 11.33

200 200 0.00042 0.78 -0.00132 -2.80 0.00203 3.09

500 200 0.00053 1.57 0.00017 0.57 0.00309 7.88

1000 200 0.00120 5.27 -0.00098 -4.57 -0.00110 -3.49

Page 29: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 29

(Table 3 Continued )

B0 B1 T00 /T11

Number

of

dyads J Bias % Bias % Bias %

Model 2

ICC .05

50 122 0.00602 6.73 0.01743 19.86 0.04722 82.50

100 141 0.00172 2.63 0.00792 12.49 0.02738 72.39

200 146 -0.00464 -8.88 0.00057 1.16 0.01988 85.69

500 171 -0.00227 -7.01 0.00362 13.23 0.00963 61.29

1000 185 -0.00005 -0.24 0.00016 0.76 0.00614 55.50

ICC .10

50 146 0.00034 0.36 0.00363 3.56 0.03425 47.55

100 151 -0.00650 -8.81 0.00580 8.75 0.02711 68.43

200 174 -0.00060 -1.25 -0.00188 -3.57 0.01665 59.02

500 198 0.00190 6.73 0.00113 3.77 0.00997 61.80

1000 200 0.00159 7.53 0.00038 1.79 0.00779 74.53

ICC .20

50 186 -0.00275 -2.82 0.01020 10.75 0.01491 16.53

100 195 -0.00112 -1.63 0.00976 15.62 0.02539 58.93

200 200 -0.00535 -10.11 0.00199 4.15 0.01477 49.25

500 200 -0.00191 -5.97 -0.00078 -2.44 0.00851 47.41

1000 200 -0.00017 -0.77 0.00019 0.86 0.00612 50.15

ICC .30

50 195 -0.01690 -15.15 0.00434 4.40 0.03493 55.01

100 200 -0.00844 -11.16 0.00552 8.36 0.02891 78.35

200 200 -0.00134 -2.74 0.00394 8.57 0.01916 74.39

500 200 -0.00090 -2.91 0.00413 15.37 0.01179 80.77

1000 200 -0.00105 -4.73 0.00078 3.75 0.00851 94.93

Page 30: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 30

Footnotes

2 The notation for does not typically involve an exponent, although it is a variance rather than a

standard deviation.

3Although several of the studies mentioned here (e.g., van der Leeden & Busing, 1994) address

issues of the relative efficiency of different estimators, such as full maximum likelihood and

generalized least squares, we focus our discussion on findings that are applicable to REML, by far the

most commonly used estimator.

4 It is not possible to estimate both variances (or “random effects”) in hierarchical linear models,

because the number of random parameters that can be estimated is limited by the number of variance-

covariance matrix elements within each level-2 unit (Snijders & Bosker, 1999). The variance-

covariance matrix among level-1 units is given by:

where , and are the variance estimates for the intercept and slope and the covariance between

intercept and slope, respectively, and is the variance of the level-1 residual. and is the

predictor variable for the two observations in a dyad (Goldstein, 1999). The level-1 variance-

covariance matrix for dyadic data (i.e., 2 observations per group or nj = 2), given above, contains

or unique elements. A model containing a single level-1 predictor with

random effects estimated for the intercept and slope, such as that in the models tested here requires 4

Page 31: Running head: HIERARCHICAL LINEAR MODELING OF ...web.pdx.edu/~newsomj/mlrdyad4.doc · Web viewHierarchical linear models: Applications and data analysis methods. Newbury Park, CA:

Nonconvergence and sample bias 31

parameters to be estimated: , , , and . In general, parameters are

required for a model with all possible random variances and covariances. With dyadic data, a model

with a random intercept and slope results in 1 too many parameters to be estimated given the number

of covariance elements available, leading to identification problems.