€¦ · web viewthe chapter 5 data file, ch5hor test . . . the hor stands for horizontal. we’ll...
TRANSCRIPT
L5 – Repeated Measures and Longitudinal Analyses using MIXED 1
Review of GLM Repeated / Intro to Longitudinal Analyses
(Done in SPSS because I believe you can’t do traditional repeated measures in R.)
The Chapter 5 data file, ch5hortest . . .The hor stands for horizontal. We’ll see what that means shortly.
test1, test2, test3: Scores on identical achievement tests taken at 3 different times with approximately equal intervals between tests.
effective is a dichotomous variable equal to 1 if the teacher is perceived to be effective and 0 if not.
courses ??? I don’t know what this represents.
female is a dichotomous variable equal to 1 if teacher is female?
ses is student SES as a Z-score.
ses_mean is mean SES of all students in a school.
courses_mean Mean of courses for a school. I don’t know what this represents.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 2
Questions to ask (p. 142)
1. Is there change in average achievement across the 3 test periods?
2. If there is change, is it systematic as opposed to merely random?
Systematically increasing? Systematically decreasing?
3. It it’s systematic, what is the shape of the curve of change across time?
Linear? Quadratic? Cubic?
4. Is change related to student ses or teacher effectiveness?
The above are all basic questions that can be asked of repeated measures data.
All such questions can be answered using traditional repeated measures analyses.
As we’ll see, they can also be answered by treating repeated measures data as if they were multilevel data.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 3
Analyses of the horizontal file data using the GLM Repeated Measures procedure, p. 151
(These kinds of analyses should be vaguely familiar to you from last semester.)These analyses require the horizontal file, ch5hortest.sav.
I called the repeated measures factor, “time”.
I’ve not mentioned this before, but you can give the variable that is being measured across time periods a unique name.
Type that name into the “Measure Name:” field. Then lick [Add].
Here it is called “test”. That is, I’m measuring something called “test” at each of three time periods.
Obviously, I could have name the measure, time. But, in a more elaborate dataset, several things might be measured at different time periods – test, height, depression, etc.
This says that what we’re measuring at the three time periods are test scores.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 4
The data editor columns corresponding to the three times are specified to SPSS.
I recommend, as usual, that you ask for as many visual displays of the data as are practicable.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 5
I recommend, as usual, that you ask for descriptive statistics, effect sizes, and observed power.
Since we have no covariates, the estimated means will be identical to the observed means.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 6
The output of the analysis using traditional repeated measures.GLM test1 test2 test3 /WSFACTOR=time 3 Polynomial /MEASURE=test /METHOD=SSTYPE(3) /PLOT=PROFILE(time) /EMMEANS=TABLES(time) /PRINT=DESCRIPTIVE ETASQ OPOWER /CRITERIA=ALPHA(.05) /WSDESIGN=time.
General Linear Model[DataSet1] G:\MdbT\P595C(Multilevel)\Multilevel and Longitudinal Modeling with IBM SPSS\Ch5Datasets&ModelSyntaxes\ch5hortest.sav
Within-Subjects Factors
Measure:test
time
Dependent
Variable
1 test1
2 test2
3 test3
Descriptive Statistics
Mean Std. Deviation N
test1 48.6323 9.71254 8670
test2 53.1073 9.88757 8670
test3 57.0944 9.89402 8670
Multivariate Testsc
Effect Value F
Hypothesis
df Error df Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powerb
time
Pillai's Trace .359 2424.124a 2.000 8668.000 .000 .359 4848.248 1.000Wilks' Lambda .641 2424.124a 2.000 8668.000 .000 .359 4848.248 1.000Hotelling's Trace .559 2424.124a 2.000 8668.000 .000 .359 4848.248 1.000Roy's Largest Root .559 2424.124a 2.000 8668.000 .000 .359 4848.248 1.000
a. Exact statisticb. Computed using alpha = .05c. Design: Intercept Within Subjects Design: time
One factor – time – one test result.
The results displayed in Multivariate Tests say that there are significant differences in mean test score across time periods.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 7
Recall that the data must “pass” Mauchly’s test if we wish to use the Sphericity Assumed test below.
Mauchly's Test of Sphericityb
Measure:test
Within Subjects Effect Mauchly's W
Approx. Chi-
Square df Sig.
Epsilona
Greenhouse-
Geisser Huynh-Feldt Lower-bound
time .977 206.113 2 .000 .977 .977 .500
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional
to an identity matrix.
a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests
of Within-Subjects Effects table.
b. Design: Intercept
Within Subjects Design: time
The data failed Mauchly’s test, so we’ll ignore the Sphericity Assumed line in the table below.
Tests of Within-Subjects EffectsMeasure:test
Source
Type III Sum
of Squares df
Mean
Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
time Sphericity
Assumed
310760.337 2 155380.168 2581.645 .000 .229 5163.290 1.000
Greenhouse-
Geisser
310760.337 1.954 159031.307 2581.645 .000 .229 5044.748 1.000
Huynh-Feldt 310760.337 1.955 158995.891 2581.645 .000 .229 5045.871 1.000
Lower-bound 310760.337 1.000 310760.337 2581.645 .000 .229 2581.645 1.000
Error
(time)
Sphericity
Assumed
1043513.536 17338 60.186
Greenhouse-
Geisser
1043513.536 16939.944 61.601
Huynh-Feldt 1043513.536 16943.717 61.587
Lower-bound 1043513.536 8669.000 120.373
a. Computed using alpha = .05
L5 – Repeated Measures and Longitudinal Analyses using MIXED 8
A nice feature of the GLM Repeated Measures analysis procedure is its automatic test of the shape of the curve of means across time periods. That test assumes that the time periods are equally spaced, however, so don’t rely on it if they are not. We don’t know, for sure, whether the time periods in this dataset are equally space. We’ll assume they are.
As we’ll see, if we want such tests in MIXED, we’ll have to create them using polynomials (ugh). But analyzing the shape using multilevel techniques WILL allow the time periods to be unequally spaced.
The test below tells us that the shape of the curve of means over the three time periods is not precisely linear, but curvilinear. Alas, it doesn’t tell us in this table, what the nature of the curve is – whether it is curved downward or curved upward.
Tests of Within-Subjects Contrasts
Measure:test
Source time
Type III
Sum of
Squares df
Mean
Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
time Linear 310416.221 1 310416.221 4801.239 .000 .356 4801.239 1.000
Quadratic 344.115 1 344.115 6.176 .013 .001 6.176 .700
Error
(time)
Linear 560479.947 8669 64.653
Quadratic 483033.589 8669 55.720
a. Computed using alpha = .05
Be careful!! There is no between-subjects factor, but this table will always be displayed. It’s simply telling us that the overall mean of all the scores is significantly different from 0.
Tests of Between-Subjects EffectsMeasure:test
Transformed Variable:Average
Source
Type III Sum
of Squares df Mean Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
Intercept 7.291E7 1 7.291E7 429850.489 .000 .980 429850.489 1.000
Error 1470402.182 8669 169.616
a. Computed using alpha = .05
L5 – Repeated Measures and Longitudinal Analyses using MIXED 9
Estimated Marginal Means
These are means of the dependent variable within each group computed assuming that all covariates had the same value.
Since there are no covariates, these means will be the observed means.time
Measure:test
time Mean Std. Error
95% Confidence Interval
Lower Bound Upper Bound
1 48.632 .104 48.428 48.837
2 53.107 .106 52.899 53.315
3 57.094 .106 56.886 57.303
Profile Plots
Note that even though the overall relationship looks very nearly linear, the quadratic component was significant, suggesting that the very slight downward bend in the curve is a significant one.
Recall that the sample size was 8000+, meaning that even the smallest real effect will be significant.
Very slight downward curvature
L5 – Repeated Measures and Longitudinal Analyses using MIXED 10
Traditional repeated analyses with student / person factors.
1) Teaching effectiveness:
0 = teacher of the student not judged to be effective; 1 = teacher judged to be effective
2) Student SES as a continuously varying quantity potentially different for each student.
Enter the same responses as those entered when there were no between-subjects covariates.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 11
I could have entered effective as a covariate, since it has only two values.
But I wanted to create graphs with separate lines for each value of effective. For that, it has to be a factor.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 12
Note that since we have a continuous covariate in the model, it makes sense to get the parameters of the equation corresponding to that covariate. That’s why I checked the “Parameter Estimates” box.
GLM test1 test2 test3 BY effective WITH ses /WSFACTOR=time 3 Polynomial /MEASURE=test /METHOD=SSTYPE(3) /PLOT=PROFILE(time time*effective) /EMMEANS=TABLES(time) WITH(ses=MEAN) /EMMEANS=TABLES(effective*time) WITH(ses=MEAN) /PRINT=DESCRIPTIVE ETASQ OPOWER PARAMETER /CRITERIA=ALPHA(.05) /WSDESIGN=time /DESIGN=ses effective.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 13
General Linear Model Output[DataSet1] G:\MdbT\P595C(Multilevel)\Multilevel and Longitudinal Modeling with IBM SPSS\Ch5Datasets&ModelSyntaxes\ch5hortest.sav
Within-Subjects Factors
Measure:test
time
Dependent
Variable
1 test1
2 test2
3 test3
Between-Subjects Factors
N
effective Teacher effectiveness .00 3901
1.00 4769
Descriptive Statistics
effective Teacher
effectiveness Mean Std. Deviation N
test1 .00 46.9255 12.14551 3901
1.00 50.0284 6.82068 4769
Total 48.6323 9.71254 8670
test2 .00 50.5716 12.39563 3901
1.00 55.1815 6.51977 4769
Total 53.1073 9.88757 8670
test3 .00 51.7330 10.02157 3901
1.00 61.4799 7.28562 4769
Total 57.0944 9.89402 8670
L5 – Repeated Measures and Longitudinal Analyses using MIXED 14
Multivariate Testsc
Effect Value F
Hypothesi
s df Error df Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powerb
time Pillai's Trace .360 2434.511a 2.000 8666.000 .000 .360 4869.021 1.000
Wilks' Lambda .640 2434.511a 2.000 8666.000 .000 .360 4869.021 1.000
Hotelling's
Trace
.562 2434.511a 2.000 8666.000 .000 .360 4869.021 1.000
Roy's Largest
Root
.562 2434.511a 2.000 8666.000 .000 .360 4869.021 1.000
time * ses Pillai's Trace .001 2.196a 2.000 8666.000 .111 .001 4.392 .451
Wilks' Lambda .999 2.196a 2.000 8666.000 .111 .001 4.392 .451
Hotelling's
Trace
.001 2.196a 2.000 8666.000 .111 .001 4.392 .451
Roy's Largest
Root
.001 2.196a 2.000 8666.000 .111 .001 4.392 .451
time *
effective
Pillai's Trace .104 502.407a 2.000 8666.000 .000 .104 1004.814 1.000
Wilks' Lambda .896 502.407a 2.000 8666.000 .000 .104 1004.814 1.000
Hotelling's
Trace
.116 502.407a 2.000 8666.000 .000 .104 1004.814 1.000
Roy's Largest
Root
.116 502.407a 2.000 8666.000 .000 .104 1004.814 1.000
a. Exact statisticb. Computed using alpha = .05c. Design: Intercept + ses + effective Within Subjects Design: time
Time: There were significant differences in mean test scores across the three time periods.
Note that SPSS Automatically tested two interactions – the time*ses interaction and the effective*ses interaction. Thanks, SPSS. Remember that interactions are moderation. We’re asking, does ses moderate the increase in test scores over time? And does effective moderate the increase in test scores over time?
Time*ses: Since it’s not significant, it tells us that change across time was the same for low ses kids as it was for high ses kids. No moderation.
Time*effective: Since it is significant, it tells us that the difference in mean test scores across times was different for kids with less effective teachers than it was for kids with more effective teachers. That is, effective moderates the Test~Time relationship.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 15
The data have to pass Mauchly’s test of sphericity in order for us to be able to interpret the Sphericity Assumed line below.
Mauchly's Test of Sphericityb
Measure:test
Within Subjects Effect Mauchly's W
Approx. Chi-
Square df Sig.
Epsilona
Greenhouse-
Geisser Huynh-Feldt Lower-bound
time .969 269.218 2 .000 .970 .971 .500Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix.a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table.b. Design: Intercept + ses + effective Within Subjects Design: time
Sphericity did not hold, we failed Mauchly’s test, so we must ignore the “Sphericity Assumed” results on the next page.
(You must ignore them. Don’t mess with the God of Statistics!!)
L5 – Repeated Measures and Longitudinal Analyses using MIXED 16
Tests of Within-Subjects Effects
Measure:test
Source
Type III
Sum of
Squares df
Mean
Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
time Sphericity Assumed
284416.583 2 142208.292
2486.938 .000 .223 4973.875 1.000
Greenhouse-Geisser
284416.583 1.941 146558.224
2486.938 .000 .223 4826.248 1.000
Huynh-Feldt 284416.583 1.942 146492.120
2486.938 .000 .223 4828.425 1.000
Lower-bound 284416.583 1.000 284416.583
2486.938 .000 .223 2486.938 1.000
time * ses Sphericity Assumed
245.937 2 122.969 2.150 .116 .000 4.301 .443
Greenhouse-Geisser
245.937 1.941 126.730 2.150 .118 .000 4.173 .436
Huynh-Feldt 245.937 1.942 126.673 2.150 .118 .000 4.175 .436Lower-bound 245.937 1.000 245.937 2.150 .143 .000 2.150 .311
time * effective
Sphericity Assumed
52072.332 2 26036.166
455.320 .000 .050 910.641 1.000
Greenhouse-Geisser
52072.332 1.941 26832.572
455.320 .000 .050 883.612 1.000
Huynh-Feldt 52072.332 1.942 26820.470
455.320 .000 .050 884.011 1.000
Lower-bound 52072.332 1.000 52072.332
455.320 .000 .050 455.320 1.000
Error(time) Sphericity Assumed
991194.398 17334 57.182
Greenhouse-Geisser
991194.398 16819.517 58.931
Huynh-Feldt 991194.398 16827.107 58.905Lower-bound 991194.398 8667.000 114.364
a. Computed using alpha = .05
The results here are the same as found in the Multivariate Results table.
1) There is an effect of time; 2) there is no time*ses interaction; and 3) there is a time*effective interaction.
So mean test performance changes across time, and it changes in a different way for kids with effective teachers than it does for kids with uneffective teachers.
The kid’s ses does not affect how his/her test scores change over time.
What about the effect of ses on test scores overall? That answer is in the Test of Between Subjects Effects below.
But first, the nature of the changes over time . . . linear? Quadratic?
L5 – Repeated Measures and Longitudinal Analyses using MIXED 17
Tests of Within-Subjects Contrasts
Measure:test
Source time
Type III Sum
of Squares df
Mean
Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
time Linear 283779.090 1 283779.090 4795.547 .000 .356 4795.547 1.000
Quadratic 637.493 1 637.493 11.551 .001 .001 11.551 .925
time * ses Linear 244.668 1 244.668 4.135 .042 .000 4.135 .529
Quadratic 1.270 1 1.270 .023 .879 .000 .023 .053
time *
effective
Linear 47359.971 1 47359.971 800.330 .000 .085 800.330 1.000
Quadratic 4712.361 1 4712.361 85.386 .000 .010 85.386 1.000
Error(time) Linear 512874.459 8667 59.176
Quadratic 478319.939 8667 55.189
a. Computed using alpha = .05
time linear: The overall mean test scores changed in a linear fashion over time.
time quadratic: The shape of the overall curve of mean test scores to time was curved.
time*effective linear: The slopes of the curves relating test scores to time were different for the two effectiveness levels.
time*effective quadratic: The shapes of the curves relating test scores to time were different for the two effectiveness levels.
Wow! These data are getting interesting.
OK, OK, what about the overall effect of ses? Of effective? The Tests of Between Subjects Effects are below.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 18
Tests of Between-Subjects Effects
Measure:test
Transformed Variable:Average
Source
Type III Sum
of Squares df Mean Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
Intercept 7.122E7 1 7.122E7 492920.714 .000 .983 492920.714 1.000ses 91.225 1 91.225 .631 .427 .000 .631 .125effective 218038.615 1 218038.615 1509.048 .000 .148 1509.048 1.000Error 1252273.450 8667 144.488
a. Computed using alpha = .05
The dependent variable in these tests of Between-Subjects Effects is the mean across all levels of the repeated measures variable – the mean across the three time periods.
The mean scores across the 3 times were not related to student SES values (p = .427).(Hmm- this is different from the result with the ch3 data.)
The average scores across the 3 times were related to teaching effectiveness.
High effective: Average of test1,test2,test3 is largeLow effective: Average of test1,test2,test3 is low
L5 – Repeated Measures and Longitudinal Analyses using MIXED 19
The following are reported because of the presence of a covariate, ses, and because I asked for Parameter Estimates.
At each time period, test scores were regressed onto SES. The relationship was NS for each regression.
Parameter Estimates
Dependent
Variable Parameter B
Std.
Error t Sig.
95% Confidence
Interval
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
Lower
Bound
Upper
Bound
test1 Intercept 50.020 .139 360.028 .000 49.748 50.293 .937 360.028 1.000
ses .221 .132 1.681 .093 -.037 .479 .000 1.681 .390
[effective=.00] -3.103 .207 -14.990 .000 -3.509 -2.697 .025 14.990 1.000
[effective=1.00
]
0b . . . . . . . .
test2 Intercept 55.178 .139 395.910 .000 54.905 55.451 .948 395.910 1.000
ses .088 .132 .669 .504 -.171 .347 .000 .669 .103
[effective=.00] -4.610 .208 -22.201 .000 -5.017 -4.203 .054 22.201 1.000
[effective=1.00
]
0b . . . . . . . .
test3 Intercept 61.483 .125 491.978 .000 61.238 61.728 .965 491.978 1.000
ses -.082 .118 -.696 .486 -.315 .150 .000 .696 .107
[effective=.00] -9.747 .186 -52.348 .000 -10.112 -9.382 .240 52.348 1.000
[effective=1.00
]
0b . . . . . . . .
a. Computed using alpha = .05
b. This parameter is set to zero because it is redundant.
So, the average of all three test scores was not related to ses, as shown above in the Tests of Between-Subjects Effects.
This table adds to that by telling us that none of the individual test scores was related to ses – neither test1, nor test2, nor test2, as shown here.
There is no corresponding test for effective because SPSS was told that it’s a factor, not a continuous variable.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 20
Estimated Marginal Means
1. time
Measure:test
time Mean Std. Error
95% Confidence Interval
Lower Bound Upper Bound
1 48.477a .103 48.274 48.680
2 52.877a .104 52.673 53.080
3 56.606a .093 56.424 56.789
a. Covariates appearing in the model are evaluated at the following values:
ses = .0370.
The estimated marginal means at each time are computed as if all participants had the same SES score - the average of all SES scores - .0370.These could be referred to as mean test scores adjusted for ses.
2. Teacher effectiveness * timeMeasure:test
Teacher effectiveness time Mean Std. Error95% Confidence Interval
Lower Bound Upper Bound.00 1 46.926a .154 46.625 47.226
2 50.572a .154 50.270 50.8733 51.733a .138 51.462 52.004
1.00 1 50.028a .139 49.756 50.3012 55.182a .139 54.908 55.4553 61.480a .125 61.235 61.725
a. Covariates appearing in the model are evaluated at the following values: ses = .0370.The estimated marginal means at each combination of effectiveness level and time period are computed as if all participants had the same SES score - .0370.
These could be referred to as test means at each combination of time period and effectiveness level adjusted for ses.
Since ses was not significant, these are virtually identical to the observed means reported several pages ago. They may not be in cases when the covariate is significantly related to the dependent variable.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 21
Profile Plots
Overall plot across effective groups.
The graph below explains the slight downward bend in the overall curve.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 22
Students with effective teachers actually gained momentum from the 2nd to the 3rd test.
Students with ineffective teachers were not able to continue upward at the same rate – they began falling behind even more.
Note that students with less effective teachers did more poorly from the beginning. Presumably, lack of effectiveness affected performance on the first test.
Obviously, these results could have huge policy implications.
Overall Summary . . .
The SPSS GLM Repeated Measures procedure does a great job of giving you valuable information concerning your data.
I strongly recommend that you analyze your data with this procedure, if it is appropriate for your data.
Let’s see how well the MIXED procedure can provide similar answers.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 23
Longitudinal Analysis Using MIXED I, 2E p. 191 ff
We’ve analyzed these data using the traditional repeated measures techniques. Now we’ll see how the same data can be analyzed using the multilevel MIXED procedure.
Horizontal vs. Vertical arrangement
Traditional repeated measures analyses require that the repeated measures occupy different columns of the data editor. This arrangement will be called the horizontal arrangement. Here are the first few cases of the Ch 5 data arranged horizontally . . .
The multilevel R procedures and the SPSS MIXED procedure, on the other hand, requires that each repeated measure occupy a different row of the data editor.
This will be called a vertical arrangement.
It’s also called person.period or ppt arrangement.
TimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTime
L5 – Repeated Measures and Longitudinal Analyses using MIXED 24
Here are the Ch5 data arranged vertically.
(Some other variables, e.g., time, have been added. More on those later.
Major points about going from horizontal to vertical . . .
1. Each time period is a row in the vertical arrangement.2. Values which changed from time to time from column to column in the same row now change from row to row in the same column.3. Values which were constant across time are copied from row to row.
TimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTime
L5 – Repeated Measures and Longitudinal Analyses using MIXED 25
The major differences between the “hor” and “ver” versions of the data are that
1) test1, test2, and test3 have been replaced by a single column, test, with the values of test1, test2, and test3 placed on successive lines of the data file.
2) All variables that applied to the person whose values did not change from one time period to the next, e.g., effective and ses, were copied downwards so they appear at each time period.
3) a variable called time, with values 0, 1, and 2 for each of the three successive lines of data has been added.
How do we get from one form to the other? Later. Trust me.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 26
Longitudinal Analyses as Multilevel analyses
Level 1 data:
The individual observations from a given person at different time periods are the Level 1 Y values.
The times at which the observations were taken are the X values. So, invariably, time is the level 1 predictor.
Characteristics that vary from one time to the next are Level 1 characteristics. We won’t actually consider such characteristics here.
Level 1 Equation
Y = Intercept + Slope * Time + Residual
Level 2 data:
People are the Level 2 entities – analogous to groups in cross-sectional analysis. Each person gives us a group of scores.
Characteristics of people – their ses, sex, etc, are Level 2 characteristics
Typically, these characteristics are assumed to affect the intercepts of the equations relating Y to X (time) and the slopes of the same equations.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 27
New Notation (ARGH!!!)
In this chapter, Heck et al, 2E p 187, introduce a new set of symbols for old concepts. So does everyone else, so we can’t blam just Heck and his co-authors.
p replaces B in the Level 1 equationsB replaces g in the Level 2 models of intercept and slope.
I was not responsible for this. The reason probably involves different traditions doing the same analyses.
The cross-sectional tradition used the B + g notation.
The longitudinal tradition uses the p + B notation.
So, for a basic model with a Level 1 predictor (time), a Level 2 Intercept predictor (person characteristic) and a Level 2 slope predictor (person characteristic). . .
(t=time; i=person) or (i=observation at a time period; j=person)
Level 1 ModelYti = p0i + p1iati + eti(a is time or a time-related characteristic; it’s usually just time) Level 2 Model of Intercept
p0i = B00 + B01*Personcharacteristic i + u0i Level 2 Model of Slope
p1i = B10 + B11*Personcharacteristic i + u1i
Combined
Yti = B00 + B01*Personcharacteristic i + u0i + (B10 +B11*Personcharacteristic i + u1i)*ati + eti
Really!!??
L5 – Repeated Measures and Longitudinal Analyses using MIXED 28
A simple random intercepts model of Ch5Vertical.Linear change over time with random intercepts.
Level 1 Model: Y = Intercept + slope * time + error
In symbols: Yti = p0i + p1i*timet+ eti
Level 2 Model of intercept:
Intercept = constant + random deviate
In symbols: p0i = B00 + u0i
Level 2 Model of slope:
Slope = constant
In symbols: p1i = B10
Combined model:
Y = constant + random deviate + slope*time + error
Yti = B00 + u0i + B10*timet + eti
R code and SPSS code to apply this model follow
L5 – Repeated Measures and Longitudinal Analyses using MIXED 29
R application of ch5Vertical data (more on creating vertical data later)
Is there Linear change over time with random intercepts?
R Rcmdr nlme import data ch5growthdata-vertical
> ch5a <- lme(fixed = test ~ time, random = ~1|id,data=ch5vert)> summary (ch5a)Linear mixed-effects model fit by REML Data: ch5vert AIC BIC logLik 189388.4 189421.1 -94690.22
Random effects: Formula: ~1 | id (Intercept) ResidualStdDev: 6.039054 7.759078
Fixed effects: test ~ time Value Std.Error DF t-value p-value(Intercept) 48.71361 0.09996515 17339 487.3059 0time 4.23105 0.05892309 17339 71.8062 0 Correlation: (Intr)time -0.589
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -4.99170789 -0.35203890 -0.06226746 0.30966008 6.64026936
Number of Observations: 26010Number of Groups: 8670
L5 – Repeated Measures and Longitudinal Analyses using MIXED 30
I discovered a new R command, intervals, that will display confidence intervals for all estimates. We can use these to determine if the estimates are significantly different from 0.> intervals (ch5a)Approximate 95% confidence intervals
Fixed effects: lower est. upper(Intercept) 48.51767 48.713608 48.90955time 4.11555 4.231045 4.34654attr(,"label")[1] "Fixed effects:"
Random Effects: Level: id lower est. uppersd((Intercept)) 5.89741 6.039054 6.184101
Within-group standard error: lower est. upper 7.677904 7.759078 7.841110 Same Analysis in SPSSSyntaxMIXED test WITH time /FIXED=time | SSTYPE(3) /METHOD=REML /PRINT=G SOLUTION TESTCOV /RANDOM=INTERCEPT | SUBJECT(id) COVTYPE(UN) /REPEATED=time | SUBJECT(id) COVTYPE(ID).<--- Note the new line.
Mixed Model AnalysisFixed Effects
Estimates of Fixed Effectsa
Parameter Estimate Std. Error df t Sig.95% Confidence Interval
Lower Bound Upper BoundIntercept 48.713608 .099965 17830.208 487.304 .000 48.517666 48.909550time 4.231045 .058923 17339.000 71.806 .000 4.115550 4.346540a. Dependent Variable: test.Covariance Parameters
Estimates of Covariance Parametersa
Parameter Estimate Std. Error Wald Z Sig.95% Confidence Interval
Lower Bound Upper BoundRepeated Measures Variance 60.202875 .646577 93.110 .000 58.948852 61.483575Intercept [subject = id] Variance 36.471083 .885401 41.192 .000 34.776367 38.248386a. Dependent Variable: test.
Note that while the fixed parameter estimates in nlme and SPSS are identical, there are small differences in the variance parameters. More on this later.
6.0390542 = 36.47017
7.7590782 = 60.30239
L5 – Repeated Measures and Longitudinal Analyses using MIXED 31
2nd Model 1 of Ch5Vertical.Linear change over time with random intercepts and random slopes.
Level 1 Model: Y = Intercept + slope * time + error
In symbols: Yti = p0i + p1i*timet+ eti
Level 2 Model of intercept:
Intercept = constant + random deviate
In symbols: p0i = B00 + u0i
Level 2 Model of slope:
Slope = constant + random deviate
In symbols: p1i = B10 + u1i
Combined model:
Y = constant + deviate + (constant+deviate)*time + error
Y = B00 + u0i + (B10 + u1i)*time + eti
Multiplying through by time . . .
Y = B00 + u0i + B10*time + u1i*time + eti
p0i p1i
L5 – Repeated Measures and Longitudinal Analyses using MIXED 32
R application of ch5Vertical data
Linear change over time with random intercepts and slopes
R Rcmdr nlme import data ch5growthdata-vertical
Note that I’ve unchecked the “Convert value labels to factor levels” box. I’ve found that especially useful when working with labeled dichotomous variables, which are to be treated as covariates.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 33
The call to lme is nearly identical to the call for the random intercepts only model. The only change is “random = ~time” is substituted for “random = ~1” to tell lme that the slope with respect to time is to vary randomly from person to person.
Recall that lme assumes that the intercepts will vary randomly whenever slopes vary randomly.
From rcmdr, importing the datach5vert <- + read.spss("G:/MDBO/html2/p5520/p5520 Data/ch5growthdata-vertical.sav", + use.value.labels=FALSE, max.value.labels=Inf, to.data.frame=TRUE)
> colnames(ch5vert) <- tolower(colnames(ch5vert))
Call to lme> ch5b <- lme(fixed = test ~ time, random = ~time|id,data=ch5vert)> summary (ch5b)Linear mixed-effects model fit by REML Data: ch5vert AIC BIC logLik 189295.6 189344.6 -94641.81
Random effects: Formula: ~time | id Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) 5.997067 (Intr)time 2.109559 -0.097Residual 7.466786
Fixed effects: test ~ time Value Std.Error DF t-value p-value(Intercept) 48.71361 0.09750377 17339 499.6074 0time 4.23105 0.06106200 17339 69.2910 0 Correlation: (Intr)time -0.564
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -4.46673565 -0.33153341 -0.06519042 0.29046169 6.37548055
Number of Observations: 26010Number of Groups: 8670
Recall that this is the correlation between random intercept deviations and random slope deviations.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 34
> intervals (ch5b)Approximate 95% confidence intervals
Fixed effects: lower est. upper(Intercept) 48.522490 48.713608 48.904725time 4.111357 4.231045 4.350733attr(,"label")[1] "Fixed effects:"
Random Effects: Level: id lower est. uppersd((Intercept)) 5.8487060 5.99706733 6.14919202sd(time) 1.9143223 2.10955860 2.32470650cor((Intercept),time) -0.1161948 -0.09725023 -0.07823498 Within-group standard error: lower est. upper 7.371448 7.466786 7.563357 Variance = 55.752893
SPSS AnalysisMIXED test WITH time /FIXED=time | SSTYPE(3) /METHOD=REML /PRINT=G SOLUTION TESTCOV /RANDOM=INTERCEPT time| SUBJECT(id) COVTYPE(UNR) /REPEATED=time | SUBJECT(id) COVTYPE(ID).
Estimates of Fixed Effectsa
Parameter Estimate Std. Error df t Sig.95% Confidence Interval
Lower Bound Upper BoundIntercept 48.713608 .097504 8669.000 499.608 .000 48.522477 48.904738time 4.231045 .061062 8669.000 69.291 .000 4.111349 4.350741a. Dependent Variable: test.
Estimates of Covariance Parametersa
Parameter Estimate Std. Error Wald Z Sig.95% Confidence Interval
Lower Bound Upper BoundRepeated Measures Variance 55.752907 .846785 65.841 .000 54.117699 57.437524Intercept + time [subject = id]
Var(1) 35.964739 1.437137 25.025 .000 33.255481 38.894715Var(2) 4.450225 .648347 6.864 .000 3.344812 5.920961Corr(2,1) -.097247 .053412 -1.821 .069 -.200487 .008130
a. Dependent Variable: test.
Note that while the fixed effects estimates are exactly identical, there are small differences in the random effects variances.
Note that the correlation between random intercepts and random slopes is negligible, at -0.097.
Variances35.964824.450237
Added by me.
L5 – Repeated Measures and Longitudinal Analyses using MIXED 35
What have we found so far?
When we assume that intercepts (initial math scores) of students may vary randomly and that slopes of the change in scores across times may vary randomly, we find that test scores increase significantly across the three time periods, increasing on the average of 4.2 points per time period. Note that the traditional repeated measure analysis did not give us this average change over time value.
We find that there is significant random variability in residuals of the individual test values from the predicted values based on time alone, leaving room for more predictors. The traditional analysis did not tell us about that variability.
We find that the variation in intercepts is significantly greater than 0, as it the variation in slopes. Neither of these could have been discovered in the traditional analysis.
We find little correlation between random variation in intercepts and random variation in slopes. The traditional analysis did not mention anything about that.
NOTHING ABOUT QUADRATIC CHANGE OR EFFECTIVE OR SES BECAUSE THESE ARE LEVEL 2 SYSTEMATIC EFFECTS THAT WERE NOT INCLUDED IN THE ABOVE ANALYSES.
More on that in the next lecture.