4. hlm
DESCRIPTION
AnalyticsTRANSCRIPT
-
5/20/2018 4. HLM
1/32
Doing HLM using SAS PROCMIXED
Kazwww.estat.us
Feb 2005
http://www.estat.us/http://www.estat.us/ -
5/20/2018 4. HLM
2/32
My points
(1) Easy to compare HLM and other modelsthat are not HLM; thus, helpful. This isbecause PROC MIXED lets you run models
that are not HLM.(2) Easy to understand what makes HLM
HLM. In SAS, what is not essential to
HLM is done outside PROC MIXED (e.g.,centering)
-
5/20/2018 4. HLM
3/32
OLS vs. HLM in PROC MIXED.
The difference is a RANDOM statement.
OLS regression syntaxPROC MIXED ;MODEL Y= X;
Run; HLM syntax
PROC MIXED;Model Y=X;random intercept X/ subject=school;
Run;
(1)
OLS
Y_jk = b0 + error_jk
(2)
HLMLevel1: Y_jk=b0 + error_jkLevel2: b0=g0 + error_korY_jk=b0 + error_jk + error_k
-
5/20/2018 4. HLM
4/32
Again, turning a simple linearmodel into HLM
(1)PROC MIXED;Model Y=X W;
Run;(2)PROC MIXED;
Model Y=X W;random intercept X W/subject=GroupID;Run;
(3)Random statement below reads:I request that the intercept, as well
as the effects of X and W beEstimated for each subject whichcan be identified by GroupID.
-
5/20/2018 4. HLM
5/32
How to write SAS PROC MIXED syntax:
Intuitive way(1) Write all the variable names at the model
statement.
model Y=X W;
(2) Decide which variableseffect you want toestimate by schools
random intercept X W/subject=school;
-
5/20/2018 4. HLM
6/32
More careful way1. Start from level-specific specification.e.g., level1:y=b0 + b1*X + error_ij
level2: b0=g00 + g01*W + error_0jlevel2: b1=g10 + g11*W + error_1j
2. Insert level-2 equations into level-1 equations.3. Write the variable names involved in model statement.
4. Find random components(written in Roman alphabets)RULE1: Put interceptin the random statement to accommodatehigher level errors.RULE2: If the name of any variables sits right next to level-2error with an asterisk (e.g., X*level-2 error), put those variablenames in the random statement.(RULE3:No worry about residual. It is set by default.)
-
5/20/2018 4. HLM
7/32
Example 1 Anova Model
Level1: Y_ij=b0j + Residual_ij
Level2: b0j= g00 + U_0j
Y= g00 +U_0j + Residual
proc mixed
;class group;model Y= ;
random intercept/subject=school;run;
I said:
RULE1: Put interceptin the random statement toaccommodate higher level errors.RULE2: If the name of any variables sits right next tolevel-2 error with an asterisk (e.g., X*level-2 error), putthose variable names in the random statement.ONLY RULE1 relevant in this model.
-
5/20/2018 4. HLM
8/32
Example 2 Slope as outcome models
Level1: Y_ij=b0j + b1j*X + Residual_ij
Level2: b0j= g00 + g01*W + U_0jLevel2: b1j=g10+ g11*W +U_1j
Y= g00 + g01*W + g10*X + g11*W*X+ U_1j*X +U_0j + Residual
proc mixed
;class group;model Y= W X W*X;
random intercept X/subject=school;run;
What were..RULE1?RULE2?
-
5/20/2018 4. HLM
9/32
How to do substitution:Cheating using HLM software!
PUSHMIXED buttonto get a littlewindow like this.
-
5/20/2018 4. HLM
10/32
How to do substitution by hand
Level1: Y_ij=b0j + b1j*X + Residual_ijLevel2: b0j= g00 + t g01*W + U_0j
Level2: b1j=gt10+ g11*W +U_1j
1. Insert higher level equations into the level-1 equation.
Y=[g00 + g01*W + U_0j] + [g10+g11*W + U_1j]*X + Residual_ij
2. Take out the brackets--> Y=g00 + g01*W + U_0j + g10*X +g11*W*X + U_1j*X + Residual_ij
3. Notice which parts are structural part and which parts are random components. Y=g00 + g01*W + g10*X +g11*W*X + U_1j*X + U_0j + Residual_ij
proc mixed ;model W X W*X;random intercept X /subject=school;run;
What were rule1 and rule 2?
-
5/20/2018 4. HLM
11/32
Fixed Effects or Random EffectsOLS regression is a fixed effect model
PROC MIXED;Model Y=X;Run;
OLS regression is a model with fixed effects. So in a way OLSis a special case of HLM. This is an awfully inflexible modelthat does not consider the existence of various sources oferrors.
HLMPROC MIXED;Model Y=X;random intercept X/ subject=groupID;Run;
If a researcher thinks the effect of X (and the intercept) isdifferent by groups, so we should treat these coefficientsas random effects.
-
5/20/2018 4. HLM
12/32
Benefit 1 of using random effect
Conceptual oneUseful to think about Micro-Macro problems
(1)Student: Math score=b0 + b1*parentseducation level +
. + errorCountry:b1=g00 + g01*SELECTION + error
(2)Classroom: teacher perception of math ability of class=b0 + b1*average parentseducation level
+b2*average math score+b3*noise + error
Country:b1=g10 + g11*National Exam + errorb2=g20 + g21*National Exam + errorb3=g30 + g31*National Exam + error
-
5/20/2018 4. HLM
13/32
Benefit 2: Statistical benefit
Statistical Benefits In deriving a grand mean (re: the effect of X or an intercept) HLM does
shrinkage. This pulls inaccurate group means towards the grand mean, so wecan reduce the influence of outliners if their estimates are inaccurate (i.e.,having large error variance and/or coming from a small number of observationswithin each group unit)
Shrunk School mean=reliability*school meanwhere reliability is a function of N of observation in a group unit and variance.(R&B HLM book, p. 48)
Quiz: 1) what happens to a school whose reliability is 1?2) What happens if all schools are 1 on reliability?3) What happens if all schools are .5 on reliability?
-
5/20/2018 4. HLM
14/32
Quick decision rule
Random or fix
Do I open the door at 11PM?
Literature Theory
Exploratory analysis (lets see what
happens.)
-
5/20/2018 4. HLM
15/32
Moore complicated: Two step decisionsregarding random effects
(I need your help in phrasing this.)
Step 1: Effect different by school?
Step 2: Random or Fixed? Fixed: Use a series of dummy variables
(in reality too tedious)
Random: Shrinkage applies and get aprecision guided grand mean
-
5/20/2018 4. HLM
16/32
ExampleStudent Engagement study
using ESM
by Uekawa, Borman, and Lee
-
5/20/2018 4. HLM
17/32
Engagement Level (Rasch
model composite) When you were signaled the first time today, SD D A SA I was paying attention.. O O O O I did not feel like listening O O O O My motivation level was high.. O O O O I was bored.. O O O O I was enjoying class O O O O I was focused more on class than anything else O O O O I wished the class would end soon. O O O O I was completely into class O O O O
The MEANS Procedure
Analysis Variable : engagement engagement
N Mean Std Dev Minimum Maximum 2316 0.1167889 10.0106694 -31.5283511 26.4164991
-
5/20/2018 4. HLM
18/32
3-level HLM Level1: Repeated Measures (10 beeps)
Level2: Students (10 kids from a class)
Level3: courses (34 courses, Monday to Friday)
-
5/20/2018 4. HLM
19/32
3-level HLMLibname here "C:\";/*This is three level model*/procmixeddata=here.esm covtest noclprint;
class IDclass IDstudent;model engagement= /solution ddfm=kr;random intercept /sub=IDstudent(IDclass);random intercept /sub=IDclass ;run;
Quiz: how can we make this a 2-level hlm?
-
5/20/2018 4. HLM
20/32
PROC MIXED statement
proc mixed data=here.esm covtest noclprint; covtestdoes a test for covariance components (whether variances are
significantly larger than zero.). The reason why you have to request such asimple thing is that COVTEST is not based on chi-square test that onewould use for a test of variance. It uses instead t-test or something thatis not really appropriate. Shockingly, SAS has not corrected this problem
for a while. Anyways, because SAS feels bad about it, it does not want tomake it into a default option, which is why you have to request this. Notmany people know this and I myself could not believe this. So I guess thatmeans that we cannot really believe in the result of COVTEST and mustuse it with caution.
When there are lots of group units, use NOCLPRINT to suppress theprinting of group names.
-
5/20/2018 4. HLM
21/32
CLASS statement
class IDclass IDstudent Hisp; We throw in the variables that we want SAS to treat as categorical
variables. Variables that are characters (e.g., city names) must be on thisline (it wont run otherwise). Group IDs, such as IDclass in my exampledata, must be also in these lines; otherwise, it wont run. Variables thatare numeric but dummy-coded (e.g., black=1 if black;else 0) dont have to bein this line, but the outputs will look easier if you do.
One thing that is a pain in the neck with CLASS statement is that itchooses a reference category by alphabetical order. Whatever group in aclassification variable that comes the last when alphabetically ordered willbe used as a reference group. We can control this by data manipulation.For example, if gender=BOY or GIRL, then I tend to create a new variableto make it explicit that I get girl to be a reference group:If gender=Boythen gender2=(1) Boy;
If gender=Girlthen gender2=(2) Girl;
-
5/20/2018 4. HLM
22/32
MODEL statementmodel engagement= /solution ddfm=kr;
ddfm=krspecifics the ways in which the degree offreedom is calculated. It seems most close to thedegree of freedom option used by Bryk, Raudenbush,and Congdons HLM program.
Could be computationally very heavy if a model is
complicated. ddfm=bwwould run faster, though DFwould be wrong.
-
5/20/2018 4. HLM
23/32
Random statementrandom intercept X/sub=IDstudent(IDclass);random intercept X/sub=IDclass ;
We can estimate variance of slopes for categorical variablesusing group=option --- without necessarily making theminto dummy variables.
random intercept race /sub=IDclass group=race;
(instead of random intercept black white hispanic/sub=IDclass;)
Lib h "G \SAS"MODEL 1
-
5/20/2018 4. HLM
24/32
Libname here "G:\SAS";procmixeddata=here.esm covtest noclprint;weight precision_weight;class IDclass IDstudent;model engagement= /solution ddfm=kr;random intercept /sub=IDstudent(IDclass);random intercept /sub=IDclass ;run;
The Mixed Procedure
Covariance Parameter Estimates
Standard Z Cov Parm Subject Estimate Error Value Pr Z
Intercept IDstudent(IDclass) 23.3556 2.5061 9.32
-
5/20/2018 4. HLM
25/32
Libname here "G:\SAS";procmixeddata=here.esm covtest noclprint;weight precision_weight;class IDclass IDstudent subject;model engagement= hisp /solution ddfm=kr;random intercept /sub=IDstudent(IDclass);random intercept hisp /sub=IDclass ;
run;
Solution for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr > |t|
Intercept -0.6287 0.5101 33.1 -1.23 0.2265
hisp -2.2113 1.0031 17.7 -2.20 0.0410
Covariance Parameter Estimates
Standard Z
Cov Parm Subject Estimate Error Value Pr Z
Intercept IDstudent(IDclass) 22.4743 2.4712 9.09
-
5/20/2018 4. HLM
26/32
MODEL 3
procmixeddata=here.esm covtest noclprint;
weight precision_weight;
class IDclass IDstudent subject;
model engagement= hisp math hisp*math /solution ddfm=kr;
random intercept /sub=IDstudent(IDclass);
random intercept hisp /sub=IDclass ;run;
Solution for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr > |t|
Intercept -0.3249 0.7061 34.2 -0.46 0.6484
hisp -4.1236 1.4562 14.5 -2.83 0.0129
math -0.6081 1.0108 33.7 -0.60 0.5515
hisp*math 3.3305 1.9233 15.2 1.73 0.1035
The Mixed Procedure
Covariance Parameter Estimates
Standard Z
Cov Parm Subject Estimate Error Value Pr Z
Intercept IDstudent(IDclass) 22.6987 2.5020 9.07
-
5/20/2018 4. HLM
27/32
MODEL 3Solution for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr > |t|
Intercept -0.3249 0.7061 34.2 -0.46 0.6484
hisp -4.1236 1.4562 14.5 -2.83 0.0129
math -0.6081 1.0108 33.7 -0.60 0.5515
hisp*math 3.3305 1.9233 15.2 1.73 0.1035The Mixed Procedure
Covariance Parameter Estimates
Standard Z
Cov Parm Subject Estimate Error Value Pr Z
Intercept IDstudent(IDclass) 22.6987 2.5020 9.07 C
Residual 31.9761 1.0269 31.14
-
5/20/2018 4. HLM
28/32
Which is easy to understand?
In HLM software In SAS PROC MIXEDLevel-1 Intercept DisappearsLevel-2 Intercept DisappearsLevel-3 Intercept InterceptLevel-1 Error ResidualLevel-2 Error Random effectsLevel-3 Error Random effects
HLM way
Level1:engagement=b0+ b1*Hispanic +residual
Level2: b0=g00 + A
Level2: b1=g10
Level3: g00=t_000 + t_100*Math + B
Level3: g10=t_100 + t_101*Math + C
PROC MIXED wayLevel1:engagement=t_000+ t_100*Math
+ t_100*Hispanic+ t_101*Math*Hispanic
+ C*Hispanic+ B + A+ residual
-
5/20/2018 4. HLM
29/32
Why do we center variables?Level1:engagement=
t_000+ t_100*Math+ t_100*Hispanic+ t_101*Math*Hispanic
+ C*Hispanic+ B + A+ residual Imagine we have to report to teachers
their studentsaverage engagement score.
We want to use B + t_000. To be clearaboutMeaning of t_000 part, we couldcentervariables,if it makes sense.
h
-
5/20/2018 4. HLM
30/32
What about Centering?
In SAS, we use PROC STANDARD to do centering and this is outsideof PROC MIXED. When I learned this, I thought, I have done itbefore!because centering is similar to the concept of Z-scores.
This is GROUP MEAN CenteringProc standard data=X mean=0;by GroupID;var X;Run;
This is GRAND MEAN Centeringproc standard data=X mean=0;var X;Run;
By the way, just for your information, this is to create Z-scoresproc standard data=X mean=0 STD=1;
var X;Run;
When you useSAS PROCMIXED, younotice Centering
is not really atopic that isspecific toHLMbecauseit is doneoutside PROC
MIXED.
-
5/20/2018 4. HLM
31/32
What does it mean to centerdummy variable, like gender?
1.To adjust for gender composition.
2.-Without it, the intercept = either male or
female
-With it, the intercept is adjusted for gendercomposition.
3. See my Excel Presentation if we havetime. www.estat.us/sas/centering.xls
-
5/20/2018 4. HLM
32/32
ENDTo go back to my HLM pagewww.estat.us/id38.html
http://www.estat.us/id38.htmlhttp://www.estat.us/id38.html