4. hlm

5/20/2018 4. HLM

1/32

Doing HLM using SAS PROCMIXED

Kazwww.estat.us

Feb 2005
http://www.estat.us/http://www.estat.us/

5/20/2018 4. HLM

2/32

My points

(1) Easy to compare HLM and other modelsthat are not HLM; thus, helpful. This isbecause PROC MIXED lets you run models

that are not HLM.(2) Easy to understand what makes HLM

HLM. In SAS, what is not essential to

HLM is done outside PROC MIXED (e.g.,centering)

5/20/2018 4. HLM

3/32

OLS vs. HLM in PROC MIXED.

The difference is a RANDOM statement.

OLS regression syntaxPROC MIXED ;MODEL Y= X;

Run; HLM syntax

PROC MIXED;Model Y=X;random intercept X/ subject=school;

Run;

(1)

OLS

Y_jk = b0 + error_jk

(2)

HLMLevel1: Y_jk=b0 + error_jkLevel2: b0=g0 + error_korY_jk=b0 + error_jk + error_k

5/20/2018 4. HLM

4/32

Again, turning a simple linearmodel into HLM

(1)PROC MIXED;Model Y=X W;

Run;(2)PROC MIXED;

Model Y=X W;random intercept X W/subject=GroupID;Run;

(3)Random statement below reads:I request that the intercept, as well

as the effects of X and W beEstimated for each subject whichcan be identified by GroupID.

5/20/2018 4. HLM

5/32

How to write SAS PROC MIXED syntax:

Intuitive way(1) Write all the variable names at the model

statement.

model Y=X W;

(2) Decide which variableseffect you want toestimate by schools

random intercept X W/subject=school;

5/20/2018 4. HLM

6/32

More careful way1. Start from level-specific specification.e.g., level1:y=b0 + b1*X + error_ij

level2: b0=g00 + g01*W + error_0jlevel2: b1=g10 + g11*W + error_1j

2. Insert level-2 equations into level-1 equations.3. Write the variable names involved in model statement.

4. Find random components(written in Roman alphabets)RULE1: Put interceptin the random statement to accommodatehigher level errors.RULE2: If the name of any variables sits right next to level-2error with an asterisk (e.g., X*level-2 error), put those variablenames in the random statement.(RULE3:No worry about residual. It is set by default.)

5/20/2018 4. HLM

7/32

Example 1 Anova Model

Level1: Y_ij=b0j + Residual_ij

Level2: b0j= g00 + U_0j

Y= g00 +U_0j + Residual

proc mixed

;class group;model Y= ;

random intercept/subject=school;run;

I said:

RULE1: Put interceptin the random statement toaccommodate higher level errors.RULE2: If the name of any variables sits right next tolevel-2 error with an asterisk (e.g., X*level-2 error), putthose variable names in the random statement.ONLY RULE1 relevant in this model.

5/20/2018 4. HLM

8/32

Example 2 Slope as outcome models

Level1: Y_ij=b0j + b1j*X + Residual_ij

Level2: b0j= g00 + g01*W + U_0jLevel2: b1j=g10+ g11*W +U_1j

Y= g00 + g01*W + g10*X + g11*W*X+ U_1j*X +U_0j + Residual

proc mixed

;class group;model Y= W X W*X;

random intercept X/subject=school;run;

What were..RULE1?RULE2?

5/20/2018 4. HLM

9/32

How to do substitution:Cheating using HLM software!

PUSHMIXED buttonto get a littlewindow like this.

5/20/2018 4. HLM

10/32

How to do substitution by hand

Level1: Y_ij=b0j + b1j*X + Residual_ijLevel2: b0j= g00 + t g01*W + U_0j

Level2: b1j=gt10+ g11*W +U_1j

1. Insert higher level equations into the level-1 equation.

Y=[g00 + g01*W + U_0j] + [g10+g11*W + U_1j]*X + Residual_ij

2. Take out the brackets--> Y=g00 + g01*W + U_0j + g10*X +g11*W*X + U_1j*X + Residual_ij

3. Notice which parts are structural part and which parts are random components. Y=g00 + g01*W + g10*X +g11*W*X + U_1j*X + U_0j + Residual_ij

proc mixed ;model W X W*X;random intercept X /subject=school;run;

What were rule1 and rule 2?

5/20/2018 4. HLM

11/32

Fixed Effects or Random EffectsOLS regression is a fixed effect model

PROC MIXED;Model Y=X;Run;

OLS regression is a model with fixed effects. So in a way OLSis a special case of HLM. This is an awfully inflexible modelthat does not consider the existence of various sources oferrors.

HLMPROC MIXED;Model Y=X;random intercept X/ subject=groupID;Run;

If a researcher thinks the effect of X (and the intercept) isdifferent by groups, so we should treat these coefficientsas random effects.

5/20/2018 4. HLM

12/32

Benefit 1 of using random effect

Conceptual oneUseful to think about Micro-Macro problems

(1)Student: Math score=b0 + b1*parentseducation level +

. + errorCountry:b1=g00 + g01*SELECTION + error

(2)Classroom: teacher perception of math ability of class=b0 + b1*average parentseducation level

+b2*average math score+b3*noise + error

Country:b1=g10 + g11*National Exam + errorb2=g20 + g21*National Exam + errorb3=g30 + g31*National Exam + error

5/20/2018 4. HLM

13/32

Benefit 2: Statistical benefit

Statistical Benefits In deriving a grand mean (re: the effect of X or an intercept) HLM does

shrinkage. This pulls inaccurate group means towards the grand mean, so wecan reduce the influence of outliners if their estimates are inaccurate (i.e.,having large error variance and/or coming from a small number of observationswithin each group unit)

Shrunk School mean=reliability*school meanwhere reliability is a function of N of observation in a group unit and variance.(R&B HLM book, p. 48)

Quiz: 1) what happens to a school whose reliability is 1?2) What happens if all schools are 1 on reliability?3) What happens if all schools are .5 on reliability?

5/20/2018 4. HLM

14/32

Quick decision rule

Random or fix

Do I open the door at 11PM?

Literature Theory

Exploratory analysis (lets see what

happens.)

5/20/2018 4. HLM

15/32

Moore complicated: Two step decisionsregarding random effects

(I need your help in phrasing this.)

Step 1: Effect different by school?

Step 2: Random or Fixed? Fixed: Use a series of dummy variables

(in reality too tedious)

Random: Shrinkage applies and get aprecision guided grand mean

5/20/2018 4. HLM

16/32

ExampleStudent Engagement study

using ESM

by Uekawa, Borman, and Lee

5/20/2018 4. HLM

17/32

Engagement Level (Rasch

model composite) When you were signaled the first time today, SD D A SA I was paying attention.. O O O O I did not feel like listening O O O O My motivation level was high.. O O O O I was bored.. O O O O I was enjoying class O O O O I was focused more on class than anything else O O O O I wished the class would end soon. O O O O I was completely into class O O O O

The MEANS Procedure

Analysis Variable : engagement engagement

N Mean Std Dev Minimum Maximum 2316 0.1167889 10.0106694 -31.5283511 26.4164991

5/20/2018 4. HLM

18/32

3-level HLM Level1: Repeated Measures (10 beeps)

Level2: Students (10 kids from a class)

Level3: courses (34 courses, Monday to Friday)

5/20/2018 4. HLM

19/32

3-level HLMLibname here "C:\";/*This is three level model*/procmixeddata=here.esm covtest noclprint;

class IDclass IDstudent;model engagement= /solution ddfm=kr;random intercept /sub=IDstudent(IDclass);random intercept /sub=IDclass ;run;

Quiz: how can we make this a 2-level hlm?

5/20/2018 4. HLM

20/32

PROC MIXED statement

proc mixed data=here.esm covtest noclprint; covtestdoes a test for covariance components (whether variances are

significantly larger than zero.). The reason why you have to request such asimple thing is that COVTEST is not based on chi-square test that onewould use for a test of variance. It uses instead t-test or something thatis not really appropriate. Shockingly, SAS has not corrected this problem

for a while. Anyways, because SAS feels bad about it, it does not want tomake it into a default option, which is why you have to request this. Notmany people know this and I myself could not believe this. So I guess thatmeans that we cannot really believe in the result of COVTEST and mustuse it with caution.

When there are lots of group units, use NOCLPRINT to suppress theprinting of group names.

5/20/2018 4. HLM

21/32

CLASS statement

class IDclass IDstudent Hisp; We throw in the variables that we want SAS to treat as categorical

variables. Variables that are characters (e.g., city names) must be on thisline (it wont run otherwise). Group IDs, such as IDclass in my exampledata, must be also in these lines; otherwise, it wont run. Variables thatare numeric but dummy-coded (e.g., black=1 if black;else 0) dont have to bein this line, but the outputs will look easier if you do.

One thing that is a pain in the neck with CLASS statement is that itchooses a reference category by alphabetical order. Whatever group in aclassification variable that comes the last when alphabetically ordered willbe used as a reference group. We can control this by data manipulation.For example, if gender=BOY or GIRL, then I tend to create a new variableto make it explicit that I get girl to be a reference group:If gender=Boythen gender2=(1) Boy;

If gender=Girlthen gender2=(2) Girl;

5/20/2018 4. HLM

22/32

MODEL statementmodel engagement= /solution ddfm=kr;

ddfm=krspecifics the ways in which the degree offreedom is calculated. It seems most close to thedegree of freedom option used by Bryk, Raudenbush,and Congdons HLM program.

Could be computationally very heavy if a model is

complicated. ddfm=bwwould run faster, though DFwould be wrong.

5/20/2018 4. HLM

23/32

Random statementrandom intercept X/sub=IDstudent(IDclass);random intercept X/sub=IDclass ;

We can estimate variance of slopes for categorical variablesusing group=option --- without necessarily making theminto dummy variables.

random intercept race /sub=IDclass group=race;

(instead of random intercept black white hispanic/sub=IDclass;)

Lib h "G \SAS"MODEL 1

5/20/2018 4. HLM

24/32

Libname here "G:\SAS";procmixeddata=here.esm covtest noclprint;weight precision_weight;class IDclass IDstudent;model engagement= /solution ddfm=kr;random intercept /sub=IDstudent(IDclass);random intercept /sub=IDclass ;run;

The Mixed Procedure

Covariance Parameter Estimates

Standard Z Cov Parm Subject Estimate Error Value Pr Z

Intercept IDstudent(IDclass) 23.3556 2.5061 9.32

5/20/2018 4. HLM

25/32

Libname here "G:\SAS";procmixeddata=here.esm covtest noclprint;weight precision_weight;class IDclass IDstudent subject;model engagement= hisp /solution ddfm=kr;random intercept /sub=IDstudent(IDclass);random intercept hisp /sub=IDclass ;

run;

Solution for Fixed Effects

Standard

Effect Estimate Error DF t Value Pr > |t|

Intercept -0.6287 0.5101 33.1 -1.23 0.2265

hisp -2.2113 1.0031 17.7 -2.20 0.0410


Standard Z

Cov Parm Subject Estimate Error Value Pr Z


5/20/2018 4. HLM

26/32

MODEL 3

procmixeddata=here.esm covtest noclprint;

weight precision_weight;

class IDclass IDstudent subject;

model engagement= hisp math hisp*math /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);

random intercept hisp /sub=IDclass ;run;

Solution for Fixed Effects

Standard


Intercept -0.3249 0.7061 34.2 -0.46 0.6484

hisp -4.1236 1.4562 14.5 -2.83 0.0129

math -0.6081 1.0108 33.7 -0.60 0.5515

hisp*math 3.3305 1.9233 15.2 1.73 0.1035

The Mixed Procedure


Standard Z



5/20/2018 4. HLM

27/32

MODEL 3Solution for Fixed Effects

Standard


Intercept -0.3249 0.7061 34.2 -0.46 0.6484

hisp -4.1236 1.4562 14.5 -2.83 0.0129

math -0.6081 1.0108 33.7 -0.60 0.5515

hisp*math 3.3305 1.9233 15.2 1.73 0.1035The Mixed Procedure


Standard Z


Intercept IDstudent(IDclass) 22.6987 2.5020 9.07 C

Residual 31.9761 1.0269 31.14

5/20/2018 4. HLM

28/32

Which is easy to understand?

In HLM software In SAS PROC MIXEDLevel-1 Intercept DisappearsLevel-2 Intercept DisappearsLevel-3 Intercept InterceptLevel-1 Error ResidualLevel-2 Error Random effectsLevel-3 Error Random effects

HLM way

Level1:engagement=b0+ b1*Hispanic +residual

Level2: b0=g00 + A

Level2: b1=g10

Level3: g00=t_000 + t_100*Math + B

Level3: g10=t_100 + t_101*Math + C

PROC MIXED wayLevel1:engagement=t_000+ t_100*Math

+ t_100*Hispanic+ t_101*Math*Hispanic

+ C*Hispanic+ B + A+ residual

5/20/2018 4. HLM

29/32

Why do we center variables?Level1:engagement=

t_000+ t_100*Math+ t_100*Hispanic+ t_101*Math*Hispanic

+ C*Hispanic+ B + A+ residual Imagine we have to report to teachers

their studentsaverage engagement score.

We want to use B + t_000. To be clearaboutMeaning of t_000 part, we couldcentervariables,if it makes sense.

h

5/20/2018 4. HLM

30/32

What about Centering?

In SAS, we use PROC STANDARD to do centering and this is outsideof PROC MIXED. When I learned this, I thought, I have done itbefore!because centering is similar to the concept of Z-scores.

This is GROUP MEAN CenteringProc standard data=X mean=0;by GroupID;var X;Run;

This is GRAND MEAN Centeringproc standard data=X mean=0;var X;Run;

By the way, just for your information, this is to create Z-scoresproc standard data=X mean=0 STD=1;

var X;Run;

When you useSAS PROCMIXED, younotice Centering

is not really atopic that isspecific toHLMbecauseit is doneoutside PROC

MIXED.

5/20/2018 4. HLM

31/32

What does it mean to centerdummy variable, like gender?

1.To adjust for gender composition.

2.-Without it, the intercept = either male or

female

-With it, the intercept is adjusted for gendercomposition.

3. See my Excel Presentation if we havetime. www.estat.us/sas/centering.xls

5/20/2018 4. HLM

32/32

ENDTo go back to my HLM pagewww.estat.us/id38.html
http://www.estat.us/id38.htmlhttp://www.estat.us/id38.html

4. hlm

Documents