2006 may maryland mixture keynote v25 · presentation at the university of maryland cilvr...
TRANSCRIPT
1
1
Latent Variable Hybrids:Overview Of Old And New Models
Bengt MuthénUCLA
Presentation at the University of Maryland CILVR conference"Mixture Models in Latent Variable Research",
May 18-19, 2006
2
Growth mixtureanalysis
Latent transition analysis, Latent class growth analysis
Growth analysis(random effects)
LongitudinalModels
Factor mixtureanalysis
Regression mixture analysis, Latent class analysis
Factor analysis,SEM
Cross-SectionalModels
HybridsCategorical Latent Variables
ContinuousLatent Variables
Overview
2
3
Measurement InvarianceYes No
Factor Analysis Emphasis
Overview Of Hybrids: Modeling With Categorical And Continuous Latent Variables
Cluster Analysis and Factor Analysis Emphasis
4
f
y
c
Measurement InvarianceYes
Factor Analysis EmphasisNo
YesParametric Factor Distribution
Mixture Factor Analysis
factor
Overview Of Hybrids: Modeling With Categorical And Continuous Latent Variables
Cluster Analysis and Factor Analysis Emphasis
3
5
f
y
c
factor
Overview Of Hybrids: Measurement Invariance, Parametric Factor Distribution
• Cross-sectional examples:• Mixture factor analysis (McDonald, 1967, 2003; Muthen, 1989;
Yung, 1997; Lubke & Muthen, 2005)
6
Overview Of Hybrids: Measurement Invariance, Parametric Factor Distribution
• Longitudinal examples:• Growth mixture modeling of trajectory classes
(Verbeke & LeSaffre, 1996; Muthen & Shedden, 1999)• Intervention effects varying across trajectory classes
(Muthen et al, 2002)• Regime (latent class) switching (Dolan, Schmittman,
Lubke, Neale, 2005)
f
y
cfactor
4
7
Measurement InvarianceYes
Factor Analysis Emphasis
Yes
Parametric Factor Distribution
Mixture Factor Analysis
No
Non-Parametric FA
No
f
y
c
factor
f
y
c
factor
Overview Of Hybrids: Modeling With Categorical And Continuous Latent Variables
Cluster Analysis and Factor Analysis Emphasis
8
f
y
c
factor
Overview Of Hybrids:Non-Parametric Factor Distribution
• Cross-sectional examples:• Latent class factor analysis for DSM-V (Muthen & Asparouhov, 2006)• IRT (?)
• Longitudinal examples:• Binary growth; non-normal random effects (Aitkin, 1999)• Trajectory groups for criminal offenders (Nagin & Land, 1993)
5
9
Measurement InvarianceYes No
Factor Analysis Emphasis
Yes
Parametric Factor Distribution
No Yes
Parametric Factor Distribution
No
Non-Parametric FMA
fc fc
Item j
Item k
f
y
c
factor
f
y
c
factor
Mixture Factor Analysis Non-Parametric FA Factor Mixture Analysis
Overview Of Hybrids: Modeling With Categorical And Continuous Latent Variables
Cluster Analysis and Factor Analysis Emphasis
10
fc
Item j
Item k
Overview Of Hybrids: Factor Mixture Analysis
• Cross-sectional examples:• Factor mixture analysis in psychometrics (Blafield, 1980; Yung, 1997)• Structural equation mixtures; market segmentation
(Jedidi, Jagpal & DeSarbo, 1997)• IRT mixtures; solution strategies, guessing, levels of difficulty - Saltus
(Yamamoto, 1987; Mislevy & Verhelst, 1990; Mislevy & Wilson, 1996; Wilson, de Boeck, Acton, 2005)
6
11
Overview Of Hybrids: Factor Mixture Analysis (Continued)
• Cross-sectional examples (continued):• Factor mixture analyzers; continuous micro-array expression data
(McLachlan, Do & Ambroise, 2004)• Factor mixture modeling; binary diagnostic criteria; genetics for
twins, siblings (Muthen, Asparouhov, Rebollo, 2006)• Classic normal finite mixtures; Fisher’s Iris data • Non-parametric factor mixture analysis (?)
• Longitudinal examples:• Factor mixture latent transition analysis (Muthen, 2006)
fc
Item j
Item k
12
Unclassified Contributors
• Dayton, Meehl, Vermunt
• Suggestions welcome
7
13
Model Testing Issues
14
NLSY 1989: Latent Class Analysis Of DSM-III-RAlcohol Dependence Criteria (n = 8313)
0.830.110.020.240.00Continue0.400.020.000.080.00Relief0.430.030.000.100.00Give up0.960.730.020.830.03Major role-Hazard0.650.090.000.190.00Time spent0.600.050.010.140.00Cut down0.990.940.120.960.15Larger0.810.350.010.450.01Tolerance0.490.070.000.140.00Withdrawal
DSM-III-R Criterion Conditional Probability of Fulfilling a Criterion
0.030.210.750.220.78IIIIIIIII
Two-class solution1 Three-class solution2
1Likelihood ratio chi-square fit = 1779 with 492 degrees of freedom2Likelihood ratio chi-square fit = 448 with 482 degrees of freedom
Prevalence
Latent Classes
Source: Muthén & Muthén (1995)
8
15
Alcohol Dependence
Alcohol Abuse
2.60
2.20
1.80
1.40
1.00
0.60
0.20
-0.20
-0.20 0.20 0.60 1.00 1.40 1.80 2.20 2.60
CLASS1.0000002.0000003.000000
Estimated Factor Scores From Two-Factor Model By Class
LCA, 3 classes: logL = -14,139, 29 parameters, BIC = 28,539FA, 2 factors: logL = -14,083, 26 parameters, BIC = 28,401FMA 2 classes, 1 factor, loadings invariant:
logL = -14,054, 29 parameters, BIC = 28,370
16
• Nylund, Muthen and Asparouhov (2006) simulation study
• BLRT has better Type I error than NCS and LMR
• BLRT finds the right number of classes better than BIC and LMR
Deciding On The Number Of Classes: Bootstrapped LRT
9
17
694017802010001000
69401472907624500
6781694334089220010-Item(ComplexStructure)
543543543n
ClassesClassesClassesModel
BLRTLMRBIC
Latent class analysis with categorical outcomes
Nylund, Asparouhov And Muthen (2006)
18
Factor (IRT) Mixture Example: The Latent Structure Of ADHD
• UCLA clinical sample of 425 males ages 5-18, all with ADHDdiagnosis
• Subjects assessed by clinicians: 1) direct interview with child (> 7 years), 2) interview with mother about child
• KSADS: Nine inattentiveness items, nine hyperactivity items;dichotomously scored
• Families with at least 2 ADHD affected children• Parent data, candidate gene data on sib pairs
10
19
‘Driven by motor’‘Forgetful in daily activities’
‘Talks excessively’‘Loses things’
‘Interrupts or intrudes’‘Dislikes/avoids tasks’
‘Difficulty waiting turn’‘Difficulty organizing tasks’
‘Blurts out answers’‘Difficulty following instructions’
‘Difficulty playing quietly’‘Doesn’t listen’
‘Runs or climbs excessively’‘Makes a lot of careless mistakes’
‘Fidgets’‘Easily distracted’
‘Difficulty remaining seated’‘Difficulty sustaining attn on tasks/play’Hyperactivity Items:Inattentiveness Items:
The Latent Structure Of ADHD (Continued)
20
7625
7547
7496
7452
7430
7523
BIC
0.75-3499LCA – 4c
0.94-3464LCA – 5c
0.113-3431LCA – 6c
0.27132-3413LCA – 7c
0.56-3545LCA – 3c
0.37-3650LCA – 2c
BLRT p value for k-1 classes# ParametersLikelihoodModel
The Latent Structure Of ADHD: Model Fit Results
LCA-3c is best by BIC and LCA-6c is best by BLRT
11
21
Three-Class And Six-Class LCA Item Profiles
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
8
8.5
9
9.5
10
10.
5
11
11.
5
12
12.
5
13
13.
5
14
14.
5
15
15.
5
16
16.
5
17
KSADS Items
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Class 1, 23.0%Class 2, 7.6%Class 3, 24.8%Class 4, 18.8%Class 5, 11.2%Class 6, 14.7%
22
0.27
0.
0.
0.
0.
0.
BLRT p value for k-1 classes
7331
7625
7547
7496
7452
7430
7523
BIC
94-3464LCA – 5c
113-3431LCA – 6c
132-3413LCA – 7c
53-3505EFA – 2f
75-3499LCA – 4c
56-3545LCA – 3c
37-3650LCA – 2c
# ParametersLikelihoodModel
The Latent Structure Of ADHD: Model Fit Results
The EFA model is better than LCA - 3c, but no classification of individuals is obtained
12
23
728059-3461FMA – 2c, 2f
7318
7331
7625
7547
7496
7452
7430
7523
BIC
χ2_diff (16) = 58p < 0.01
0.27
0.
0.
0.
0.
0.
BLRT p value for k-1
94-3464LCA – 5c
113-3431LCA – 6c
132-3413LCA – 7c
75-3432
FMA – 2c, 2fClass-varyingFactor loadings
53-3505EFA – 2f
75-3499LCA – 4c
56-3545LCA – 3c
37-3650LCA – 2c
# ParametersLikelihoodModel
The Latent Structure Of ADHD: Model Fit Results
24
Three-Class LCA And Two-Class, Two-Factor FMA Item Profiles
13
25
Factor Mixture Modeling
Categorical outcomes plus continuous-normal latent variables have the statistical and computational disadvantage of
- normality assumption - heavy computations due to numerical integration
Non-parametric latent variable distribution avoids the normality assumption and at the same time the computational disadvantage!
26
Non-Parametric Estimation Of TheRandom Effect Distribution Using Mixtures
Wei
ght
Points Points
Wei
ght
Estimated weights and points (class probabilities and class means)
14
27
• Factor analysis generalizes to random effects repeated measures (growth) analysis
• Latent class analysis generalizes to latent transition analysis
• Factor mixture analysis generalizes to growth mixture modeling and generalized latent transition analysis
Longitudinal Analysis
28
Growth Mixture Modeling:Shapes of Growth Curves
15
29
Placebo Non-Responders, 55% Placebo Responders, 45%
Ham
ilton
Dep
ress
ion
Rat
ing
Scal
e
05
1015
2025
30
Baseli
ne
Wash-i
n
48 ho
urs
1 wee
k
2 wee
ks
4 wee
ks
8 wee
ks
0
5
10
15
20
25
30
05
1015
2025
30
Baseli
ne
Wash-i
n
48 ho
urs
1 wee
k
2 wee
ks
4 wee
ks
8 wee
ks
0
5
10
15
20
25
30
A Clinical Trial Of Depression Medication:
Two-Class Growth Mixture Modeling
30
Example: Mixed-Effects Regression Models ForStudying The Natural History Of Prostate Disease
Source: Pearson, Morrell, Landis and Carter (1994), Statistics in Medicine
MIXED-EFFECT REGRESSION MODELS
Years Before Diagnosis Years Before Diagnosis
Figure 2. Longitudinal PSA curves estimated from the linear mixed-effects model for the group average (thick solid line) and for each individual in the study (thin solid lines)
Controls
BPH Cases
Local/RegionalCancers
Metastatic Cancers
PSA
Lev
el (n
g/m
l)
0
4
8
12
16
20
24
28
32
36
40
44
0
4
8
12
16
04
8
0
4
8
15 10 5 015 10 5 0
16
31
Growth Mixture Modeling Of Developmental Pathways
Outcome
Escalating
Early Onset
Normative
Agex
i
u
s
c
y1 y2 y3 y4
q
18 37
32
Multilevel Growth Mixture Modeling
17
33
Mat
h A
chie
vem
ent
Poor Development: 20% Moderate Development: 28% Good Development: 52%
69% 8% 1%Dropout:
7 8 9 10
4060
8010
0
Grades 7-107 8 9 10
4060
8010
0
Grades 7-107 8 9 10
4060
8010
0
Grades 7-10
Growth Mixture Modeling:LSAY Math Achievement Trajectory ClassesAnd The Prediction Of High School Dropout
34
hsdrop
c
i s
math7 math8 math9 math10female
hispanic
black
mother’s ed.
home res.
expectations
drop thoughts
arrested
expelled
Muthen (2004)
18
35mstrat
hsdrop
c
iw sw
math7 math8 math9 math10
math7 math9 math10
ib lunch
math8
hsdrop
C#1
C#2
female
hispanic
black
mother’s ed.
home res.
expectations
drop thoughts
arrested
expelled Within
Between
36
iu
iy sy
su
maleblackhispesfh123hsdrpcoll
qy
qu
y18 y19 y20 y24 y25
u18 u19 u20 u24 u25
neve
r
once
2 or
3 ti
mes
4 or
5 ti
mes
6 or
7 ti
mes
8 or
9 ti
mes
10 o
r mor
e tim
es
HD83
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1050
Cou
nt
Two-Part Growth Modeling Of Frequency Of Heavy Drinking Ages 18 – 25
Olsen and Schafer (2001)neve
r
once
2 or
3 ti
mes
4 or
5 ti
mes
6 or
7 ti
mes
8 or
9 ti
mes
10 o
r mor
e tim
es
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1050
Cou
nt
neve
r
once
2 or
3 ti
mes
4 or
5 ti
mes
6 or
7 ti
mes
8 or
9 ti
mes
10 o
r mor
e tim
es
HD83
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1050
Cou
nt
19
37
Two-Part Modeling Extensions In Mplus
• Growth modeling• Distal outcome• Parallel processes• Trajectory classes (mixtures)• Multilevel
• Factor analysis• Mixtures
• Latent classes for binary and continuous parts may be incorrectly picked up as additional factors in conventionalanalysis
• Multilevel
38
Growth Modeling Paradigms:Debate In Criminology And Annals AAPSS
Random Effects
Raudenbush
HLM
Latent Classes
Nagin
PROC Traj
Growth Mixtures
Muthen
Mplus
troubled waters
Sampson & Laub
20
39
Growth Mixture Modeling Versus Latent Class Growth Analysis
Escalating
Early Onset
Normative
Age18 37
Outcome
Escalating
Early Onset
Normative
Age18 37
Outcome
GMM LCGA
Muthen Nagin
40
• Nagin’s inconsistency: Latent classesused as
• Substantively distinct andmeaningful subgroups
• Non-parametric representation ofthe latent variable distribution
• Resolution: Combine substantive andnon-parametric latent classes
• Non-parametrically describedvariation within substantive themes
• Easy to set up in Mplus using twolatent class variables
Substantive And Non-Parametric Latent Classes
3 non-parametric classes within each of 2 substantive classes
21
41
Transition Probabilities
0.6 0.4
0.3 0.7
c21 2
1 2
1
2
1
2
c1
c1
Mover Class
Stayer Class c2
(c=1)
(c=2)
0.90 0.10
0.05 0.95
Time 1 Time 2
Latent Transition Analysis
u11 u12 u21 u22
c1 c2
c
42
Time 1 Time 2
Latent Transition Analysis
c1
u11 . . . u1p
c2
u21 . . . u2p
22
43
Time 1 Time 2
Factor Mixture Latent Transition Analysis
f1
c1
u11 . . . u1p
f2
c2
u21 . . . u2p
44
Time 1 Time 2
Factor Mixture Latent Transition Analysis
Item probability
f1
c1
u11 . . . u1p
f2
c2
u21 . . . u2p
Item probability
Item Item
23
45
• 1,137 first-grade students in Baltimore public schools
• 9 items: Stubborn, Break rules, Break things, Yells at others,Takes others property, Fights, Lies, Teases classmates, Talksback to adults
• Skewed, 6-category items; dichotomized (almost never vs other)
• Two time points: Fall and Spring of Grade 1
• For each time point, a 2-class, 1-factor FMA was found best fitting
Factor Mixture Latent Transition Analysis:Aggressive-Disruptive Behavior In The Classroom
46
Factor Mixture Latent Transition Analysis:Aggressive-Disruptive Behavior In The Classroom
(Continued)
16,30640-8,012 FMA LTAfactors relatedacross time
17,44521-8,649 Conventional LTA
BIC# parametersLoglikelihoodModel
24
47
Factor Mixture Latent Transition Analysis:Aggressive-Disruptive Behavior In The
Classroom (Continued)
Estimated Latent Transition Probabilities, Fall to Spring
0.590.41High0.060.94LowHighLow
FMA-LTA0.830.17High0.070.93LowHighLow
Conventional LTA
48
Within
Between
c1
u11 . . . u1p
c2
u21 . . . u2p
c1#1 c2#2
Two-Level Latent Transition Analysis
25
49
Visit www.statmodel.com to check out
- Web videos of courses
- Recent papers
- Version 4 User’s Guide
- Mplus Discussion
- Short course announcements
Mplus Website