structural equation modeling using mplus chongming yang, ph. d. 3-22-2012
TRANSCRIPT
Structural Equation Structural Equation Modeling Modeling
Using MplusUsing Mplus
Chongming Yang, Ph. D.Chongming Yang, Ph. D.
3-22-20123-22-2012
“In the past twenty years we have witnessed a paradigm shift in the analysis of correlational data. Confirmatory factor analysis and structural equation modeling have replaced exploratory factor analysis and multiple regression as the standard methods.”
Kenny, D.A. Kashy, D.A., & Bolger, N. (1998). Data analysis in psychology. In D.T. Gilbert, S.T. Fiske, & G. Lindzey (Eds.) The Handbook of Social Psychology, Vol. 1 (pp233-265). New York: McGraw-Hill.
New Paradigm in Data Analysis
Structural? Structural?
StructuralismStructuralism ComponentsComponents Relations Relations
ObjectivesObjectives
Introduction to SEMIntroduction to SEM ModelModel
Source of the modelSource of the model ParametersParameters Estimation Estimation Model evaluationModel evaluation Applications Applications
Estimate simple models with Mplus Estimate simple models with Mplus
Continuous Dependent Continuous Dependent VariablesVariables
Session ISession I
Four Moments/InformationFour Moments/Informationof Variableof Variable
MeanMean VarianceVariance SkewednessSkewedness Kurtosis Kurtosis
Variance & CovarianceVariance & Covariance2( )
1
n
ii
x xV
n
( )( )
1
n
i ii
x x y yCov
n
Covariance Matrix (S)
x1 x2 x3 x1 x2 x3
x1 Vx1 V11
x2 Covx2 Cov21 21 VV22
x3 Covx3 Cov31 31 CovCov32 32 VV33
Statistical Model Statistical Model
Probabilistic statement about Probabilistic statement about Relations of variablesRelations of variables
Imperfect but useful representation Imperfect but useful representation of realityof reality
Structural Equation Structural Equation ModelingModeling
A system of regression equations for A system of regression equations for latent variables to estimate and test latent variables to estimate and test direct and indirect effects without the direct and indirect effects without the influence of measurement errors.influence of measurement errors.
To estimate and test theories about To estimate and test theories about interrelations among observed and interrelations among observed and latent variables.latent variables.
Latent Variable / Construct / FactorLatent Variable / Construct / Factor
A hypothetical variable A hypothetical variable cannot be measured directly cannot be measured directly inferred from observable manifestations inferred from observable manifestations
Multiple manifestations (indicators) Multiple manifestations (indicators) Normally distributed interval Normally distributed interval
dimensiondimension No objective measurement unitNo objective measurement unit
How is Depression How is Depression Distributed in?Distributed in?
College students College students
Patients for Depression Therapy Patients for Depression Therapy
Normal Distributions Normal Distributions
Levels of AnalysesLevels of Analyses
ObservedObserved
LatentLatent
Test TheoriesTest Theories
Classical True Score Theory:Classical True Score Theory:
Observed Score = True score + Observed Score = True score + ErrorError
Item Response TheoryItem Response Theory Generalizability Generalizability (Raykov & Marcoulides, 2006)(Raykov & Marcoulides, 2006)
Graphic Symbols of SEMGraphic Symbols of SEM
Rectangle – observed variableRectangle – observed variable Oval -- latent variable or errorOval -- latent variable or error Single-headed arrow -- causal Single-headed arrow -- causal
relationrelation Double-headed arrow -- correlation Double-headed arrow -- correlation
Graphic Measurement Graphic Measurement Model Model
of Latent of Latent
X1
X2
X3
1
2
3
1
2
3
EquationsEquations
Specific equationsSpecific equationsXX11 = = 11 + + 11
XX22 = = 22 + + 22
XX33 = = 33 + + 3 3
Matrix SymbolsMatrix SymbolsX = X = + +
Relations of VariancesRelations of Variances
VVX1X1 = = 1122 + + 11
VVX2X2 = = 2222 + + 22
VVX3X3 = = 3322 + + 33
= measurement error / uniqueness = measurement error / uniqueness
Sample Covariance Matrix (S)
x1 x2 x3 x1 x2 x3
x1 Vx1 V11
x2 Covx2 Cov21 21 VV22
x3 Covx3 Cov31 31 CovCov32 32 VV33
Relation of CovariancesRelation of Covariances
Variance of Variance of = common covariance = common covariance of X1 X2 and X3of X1 X2 and X3
Variance of
1
2 3
0
0
0
Unknown ParametersUnknown Parameters
VVX1X1 = = 1122 + + 11
VVX2X2 = = 2222 + + 22
VVX3X3 = = 3322 + + 33
Unstandardized Unstandardized ParameterizationParameterization
(scaling)(scaling) 1 1 = 1 = 1 (set variance of X1 =1; X1 called reference (set variance of X1 =1; X1 called reference
Indicator)Indicator)
Variance of Variance of = common variance of X1 = common variance of X1 X2 and X3X2 and X3
Squared Squared = explained variance of X (R = explained variance of X (R22)) Variance of Variance of = unexplained variance in = unexplained variance in
XX Mean of Mean of = 0 = 0
Standardized Standardized ParameterizationsParameterizations
(scaling)(scaling) Variance of Variance of = 1 = common = 1 = common
variance of X1 X2 and X3variance of X1 X2 and X3 Squared Squared = explained variance of X = explained variance of X
(R(R22)) Variance of Variance of = 1 - = 1 - 22 Mean of Mean of = 0 = 0 Mean of Mean of = 0 = 0
Two Kinds of ParametersTwo Kinds of Parameters
Fixed at 1 or 0Fixed at 1 or 0 Freely estimatedFreely estimated
GeneralIntelligence
Verbald3
Reasoningd2
Analyticd1
EmotionalIntelligence
Recognize/Assessd5
SelfControld4
Personality
Opennessd7
Agreeable-nessd6
JobSatisfaction
BeingAppreciated e1
SocialRelations e2
MaritalSatisfaction
PerceivedBenefit e3
PerceivedCost e4
z1
z2
Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols
X = X = xx + + (exogenous) (exogenous)
Y = Y = yy + + (endogenous)(endogenous)
= = + + + + (structural model)(structural model)
Note: Measurement model reflects the true score Note: Measurement model reflects the true score theory theory
Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols
X = X = xx + + xx + + (measurement) (measurement)
Y = Y = yy + + yy + + (measurement)(measurement)
= = αα + + + + + + (structural)(structural)
Note: SEM with mean structure.Note: SEM with mean structure.
Model Implied Covariance Model Implied Covariance MatrixMatrix
(Σ)(Σ)
Note: This covariance matrix contains unknown parameters in the equations.
(I-B) = non-singular
Sample Covariance Matrix (S)Sample Covariance Matrix (S)
x1 x2 x3 x4 …x1 x2 x3 x4 …x1 x1 vv11
x2 x2 covcov21 21 vv22
x3 x3 covcov31 31 covcov32 32 vv33
x4 x4 covcov41 41 covcov42 42 covcov43 43 vv4 4 ……
…… Mean1 Mean2 Mean3 Mean4 Mean1 Mean2 Mean3 Mean4 ……
Total info = P(P+1)/2 + Means (if included)Total info = P(P+1)/2 + Means (if included)
Estimations/Fit FunctionsEstimations/Fit Functions
Hypothesis: Hypothesis: = S or = S or - S = 0 - S = 0 Maximum LikelihoodMaximum Likelihood
F = log||F = log|||| + trace(S|| + trace(S-1-1) - log||S|| - (p+q)) - log||S|| - (p+q)
Convergence -- Reaching Convergence -- Reaching LimitLimit
Minimize F while adjust unknown Parameters through Minimize F while adjust unknown Parameters through iterative processiterative process
Convergence value: F difference between last two Convergence value: F difference between last two iterationsiterations
Default convergence = .0001 Default convergence = .0001 Increase to help convergence (Increase to help convergence (0.001 or 0.010.001 or 0.01))
e.g. e.g. Analysis: convergence = .01;Analysis: convergence = .01;
No ConvergenceNo Convergence
No unique parameter estimatesNo unique parameter estimates Lack of degrees of freedom Lack of degrees of freedom under under
identification identification Variance of reference indicator too Variance of reference indicator too
small small Fixed parameters are left to be freely Fixed parameters are left to be freely
estimatedestimated Misspecified model Misspecified model
Absolute Fit IndexAbsolute Fit Index
22 = F(N-1) = F(N-1) (N = sample size)(N = sample size)
df = p(p+1)/2 – q df = p(p+1)/2 – q
P = number of variances, covariances, & meansP = number of variances, covariances, & means
q = number of unknown parameters to be estimatedq = number of unknown parameters to be estimated
probprob = ? = ? (Nonsignificant (Nonsignificant 22 indicates good fit, indicates good fit, Why?)Why?)
Relative Fit: Relative Fit: Relative to Baseline (Null) Relative to Baseline (Null)
ModelModel Fix all unknown parameters at 0 Fix all unknown parameters at 0 Variables not related Variables not related ((=======0)=0)
Model implied covariance Model implied covariance = 0 = 0 Fit to sample covariance matrix SFit to sample covariance matrix S Obtain Obtain 22, df, , df, prob prob < .0000 < .0000
Relative Fit IndicesRelative Fit Indices
CFI = 1- (CFI = 1- (22-df)/(-df)/(22bb-df-dfbb) )
b = baseline modelb = baseline model Comparative Fit Index, desirable => .95; 95% better than b modelComparative Fit Index, desirable => .95; 95% better than b model
TLI = (TLI = (22bb/df/dfb b - - 22/df) / (/df) / (22
bb/df/dfbb-1) -1) (Tucker-Lewis Index, desirable => .90)(Tucker-Lewis Index, desirable => .90)
RMSEA = RMSEA = √(√(22-df)/(n*df) -df)/(n*df) (Root Mean Square of Error Approximation, desirable <=.06(Root Mean Square of Error Approximation, desirable <=.06 penalize a large model with more unknown parameters)penalize a large model with more unknown parameters)
Absolute Fit -- SRMRAbsolute Fit -- SRMR
Standardized Root Mean Square Standardized Root Mean Square ResidualResidual
SRMR = Difference between observed SRMR = Difference between observed and implied covariances in standardized and implied covariances in standardized metricmetric
Desirable when < .90, but no consensusDesirable when < .90, but no consensus Does not penalize for number of model Does not penalize for number of model
parameters, unlike RMSEAparameters, unlike RMSEA
Special Case ASpecial Case A
VerbalAggression
t4a3 e3
t4a93 e2
t4a94 e1
PhysicalAggression
t4a37 e6
t4a57 e5
t4a90 e4
Sex
d1
1
d2
1
Special Cases A Special Cases A
Assumption: x = Assumption: x =
y y = = xx + + + +
= = + + xx + +
Special Case BSpecial Case B
VerbalAggression
x3e3
x2e2
x1e1
PhysicalAggression
x6e6
x5e5
x4e4
PeerStatus
d
Special Cases B Special Cases B
Assumption: y = Assumption: y =
x = x = xx + + xx + +
yy = = + + + +
Other Special Cases of SEMOther Special Cases of SEM
Confirmatory Factor Analysis Confirmatory Factor Analysis (measurement model only)(measurement model only) Multiple & Multivariate RegressionMultiple & Multivariate Regression ANOVA / MANOVA ANOVA / MANOVA (multigroup CFA)(multigroup CFA)
ANCOVAANCOVA Path Analysis Model Path Analysis Model (no latent variables)(no latent variables)
Simultaneous Econometric Equations…Simultaneous Econometric Equations… Growth Curve ModelingGrowth Curve Modeling ……
EFA vs. CFAEFA vs. CFA
Factor 1
x1
e1
1
1
x2
e21
x3
e31
Factor 2
x4
e4
x5
e5
x6
e6
1
1 1 1
Exploratory Factor AnalysisConfirmatory Factor Analysis
Factor 1
x1
e1
x2
e2
x3
e3
Factor 2
x4
e4
x5
e5
x6
e6
1
1 1 1
1
1 1 1
Multiple RegressionMultiple Regression
x1
x2
x3
Y
e1
ANCOVAANCOVA
Pretest1
Group
Posttest1
e11
Pretest2 Posttest2
e21
Multivariate Normality Multivariate Normality AssumptionAssumption
Observed data summed up perfectly Observed data summed up perfectly by covariance matrix S (+ means M), by covariance matrix S (+ means M), S thus is an estimator of the S thus is an estimator of the population covariance population covariance
Consequences of ViolationConsequences of Violation
Inflated Inflated 2 2 & deflated CFI and TLI& deflated CFI and TLI reject plausible models reject plausible models
Inflated standard errors Inflated standard errors attenuate factor loadings and attenuate factor loadings and structural parametersstructural parameters
(Cause: Sample covariances were underestimated) (Cause: Sample covariances were underestimated)
Accommodating Accommodating StrategiesStrategies
Correcting Fit Correcting Fit Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & Standard Errors & Standard Errors
(estimator = mlm; in Mplus)(estimator = mlm; in Mplus) Correcting standard errorsCorrecting standard errors
BootstrappingBootstrapping Transforming Nonnormal variablesTransforming Nonnormal variables
Transforming into new normal indicators Transforming into new normal indicators (undesirable)(undesirable)
SEM with Categorical VariablesSEM with Categorical Variables
Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & & SE SE
S-B S-B 22 = = d d-1-1(ML-based (ML-based 22)) (d= Scaling factor (d= Scaling factor that incorporates kurtosis)that incorporates kurtosis)
Effect: performs well with continuous data Effect: performs well with continuous data in terms of in terms of 22, CFI, TLI, RMSEA, parameter , CFI, TLI, RMSEA, parameter estimates and standard errors.estimates and standard errors.
also works with certain-categorical also works with certain-categorical variables (See next slide)variables (See next slide)
Analysis:Analysis: estimator = MLM; estimator = MLM;
Workable Categorical DataWorkable Categorical Data
1.000 2.000 3.000 4.000 5.000
0.000
1.000
2.000
3.000
4.000
5.000
6.000
7.000
Nonworkable Categorical Nonworkable Categorical DataData
1.000 2.000 3.000
0.000
1.000
2.000
3.000
4.000
5.000
6.000
BootstrappingBootstrapping
Original btstrp1 btstrp2 …Original btstrp1 btstrp2 … x y x y x y x y x y x y 1 5 5 3 1 31 5 5 3 1 3 2 4 1 1 5 42 4 1 1 5 4 3 3 3 2 4 13 3 3 2 4 1 4 2 4 5 2 24 2 4 5 2 2 5 1 2 4 3 55 1 2 4 3 5 . . . . . .. . . . . .
Limitation of BootstrappingLimitation of Bootstrapping
Assumption: Sample = PopulationAssumption: Sample = Population Useful Diagnostic ToolUseful Diagnostic Tool Does not Compensate for Does not Compensate for
small or unrepresentative samples small or unrepresentative samples severely non-normal or severely non-normal or absence of independent samples for the cross-absence of independent samples for the cross-
validationvalidation Analysis:Analysis: Bootstrap = 500 Bootstrap = 500
(standard/residual);(standard/residual); Output:Output: stand cinterval; stand cinterval;
Examining Group DifferencesExamining Group Differencesin latent variables (MANOVA)in latent variables (MANOVA)
XXg1g1 = = g1g1 + + g1g1g1g1 + + g1g1
XXg2g2 = = g2g2 + + g2g2g2g2 + + g2g2
XXg1g1-- XXg2 g2 = (= (g1g1 - - g2g2) + () + (g1g1g1g1--g2g2g2g2 ) + ( ) + (g1g1-- g2g2) )
Imposing equality constraints on Imposing equality constraints on and use items with invariant loadings and use items with invariant loadings
XXg1g1-- XXg2 g2 = = + + ((g1g1- - g2g2) + () + (g1g1- - g2g2))Given Given = 0, by assigning = 0, by assigning g1g1 = 0 = 0
XXg1g1-- XXg2 g2 = = + + ((g2g2))
Measurement InvarianceMeasurement Invariance(Hierarchical restrictions)(Hierarchical restrictions)
Configural invariance – same itemsConfigural invariance – same items Metric Factorial InvarianceMetric Factorial Invariance
Weak – additional invariant loadings (Weak – additional invariant loadings ()) Strong – additional invariant intercept Strong – additional invariant intercept
(()) Strict – additional invariant error Strict – additional invariant error
variance (variance ())(Steven & Reise, 1997)(Steven & Reise, 1997)
Partial InvariancePartial Invariance
Majority of factor loadings invariantMajority of factor loadings invariant Variant factor loadings are allowed to Variant factor loadings are allowed to
be freely estimated across groupsbe freely estimated across groups
Two Applications Invariance Two Applications Invariance TestTest
Develop unbiased test Develop unbiased test Examine group difference in latent Examine group difference in latent
variables variables
Advantages of Multigroup Advantages of Multigroup Analysis Analysis
Test all parameters across groupsTest all parameters across groups Allow invariant variances across Allow invariant variances across
groupsgroups Large sample sizes Large sample sizes How large is large enough? How large is large enough? (Muthén & (Muthén &
Muthén, 2002)Muthén, 2002)
MIMIC ModelMIMIC Model
x1
x3
y1
y2
y3
y4
e1
e2
e3
e4
Fx2
MIMIC Model for Examining MIMIC Model for Examining Group DifferenceGroup Difference
MIMIC = multiple indicator multiple MIMIC = multiple indicator multiple causescauses
Indicators = functions of latent variableIndicators = functions of latent variable Controlling for latent variable, covariate Controlling for latent variable, covariate
should have no effects indicatorsshould have no effects indicators Significant Covariate Effects = biases in Significant Covariate Effects = biases in
the levelsthe levels
Assumptions of MIMIC ModelAssumptions of MIMIC Model
Invariant factor loadings across Invariant factor loadings across subgroupssubgroups
Invariant variances (latent & Invariant variances (latent & observed)observed)
Small sample size Small sample size
Multiple Programs Multiple Programs IntegratedIntegrated
SEM of both continuous and categorical SEM of both continuous and categorical variablesvariables
Multilevel modeling Multilevel modeling Mixture modeling (identify hidden groups)Mixture modeling (identify hidden groups) Complex survey data modeling Complex survey data modeling
(stratification, clustering, weights)(stratification, clustering, weights) Modern missing data treatmentModern missing data treatment Monte Carlo Simulations Monte Carlo Simulations
Types of Mplus FilesTypes of Mplus Files
Data (*.dat, *.txt)Data (*.dat, *.txt) Input (specify a model, <=80 Input (specify a model, <=80
columns/line)columns/line) Output (automatically produced) Output (automatically produced) Plot Plot
Data File Format Data File Format
Free Free Delimited by tab, space, or comma Delimited by tab, space, or comma No missing values No missing values Default in Mplus Default in Mplus Computationally slow with large data setComputationally slow with large data set
FixedFixed
Format = 3F3, 5F3.2, F5.1;Format = 3F3, 5F3.2, F5.1;
Mplus Input Mplus Input
DATADATA: : File = ? File = ?
VARIABLEVARIABLE: : Names=?; Usevar=?; Names=?; Usevar=?; Categ=?;Categ=?;
ANALYSISANALYSIS: : Type = ?Type = ?
MODELMODEL: : (BY, ON, WITH)(BY, ON, WITH) OUTPUTOUTPUT: : Stand;Stand;
Model Specification in MplusModel Specification in Mplus
BY BY Measured by Measured by (F by x1 x2 x3 x4)(F by x1 x2 x3 x4)
ON ON Regressed on Regressed on (y on x)(y on x)
WITH WITH Correlated with Correlated with (x with y)(x with y)
XWITH XWITH Interact with Interact with (inter | F1 xwith F2)(inter | F1 xwith F2)
PON PON Pair ON Pair ON (y1 y2 on x1 x2 = y1 on x1; y2 on (y1 y2 on x1 x2 = y1 on x1; y2 on
x2)x2) PWITH PWITH pair with pair with (x1 x2 with y1 y2 = x1 with (x1 x2 with y1 y2 = x1 with
y1; y1 with y2)y1; y1 with y2)
Default Specification
Error or residual (disturbance) Covariance of exogenous variables in
CFA Certain covariances of residuals (z2)
z2z1
PracticePractice Prepare two data files for MplusPrepare two data files for Mplus
Mediation.sav Mediation.sav Aggress.sav Aggress.sav
Model SpecificationModel Specification Single Group CFASingle Group CFA Examine Mediation Effects in a Full Examine Mediation Effects in a Full
SEMSEM Run a MIMIC model of aggressions Run a MIMIC model of aggressions Multigroup CFA to examine Multigroup CFA to examine
measurement invariance measurement invariance
SPSS DataSPSS Data
Missing Values?Missing Values? Leave as blank to use fixed formatLeave as blank to use fixed format Recode into special number to use free formatRecode into special number to use free format
Save as & choose file typeSave as & choose file type Fixed ASCIIFixed ASCII Free *.dat (with or without variable names?)Free *.dat (with or without variable names?)
Copy & paste variable names into Mplus Copy & paste variable names into Mplus input fileinput file
Stata2mplusStata2mplus
Converting a stata data file to *.datConverting a stata data file to *.dat
Find out:Find out:http://www.ats.ucla.edu/stat/stata/faq/stata2mplus.htm
Graphic ModelGraphic Model
F1
y1 y2 y3
F3
y7 y8 y9
F5
y13 y14 y15
F2
y6y5y4 F4
y12y11y10
d3
d4d5
Model SpecificationModel Specification
Model: Model: f1 by y1-y3;f1 by y1-y3;
f2 by y4-y6;f2 by y4-y6;
f3 by y7-y9;f3 by y7-y9;
f4 by y10-y12;f4 by y10-y12;
f5 by y13-y15;f5 by y13-y15;
f3 on f1 f2;f3 on f1 f2;
f4 on f2;f4 on f2;
f5 on f2 f3 f4 ;f5 on f2 f3 f4 ;MeaErrors are au
Modification IndicesModification Indices
Lower bound estimate of the expected Lower bound estimate of the expected chi square decrease chi square decrease
Freely estimating a parameter fixed at Freely estimating a parameter fixed at 00
MPlusMPlus Output: stand Mod(10); Output: stand Mod(10); Start with least important parameters Start with least important parameters
(covariance of errors)(covariance of errors) Caution: justification?Caution: justification?
Indirect (Mediation) EffectIndirect (Mediation) Effect
A*BA*B
Mplus specification:Mplus specification:Model Indirect: DV IND Mediator IV;Model Indirect: DV IND Mediator IV;
Model ComparisonModel Comparison Model: Model:
Probabilistic statement about the relations of Probabilistic statement about the relations of variablesvariables
Imperfect but usefulImperfect but useful
Models Differ:Models Differ: Different Variables and Different Relations Different Variables and Different Relations
((, , , , , , )) Same Variables but Different Relations Same Variables but Different Relations
((, , , , , , ))
Nested ModelNested Model A Nested Model (b) comes from general A Nested Model (b) comes from general
Model (a) byModel (a) by
Removing a parameter (e.g. a path)Removing a parameter (e.g. a path)
Fixing a parameter at a value (e.g. 0)Fixing a parameter at a value (e.g. 0)
Constraining parameter to be equal to anotherConstraining parameter to be equal to another
Both models have the same variablesBoth models have the same variables
Equality Constraints in Equality Constraints in Mplus Mplus
Parameter Labels:Parameter Labels: Numbers Numbers Letters Letters Combination of numbers of lettersCombination of numbers of letters
Constraint (B=A)Constraint (B=A) F3 on F1 (A);F3 on F1 (A); F3 on F2 (A);F3 on F2 (A);
Test If A=BTest If A=B
F1
y1 y2 y3
F3
y7 y8 y9
F5
y13 y14 y15
F2
y6y5y4 F4
y12y11y10
B
A
d3
d4d5
Model Comparison via Model Comparison via 22 DifferenceDifference
22 = df = (Nested model) = df = (Nested model) 22 = df = (Default model) = df = (Default model) ___________________________________ ___________________________________ 22
difdif = df = dfdifdif = p = ? = p = ? (a single tail)(a single tail)
Find p value at the following website:Find p value at the following website:http://www.tutor-homework.com/statistics_tables/statistics_tables.html
Conclusion: Conclusion: If p > .05, there is no difference between the default model and If p > .05, there is no difference between the default model and
nested model. Or the Hypothesis that the parameters of the two nested model. Or the Hypothesis that the parameters of the two models are equal is not supported. models are equal is not supported.
Other Comparison CriteriaOther Comparison Criteria
AIC = 2211 - - 22
22 - 2(df - 2(df11 – df – df22))
= Δ2211 – 2(Δdf) (as 22
difdif
testtest)) BIC
Smaller is better Difference > 2
PracticePractice
Test if effect A=BTest if effect A=B
Run CFA with Real DataRun CFA with Real Data
VerbalAggression
a3 e1
a93 e2
a94 e3
PhysicalAggression
a37 e4
a57 e5
a90 e6
Multigroup AnalysisMultigroup Analysis
VARIABLE:VARIABLE: USEVAR = X1 X2 X3 X4; USEVAR = X1 X2 X3 X4; Grouping IS Grouping IS sex sex (0=F 1=M); (0=F 1=M); ANALYSIS: ANALYSIS: TYPE = MISSING H1;TYPE = MISSING H1;MODEL:MODEL: F1 BY X1 - X4;F1 BY X1 - X4;
MODEL M: MODEL M: F1 BY X2 - X4; F1 BY X2 - X4;
Note: sex is grouping variable and is not used in the model.
Test Measurement Invariance Test Measurement Invariance Default Model Default Model
Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 () a93 () a94 ();a94 (); F2 By F2 By a57 () a57 () a90 ();a90 ();Output: stand;Output: stand;
Note: Reference indicators in the second group are omitted.
Test Measurement Invariance Test Measurement Invariance Constrained Model Constrained Model
Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 (1) a93 (1) a94 (2);a94 (2); F2 By F2 By a57 (3) a57 (3) a90 (4);a90 (4);Output: stand;Output: stand;
Note: Reference indicators in the second group are omitted.
Estimate with Real DataEstimate with Real Data
VerbalAggression
a3 e1
a93 e2
a94 e3
PhysicalAggression
a37 e4
a57 e5
a90 e6
Sex
Race1
Race2
d1
d2
SEM with Categorical SEM with Categorical IndicatorsIndicators
Session IISession II
Problems of Ordinal ScalesProblems of Ordinal Scales
Not truly interval measure of a latent Not truly interval measure of a latent dimension, having measurement dimension, having measurement errors errors
Limited range, biased against Limited range, biased against extreme scoresextreme scores
Items are equally weighted (implicitly Items are equally weighted (implicitly by 1) when summed up or averaged, by 1) when summed up or averaged, losing item sensitivity losing item sensitivity
Criticisms on Using Ordinal Criticisms on Using Ordinal Scales Scales as Measures of Latent as Measures of Latent
ConstructsConstructs Steven (1951):Steven (1951): …means should be avoided …means should be avoided
because its meaning could be easily interpreted because its meaning could be easily interpreted beyond ranks.beyond ranks.
Merbitz(1989):Merbitz(1989): Ordinal scales and foundations Ordinal scales and foundations of misinferenceof misinference
Muthen (1983):Muthen (1983): Pearson product moment Pearson product moment correlations of ordinal scales will produce correlations of ordinal scales will produce distorted results in structural equation modeling. distorted results in structural equation modeling.
Write (1998):Write (1998): “… “…misuses nonlinear raw scores misuses nonlinear raw scores or Likert scales as though they were linear or Likert scales as though they were linear measures will produce systematically distorted measures will produce systematically distorted results. …It’s not only unfair, it is immoral.” results. …It’s not only unfair, it is immoral.”
Assumption of Categorical Assumption of Categorical Indicators Indicators
A categorical indicator is a coarse A categorical indicator is a coarse categorization of a normally categorization of a normally distributed underlying dimension distributed underlying dimension
Latent (Polychoric) Latent (Polychoric) CorrelationCorrelation
Categorization of Latent DimensionCategorization of Latent Dimension& Threshold & Threshold
No Yes
Never Sometimes Often
1 2 3 4 5
Y
m-1 m
ThresholdThreshold
The values of a latent dimension at The values of a latent dimension at which respondents have 50% which respondents have 50% probability of responding to two probability of responding to two adjacent categoriesadjacent categories
Number of thresholds = response Number of thresholds = response categories – 1. e.g. a binary variable categories – 1. e.g. a binary variable has one threshold.has one threshold.
Mplus specification [x$1] [y$2]; Mplus specification [x$1] [y$2];
Normal Cumulative Normal Cumulative DistributionsDistributions
Measurement Models of Measurement Models of Categorical Indicators (Categorical Indicators (2P 2P
IRT)IRT)
Probit: Probit: P P ((=1|=1|) = ) = [(-[(- + + ))-1/2-1/2 ] ] (Estimation = Weight Least Square with df adjusted (Estimation = Weight Least Square with df adjusted
for for
Means and Variances)Means and Variances)
Logistic: Logistic: P P ((=1|=1|) = 1 / (1+ ) = 1 / (1+ ee-(--(- + + ))))
(Maximum Likelihood Estimation)(Maximum Likelihood Estimation)
Converting CFA to IRT Converting CFA to IRT ParametersParameters
Probit ConversionProbit Conversion a = a = -1/2 -1/2
b = b = // Logit ConversionLogit Conversion
a = a = /D/D (D=1.7)(D=1.7)
b = b = //
Sample Information Sample Information
Latent Correlation Matrix Latent Correlation Matrix
equivalent to covariance matrix of equivalent to covariance matrix of continuous indicatorscontinuous indicators
Threshold matrix Threshold matrix ΔΔ equivalent to means of continuous equivalent to means of continuous
indicatorsindicators
One Parameter One Parameter Item Response Theory ModelItem Response Theory Model
Analysis: Estimator = ML;Analysis: Estimator = ML; Model: Model:
F by [email protected] F by [email protected]
[email protected] [email protected]
… …
Stages of EstimationStages of Estimation
Sample information: Sample information: Correlations/threshold/intercepts Correlations/threshold/intercepts (Maximum Likelihood)(Maximum Likelihood)
Correlation structure (Weight Least Correlation structure (Weight Least Square)Square)
gg F = F = (s (s(g)(g)--(g)(g))’W)’W(g)-1(g)-1(s(s(g)(g)--(g)(g))) g=1g=1
WW-1-1 matrix matrix
Elements: Elements:
S1 intercepts or/and thresholdsS1 intercepts or/and thresholds
S2 slopesS2 slopes
S3 residual variances and S3 residual variances and correlationscorrelations
WW-1 -1 : divided by sample size: divided by sample size
EstimationEstimation
WLSMVWLSMV: :
WWeight eight LLeast east SSquare estimation with quare estimation with degrees of freedom adjusted for degrees of freedom adjusted for MMeans and eans and VVariances of latent and ariances of latent and observed variables observed variables
Baseline ModelBaseline Model
Freely estimated thresholds of all the Freely estimated thresholds of all the categorical indicatorscategorical indicators
dfdf = = pp 22– 3– 3p p ((p p = 3 of polychoric = 3 of polychoric correlations)correlations)
Multigroup AnalysisMultigroup Analysis
VARIABLE:VARIABLE: USEVAR = X1 X2 X3 X4; USEVAR = X1 X2 X3 X4; Grouping IS sex (0=F 1=M); Grouping IS sex (0=F 1=M); ANALYSIS: ANALYSIS: TYPE = MISSING H1;TYPE = MISSING H1;MODEL:MODEL: F1 BY X1 - X4;F1 BY X1 - X4;
MODEL M: MODEL M: F1 BY X2 - X4; F1 BY X2 - X4;
Data Preparation TipData Preparation Tip
Categorical indicators are required to Categorical indicators are required to have consistent response categories have consistent response categories across groupsacross groups
Run Crosstab to identify zero cellsRun Crosstab to identify zero cells
Recode variables to collapse certain Recode variables to collapse certain categories to eliminate zero cellscategories to eliminate zero cells
Inconsistent CategoriesInconsistent Categories
1 2 3 4 5
Male 60 80 43 4 0
Female
57 86 32 16 2
1 2 3 4
Male 60 80 43 4
Female
57 86 32 18
Test Measurement Invariance Test Measurement Invariance Default Model Default Model
Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 () a93 () a94 ();a94 (); F2 By F2 By a57 () a57 () a90 ();a90 ();Output: stand;Output: stand;Savedata:Savedata: difftest agg.dat; difftest agg.dat;
Specify Specify DependentDependent Variables Variables
as Categoricalas Categorical Variable:Variable:
Categ = x1-x3;Categ = x1-x3; Categ = all;Categ = all;
Model Comparison with Model Comparison with Categorical Dependent Categorical Dependent
Variables Variables 1.1. Run H0 model with the following at the Run H0 model with the following at the
end of input file: end of input file: Savedata:Savedata: difftest test.dat;difftest test.dat; 2. Run a nested model H1 with an equality 2. Run a nested model H1 with an equality
constraint (s) on a parameter (s) with the constraint (s) on a parameter (s) with the following in the input file:following in the input file:
Analysis: Analysis: difftest test.dat;difftest test.dat; 3. Examine Chi-square difference test in the 3. Examine Chi-square difference test in the
output of H1 Modeloutput of H1 Model
Test Measurement Invariance Test Measurement Invariance Nested Model Nested Model
Analysis: type = missing h1;Analysis: type = missing h1; difftest agg.dat;difftest agg.dat;Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 (1) a93 (1) a94 (2);a94 (2); F2 By F2 By a57 (3) a57 (3) a90 (4);a90 (4);Output: stand;Output: stand;
Reporting Results
Guidelines: Conceptual Model Software + Version Data (continuous or categorical?) Treatment of Missing Values Estimation method Model fit indices (2
(df), p, CFI, TLI, RMSEA)
Measurement properties (factor loadings + reliability) Structural parameter estimates (estimate,
significance, 95% confidence intervals) ( = .23*, CI = .18~.28)
Reliability of Categorical Indicators
(variance approach)
= (i)2/ [(i)2 + 2], where
(i)2 = square (sum of standardized factor loadings)
2 = sum of residual variances i = items or indicator
2i = 1 - 2
McDonald, R. P. (1999). Test theory: A unified treatment (p.89) Mahwah, New Jersey: Lawrence Erlbaum Associates.
Calculator of Reliability Calculator of Reliability (Categorical Indicators)(Categorical Indicators)
SPSS reliability dataSPSS reliability data SPSS reliability syntax SPSS reliability syntax
Interactions in SEMInteractions in SEM
Observed or Latent Observed or Latent Categorical or ContinuousCategorical or Continuous Nine possible combinationsNine possible combinations Treatment Treatment see users’ Guide see users’ Guide
Trouble Shooting StrategyTrouble Shooting Strategy
Start with one part of a big modelStart with one part of a big model Ensure every part worksEnsure every part works Estimate all parts simultaneously Estimate all parts simultaneously
Important ResourcesImportant Resources
Mplus Website:Mplus Website: www.statmodel.com
Papers:Papers: http://www.statmodel.com/papers.shtml
Mplus discussions:Mplus discussions:
http://www.statmodel.com/cgi-bin/discus/discus.cgi