introduction to mplus - rdcrdc.uwo.ca/events/docs/presentation_slides/2009-10/wilk-mplus-2010.pdf•...
Post on 06-May-2018
226 Views
Preview:
TRANSCRIPT
Introduction to Mplus
Piotr Wilkpiotr.wilk@schulich.uwo.ca
May 12, 2010
SPONSORED BY:Research Data Centre
Population and Life Course Studies PLCSInterdisciplinary Development Initiative
OVERVIEW
• Mplus modeling framework • Mplus language• Examples of research using Mplus
WHAT IS MPLUS?
• Statistical modeling program for structural equation modeling... and more
• Extremely flexible modeling framework:– Multiple types of data formats (variables)– Multiple types of statistical models
(relationships)• Code-based path-centric specification
MODELING FRAMEWORK
MODELING FRAMEWORK
• Describes structure of the (rectangles and circles)
• Describes relationships between variables (arrows)
• Acknowledges complex data structures:– Multilevel and multiple population data
DATA STRUCTURE
• Observe variables (rectangles)• Latent variables (circles)• Combinations
OBSERVED OUTCOME VARIABLES
• Continuous (y)• Categorical (u):
– Censored– Binary– Ordered categorical (ordinal)– Unordered categorical (nominal)– Counts
• Combinations: – Single model (y with u)
LATANT VARIABLES
• Continuous latent variables (f):– Continuous indicators– Categorical indicators
• Categorical latent variables (c)– Measurement model– Group membership
RELATIONSHIPS: OBSERVED VARIABLES
• Linear regression for continuous outcomes• Probit or logistic regression for binary
outcomes• Poisson or zero-inflated regression for
count outcomes• Simultaneous modeling of several related
relationships:– Path analysis
RELATIONSHIPS: LATANT VARIABLES
• Continuous latent variables:– Structural equation modeling
• Categorical latent variables:– Mixture modeling– Latent class analysis
• Latent variable interactions
RELATIONSHIPS: ALL VARIABLES
• Ability to combine all types of variables and all types of relationships into a single analytical framework
MODELING POSSIBILITIES:EXAMPLES
• Complex survey data• Multiple group analysis• Multilevel modeling• Mixture modeling • Latent class analysis• Longitudinal data analysis• Modeling with missing data• Monte Carlo simulations• And more…
COMPLEX SURVEY DATA
• Adjustment of standard errors:– Takes into account stratification and/or non-
independence of observations – Unequal probabilities of selection (sampling
weights)• Multilevel framework:
– Specify a separate model for each level of the multilevel data
• Both approaches can be combined
MULTILEVEL MODELING
• Multilevel models separate the overall variance into two sources:– Within (individual-level variation)– Between (group-level variation)
• Allows random intercepts and random slopes
• Random effects can be specified for any relationship
MIXTURE MODELING
• Modeling with categorical latent variables• Represent subpopulations where
population membership not known but inferred from the data
LATENT CLASS ANALYSIS
• A special case of mixture modeling• Explains relationships among observed
dependent variables• Provides classification of individuals into
more homogenous sub-groups
LONGITUDINAL DATA ANALYSIS
• Broad class of statistical methods for longitudinal data
• Latent growth curve analysis – Resembles classic confirmatory factor analysis
• Multilevel modeling
MODELING WITH MISSING DATA
• Several options for estimating models with missing data
• Estimation based on two assumptions:– Missing completely at random– Missing at random
• Non-ignorable missing data modeling:– Categorical outcomes as indicators of
missingness• Generates and analyzes multiple data sets
using multiple imputation• Computes bootstrapped standard errors
MONTE CARLO SIMULATIONS
• Extensive Monte Carlo facilities for data generation and data analysis
• Generates several types of data based on specified parameters
• Can be used for power analysis• Other Monte Carlo features:
– Saving generated data and parameter estimates
– Analytical results from each replication can be saved in an external file
OTHER USEFUL FEATURES
• Indirect effects (specific paths)• Bootstrap standard errors and confidence
intervals• Robust estimation of standard errors and
chi-square tests for model fit• And more…
COMMAND STRUCTURE
• Mplus is a command-based program• There are nine sets of Mplus commands:
– TITLE:– DATA:– VARIABLE:– DEFINE: – ANALYSIS:– MODEL:– OUTPUT:– SAVEDATA:– PLOT: – MONTECARLO:
GENERAL RULES
• All commands must begin on a new line and must be followed by a colon (:)
• Some commands have numerous subcommands
• Semicolons (;) separate subcommands• Individual lines of code cannot exceed 80
characters • Not case sensitive (only variable names
are case sensitive) • Exclamation mark in front (!) serves as a
comment character
TITLE COMMAND
• Specifies a title that will be printed on each page of the output file
• No limit on length
DATA COMMAND
• Specifies where the data file is located and the format of the data
• Records may be in free format or fixed format
• Accepts covariance or correlation matrices • Data files from other statistical packages
have to be converted: – SAS and SPSS: fixed format ASCII file – STATA: stata2mplus function
DEFINE COMMAND
• Allows for transformation and creation of new variables
• Supports a large number of transformation functions
• Allows for conditional statements– Selection of observations
ANALYSIS COMMAND
• Specifies analysis type(s) and estimation procedure
• Many estimation options are available • Some analyses require additional
commands
MODEL COMMAND:OVERVIEW
• Specifies the parameters of the model • Models are built in terms of relationships
between variables: – Variable RELATIONSHIP Variable
MODEL COMMAND:RELATIONSHIPS
• BY keyword ("measured by"): – Define the latent variables
• ON keyword (“regressed on”): – Structural path between variables
• WITH keyword (“correlated with”): – correlation between two variables
MODEL COMMAND:PARAMETERS
• Variances: – Variable name without brackets
• [Means] or thresholds [catvar$1]: – Variable name inside square brackets
• {Scale factors}: – Variable name in curly brackets
OUTPUT COMMAND
• Specifies optional outputs to be generated• Mplus creates an output file using the
extension .out (text file) • Specific elements of output can be
included or suppressed
SAVEDATA COMMAND
• Determines what to save in new text files• Analysis dependent outputs
– Datasets– Parameter estimates – Latent class memberships– Cook’s distances or “influence” statistics
PLOT COMMAND
• Provides graphical displays of observed data and results:– Histograms / scatterplots– Individual observed and estimated values– Sample and estimated means and
proportions/probabilities• Available for:
– Total sample– By group / class– Adjusted for covariates
• Editing and exporting of plots
DEFAULTS
• The command language is set up with defaults to minimize the amount of text
• Version specific defaults• Example: Missing data
– Mplus assumes that there are no missing values or that FIML estimation (missing values are missing at random)
– Listwise deletion must be specified under the DATA command
SUMMARY: PROS
• Many great features not available in other packages
• Ability to combine various data types • Path-centric specification:
– Relatively intuitive and easy to learn – Extensions to larger models are easy to
implement • Commitment to development • Excellent support
SUMMARY: CONS
• Cost: Mplus is a commercial package• Annual fee: Support and updates • Matrix specification is not supported • No data management beyond Monte Carlo
capabilities, transformations, and selection of observations
ADDITIONAL RESOURCES
• Technical and theoretical support:– Homepage: www.statmodel.com– Discussion forum: www.statmodel.com/cgi-
bin/discus/discus.cgi• Online manuals and tutorials • Other websites:
http://www.ats.ucla.edu/stat/mplus/
MPLUS COMMERCIAL VERSION
• Current version: 6.0 (new!) • Base Program: 595 USD • Mixture "add-on": 745 USD• Multilevel "add-on": 745 USD• Combination "add-on": 895 USD
MPLUS DEMO VERSION
• Free version of the software• www.statmodel.com/demo.shtml• Limit on the number of variables
– 2 independent variables– 6 dependent variables
CONCLUSION
• Advantages and disadvantage of using only one program
• Each program has strengths and weaknesses
• Use the correct one for the problem at hand
EXAMPLES
• Path analysis (3.11)• Structural equation modeling (511)• Latent growth curve analysis
– Quadratic growth (6.9)– Paralleled processes (6.13)
• Mixture modeling (7.1)• Advanced models
– Latent class growth curves analysis – Complier average causal effect
PATH ANALYSIS
PATH ANALYSIS
TITLE: Path analysis with continuous dependent variablesDATA:
FILE IS ex3.11.dat;VARIABLE:
NAMES ARE y1-y3 x1-x3;MODEL: y1 y2 ON x1 x2 x3;
y3 ON y1 y2 x2;
STRUCTURAL EQUATION MODEL
STRUCTURAL EQUATION MODEL
TITLE: SEM with continuous indicators DATA: FILE IS ex5.11.dat;VARIABLE: NAMES ARE y1-y12;MODEL: f1 BY y1-y3;
f2 BY y4-y6;f3 BY y7-y9;f4 BY y10-y12;f4 ON f3;f3 ON f1 f2;
LATENT GROWTH MODEL
LATENT GROWTH MODEL
TITLE: Quadratic growth model DATA: FILE IS ex6.9.dat;VARIABLE: NAMES ARE y11-y14;MODEL: i s q | y11@0 y12@1 y13@2 y14@3;PLOT: Type is Plot3;Series = y11 (0)
y12 (1)y13 (2)y14 (3);
LATENT GROWTH MODEL
LATENT GROWTH MODEL
TITLE: Growth model for two parallel processesDATA: FILE IS ex6.13.dat;VARIABLE: NAMES ARE y11- y24;MODEL:
i1 s1 | y11@0 y12@1 y13@2 y14@3;i2 s2 | y21@0 y22@1 y23@2 y24@3;s1 ON i2;s2 ON i1;
MIXTURE MODEL
MIXTURE MODEL
TITLE: Mixture regression analysisDATA: FILE IS ex7.1.dat;VARIABLE: NAMES ARE y x1 x2;
CLASSES = c (2);ANALYSIS: TYPE = MIXTURE;MODEL:
%OVERALL%y ON x1 x2;c ON x1;%c#2%y ON x2;y;
GROWTH MIXTURE MODEL
COMPLIER AVERAGE CAUSAL EFFECT
Outcome
Covariates
Compliance
Treatment
top related