northern arizona university · web viewstructural equation modeling (sem) (also, analysis of moment...

22
W. Martin-EPS 725-09 Department of Educational Psychology EPS 725: Multivariate Statistics Structural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural Model Example Example (Keith, 2006, p. 333) The purpose of this study is to examine the effects of peer rejection on kindergarten students’ academic and emotional adjustment, which is based upon a study by Buhs and Ladd (2001). These data (N = 399) are simulated to be consistent with the Buhs and Ladd study. There is a measurement model (similar to CFA) as part of the structural model in which 8 measured variables are estimated by four latent variables. The structural model also includes hypotheses of the effects of one latent variable on another including the disturbances for the endogenous latent variables in the model. Disturbances (d) represent all other influences on the dependent variables other than those shown in the model. The unique-error variances (r) of the measured variables measure all other influences on the measured variances beyond the influence of the latent variable (Keith, 2006). Study Variables Rejection (latent variable) averaged sociometric ratings (observed variable) for each child by the other children in class and number of nominations (observed variable) as a child who others did not want to play with. Classroom Participation (latent variable) change from a previous rating teacher ratings of cooperative participation 1

Upload: others

Post on 15-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Department of Educational PsychologyEPS 725: Multivariate Statistics

Structural Equation Modeling (SEM)(Also, analysis of moment structures, analysis of covariance structures, or causal modeling.)

Structural Model Example

Example (Keith, 2006, p. 333)

The purpose of this study is to examine the effects of peer rejection on kindergarten students’ academic and emotional adjustment, which is based upon a study by Buhs and Ladd (2001). These data (N = 399) are simulated to be consistent with the Buhs and Ladd study. There is a measurement model (similar to CFA) as part of the structural model in which 8 measured variables are estimated by four latent variables. The structural model also includes hypotheses of the effects of one latent variable on another including the disturbances for the endogenous latent variables in the model. Disturbances (d) represent all other influences on the dependent variables other than those shown in the model. The unique-error variances (r) of the measured variables measure all other influences on the measured variances beyond the influence of the latent variable (Keith, 2006).

Study Variables

Rejection (latent variable) – averaged sociometric ratings (observed variable) for each child by the other children in class and number of nominations (observed variable) as a child who others did not want to play with.

Classroom Participation (latent variable) change from a previous rating – teacher ratings of cooperative participation (observed variable) and autonomous participation (observed variable).

Achievement Adjustment (latent variable) was measured using the Metropolitan Readiness Test (MRT) Quantitative scale (observed variable) and Language scale (observed variable).

Emotional Adjustment (latent variable) was measured by student self-ratings of loneliness (observed variable) and their desire to avoid school (observed variable).

Research Questions

1

Page 2: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Adequacy of the Model (Model Fit)

1. Is the estimated population covariance matrix generated by the initial model consistent with the sample covariance matrix of the data?

2. Is the estimated population covariance matrix generated by the modified model consistent with the sample covariance matrix of the data?

Coefficient Effects

3. Are there significant direct effects of Rejection on kindergarten students’ Academic and Emotional Adjustment?

4. Which coefficients estimating parameters in the total model were significant?

Mediation Effects

5. Does classroom participation mediate the effect of Rejection on Academic Adjustment and Emotional Adjustment? What were the indirect effects of Rejection on Academic and Emotional Adjustment?

General SEM Process

In SEM, a hypothesized model has a set of underlying parameters which correspond to (1) regression coefficients, and (2) the variances and covariances of the independent variables in the model (Bentler,1995). “These parameters are estimated from the sample data to be a ‘best guess’ about population values. The estimated parameters are then combined by means of covariance algebra to produce an estimated population covariance matrix. This estimated population covariance matrix is compared with the sample covariance matrix and, ideally, the difference is very small and not statistically significant” (Ullman, 2007, p. 684).

The data of this example are analyzed using Analysis of Moment Structures (AMOS/SPSS) and will be interpreted using the major SEM steps of: (a) specifying the model, (2) estimating the model, (3) assessing the fit of the model, and (4) modifying the model.

2

Page 3: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

3

Model Specification

EstimationTechniques

Assessing Fitof Model

ModelModification

Page 4: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

1. Model Specification (model hypotheses) is specifying a model in which the parameters for the model are estimated using sample data, and the parameters are used to produce the estimated population covariance matrix (Ullman, 2007). Parameters are numerical characteristics of the SEM relationships that include regression coefficients, variances, and covariances.

Rejection

ClassroomParticipation

(change)

Achievement

EmotionalAdjustment

d11

d21

d3

1

negativenominations

r2

1

averagedrating

(reversed)

r1

1

1

cooperativeparticipation

r3

11

autonomousparticipation

r4

1

loneliness(reversed)

r7

1 1

schoolavoidance(reversed)

r8

1

MRTLanguage

r6

11

MRTQuantitative

r5

1

Effects of Peer RejectionModel Specification

4

Page 5: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Please identify which variables are unmeasured (latent) variables (factors) and measured (observed) variables in the hypothesized model.

_____________________________________________________________________

_____________________________________________________________________

Identification of the Model

A determination is made as to whether the parameters are identified before estimation techniques are undertaken. “Identification refers to whether the parameters of the model can be uniquely determined by the sample data” (Kaplan, 2000). In other words, does enough information exist to identify a solution to a set of structural equations (Black et al., 2006)? Only models that are overidentified or identified (just identified) can be estimated.

“These parameters are estimated from the sample data to be a ‘best guess’ about population values. The estimated parameters are then combined by means of covariance algebra to produce an estimated population covariance matrix. The estimated population covariance matrix is compared with the sample covariance matrix and, ideally, the difference is very small and not statistically significant” (Ullman, 2007, p. 684).

The first step in identification entails counting the number of data points and the number of parameters that are to be estimated. Data points are the number of sample variances and covariances, while the number of parameters is a total of the number of regression coefficients, variances, and covariances that are to be estimated.

(1) An overidentified model is necessary for conducting the analysis and it means there are more data points than parameters.(2) An identified (just identified) model has the same number of data points and parameters. Hypotheses about adequacy of the model cannot be tested but hypotheses about paths in the model can be tested.(3) An underidentified model has fewer data points than parameters and the parameters cannot be estimated. However, one can estimate the parameters by reducing the parameters by fixing, constraining, or deleting some of them. “A parameter may be fixed by setting it to a specific value or constrained by setting the parameter equal to another parameter” (Ullman, 2007, p. 709)

5

Page 6: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

The formula for obtaining the number of data points in a model is: p ( p + 1) 2

Where p = the number of measured variables (Ullman, 2007). The numbers in the formula are constants. The measured variables produce a sum-of-squares and cross-products matrix including cross products-the sum of products.

The number of data points is the number of sample variances and covariances. The number of parameters is the sum of the number of regression coefficients, variances, and covariances that are to be estimated.

The number of data points in the example: ___________________=_________

The number of parameters : regression coefficients _____ + covariances_____ +

variances_____ = ________

Number of data points – number of parameters = _____________ dfs

Is the model overidentified (more data points than parameters) which is

necessary for analysis? Yes or No

A second step involves examining the measurement portion of the model which is the relationship between the measured indicators (variables) and the factors. This is done to establish the scale of each factor and to assess the identifiability of this portion of the model. In this example, the variances of the observed variables and factors were fixed to 1.0. According to Kaplan (2000), this is a common way to set the metric so that the disturbance (error) terms are in the same scale as their relevant endogenous variables.

Evaluations of Assumptions of SEM

Sample Size and Missing Data: SEM is typically a large sample technique because covariances like correlations and parameter estimates and chi-square tests of fit are sensitive to sample size. The best indicators of a good SEM (like EFA) model are size of factor loadings, number of variables, and size of the sample. Thompson (2000) reported a sample size of at least 100. A ratio of number of people to number of measured or observed variables should be at a minimum of 10:1, better 15:1 or 20:1.

What is the ratio of cases to observed variables in this study and is it acceptable using the criteria identified by Thompson?_______________________________________________________________

_______________________________________________________________________

6

Page 7: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Multivariate Normality and Outliers: The majority of the SEM estimation techniques assume multivariate normality. Ullman (2007) states, “screen the measured variables for outliers, both univariate and multivariate, and the skewness and kurtosis of the measured variables examined” (p. 683) in the standard manner. All measured variables whether IVs or DVs are screened together for outliers. Variable transformation can be used if variables are highly skewed or beyond kurtotic range expectations. If transformation does not work then an estimation method can be selected that deals with nonnormality. For example, maximum likelihood and generalized least squares works well with large sample sizes while the Yuan-Bentler test works well for smaller samples (Ullman, 2007).

Please conduct a Mahalanobis distance analysis using SPSS Standard MRA with id # as the DV and all observed variables as the IVs. Please identify the χ2

.999 = ________ based upon ______df.

Identify all of the case numbers and values of those cases that were multivariate outliers.

________________________________________________________________

________________________________________________________________

Linearity: Only linear relationships among variables are examined with SEM. While difficult to assess, Ullman (2007) suggests “linear relationships among pairs of measured variables can be assessed through inspection of scatterplots” (p. 683). If nonlinear relationships exist among variables then raising the measured variables to powers can be used. Please conduct the same Standard MRA as was done on the multivariate outlier assessment but now ask for a standardized residual plot and collinearity diagnostics.

Please interpret whether the standardized residual plot is or is not demonstrating linearity, normality, and homoscedasticity.

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________

7

Page 8: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

5. Multicollinearity and Singularity: SEM will not work properly if the variables are perfect linear combinations or are highly correlated. “Inspect the determinant of the covariance matrix. An extremely small determinant may indicate a problem with multicollinearity or singularity” (Ullman, 2007, p. 683). Delete variables causing singularity or create composite variables and use them in the analysis.

Please interpret the tolerance, VIF, and condition index/variance proportions from the Standard MRA.

________________________________________________________________

________________________________________________________________

________________________________________________________________

Model Estimation Techniques

After the model has been identified, population parameters are estimated with the purpose of minimizing the difference between the observed and estimated population covariance matrices. Researchers are usually interested in using the χ2 test statistic. The performance of the χ2 is affected by the factors: (1) sample size, (2) nonnormality of the distribution of errors, of factors, and of errors and factors, and (3) violation of the assumption of independence of factors and errors (Ullman, 2007).Estimation procedures examined in Monte Carlo studies discussed by factors:

Estimation Methods and Sample Size-using maximum likelihood estimation (MLE) Black et al. (2006) recommends a minimal sample size of 100-150 but ideally between 150-400. Black et al. indicate that when a sample size is >400, “the method becomes more sensitive and almost any differenced is detected, making goodness-of-fit measures suggest poor fit” (p. 741). Ho generalized least squares (GLS). A test statistic similar to Hotelling’s T with a Yuan-Bentler adjustment to the Asymptotically Distribution Free (ADF) was effective with sample sizes of 60-120.

Estimation Methods and Nonnormality-with sample sizes of 2,500+ both ML and GLS worked fine. The ADF was not good with sample sizes under 2,500. With small samples the Yuan-Bentler test statistic performed best.

Estimation Methods and Dependence-the assumption that errors are independent is important to SEM. When errors and factors were dependent ML and GLS performed poorly-always rejecting the true model. ADF was poor unless the model had 2,500 cases. Elliptical distribution theory (EDT) was better

8

Page 9: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

but still rejected too many true models. Overall for medium to large samples the scaled ML was best and for small samples it was the Yuan-Bentler test statistic.

Choice of Estimation Method (Ullman, 2007)(1) large samples+normality+independence >ML, the Scaled ML or GLS(2) ML is the most commonly used estimation method in SEM.(3) small samples (60-120)>Yuan-Bentler test statisticWe will be using ML for this example.

Assessing the Fit of the Model

After specification (identification) and estimation, is it a good model? One aspect of a good fit is between the sample covariance matrix and the estimate population covariance matrix. Goodness of fit can be conceptualized as a series of models all nested within one another and are like hierarchical models in log-linear modeling. Nested models are subsets of one another. At one end of continuum is independence model (completely unrelated variables) with df equal to the number of data points minus the variances that are estimated. At the other end of the continuum is the saturated (full or perfect) model with zero degrees of freedom.

Assessment of fit can be unclear because of sample sizes. With large samples, trivial differences can be significant with χ2. With small samples, χ2 may not be distributed as χ2 leading to inaccurate probability levels. The probability levels are also inaccurate when the underlying assumptions of χ2 are violated. Because of these problems, a rule of thumb directly related to χ2 is a good fitting model may be indicated when the ratio of the χ2 to the degrees of freedom is less than 2.

Overall fit of the model assessing χ2. We are comparing the estimated population covariance matrix to the sample covariance matrix and we want the difference to be small and not statistically significant. Please refer to the “Result (Default model)” and fill in the following information.

Chi-square Degrees of freedom Probability level

__________ _________________ ______________

Also, divide the χ2 by the degrees of freedom.

______________________________________________________

Interpretation:___________________________________________

_______________________________________________________

9

Page 10: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Overall fit of the model assessing fit indices. There are several fit indices available for interpretation (see Ullman, 2007, p. 716-720). We will be using three of the most commonly used fit indices as identified below.

The Comparative fit index (CFI) employs the noncentral χ2 distribution with noncentrality parameters. CFI values greater than .95 indicate a good fitting model (0-1 range). Root mean square error of approximation (RMSEA) estimates lack of fit in a model compared to a perfect (saturated) model. Values .06 or less indicate good fit while values larger than .10 indicate poor fit. Goodness-of-fit (GFI) is analogous to R2. In the past values greater than .90 were considered good. In more recent times, greater than .95 has been used as indicative of a good fit. Look for these values in the output under “Model Fit Summary” using the “Default model” row.

CFI = __________ Interpretation:_____________________________________

RMSEA = __________ Interpretation:________________________________

GFI = __________ Interpretation:_____________________________________

Summary of overall fit of the model: ________________________________________________________________

________________________________________________________________

________________________________________________________________

Examination of Estimates of ParametersWe want to check to see if each of the latent and observed variables are significant thus contributing to the model. Go to the section titled, “Regression Weights: (Group number 1 – Default model)”. These are the unstandardized regression weights. The *** means p < .05. Critical ratios (C.R.) are z scores so a critical z value for α = 05, two-tailed = 1.96. The P values give an approximate two-tailed probability for the critical ratios this large or larger. P is calculated with the assumption that the parameter estimates are normally distributed and is only correct in large samples (Arbuckle, 2006).

We are testing that each single parameter equals zero. Identify all latent and observed variables with their estimate, C.R., and p value that are not significant (p > .05) below.

______________________________________________________________

_______________________________________________________________

_______________________________________________________________

10

Page 11: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Modification Index (M.I.) and Parameter Change (Par Change)Go to the section titled, “Variances: (Group number 1 – Default model)”, “M.I. Par Change”, and Regression Weights: (Group number 1-Default model).” The modification index (M.I) is an estimate of the decrease in chi-square if the two listed variables were allowed to correlate. The parameter change (Par Change) gives the approximate estimates of how much the parameter would change if it were relaxed. Identify the variables with the highest Par Change value from the Regression Weights: (Group number 1-Default model).” We just chose one to identify for this example. There could be others we would want to look at also.

Variables M.I. Par Change

_______ _____ __________

Write a Summary of Overall Model Fit and a possible model modification.

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________

Model ModificationThe two reasons for model modification are to: (1) improve fit in exploratory work and (2) test hypotheses for theoretical work. Choosing to do a modification must be based upon theoretical or common sense. We are attempting to improve the fit of the model post hoc by adding a path from Achievement>Emotional Adjustment which was identified in the previous run looking at the modification index. Since the model will continue to be nested by adding the new path, we also will use the chi-square difference test to compare the previous model to the modified model to check for improvement. Additionally, we will look at the same fit indices as before.

11

Page 12: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Rejection

ClassroomParticipation

(change)

Achievement

EmotionalAdjustment

d11

d21

d3

1

negativenominations

r2

1

averagedrating

(reversed)

r1

1

1cooperativeparticipation

r3

11

autonomousparticipation

r4

1

loneliness(reversed)

r7

1 1

schoolavoidance(reversed)

r8

1

MRTLanguage

r6

11

MRTQuantitative

r5

1

Effects of Peer RejectionModel Specification

Overall fit of the model assessing χ2. We are comparing the estimated population covariance matrix to the sample covariance matrix and we want the difference to be small and not statistically significant. Please refer to the “Result (Default model)” on page 3 of the output and fill in the following information.

Chi-square Degrees of freedom Probability level

__________ _________________ ______________

12

Page 13: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

Also, divide the χ2 by the degrees of freedom.

______________________________________________________

Interpretation:___________________________________________

_______________________________________________________

Comparison of the Fit of the Initial Model and the Modified Model

Initial Model df ____________ - Modified Model df ____________ = __________

χ2.999 = ________ based upon ______df

Initial Model χ2 ____________ - Modified Model χ2 ____________ = __________

Is the significant between the two models significant? Yes No

Symbolic summary statement:_____________________

Overall fit of the model assessing fit indices.

CFI = __________ Interpretation:_______________________________

RMSEA = __________ Interpretation:________________________________

GFI = __________ Interpretation:________________________________

Examination of Estimates of ParametersWe want to check to see if each of the latent and observed variables are significant thus contributing to the model. Go to the section titled, “Regression Weights: (Group number 1 – Default model)”. These are the unstandardized regression weights. The *** means p < .05. Critical ratios (C.R.) are z scores so a critical z value for α = 05, two-tailed = 1.96. The P values give an approximate two-tailed probability for the critical ratios this large or larger. P is calculated with the assumption that the parameter estimates are normally distributed and is only correct in large samples (Arbuckle, 2006).

13

Page 14: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

We are testing that each single parameter equals zero. Identify all latent and observed variables with their estimate, C.R., and p value that are not significant (p > .05) below.

______________________________________________________________

_______________________________________________________________

_______________________________________________________________

Please answer the following research questions.

Adequacy of the Model (Model Fit)

2. Is the estimated population covariance matrix generated by the modified model consistent with the sample covariance matrix of the data?

______________________________________________________________

_______________________________________________________________

_______________________________________________________________

_______________________________________________________________

Coefficient Effects

3. Are there significant direct effects of rejection on kindergarten students’ Academic and Emotional Adjustment?

______________________________________________________________

_______________________________________________________________

_______________________________________________________________

_______________________________________________________________

14

Page 15: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

4. Which coefficients estimating parameters in the total model were significant?

______________________________________________________________

_______________________________________________________________

_______________________________________________________________

_______________________________________________________________

Mediation Effects

5. Does classroom participation mediate the effect of Rejection on Academic Adjustment and Emotional Adjustment? What were the indirect effects of Rejection on Academic and Emotional Adjustment?

______________________________________________________________

_______________________________________________________________

_______________________________________________________________

_______________________________________________________________

15

Page 16: Northern Arizona University · Web viewStructural Equation Modeling (SEM) (Also, analysis of moment structures, analysis of covariance structures, or causal modeling.) Structural

W. Martin-EPS 725-09

References

Arbuckle, J. L. (2006). Amos 7.0 user’s guide. Chicago, IL: SPSS, Inc.

Bentler, P. M. (1995). EQS: Structural equations program manual. Encino, CA:

Multivariate Software, Inc.

Buhs, E. S., & Ladd, G. W. (2001). Peer rejection as an antecedent of young

children’s school adjustment: An examination of mediating processes.

Developmental Psychology, 37, 550-560.

Hair, J. R., Jr., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L.

(2006). Multivariate analysis. Upper saddle River, NJ: Pearson Prentice

Hall.

Kaplan, D. (2000). Structural equation modeling: Foundations and extensions.

Thousand Oaks, CA: Sage Publications, Inc.

Keith, T. Z. (2006). Multiple regression and beyond. Boston, MA: Allyn and

Bacon.

Thompson, B. (2000). Ten commandments of structural equation modeling. In

L.G. Grimm and P. R. Yarnold (eds.), Reading and Understanding More

Multivariate Statistics. Washington, DC: American Psychological

Association.

Ullman, J. B. (2007). Structural equation modeling. In B. G. Tabachnick & L. S.

Fidell (Eds.), Using multivariate statistics (pp. 676-780). Boston, MA:

Pearson Education, Inc.

16