general structural equations (lisrel) week 3 #4 mean models reviewed non-parallel slopes non-normal...

General Structural Equations(LISREL)

Week 3 #4Mean Models Reviewed

Non-parallel slopesNon-normal data

2

Models for Means and Intercepts (continued)

Multiple Group Models:For “zero order” latent variable mean

differences: “free” individual measurement equation

intercepts but constrain them to equality across groups

Fix the latent variable means to 0 in group 1 Free the latent variable means in groups 2->k If the latent variables of interest are

endogenous and if there are exogenous latent variables in the model, constrain construct equation path coefficients to zero.

3

Models for Means and Intercepts (continued)

For “zero order” latent variable mean differences: “free” individual measurement equation intercepts but constrain them to

equality across groups Fix the latent variable means to 0 in group 1 Free the latent variable means in groups 2->k If the latent variables of interest are endogenous and if there are exogenous

latent variables in the model, constrain construct equation path coefficients to zero.

Individual LV mean parameters represent contrast with (differerence from) “reference group” (group with LV mean set to zero; LR tests requested for joint hypotheses (e.g, constrain means to zero in all groups vs. model with groups 2->k freed)

Check modification indices on measurement equation intercepts to verify “proportional indicator differences” assumption holds (or at least holds approximately)

4

AMOS Programming Check off “means and intercepts” Means and intercepts will now appear on

diagram. Where variances used to appear, there will now be two parameters (mean + variance); where the variable is dependent, one parameter (intercept) will appear.

Impose appropriate parameter constraints [insert brief demonstration here!]

5

Review yesterday’s slides from slide 52

Uses World Values Study 1990 data for an example

We’ll use an updated version (new data, some difference in countries) today

Refer to handout (slides not reproduced)

6

Means1a.LS8 - tau-x elements allowed to vary between

countries. Must fix kappa (mean of ksi’s) to 0 since otherwise

not identified. Chi-square=233.65 df=42

United States:

TAU-X

A006 F028 F066 F063 F118 F119

-------- -------- -------- -------- -------- --------

1.6191 3.6383 2.2287 8.5530 4.7504 2.9739

(0.0263) (0.0688) (0.0563) (0.0733) (0.0941) (0.0749)

61.4969 52.8937 39.5980 116.6334 50.4717 39.6838

TAU-X

F120 F121

-------- --------

4.3443 5.9000

(0.0883) (0.0757)

49.2263 77.9553

CANADA:

TAU-X

A006 F028 F066 F063 F118 F119

-------- -------- -------- -------- -------- --------

2.1202 4.7402 3.2042 7.4657 5.4974 3.3091

(0.0236) (0.0612) (0.0551) (0.0706) (0.0812) (0.0646)

89.9232 77.4453 58.1887 105.7780 67.7035 51.2445

TAU-X

F120 F121

-------- --------

4.4986 6.0079

(0.0713) (0.0636)

7

Means1b.ls8 Measurement model like means1a, but now we are expressing group 1 versus group 2 differences in means by 2 parameters (1 for each latent variable) as opposed to calculating them for each indicator using, e.g., TX 1 [1] – TX1 [2].

Chi-square=276.27 df=48KAPPA in Group 2 (Canada) [Kappa in Group 1 is fixed to zero] KSI 1 KSI 2 -------- -------- 1.0712 0.3236 (0.0731) (0.0948) 14.6538 3.4138Above provides significance tests for:Canada-U.S. differences in religiosity (z=14.6538, p<.001)Canada-U.S. differences in sex/morality attitudes (z=3.4138, p<.001)For a joint significance test to see if both the means for Religiosity and Sex/morality

are different (null hypothesis, differences both = 0), see program Means1c.ls8. Chi-square = 512.9661 df=50 for this model; subtract chi-squares (512-276) for test (df=2).

8

Diagnostics for this model: See Modification Indices for TX vectors:USA Modification Indices for TAU-X A006 F028 F066 F063 F118 F119 -------- -------- -------- -------- -------- -------- 0.6495 0.2995 2.8724 8.2808 27.0494 2.0749 Modification Indices for TAU-X F120 F121 -------- --------

12.1313 5.2727CANADAModification Indices for TAU-X A006 F028 F066 F063 F118 F119 -------- -------- -------- -------- -------- -------- 0.6495 0.2995 2.8725 8.2808 27.0495 2.0749 Modification Indices for TAU-X F120 F121 -------- -------- 12.1312 5.2728 Expected Change for TAU-X A006 F028 F066 F063 F118 F119 -------- -------- -------- -------- -------- -------- 0.0164 0.0238 0.0593 0.1261 0.3003 0.0637 Expected Change for TAU-X F120 F121 -------- -------- -0.1981 -0.1015

9

Means2a Model with exogenous single-indicator variables.

Single indicator ksi-variables: gender, age, education.

Specification GA=IN in group 2 implies a parallel slopes model.

Thus, the AL parameters in group 2 can be interpreted as “group 1 vs. group 2

differences, controlling for differences in sex, education and age”.

TAU-X

GENDER AGE EDUC

-------- -------- --------

0.4217 42.3840 4.5365

(0.0146) (0.4750) (0.0413)

28.9360 89.2300 109.8409

ALPHA

ETA 1 ETA 2

-------- --------

1.2272 0.5898

(0.0714) (0.0954)

17.1899 6.1819

KAPPA

GENDER AGE EDUC

-------- -------- --------

-0.0196 3.3360 -0.4151

(0.0187) (0.6297) (0.0504)

-1.0482 5.2977 -8.2333

10

Diagnostics: Test of equal slopes (GA=IN) assumption:Modification Indices for GAMMA GENDER AGE EDUC -------- -------- -------- ETA 1 7.7083 6.9705 0.2122 ETA 2 3.1923 0.1765 9.3836

A global test will require the estimation of a separate model (Means2b) with GA=PS (parallel slopes assumption relaxed).

Chi-square dfCFI

Chi-square comparisons Means2a: 699.80790 .9635

Means2b: 669.59484 .9649

11

Means2bALPHA CANADA (FIXED TO 0 IN US) ETA 1 ETA 2 -------- -------- 1.2545 0.6371 (0.0725) (0.0968) 17.3057 6.5809GAMMA - USA GENDER AGE EDUC -------- -------- -------- ETA 1 0.6845 -0.0170 0.0817 (0.1003) (0.0031) (0.0352) 6.8230 -5.5398 2.3209 ETA 2 0.0624 -0.0144 0.3074 (0.1462) (0.0045) (0.0520)GAMMA-Canada GENDER AGE EDUC -------- -------- -------- ETA 1 0.9597 -0.0308 0.1525 (0.0931) (0.0028) (0.0389) 10.3099 -11.1125 3.9173 ETA 2 -0.0936 -0.0246 0.5333 (0.1200) (0.0036) (0.0521)

12

Expressing effects when parallel slope assumption is relaxed:is pattern diverging, converging, crossover?

Equations:Eta1 = alpha1 + gamma1 Ksi 1 + gamma2 Ksi2 + gamma3 Ksi 3 + zeta1Hold constant at the 0 values of all Ksi variables except one. Not quite the overall

mean (Ksi=0 in group 1, but in group 2 it’s 0 + kappa), but close enough.In group 1, alpha1 = 0, equation is:Eta1 = gamma1 [1]Ksi1 [+alpha1=0 + gamma2 Ksi2=0 + gamma3 Ksi3=0 +

zeta1

where E(zeta1)=0

In group 2, alpha1 = alpha1[2]Eta1 = alpha1[2] + gamma1[2] Ksi1 [+ other terms =0]Now, the question is, at what values do we evaluate the equation?

1. Ksi1=0 This is the Ksi1 mean in group 1. (we could, alternativelyuse something like kappa1[2]/2, which is half way betweenthe group 1 and the group 2 mean of kappa1 … or even a weighted version)

2. Ksi1 = 0 + k standard deviations, where k can be any reasonable number1? 1.5? 2.0?

3. Ksi1 = 0 – k standard deviations.

13

How do we find the standard deviation of Ksi?Look at the PHI matrix to obtain variances, and take the

square root of these!PHI USA GENDER AGE EDUC -------- -------- -------- GENDER 0.2441 (0.0102) 23.9687 AGE -0.4381 259.2400 (0.2350) (10.8158) -1.8642 23.9687 EDUC 0.0251 1.7457 1.9599 (0.0204) (0.6670) (0.0818)

14

For education, if we had a pooled estimate (Canada + US) we could use it, otherwise, we can be approximate 1.9599, 1.4733 ~ 1.72 sqrt(1.72) = 1.3. So we will want to evaluate at EDUC=0, EDUC=+1.3 (or perhaps +2.6?), EDUC=-1.3 (or perhaps -2.6?).

At Educ=0, Canada-US difference is 1.2545 (see alpha parameter, above) USA=0 Canada=1.2545

At Educ=-2.6, USA= 0 + (-2.6 * .0817) [usa gamma for educ = .0817]= -.2124Canada = 1.2545 + (-2.6 * .1525) [Canadian gamma for educ = .1525]

= 858At Educ = +2.6, USA = 0 + (2.6 * .0817) = .2124

Canada = 1.2545 + (2.6 * .1525) = 1.651

15

-0.5

0

0.5

1

1.5

2

-2.6 0 2.6

Education

USA

Canada

16

For age, approximate variance is sqrt (270) = 16.43. We could thus use 0 ± 16.43 or 0 ≠ 32.86 (or 0 ≠ (1.5 * 16.43) or if we knew that the mean was

approximately 42 (see tau-x parameter), we could simply do something like ± 20 years (more intuitive)

-0.5

0

0.5

1

1.5

2

-20 0 20

Age (0=42 years)

USA

Canada

17

Models for Four Groups• Canada• U.S.A.• Germany• U.K.

Means3a GA=PS Chi-square = 1892.25 df=180

Means3b GA=IN Chi-square = 1986.94 df=198

18

Value of USA Canada Educ USA Canada UK Germany

-2.6 -0.19188 0.70818 2.73732 1.56508 0 0 1.087 2.4339 1.8139

2.6 0.19188 1.46582 2.13048 2.06272

Formulas: USA: =0.0738*B8

Canada: =1.087+(B8*0.1457)

UK : =2.4339+(B8*-0.1167)

Germany: =1.8139+(B8*0.0957)

[B8 refers to the first education row. Formula becomes B9, B10

For rows below]

19

-0.5

0

0.5

1

1.5

2

2.5

3

-2.6 0 2.6

Education

USA

Canada

UK

Germany

Dealing with data that are not normally distributed within the traditional LISREL framework

Questions:-how bad is it if our data are not normally distributed?- what can we do about it?-are there easy “fixes”?

21

Non-Normal Data

How about just ignoring the problem?

Early 1980s: Robustness studies. Major findings:

In almost all cases, using LV models better than OLS even if data non-normal

(assumes multiple indicators available)

some discussion of conditions under which parameters might not be accurate (e.g., low measurement coefficient models)

22

Non-Normal Data

Early articles: A. Boomsa, On the Robustness of LISREL Johnson and Creech, American Sociological

Review, 48(3), 1983, 398-403 Henry, ASR, 47: 299-307 (related: Bollen and Barb, ASR, 46: 232-39)

See a good summary of early and later simulation studies: West, Finch and Curran in Hoyle.

23

Non-Normal Data See a good summary of early and later simulation studies:

West, Finch and Curran in Hoyle. Formal properties:

Consistent?

Asymp. Efficient?

Acov(θ)

X2

Multinormal (no kurtosis)

√ √ √ √

Elliptical √ √ X X

Arbitrary √ X X X

24

Non-Normal DataMany of the studies have involved CFA models

•E.g., Curran, West, Finch, Psych. Methods, 1(1), 1996.

• General findings (non-normal data):

• ML, GLS produce X2 values too high

•Overestimated by 50% in simulations

•GLS, ML produce X2 value slightly larger when sample sizes small, even when data are normally distributed

•Underestimation of NFI, TLI, CFI

•Also underestimated in small samples esp. NFI

•Moderate underestimation of std. errors (phi 25%, lambda 50%)

25

Non-Normality

Detection: ur = E(x – ur)r kurtosis 4th moment

Mean of 3 standardized: u4 / u22

Standardized 3rd moment u3/ (u2)3/2

Tests of statistical significance usually available (Bollen, p. 421) b1, b2 (skew,kurt)

N(0,1) test statistic for Kurtosis (H0: B2 – 3 = 0) Different tests (one approx. requires N>1000)

Joint test κ2 Approx. distr. as X2, df=2 Mardia’s multivariate test: skewedness, kurtosis, joint.

26

Non-NormalityNon-Normality

An alternative estimator:Fwls (also called Fagls):

[s – σ(θ)’ w-1 [s – σ(θ)]Browne, British Journal of Mathematical and Statistical Psychology, 41 (1988)

193ff.also 37 (1984), 62-83

Optimal weight matrix?asymptotic covariance matrix of sij

Acov(sij,sgh) = N-1 (σijgh - σij σgh)Sijgh = 1/N Σ (zi)(zj)(zg)(zh)

where zi is the mean-deviated value

If multinomial: σijgh = σij σgh + σjg σjh + σjh σjg (reduces to GLS)

W-1 is ½ * (k)(k+1) + ½ (k)(K+1)

27

Non-Normality

An alternative estimator:

Fwls (also called Fagls):[s – σ(θ)’ w-1 [s – σ(θ)]

W-1 is ½ * (k)(k+1) + ½ (k)(K+1)Computationally intense:

20 variables: 22,155 distinct elementsTo be non-singular,

N must be > p + ½ (p)(p+1)20 variables: minimum 23030 variables: minimum 495

Older versions of LISREL used to impose higher restrictions (refused to run until thresholds well above the minima shown above were reached)

28

Non-Normality


Fwls (also called Fagls):[s – σ(θ)’ w-1 [s – σ(θ)]

W-1 is ½ * (k)(k+1) + ½ (k)(K+1)The AGLS estimator is commonly available in SEM

software LISREL 8 AMOS SAS-CALIS EQS

Be careful! Not really suitable for small N problems Good idea to have sample sizes in the thousands, not

hundreds.

29

Non-Normality


Fwls (also called Fagls):

[s – σ(θ)’ w-1 [s – σ(θ)]

W-1 is ½ * (k)(k+1) + ½ (k)(K+1)

The AGLS estimator is commonly available in SEM software LISREL 8: ME=WL in OU statement; must also provide

asymptotic covariance matrix generated by PRELIS AC FI= statement follows CM FI= statement

AMOS: check box on analysis options

Again, the problem is that this estimator can be unstable given the size of the matrix (acov) that needs to be inverted (especially in moderate sample sizes)

30

Non-Normality

Sample program in LISREL with adf estimator:

LISREL model for religiosity and moral conservatism

Part 2: ADF estimation

DA NI=14 NO=1456

CM FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.cov

ACC FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.acc

SE

1 2 3 4 5 6 7 8 9 10 11 12 13 14/

MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c

ps=sy,fr te=sy

va 1.0 ly 1 1 ly 8 2

fr ly 2 1 ly 3 1 ly 4 1 ly 5 1

fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2

fr te 2 1 te 11 10 te 7 6

ou me=ml se tv sc nd=3 mi

31

Non-Normality

Generating asymptotic covariance matrix in PRELIS

32

Non-Normality

Generating asymptotic covariance matrix in PRELIS

Resultant matrix will be much larger than covariance matrix

33

Non-Normality ADF estimation



DA NI=14 NO=1456

CM FI=h:\icpsr99\nonnorm\relmor1.cov

ACC FI=h:\icpsr99\nonnorm\relmor1.acc

SE

1 2 3 4 5 6 7 8 9 10 11 12 13 14/


ps=sy,fr te=sy

va 1.0 ly 1 1 ly 8 2

fr ly 2 1 ly 3 1 ly 4 1 ly 5 1

fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2

fr te 2 1 te 11 10 te 7 6

ou me=wl se tv sc nd=3 mi

34

Non-Normality ML, scaled statistics



DA NI=14 NO=1456

CM FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.cov

ACC FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.acc

SE

1 2 3 4 5 6 7 8 9 10 11 12 13 14/


ps=sy,fr te=sy

va 1.0 ly 1 1 ly 8 2

fr ly 2 1 ly 3 1 ly 4 1 ly 5 1

fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2

fr te 2 1 te 11 10 te 7 6

ou me=ml se tv sc nd=3 mi

35

Non-Normality

Low tech solutions:For variables that are continuous,

TRANSFORMATION See classic regression texts such as Fox

Common transformations: X log(X) (usually natural log) X sqrt (X) X X2

X 1/ X (even harder to interpret since this will result in sign reversal)

Transforming to remove skewedness often/usually removes kurtosis, but this is not guaranteed

“Normalization” as an extreme option (e.g., map rank-ordered data onto N(0,1) distribution).

36

Non-Normality

Generally, if kurtosis between +1 and -1, not considered too problematic

(See Bollen, 1989)From this…….

37

Transformations

AMOS: Transformations must be performed on SPSS dataset. Save new dataset, and work from

this. (e.g, COMPUTE X1 = LOG(X1).)

LISREL: Transformations can be performed

in PRELIS.

PRELIS

already provides distribution

information on variables as a

“check”PRELIS “compute” dialogue box under transformations

Remember to SAVE the Prelis dataset after each

transformation. Use of stat package (SPSS, Stata,

SAS) may be preferable

38

Transformations

All the usual caveats apply:All the usual caveats apply:1. If a variable only has 4-5 values, transformation

will not normalize a variable (at the very least, will still have tucked-in tails) – though it could help bring it closer to within the +1 -1 range (Kurtosis)

2. If a categorized variable has one value with a majority of cases, then no transformation will work

3. If the variable has negative values, make sure to add a constant (“offset”) before logging

39

Other solutions:

1. Robust test statistics (Bentler) Implementation: EQS, LISREL

2. Muthen has recently developed a WLSM (mean-adjusted) and WLSMV (mean and variance adjusted) estimator

Implementation: MPLUS only3. Bootstrapping

Implementation: AMOS (easy to use) LISREL (awkward)

4. CATEGORICAL VARIABLE MODELS (CVM).

40

Bootstrapping

Computationally intensive Sampling with replacement; from

resampling space R draw bootstrap sample S*

n,j where j=# of samples, n=bootstrap n

Typically, bootstrap N = sample N Repeat resampling B times, get set of

values Issue: what if, across 200 resamples, 2 of them

have ill-defined matrices? Usually, these are discarded

41

Bootstrapping

Computationally intensive Sampling with replacement; from resampling

space R draw bootstrap sample S*n,j where j=# of samples,

n=bootstrap n

Typically, bootstrap N = sample N Repeat resampling B times, get set of values

Issue: what if, across 200 resamples, 2 of them have ill-defined matrices?

Usually, these are discarded Tests: 5% confidence intervals (want large # of

samples… confidence intervals do not need to be symmetric (can look to value at 95th percentile and at 5th among bootstrapped samples).

More common to compute standard errors

42

Bootstrapping

Overall model X2 correction (available in AMOS).. Bollen and Stine.

Yang and Bentler (chapter in Marcoulides & Schumacker): “faith” in bootstrap based on its

appropriateness in other app’s Simulation study, 1995, if explor. factor

analysis … rotated solutions close, but not so with unrotated solutions

“It seems that in the present stage of development, the use of the bootstrap estimator in covariance structure analysis is still limited. It is not clear whether one can trust the bias estimates.”

43

Bootstrapping

Ichikawa and Konishi, 1995 When data multinormal, bootstrap se’s not as

good as ML Bootstrap doesn’t seem to work when N<150

consistent overestimation (at N=300, not a problem though).

44

The Categorical Variable Model

Conceptual background:

We observe y interested in latent y*with C discrete valuesYi = Ci – 1 if vi,ci-1 < yi* where v is a threshhold

Yi = Ci – 2 if vi,ci-2 < yi* ≤ vi,ci-1

Yi = Ci – 3 if vi,ci-3 < yi* ≤ ≤ vi,ci-2

…..1 If v1,1 if vi,1 < yi

* ≤ vi,2

0 if yi* ≤ vi,1 v’s are threshhold parameters

to be estimated.

45


Observed and Latent CorrelationsX-variable scale y-variable scale Observed correl. Latent corr.

Continuous continuous pearson pearson

Contiuous categorical pearson polyserial

Continuous dichtoomous point-biserial biserial

Categorical categorical pearson polychoric

Dichotomous dichotomous phi tetrachoric

If it is reasonable to assume that continuous and normally distributed y* variables underlie the categorical y variables… a variety of latetn correlations can be specified.

46


If it is reasonable to assume that continuous and normally distributed y* variables underlie the categorical y variables… a variety of latetn correlations can be specified.

First step: estimate thresholds using MLSecond step: latent correlations estimatedThird step: obtain a consistent estimator of the asymptotic

covariance matrix of the latent correlations (for use in a weighted least squares estimator in the SEM model).

Extreme case: ability to recover y* model when variables split into 25%/75% dichotomies: promising (though X2 underestimated)

general structural equations (lisrel) week 3 #4 mean models reviewed non-parallel slopes non-normal...

Documents