bollen (1995 sm)

Upload: luh-putu-safitri-pratiwi

Post on 02-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Bollen (1995 Sm)

    1/29

    a n

    STRUCTURAL

    EQUATION

    MODELS THAT ARE

    NONLINEAR

    IN

    LATENT

    VARIABLES:

    A

    LEAST-

    SQUARES ESTIMATOR

    Kenneth A.

    Bollen*

    Busemeyer

    and Jones

    (1983)

    and

    Kenny

    and

    Judd

    (1984)

    pro-

    posed

    methods

    to include

    interactions

    of

    latent

    variables

    in

    structural

    equation

    models

    (SEMs).

    Despite

    the value

    of

    these

    works,

    their

    methods

    are

    limited

    by

    the

    required

    distributional

    assumptions,

    by

    their

    complexity

    in

    implementation,

    and

    by

    the unknown

    distributions

    of

    the

    estimators. This

    paper

    pro-

    vides a

    framework

    for

    analyzing

    SEMs

    ("LISREL"

    models)

    that include

    nonlinear

    functions

    of

    latent

    or

    a

    mix

    of

    latent and

    observed

    variables

    in

    their

    equations.

    It

    permits

    such

    nonlinear

    functions

    in

    equations

    that are

    part

    of

    latent

    variable

    models or

    measurement models. I

    estimate

    the

    coefficient

    parameters

    with

    a

    two-stage

    least

    squares

    estimator that

    is

    consistent and

    asymp-

    totically normal with a known asymptotic covariance matrix.

    The observed

    random

    variables can

    come

    from

    nonnormal

    distributions.

    Several

    hypothetical

    cases and

    an

    empirical

    exam-

    ple

    illustrate

    the

    method.

    My

    thanks to

    Scott

    Long,

    the

    referees,

    and

    Peter

    Marsden

    for their

    comments

    on

    this

    paper

    and

    to

    Laura

    Stoker and

    John

    Zaller for

    their

    helpfuldiscussions on

    the

    empirical

    example.

    I

    gratefully

    acknowledge

    the

    support

    from

    the

    Center

    for

    Advanced

    Study

    in

    the

    Behavioral

    Sciences and the

    Sociology

    Program

    of

    the

    National

    Science

    Foundation

    (SES-9121564).

    *University

    of

    North Carolina

    at

    Chapel

    Hill

    223

  • 8/10/2019 Bollen (1995 Sm)

    2/29

    KENNETH A.

    BOLLEN

    1.

    INTRODUCTION

    Structural

    equation

    models

    (SEMs),

    sometimes called LISREL mod-

    els,

    are

    widely

    used in the

    social sciences. These

    general

    models

    include

    multiple regression,

    confirmatory

    factor

    analysis,

    classical

    simultaneous

    equation

    models,

    and a

    variety

    of other common

    analy-

    sis

    techniques

    as

    special

    cases

    (Joreskog

    and

    Sorbom

    1993).

    Though

    it is

    straightforward

    o include nonlinear functions of

    exogenous

    or

    predetermined

    observed variables into these models

    (Bollen

    1989,

    pp.

    128-29)

    or to

    incorporate

    cross-product

    terms of "block"vari-

    ables

    (Marsden 1983),

    the treatment of

    models

    with

    equations

    that

    are nonlinear in latent

    or

    unobserved variables

    is not

    fully

    devel-

    oped. Typical

    examples

    are

    equations

    that

    include the

    product

    of two

    latent variables

    or

    the

    square

    of a latent variable as

    explanatory

    variables.

    Researchers

    using

    SEMs

    have

    proposed

    two

    major

    solutions

    to this

    problem.

    One

    is

    based on the

    work of

    Busemeyer

    and

    Jones

    (1983), Bohrnstedt and Marwell (1978), Feucht (1989), and Heise

    (1986).

    The

    other derives

    from the work

    of

    Kenny

    and Judd

    (1984).

    These

    papers

    take

    important

    steps

    toward

    allowing product

    interac-

    tions and

    squared

    terms

    of latent variables

    into

    SEMs,

    but

    they

    have

    several

    limitations.

    This

    paper provides

    a more

    general

    framework

    for

    analyzing

    SEMs

    that include

    nonlinear functions

    of latent

    or a mix

    of

    latent

    and

    observed

    variables.

    In

    addition,

    I

    propose

    a limited information

    esti-

    mator for such models that is based on a two-stage least squares

    (2SLS)

    procedure

    described

    in Bollen

    (forthcoming).

    Unlike

    the

    other

    methods,

    this estimator

    is

    simple,

    easy

    to

    implement,

    and has

    known

    asymptotic

    properties

    that

    do not

    depend

    on

    the

    normality

    of

    the observed

    random variables.

    The

    next section

    reviews the

    literature

    on

    product

    interactions

    and

    squares

    of latent

    variables

    in SEMs

    and instrumental

    variable/

    2SLS

    methods. Section

    3

    presents

    the

    notation,

    model

    assumptions,

    and the estimator,andSection4 discussesthe selection of instrumen-

    tal variables

    (IVs)

    that

    are needed

    to

    implement

    the

    procedure.

    Section

    5

    includes

    three

    hypothetical

    examples

    and one

    empirical

    example

    to

    illustrate

    the

    methodology.

    The results

    are summarized

    n

    Section

    6 in

    the conclusion.

    224

  • 8/10/2019 Bollen (1995 Sm)

    3/29

    MODELS

    THAT ARE NONLINEAR IN LATENT VARIABLES

    2.

    LITERATURE REVIEW

    2.1. Literatureon

    Products

    of

    Latent

    Variables

    An

    early

    study

    in the

    SEMs literature

    on

    incorporating

    products

    of

    latent variables

    in

    models was

    by Busemeyer

    and

    Jones

    (1983).

    Busemeyer

    and

    Jones focus

    on

    a

    single

    equation:

    y,

    =

    /311L

    +

    f312L2

    +

    /13L1L2

    +

    1,

    (1)

    where

    y,

    is an observed randomvariable,L1and

    L2

    are latentrandom

    variables and

    ,

    is

    a random disturbanceterm with

    a

    mean

    of

    0.

    The

    latent

    variables L1 and

    L2

    are each measured

    with a

    single

    indicator

    such that

    Y2

    =

    L1

    +

    e2

    (2)

    Y3

    L2

    +

    63,

    (3)

    where

    E(ei)

    is

    zero,

    and

    E2,

    E3,

    and

    5,

    are distributed

    ndependently

    of

    L1

    and

    L2

    and of each other. The terms

    L1,

    L2, E2, E3,

    and

    ,

    are

    random

    variables

    from normal

    distributions;

    e2,

    63,

    and

    s

    are each

    homoscedastic and

    nonautocorrelated;

    and

    y1,

    L1,

    L2,Y2,

    L1L2,

    and

    y3

    are deviated from their means.

    Busemeyer

    and Jones

    (1983)

    show that

    knowledge

    of

    the error

    variances

    (or

    reliability)

    of

    Y2

    and

    Y3,

    together

    with

    the

    results from

    Bohrnstedt

    and

    Marwell

    (1978)

    on

    estimating

    the

    reliability

    of the

    product of two normallydistributedvariables, allows one to consis-

    tently

    estimate the covariance

    matrix of

    yi,

    L1,

    L2,

    and

    L1L2.

    This

    in

    turn

    yields

    a

    consistent estimator of the

    parameters

    3,,

    312,

    and

    /13

    in

    equation

    (1).

    The

    major

    limitations of this

    method are: it

    allows

    only

    a

    single

    indicator

    per

    latent

    variable;

    the error

    variances

    of

    the

    non-

    product

    observed

    variables must be

    known;

    tests of

    statistical

    signifi-

    cance

    of

    parameter

    estimates

    are

    not

    provided;

    it offers no

    methods

    for estimating equation intercepts; and the robustness of the esti-

    mates

    to

    violations

    of

    the

    normality

    and

    independence

    assumptions

    for the

    nonproduct

    latent variables

    and

    nonproduct

    disturbances is

    not

    given (Bollen

    1989,

    pp. 407-8).

    Feucht

    (1989)

    draws on

    Fuller's

    (1980)

    work

    and

    suggests

    225

  • 8/10/2019 Bollen (1995 Sm)

    4/29

    KENNETH A. BOLLEN

    modifications that overcome some of these limitations. The

    Feucht-

    Fuller

    method ensures

    that

    the

    moment matrix that is

    corrected

    for

    measurement error

    is

    positive-definite,

    allows for

    nonnormally

    dis-

    tributed

    explanatory

    variables,

    and

    provides

    estimates

    of

    the stan-

    dard

    errors

    of the

    resulting

    coefficient estimates.

    Single

    indicators

    and known

    error

    variances

    (and

    error covariances

    f

    present)

    are still

    required,

    however.

    Heise's

    (1986)

    and Feucht's

    (1989)

    Monte Carlo

    simulation results

    provide

    mixed evidence

    on

    the value

    of

    these sin-

    gle

    indicator

    approaches

    to

    including

    interactionsof latent variables.

    Kenny

    and Judd

    (1984) give

    an alternativemethod

    of

    incorpo-

    rating

    squares

    of or

    product

    interactions of latent variables into

    SEMs

    (see

    also

    Wong

    and

    Long

    1987;

    Hayduk

    1987;

    Bollen

    1989).

    Their method

    allows

    multiple

    measures

    of

    each latent

    variable. Prod-

    ucts of these indicators are

    incorporated

    nto the model

    as indicators

    of

    the

    products

    of

    the latent

    variables.

    To illustrate the

    Kenny-Judd

    method,

    consider the

    example

    including

    an interaction

    of

    latent

    variables

    in

    equation

    (1)

    and the

    indicators

    of

    Y2

    for

    L1

    and

    Y3

    for L2

    in

    equations

    (2)

    and

    (3).

    Since

    Kenny and Judd(1984) treatmultipleindicators,add one more indi-

    cator each

    for L1 and

    L2,

    as

    in

    equations

    (4)

    and

    (5):

    y4

    =

    A41

    +

    64

    (4)

    y5

    =

    52L2

    +

    65

    (5)

    In

    addition

    to the

    assumptions

    already

    made

    for

    equations

    (1)

    to

    (3),

    the

    assumptions

    are

    that

    64

    and

    E5

    have means

    of

    zero,

    come from

    normal distributions, are each homoscedastic and nonautocorre-

    lated,

    and are

    independent

    of

    L1,

    L2,

    62,

    63,

    ,

    and

    of

    each

    other.

    All

    y

    variables

    are

    deviated

    from their

    means.

    Kenny

    and Judd

    (1984)

    suggest

    that

    analysts

    form

    indicators

    of the

    interaction

    term,

    L1L2,

    by taking

    two-way products

    of the

    indicators

    of

    L1 with the

    indicators

    of

    L2.

    This results

    in four new

    measurement

    equations

    for

    the

    indicators

    of

    L1L2:

    Y2Y3= L1L2

    +

    L1E3 + L2E2 +

    E2E3

    (6)

    Y2Y

    =

    A52L1L2

    +

    LE5

    5L2

    +

    E5LE

    +

    (7)

    43

    =

    A41L1L2

    +

    L2E4

    +

    A41L1E3

    +

    E3E4

    (8)

    Y4Y5 = A41A52L1L2

    +

    A41L165

    +

    A52L264

    +

    6465

    226

    (9)

  • 8/10/2019 Bollen (1995 Sm)

    5/29

    MODELS THAT ARE NONLINEAR

    IN

    LATENT

    VARIABLES

    Equations

    (1)

    to

    (9)

    give

    the full model to

    estimate under the

    Kenny-

    Judd

    approach.

    This

    involves

    the

    introduction of a

    number

    of

    latent

    variables and combinations

    of

    latent

    and error variables. The

    list of

    such

    variables s

    L1,

    L2,

    62

    to

    E5,

    1,

    L1L2,

    L

    13,

    L1E,

    L2E2,

    L2E4,

    E2E3,

    E2E5,

    E364,

    and

    E465.

    Estimating

    the measurement

    equations

    (6)

    to

    (9)

    in-

    volves linear and nonlinear

    constraints on the

    parameters.

    For in-

    stance,

    in

    equation

    (7)

    the factor

    loadings

    for

    L1L2

    and for

    L2E2

    are

    both

    equal

    to

    A52

    n

    equation

    (5).

    Equation

    (9)

    for

    y4y5

    has

    a nonlinear

    constraint on the coefficient

    for the

    L1L2

    variable. Additional

    restric-

    tions occur

    for the variances of the

    product

    latent variables

    in

    equa-

    tions

    (6)

    to

    (9).

    Under the

    assumption

    that L1 and L2 come from

    normal

    distributions,

    the

    variance

    of

    L1L2

    must be

    kept equal

    to

    VAR(L1)VAR(L2)

    +

    [COV(L1,L2)]2.

    Other

    examples

    of

    the restric-

    tions are in

    Kenny

    and Judd

    (1984).

    The introduction

    of the nonlinear

    constraints

    implied

    by

    the

    model and

    assumptions

    allows consistent

    estimation of the coeffi-

    cients of the terms

    that

    are nonlinear

    in

    the latent

    variables.

    Kenny

    and

    Judd use

    a

    GLS

    fitting

    function

    (Browne 1984)

    to

    estimate their

    model. See Higginsand Judd(1990)for anotherempiricalapplication.

    The

    Kenny-Judd

    method

    represents

    an advance in the

    ability

    to

    handle interactions

    and

    squares

    of

    latent

    variables,

    but it still

    has

    limitations. One

    is

    the lack

    of

    knowledge

    about

    the robustness of the

    method to

    the failure of the

    normality

    and

    independence

    assump-

    tions. Another is

    the

    proliferation

    of

    product

    latent

    variables,

    distur-

    bances,

    and

    observed variables that

    occurs with

    this method.

    Even

    a

    relatively

    simple

    model

    requires

    many

    terms

    when

    multiple

    indica-

    tors are available for each latent variable involved in the product

    interaction.

    Each of the

    new

    terms

    and

    the

    accompanying

    nonlinear

    constraints must be

    entered

    explicitly

    into the

    model.

    Also,

    the

    prop-

    erties of

    the model with raw

    rather than

    deviation

    scores are not

    known.

    2.2.

    Literature

    on Instrumental

    Variables

    and

    2SLS

    Other literaturehas been less concerned with nonlinearfunctions of

    latent

    variables but is relevant to

    this

    paper.

    This

    is the

    econometric

    literature on

    instrumental

    variables

    (IV)

    and

    two-stage

    least

    squares

    (2SLS)

    estimation.

    Most

    econometric

    texts

    (e.g.,

    Johnston

    1984;

    Judge

    et al.

    1985)

    provide

    overviews of

    these

    methods.

    227

  • 8/10/2019 Bollen (1995 Sm)

    6/29

    KENNETH

    A.

    BOLLEN

    IV and 2SLS

    techniques

    are

    helpful

    when

    an

    explanatory

    vari-

    able

    in a

    regression equation

    is correlated

    with

    the

    disturbance term

    of

    the

    equation.

    An IV is a variable

    that is

    correlated with

    an

    "endoge-

    nous"

    explanatory

    variable,

    but

    it is uncorrelated

    with the disturbance

    term.

    In 2SLS

    the

    predicted

    value

    of the

    endogenous

    explanatory

    variable,

    from a

    "first-stage"

    ordinary

    least

    squares

    (OLS)

    regression

    of the

    explanatory

    variable

    on the

    IV,

    replaces

    the

    explanatory

    vari-

    able

    in

    the

    original equation.

    The "second

    stage"

    of

    2SLS

    is the OLS

    regression

    of the

    original

    dependent

    variable

    on this

    predicted

    endoge-

    nous

    explanatory

    variable

    and

    the other

    explanatory

    variables.

    It

    pro-

    vides a

    consistent estimator

    of the coefficient

    in the

    original equation.

    When

    more

    than one

    IV is

    available,

    the 2SLS

    estimator

    is an IV

    estimator

    that uses

    an

    optimal

    combination

    of

    instruments.

    Random

    measurement

    error

    in an

    explanatory

    variable cre-

    ates

    a correlation

    between

    it and

    the disturbance.

    The bulk

    of

    econometric

    research

    on IV

    and measurement

    error is

    restricted

    to

    bivariate

    or

    multiple

    regression

    models with

    a

    single explanatory

    variable

    measured

    with error.

    Reiers0l

    (1941)

    was one

    of the first

    to

    suggest

    the use

    of IV methods as a correction for an

    explanatory

    variable

    measured

    with error.

    Extensions

    of these

    methods

    allow an

    explanatory

    variable

    to

    have

    more

    than

    one measure

    or

    expand

    to

    a

    two- to

    three-equation

    model

    (e.g.,

    Bowden

    and

    Turkington

    1984,

    pp.

    3-7,

    58-62;

    Aigner

    et

    al.

    1984).

    IV methods

    for

    models

    that

    have

    nonlinear

    functions

    of observed

    variables

    also are

    available

    (see

    Bowden

    and

    Turkington,

    1984).

    Madansky

    (1964),

    Hagglund

    (1982),

    and

    Joreskog

    (1983) pro-

    posed

    IV/2SLS methods to estimate factor

    analysis

    models. Bollen

    (forthcoming)

    developed

    a 2SLS

    estimator

    for the latent

    variable

    models

    as well.

    But none

    of these

    authors

    dealt

    with

    nonlinear

    func-

    tions of

    latent

    variables.

    The

    next

    section

    develops

    a

    general

    model

    and method

    that

    makes

    use

    of the

    2SLS

    estimator

    for such

    models.

    3.

    MODEL

    AND ESTIMATOR

    Busemeyer and Jones (1983), Kenny and Judd (1984), and Feucht

    (1989)

    concentrated

    either

    on the

    product

    of

    two

    latent

    variables

    or

    the

    square

    of

    a

    latent

    variable

    in a

    single equation

    latent

    variable

    model.

    A more

    general

    approach

    permits

    any

    number

    of

    equations,

    allows

    other

    nonlinear

    functions

    of

    the

    latent

    or observed

    variables,

    228

  • 8/10/2019 Bollen (1995 Sm)

    7/29

    MODELS

    THAT ARE NONLINEAR

    IN LATENT VARIABLES

    and

    applies

    to the

    measurement model as well as to

    the latent vari-

    able model.

    Suppose

    that

    the

    model for the latent

    variablesis

    L

    =

    acL

    +

    BlL

    +

    B2fL)

    +

    ,

    (10)

    where

    L

    is an

    m

    x

    1

    vector

    of latent

    variables,aL

    is

    an

    m

    x

    1

    vector

    of

    intercept

    terms,

    B1 is

    an

    m

    x

    m matrix

    containing

    constantcoeffi-

    cients for the effects of L on

    other

    L's, f(L)

    is

    an n

    x

    1

    vector

    of

    functions

    that

    are nonlinear in

    L,

    B2

    is an

    m

    x

    n matrix

    containing

    constant

    coefficients

    for the

    effects

    of

    f(L)

    on

    L,

    and

    ;

    is an

    m

    x

    1

    vector of disturbances with

    E(S)

    equal

    to zero and

    each

    ,

    is i.i.d.

    That

    is,

    the disturbance for each

    equation

    is homoscedastic and

    non-

    autocorrelated across

    observations,

    though

    the variance and

    other

    distributional

    traits

    of

    i

    can

    differ

    from

    j

    for i

    =

    j.

    Typically

    some

    elements of

    L

    orf(L)

    are

    "predetermined"

    or

    exogenous

    in

    the

    sense

    that

    they

    are

    uncorrelated

    with,

    or

    even

    independently

    distributed

    of '.

    The latent

    variables

    in

    L

    are observable

    through

    their indica-

    tors.

    A

    second

    equation

    provides

    the

    measurementmodel

    linking

    the

    latent to the observed variables

    y

    =

    ay

    +

    AL +

    A2f(L)

    +

    e,

    (11)

    where

    y

    is

    a

    p

    x

    1

    vector of

    random

    variables that are

    observed,

    a is

    a

    p

    x

    1

    vector of

    intercept

    constants for the

    measurement

    equations,

    A1 and

    A2

    are

    p

    x

    m

    and

    p

    x

    n

    constant

    coefficient matrices

    for

    L

    and

    f(L),

    and

    e is

    ap

    x

    1

    vector,

    where each

    Ei

    s

    an

    i.i.d. random

    error

    of

    measurement that has a

    mean of

    zero and that

    is

    independent

    of L

    andf(L). If a "latentvariable"is perfectlymeasured, then the corre-

    sponding

    element

    of

    ay

    is

    zero,

    the

    corresponding

    row

    of

    A1

    has a 1

    in

    the column that

    matches the

    latent

    variable and

    zeros in the

    rest of

    the

    row,

    and

    the

    corresponding

    row of

    A2 is

    zero,

    as

    is the

    matching

    element

    in E.

    If

    B2

    and

    A2

    are

    zero,

    then the

    model in

    equations (10)

    and

    (11)

    matches

    general

    SEMs

    with

    intercept

    terms such

    as

    Joreskog

    and

    Sorbom's

    (1993)

    LISREL model.

    In the

    case

    of the

    LISREL

    model, equation (10) correspondsto the latent variablemodel, and

    equation (11)

    to

    the

    measurement

    (or

    confirmatory

    actor

    analysis)

    model. What is

    distinctive

    about the

    model

    of

    equations

    (10)

    and

    (11)

    is

    its inclusion

    of

    f(L).

    This

    permits

    effects that

    are

    nonlinear

    in

    the

    latent

    variables.

    The

    nonlinear

    terms can

    enter

    the

    latent vari-

    229

  • 8/10/2019 Bollen (1995 Sm)

    8/29

    KENNETH A. BOLLEN

    able

    or the

    measurement

    model.

    Thus the

    model is

    a

    generalization

    of the usual SEM.1

    To

    help

    identify

    the

    model,

    assume that

    each

    latent variable

    has

    an indicator

    that "scales"

    the

    latent

    variable such

    that

    Yi

    =

    Li

    +

    E

    (12)

    This

    assumption

    does

    not rule

    out

    multiple

    indicators

    for

    a latent

    variable,

    nor does

    it

    require

    that

    indicators

    be influenced

    by

    no more

    than one

    latent

    variable.

    It

    requires

    only

    that there

    be at least

    one

    indicator

    per

    latent

    variable

    that "loads"

    exclusively

    on that latent

    variable and that scales it

    by

    virtue of having a loading of unity.

    Other

    scaling

    choices

    are

    possible,

    but

    the failure

    to

    assign

    a scale

    leads

    to an

    underidentified

    model

    (see

    e.g.,

    Bollen

    1989,

    pp.

    152-54,

    307-9).

    Partition

    y

    such

    that

    the m

    y's

    that scale

    the latent

    variables

    occur

    first

    (as

    vector

    yl)

    and the

    other

    (p

    -

    m) y's

    second

    (as

    vector

    Y2).

    This leads

    to

    Y

    Y2

    ](13)

    where

    Y =

    L

    +

    eI

    (14)

    and

    L

    =

    Y

    -

    E1.

    (15)

    Substituting equation (15)

    into

    (10)

    transforms

    (10)

    into

    an

    equation

    for the

    observed

    scaling

    variables

    rather

    than one

    for the

    latent

    variables:

    y,

    =

    aL

    +

    Bly,

    +

    B2f(,

    -

    E,)

    +

    El

    -

    Bll

    +

    (16)

    Similarly

    use

    equation

    (15)

    to rewrite

    the

    measurement

    model

    in

    equation

    (11)

    to exclude

    L:

    y

    =

    ay

    +

    Al,y

    +

    A2f(y1

    -

    E1)-

    Ale1

    +

    E.

    (17)

    Consider

    a

    single

    equation

    from

    the latent variable

    model

    in

    equation

    (16):

    'One

    can

    also

    view

    this as

    the

    "all Y

    model"

    (Bollen

    1989,

    ch.

    9)

    with

    the

    addition of

    nonlinear

    functions

    of

    the latent

    variables.

    230

  • 8/10/2019 Bollen (1995 Sm)

    9/29

    MODELS THAT ARE NONLINEAR

    IN

    LATENT VARIABLES

    Yi

    =

    aLi

    +

    B1Y1

    +

    B2ifYl

    -

    E1)

    -

    Bli

    1

    +

    Ei

    +

    i,

    (18)

    where Yiis one of the indicators that scales a latent variable. The i

    subscript signifies

    the

    ith

    row of

    the

    matrix or

    vector-so,

    for

    in-

    stance,

    B,l

    is

    the

    ith row of B1 and

    Ei

    is the

    ith

    element in

    the e1

    vector.

    In

    one broad and

    useful

    class of

    models,

    the nonlinear

    function of

    the

    latent

    variables

    is

    expressible

    as

    f(Yl

    -

    E1)

    =

    gl(yl)

    +

    g2(y1,E1),

    (19)

    where

    gl(.)

    and

    g2(.)

    are functions

    of the

    respective

    variables

    n

    paren-

    theses. This class of models includes the common cases of product

    interactions and

    quadratic

    terms of

    latent

    variables that

    Busemeyer

    and

    Jones

    (1983)

    and

    Kenny

    and Judd

    (1984)

    examined. For

    in-

    stance,

    supposef(L)

    is

    a scalar

    that

    consists of the

    product

    L1L2.

    Then

    f(Yl

    -

    e1)

    equals

    the scalar

    f(Y

    -

    El)

    =

    (Yl

    -

    E1)(2

    -

    E2)

    =

    YlY2

    -

    Y

    - Y2E1

    +

    E1l2, (20)

    where

    YlY2

    s

    gl(y1)

    and the

    last three terms are

    g2(yi,el).

    Or if

    f(L)

    equals

    L2,

    then

    f(y

    -

    E,)

    is

    the

    scalar

    f(Yl

    -

    E1)

    =

    Y2

    -

    2yle1

    +

    El, (21)

    where

    y2

    is

    gl(yl)

    and

    the

    remaining

    terms

    are

    g2(y1,el).

    The

    decomposition

    in

    equation

    (19)

    is useful

    because it

    allows

    one

    to

    place

    the

    g2(Yl,

    el)

    component

    in

    the residual while

    keeping

    gl(yl,)

    in the main

    part

    of

    the

    model.

    For

    these

    and other functions

    that are

    expressible

    as

    in

    equation

    (19),

    I

    can write

    equation (18)

    as

    yi

    =

    aLi

    +

    Bliy1

    +

    B2igl(Yl)

    +

    Ui, (22)

    where

    ui

    is

    the

    composite

    disturbanceterm

    Ui

    =

    B2ig2(l,E1)

    -

    Bli

    1

    +

    Ei

    +

    i'-

    (23)

    In

    general

    ui

    will

    be

    correlated with

    the

    right-hand

    side vari-

    ables in equation (22), and that makes ordinary least squares an

    inconsistent

    estimator of

    aLi,

    Bli,

    and

    B2,.

    An

    exception

    would occur

    if

    all

    the

    right-hand

    side

    variables in

    the

    equation

    are

    measured

    without

    error

    and are

    uncorrelated

    with

    the

    equation

    disturbance,

    Vi.

    In

    the

    more

    general

    case where the

    disturbance

    correlates with

    the

    right-

    231

  • 8/10/2019 Bollen (1995 Sm)

    10/29

    KENNETH

    A.

    BOLLEN

    hand

    side

    variables,

    a

    two-stage

    least

    squares

    (2SLS)

    estimator

    pro-

    vides a

    consistent estimator of

    these

    parameters.

    The

    literature review

    described

    special

    cases

    where the

    2SLS

    estimator

    has

    been

    successful. Here I

    develop

    a

    2SLS estimator that

    applies

    to

    general

    SEMs,

    including

    the latent

    variable

    and

    the mea-

    surement model. And

    the

    2SLS estimator

    allows

    for

    equations

    that

    are

    nonlinear in the

    latent

    or

    observed

    variables,

    requiring

    only

    that

    they

    be linear in the

    parameters.

    To

    develop

    this

    procedure,

    I

    modify

    the

    notation

    somewhat.

    Define

    N to

    be the

    number of

    cases,

    y1(l

    to

    be the N row

    matrix of

    values for the variablesin

    y,

    that have nonzero coefficientsin the

    yi

    equation,

    and

    gl(yl)(i)

    to be

    the

    N row

    matrix

    of values for

    the vari-

    ables in

    gl(yl)

    that

    have nonzero coefficients in

    the

    Yiequation.

    The N

    x

    1

    vector

    Yi

    contains the N

    values of

    Yi

    n

    the

    sample,

    and

    ui

    is an N

    x

    1

    vector

    of

    the values of

    ui.

    Let

    B1(i)

    be

    a column vector

    of the

    coefficients

    that

    correspond

    to

    yl(i)

    and

    B2('

    be

    the coefficient

    column

    vector for

    gl(y1)(')

    with all

    coefficients

    being

    identified

    parameters.

    Define

    Zi

    =

    [1

    :

    yl(

    '

    gl(yl)(i)

    ]

    and A'

    =

    [a'i:P

    B

    (i)].

    Then

    rewrite

    equation (22) as

    Yi

    =

    ZiAi

    +

    ui.

    (24)

    The 2SLS estimator

    requires

    a matrix of

    instrumental vari-

    ables,

    say

    Vi,

    that

    satisfy

    the

    assumptions

    1

    plim

    (

    V

    i Zi)

    =

    izi

    (25)

    1

    plim

    (

    -

    V;Vi)

    =

    I?ivi

    (26)

    1

    plim

    (

    -

    V'iui

    )

    =

    0,

    (27)

    where

    plim

    stands

    for the

    probability

    limit

    as N

    goes

    to

    infinity.

    Other

    assumptions

    are that the variables

    in

    Zi

    have finite variances

    and covariances, that the right-handside matricesof equations (25)

    to

    (27)

    are

    finite,

    that

    Xv,iv

    s

    nonsingular,

    and that

    XviZi

    s

    nonzero.

    These

    assumptions

    require

    that the

    instrumental

    variables

    (IVs)

    cor-

    relate with

    Zi

    and that

    the IVs not correlate

    with

    the

    composite

    disturbance

    ui.

    As

    I

    explain

    in the next

    section,

    the

    IVs will

    be

    232

  • 8/10/2019 Bollen (1995 Sm)

    11/29

    MODELS THAT ARE NONLINEAR

    IN LATENT VARIABLES

    observed variables

    (y's)

    that are

    part

    of the model or

    nonlinear func-

    tions

    of

    such

    observed variables.

    Assume

    that

    E[uiui]

    =

    o2I

    so that the

    composite

    disturbance s

    homoscedastic

    and nonautocorrelated. Whether

    E[ui]

    =

    0

    will

    de-

    pend

    on the nonlinear function of the latent

    variables

    that occurs

    in

    the

    original

    model. For

    now

    assume

    that the model

    is such that the

    mean of the

    composite

    disturbance

    s

    zero;

    later

    two of the

    examples

    will

    illustrate

    the

    consequences

    that follow when this

    assumption

    is

    false.

    In

    general

    the

    ui

    of

    equation

    (24)

    will

    correlate with one or

    more of the variables in

    Zi.

    This

    rules

    out

    the use

    of

    single-stage

    OLS

    to

    estimate

    Ai.

    The first

    stage

    of

    the 2SLS estimator is

    to

    perform

    an

    OLS

    regression

    of

    Zi

    on

    Vi,

    with

    coefficients

    (ViVi) -1V'zi.

    (28)

    The

    Vi

    matrix

    is then

    postmultiplied by

    this coefficient to form

    Zi

    (

    =

    Vi(V'iV)-~

    V'Zi),

    the

    predicted

    Zi

    matrix.

    The second

    stage

    in the

    2SLS estimation of

    A,

    is

    the OLS

    regression

    of

    y,

    on

    Zi

    which

    gives

    coefficients

    A,

    =

    (Z^' ,)-1^. (29)

    As

    is

    well

    known,

    the

    2SLS

    estimator

    is a consistent

    estimator of

    Ai

    (e.g.,

    see Johnston

    1984,

    pp.

    478-79).

    Assume that

    1

    Z'i

    u-

    AN(0,

    a

    2),

    (30)

    where AN

    refers to an

    asymptotically

    normal

    distribution.

    The

    previ-

    ous

    assumptions

    in

    equations (25)

    to

    (27)

    imply

    that

    1

    plim

    ( ZiZi)-'

    ,Z

    -

    (31)

    The

    asymptotic

    distribution of

    Ai

    is

    then

    D

    D

    -

    2

    -1

    N(Ai

    -

    A)

    D-

    N(0,

    a

    A

    N,),

    (32)

    and an

    estimate

    of the

    asymptotic

    covariance matrix

    of

    Ai

    is

    acov

    (Ai)

    =

    a

    ui(Z'iZi)-.

    233

    (33)

  • 8/10/2019 Bollen (1995 Sm)

    12/29

    KENNETH

    A. BOLLEN

    where

    6u

    =

    (Yi

    -

    ZiAi)'

    (y

    -

    ZiAi)/N.

    Thus

    the

    preceding procedure

    provides

    a consistent estimator of the

    coefficients

    for

    the linear

    and

    nonlinear terms in

    equation

    (22)

    as

    well as

    a

    measure

    of their

    statisti-

    cal

    variability.

    I

    have limited the discussion to

    the latent variable model

    in

    equation

    (10)

    that allows effects that are nonlinear

    in

    the

    latent

    variables for the class of models

    described

    in

    equations

    (22)

    and

    (23).

    A

    similar

    series of

    steps

    applies

    to the measurement model

    in

    equa-

    tion

    (11). Substituting

    equation

    (15),

    y,

    -

    e1,

    for

    L in

    equation

    (11)

    leads

    to

    equation

    (17).

    Analogous

    to

    equation

    (18)

    from

    the

    latent

    variablemodel, a

    single equation

    for the measurement model is

    Yi

    =

    ai

    +

    AliY

    1

    +

    A2iAf(y

    -

    el)

    -

    Aliel

    +

    Ei.

    (34)

    Considering

    the

    gl(.)

    and

    g2(.)

    functions as

    before

    leads to

    Yi

    =

    ayi

    +

    +

    2ig1)

    +

    i)

    +

    ,

    (35)

    where

    Ui

    =

    A2ig2(YlEl)

    -

    Alil

    +

    Ei. (36)

    An

    appropriate

    redefinition

    of

    Zi,

    Ai,

    and

    ui

    leads

    back

    to

    equation

    (24),

    Yi

    =

    ZiAi

    +

    ui.

    Under

    the

    assumptions

    detailed for

    the

    latent

    variable

    model,

    one can obtain

    a

    consistent 2SLS

    estimator

    of

    A,

    with

    a

    known

    asymptotic

    distribution.

    4. INSTRUMENTAL

    VARIABLE SELECTION

    Key to the success of using the proceduresdeveloped in the preced-

    ing

    section

    is

    finding

    appropriate

    instrumental variables

    (IVs)

    that

    satisfy

    the conditions

    for IVs and that

    lead to

    an identified model.

    When

    treating

    the

    selection

    of

    IVs,

    many

    econometric

    texts

    do not

    explain

    methods

    for

    finding

    the IVs.

    In

    contrast,

    the 2SLS

    procedure

    here

    depends

    on the

    model

    structure

    or the creation

    and selection

    of

    IVs.

    Indeed,

    the structure

    of the

    full model is

    essential

    in

    finding

    IVs,

    as is

    the idea

    that nonlinear

    functions

    of some

    of the observed

    variables can serve as IVs.

    In

    practice

    the most

    challenging

    task

    is

    to find IVs that are

    uncorrelated

    with the

    composite

    disturbance

    ui.

    Equations

    (25)

    to

    (27)

    along

    with the

    pattern

    of

    correlations

    among

    the

    errors,

    distur-

    bances,

    and latent

    variables

    of the

    model are

    important

    aids

    to

    select-

    234

  • 8/10/2019 Bollen (1995 Sm)

    13/29

    MODELS THAT ARE NONLINEAR

    IN LATENT

    VARIABLES

    ing

    appropriate

    IVs.

    A

    general

    procedure

    for

    selecting

    IVs

    has

    sev-

    eral

    steps.

    Assume

    that

    vi

    is

    a variable that

    might

    be a suitable

    instrumentalvariable. The

    following steps help

    to

    evaluate

    its

    eligibil-

    ity: (1)

    Form

    COV(vi,

    ui);

    (2)

    if

    vi

    is an

    endogenous

    variable,

    substi-

    tute

    its reduced-form

    equation

    for

    it;

    (3)

    substitute the

    right-hand

    side of

    equation

    (23)

    or

    (36)

    for

    ui;

    and

    (4)

    take the

    covariance of

    the

    resulting

    terms

    and see

    if

    it is zero.

    If

    so,

    then

    vi

    passes

    this condition

    for

    an IV.

    A

    similar

    series of

    steps

    applies

    in the search for IVs that

    are

    nonlinear functions

    of the observed

    variables. For

    instance,

    when

    modeling

    the

    product

    of two latent

    variables,

    products

    of indicators

    that do not "scale" the

    respective

    latent variables

    are often suitable

    for

    use

    as IVs.

    Suppose

    that

    Yi

    scales

    the first latent variable and

    Y2

    and

    Y3

    are additional

    measures

    of

    the same

    latent variable.

    Similarly,

    suppose

    that the

    y4

    variable scales the second latent variable

    and

    y5

    and

    Y6

    are

    two

    other

    indicators. Then

    Y2Y5,

    2Y6,

    3Ys,

    nd

    Y3Y6

    ften

    will

    qualify

    as IVs. Determination

    of

    their

    eligibility

    follows the

    same

    steps

    of

    writing

    a reduced-form

    expression

    for

    each variable

    in

    the

    product, obtainingthe productof the reducedforms, and calculating

    its

    covariance with

    ui

    to see

    if it

    is zero.

    If

    so,

    this

    product

    of

    the

    observed variables can serve as

    an IV.

    Researchers can sometimes form

    another

    IV

    by regressing

    each observed variable in

    the

    product

    term

    on

    all of the individual

    and

    product

    IVs

    of

    observed variables and

    calculating

    the

    predicted

    values from the linear

    regressions

    for each

    component

    (e.g.,

    9Y

    and

    Y2).

    Then one

    forms

    9192

    as an additional IV for

    the model. This latter

    IV follows a suggestion of Bowden and Turkington 1981) about the

    creation of IV

    for nonlinear

    functions

    of

    endogenous

    observed vari-

    ables

    in

    econometric models.

    The

    Kenny

    and

    Judd

    (1984)

    example

    discussed

    above

    provides

    an

    illustration

    of the selection of

    IVs for the

    2SLS method.

    Recall

    that the latent

    variable

    equation

    was

    y,

    =

    311L,

    +

    312L2

    +

    3L,L2

    +

    1,

    (37)

    with

    Y2

    and

    Y3

    the indicators that scale L1 and L2, respectively (see

    equations [2]

    and

    [3]). Substituting

    (Y2

    -

    E2)

    for L1 and

    (Y3

    -

    63)

    for

    L2

    leads

    to

    Yl

    =

    P31Y2

    +

    P/12Y3

    +

    133Y2Y3

    ul,

    235

    (38)

  • 8/10/2019 Bollen (1995 Sm)

    14/29

    KENNETH A.

    BOLLEN

    where

    u1

    =

    -311E2

    -

    f12E3

    -

    813y2E3

    -

    f813y3E2

    +

    813E2E3

    +

    1.

    Allther.h.s.

    variables

    of

    equation

    (38)

    are correlated with the

    composite

    distur-

    bance,

    u1.

    The

    y4

    and

    y5

    variables are

    indicatorsof L1

    and

    L2,

    respec-

    tively (see

    equations [4]

    and

    [5]).

    The

    Y4,

    Y5,

    and

    Y4Ys

    variables

    satisfy

    the

    conditions

    for

    IVs,

    as the reader can

    confirm.

    Regressing

    y,

    and

    Y2

    on these IVs and

    forming

    91

    and

    92

    and then

    calculating

    YP12

    eads

    to

    another IV. The

    2SLS

    estimator

    using

    all four IVs

    (Y4,

    y5,

    y4y5,

    and

    9192)

    is

    a

    consistent estimator of

    the

    coefficients

    in

    equation

    (37).

    Though

    the

    specific

    steps

    outlined above

    apply

    to

    any

    model,

    some

    general guidelines

    for

    ruling

    out IVs

    emerge

    from closer exami-

    nation of the

    composite

    disturbance

    ui.

    For the latent variable model

    in

    equation

    (22),

    equation

    (23)

    defines

    ui;

    it

    is

    repeated

    here for

    easy

    reference:

    Ui

    =

    B2ig2(yl,E1)

    -

    Bli

    +

    Ei

    +

    vi

    (39)

    Note that

    the latent variable model

    only

    has

    equations

    for

    the

    latent

    endogenous

    variables,

    so we

    do not

    have

    any equations

    to estimate

    for the latent

    exogenous

    variables

    in

    the latent variable

    model.

    Any

    variables correlated with i are ineligible as IVs (except in the im-

    probable

    situation

    in

    which

    a

    variable

    has

    an

    exactly

    equal

    but

    oppo-

    site in

    sign

    covariance

    with the

    remaining

    components

    of

    ui).

    In

    the

    typical

    situation,

    this means that

    other

    y's

    that are

    indicators of an

    endogenous

    Li

    are

    ineligible

    as IVs

    in

    the latent

    variable

    model

    since

    these other indicators

    correlate

    with

    V.2

    Less

    obvious is

    that indicators

    of latent variables

    that are influ-

    enced

    by Li

    are

    unacceptable

    since

    they

    too will correlate

    with

    Si.

    Also,

    if icorrelateswith rj, hen the indicatorsof Ljarenot suitable asIVs in

    the latent

    variable

    model.

    The

    B1iEl

    term

    means

    that

    any

    of the

    scaling

    indicators

    for the

    latent

    variablesthat

    appear

    on

    the

    right-hand

    ide

    of

    the

    yi equation

    cannot be

    IVs. Nor can

    y's

    whose

    errors

    of

    measure-

    ment

    correlate

    with the

    errors

    of

    such

    scaling

    indicators

    serve as

    IVs.

    Furthermore,

    any

    y's

    that have

    errors that

    correlate

    with

    Ei

    are ruled

    out

    as IVs.

    Finally,

    IVs must be

    uncorrelated

    with

    B2ig(yl,E1).

    In

    many

    cases variables

    that do not

    correlate with

    the other

    terms

    in

    ui

    will not

    correlatewith this one, but there are exceptions.

    In

    the measurement

    model the

    composite

    disturbance

    ui

    equals

    A2ig2(Y,1)

    -

    AliE1

    +

    Ei.

    The IVs

    must be uncorrelated

    with

    2Remember

    that

    I

    am

    referring

    to

    the

    latent variable

    model

    here.

    In

    measurement

    models

    some of the other

    indicators

    of the same latent

    variable

    can

    serve

    as IVs.

    236

  • 8/10/2019 Bollen (1995 Sm)

    15/29

    MODELS

    THAT ARE NONLINEAR

    IN LATENT

    VARIABLES

    these

    components

    of

    ui.

    An indicator whose

    error

    correlates

    with

    Ei

    s

    ineligible.

    Scaling

    indicators for latent variables that affect

    Yi

    or indi-

    cators whose errors of measurementcorrelate with the errorsof such

    scaling

    indicators cannot

    qualify

    as

    IVs either.

    Last,

    the IVs must be

    uncorrelated

    with the

    nonlinear

    term,

    A2ig2(yl,El).

    Note that

    unlike

    the

    composite

    disturbance

    n

    the

    latent variable model

    (see

    equation

    [23]),

    i

    does

    not

    appear

    in the

    composite

    disturbance

    for

    the mea-

    surement

    equation.

    This means

    that some of the observed variables

    that correlate

    with i and

    are

    hence

    ineligible

    as IVs

    for

    the

    latent

    variable

    equation

    might

    still be

    suitable IVs for

    equations

    from

    the

    measurement model.

    Another consideration

    in

    selecting

    IVs is that some variables

    might technically

    meet the

    conditions to

    be an

    IV,

    but

    they may

    not

    work well in

    practice.

    For

    instance,

    if

    the

    IVs

    collectively

    are

    poorly

    correlated

    with

    the

    variables that

    they

    are to

    replace,

    the

    resulting

    2SLS estimates

    may

    be unstable

    and far from the

    true

    parameters.

    Analysts

    can check this

    by

    examining

    the

    R2's

    from

    the

    first

    stage

    of

    the 2SLS

    procedure.

    Low values

    (e.g.,