analysis of interaction effects

Analysis of Interaction Effects

James JaccardNew York University

Will cover the basics of interaction analysis, highlighting multiple regression based strategies

Overview

Will discuss advanced issues and complications in interaction analysis. This treatment will be somewhat superficial but hopefully informative

Conceptual Foundations of Interaction Analysis

Causal Theories

Most (but not all) theories rely heavily on the concept of causality, i.e., we seek to identify the determinants of a behavior or mental state and/or the consequences of a behavior or environmental/mental state

I am going to ground interaction analysis in a causal framework

Causal Theories

Causal theories can be complicated, but at their core, there are five types of causal relationships in causal theories

Direct Causal Relationships

A direct causal relationship is when a variable, X, has a direct causal influence on another variable, Y:

Frustration Aggression+

Quality of Relationship with Mother

Adolescent Drug Use

Indirect Causal Relationships

An indirect causal relationship is when a variable, X, has a causal influence on another variable, Y, through an intermediary variable, M:

Indirect Causal Relationships

Quality ofRelationshipwith Mother

AdolescentDrug Use

AdolescentSchool Work

Spurious Relationship

A spurious relationship is one where two variables that are not causally related share a common cause:

Bidirectional Causal Relationships

A bidirectional causal relationship is when a variable, X, has a causal influence on another variable, Y, and that effect, Y, has a “simultaneous” impact on X:

Bidirectional Causal Relationships

Quality of Relationship with Mother

Adolescent Drug Use

Moderated Causal Relationships

A moderated causal relationship is when the impact of a variable, X, on another variable, Y, differs depending on the value of a third variable, Z

Treatment vs. No Treatment

Depression

Gender

Treatment vs. No Treatment

Depression

Gender

Exp Negative Peers

Drug Use

Quality of Parent-Adolescent Relationship

The variable that “moderates” the relationship is called a moderator variable.

Causal Theories

We put all these ideas together to build complex theories of phenomena. Here is one example:

Quality ofRelationshipwith Mother

AdolescentDrug Use

AdolescentSchool Work

Time MotherSpends with

Gender

Interaction Analysis

Interactions, when translated into causal analysis, focus on moderated relationships

When I encounter an interaction effect, I think:

Key step in interaction analysis is to identify the focal independent variable and the moderator variable.

Sometimes it is obvious – such as with the analysis of a treatment for depression on depression as moderated by gender

Gender

DepressionTreat vs Control

Sometimes it is not obvious – such as an analysis of the effects of gender and ethnicity on the amount of time an adolescent spends with his or her mother

Statistically, it matters not which variables take on which role. Conceptually, it does.

Gender

Time SpentEthnicity

The Statistical Analysis of Interactions

Omnibus tests – I do not use these

Some Common Practices

Hierarchical regression – I use sparingly

Focus on unstandardized coefficients - we tend to stay away from standardized coefficients in interaction analysis because they can be misleading and they do not have “clean” mathematical properties

Y = a + b1 X + e

A “Trick” We Will Use: Linear Transformations

Satisfaction = a + b1 Grade + e

Satisfaction = 12 + -.50 Grade + e

Y = a + b1 X + e

Satisfaction = 9 + -.50 (Grade – 6) + e

Y = a + b1 X + e

Satisfaction = 9 + -.50 (Grade – 6) + e

“Mean centering” is when we subtract the mean

Will focus on four cases:

Categorical IV and Categorical MV

Assume you know the basics of multiple regression and dummy variables in multiple regression

Continuous IV and Categorical MV

Categorical IV and Continuous MV

Continuous IV and Continuous MV

Y = Relationship satisfaction (0 to 10)

X = Gender (female = 1, male = 0)

Z = Grade (6th = 1, 7th = 0)

6th 7th

Female 8.0 7.0

Male 7.0 4.0

6th 7th

Female 8.0 7.0

Male 7.0 4.0

Three questions:

Is there a gender difference for 6th graders?

Is there a gender difference for 7th graders?

Are these gender effects different?

6th 7th

Female 8.0 7.0

Male 7.0 4.0

Gender effect for 6th grade: 8 – 7 = 1

6th 7th

Female 8.0 7.0

Male 7.0 4.0

6th 7th

Female 8.0 7.0

Male 7.0 4.0

Interaction contrast: (8-7) – (7– 4) = -2

Y = a + b1 Gender + b2 Grade + b3 (Gender)(Grade)

6th 7th

Female 8.0 7.0

Male 7.0 4.0

Y = 4.0 + 3.0 Gender + b2 Grade + -2.0 (Gender)(Grade)

Y = a + b1 Gender + b2 Grade + b3 (Gender)(Grade)

6th 7th

Female 8.0 7.0

Male 7.0 4.0

Y = 4.0 + 3.0 Gender + b2 Grade + -2.0 (Gender)(Grade)

Flipped: Y = 7.0 + 1.0 Gender + b2 Grade + 2.0 (Gender)(Grade)

Extend to groups > 2 (add 8th grade)

6th 7th

Female 8.0 7.0

Male 7.0 4.0

Inclusion of covariates

How to generate means and tables

X = Time spent together (in hours)

Z = Gender (female = 1, male = 0)

For females: b = 0.33

For males: b = 0.20

Three questions:

Are the effects different: 0.33 – 0.20

For males: b = 0.20

Y = a + b1 Gender + 0.20 Time + 0.13 (Gender)(Time)

For males: b = 0.20

Y = a + b1 Gender + 0.20 Time + 0.13 (Gender)(Time)

Flipped: Y = a + b1 Gender + 0.33 Time + -0.13 (Gender)(Time)

Do not estimate slopes separately; use flipped reference group strategy

Extend to groups > 2 (use grade as example)

Study conducted in Miami with bi-lingual Latinos

Ad language: Half shown ad in Spanish (0) and half in English (1)

Latino identity: 1 = not at all, 7 = strong identify

Outcome = Attitude toward product (1 = unfavorable, 7 = unfavorable)

Hypothesized moderated relationship

Common Analysis Form: Median Split

Many researchers not sure how to analyze this, so use median split for continuous moderator variable and conduct ANOVA

Why this is bad practice….

Identity Mean English – Mean Spanish

1 1.502 1.003 0.504 0.005 -0.506 -1.007 -1.50

Identity Mean English – Mean Spanish

1 1.502 1.003 0.504 0.005 -0.506 -1.007 -1.50

Y = a + b1 Ad language + b2 Identity + b3 Ad X Identity

In order to make intercept meaningful, subtracted 1 from Latino Identity measure, so ranged from 0 to 6

Y = a + b1 Ad language + b2 Identity + b3 Ad X Identity

Mean attitude for Spanish ad for Latino ID = 1 is 3.215

Mean difference for Latino ID = 1 is 1.707 (p < 0.05)

Mean attitude for English ad for Latino ID = 1 is 4.922

Mean difference for Latino ID = 1 is 1.707 (p < 0.05)

Identity Mean English Mean Spanish Difference

1 4.922 3.215 1.707*2 3 4 5 6 7

1 4.922 3.215 1.707*2 4.915 3.662 1.253*

3 4 5 6 7

1 4.922 3.215 1.707*2 4.915 3.662 1.253*

3 4.908 4.108 0.800*

4 5 6 7

1 4.922 3.215 1.707*2 4.915 3.662 1.253*

3 4.908 4.108 0.800*

4 4.901 4.555 0.346*

5 4.895 5.002 -0.107

6 4.888 5.449 -0.561*

7 4.882 5.896 -1.014*

(Common practice, Mean = 3, SD = 1.2; Show R program)

Y: Child anxiety (0 to 20)

X: Parent anxiety (0 to 20)

Z: Parenting behavior: Control (0 to 20)

Y: Child anxiety (0 to 20)

X: Parent anxiety (0 to 20)

Z: Parenting behavior: Control (0 to 20)

Control b for Y onto X

7 .10 8 .20 9 .30 10 .40 11 .50 12 .60 13 .70

Control b for Y onto X

7 .10 8 .20 9 .30 10 .40 11 .50 12 .60 13 .70

Y = a + b1 Control + 0.10 PA + 0.10 (Control)(PA)

(Common practice versus regions of significance)

(Why we include component parts)

Advanced Topics

Three Way Interactions

Identify focal independent variable

Identify first order moderator variable

Identify second order moderator variable

SatisfactionGender

Ethnicity

Grade 7 Grade 8

Female 6.0 6.0

Male 5.0 4.0

IC1 = (6-5) - (6-4) = -1

IC = (6-5) – (6-4) = -1

European American

Grade 7 Grade 8

Female 6.0 6.0

Male 5.0 4.0

IC1 = (6-5) - (6-4) = -1

Grade 7 Grade 8

Female 6.0 6.0

Male 6.0 6.0

IC = (6-5) – (6-4) = -1 IC = (6-6) – (6-6) = 0

European American Latinos

Grade 7 Grade 8

Female 6.0 6.0

Male 5.0 4.0

IC1 = (6-5) - (6-4) = -1

Grade 7 Grade 8

Female 6.0 6.0

Male 6.0 6.0

IC = (6-5) – (6-4) = -1 IC = (6-6) – (6-6) = 0

European American Latinos

TW = [(6-5) – (6-4)] - [(6-6) – (6-6)] = -1

G7 (1) G8 (0)

Female (1) 6.0 6.0

Male (0) 5.0 4.0

IC1 = (6-5) - (6-4) = -1

G7 (1) G8 (0)

Female (1) 6.0 6.0

Male (0) 6.0 6.0IC = (6-5) – (6-4) = -1 IC = (6-6) – (6-6) = 0

European American (1) Latinos (0)

Y = 6.0 + 0 Gender + b2 Grade + b3 Ethnic + 0 (Gender)(Grade)

+ b5 (Gender)(Ethnic) + b6 (Grade)(Ethnic) + -1 (Gender)(Grade)(Ethnic)

TW = [(6-5) – (6-4)] - [(6-6) – (6-6)] = -1

Modeling Non-Linear Interactions

Y = α + β1 X + β2 Z + ε

β1 = α’ + β3 Z + β4 Z2

Y = α + β1 X + β2 Z + ε

Substitute right hand side for β1:

β1 = α’ + β3 Z + β4 Z2

Y = α + (α’ + β3 Z + β4 Z2) X + β2 Z + ε

Y = α + β1 X + β2 Z + ε

Substitute right hand side for β1:

β1 = α’ + β3 Z + β4 Z2

Y = α + (α’ + β3 Z + β4 Z2) X + β2 Z + ε

Expand:

Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε

Re-arrange terms:

Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε

Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 + ε

Re-arrange terms:

Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε

Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 + ε

Re-label and you have your model:

Y = α + β1 X + β2 Z + β3 XZ + β4 XZ2 + ε

Re-arrange terms:

Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε

Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 + ε

Re-label and you have your model:

Y = α + β1 X + β2 Z + β3 XZ + β4 XZ2 + ε

Use centering strategy to isolate effect of X on Y (β1 ) at any given value of Z; also consider modeling intercept

Exploratory Interaction Analysis

Use program in R

Y = Tenured or not (using MLPM)

X = Number of articles published

Y = α + β1 X + ε

Z = Number of years since hired

X COEFFICENT AND M VALUES

N M Value X Slope

478 1.000 .000475 2.000 .002457 3.000 .007408 4.000 .007330 5.000 .009246 6.000 .008166 7.000 .005115 8.000 .00974 9.000 .01148 10.000 .001

Regression Mixture Modeling

BI = α + β1 Aact + β2 PN + β3 PBC + ε

Mixture Regression

But, in reality, we probably are mixing heterogeneous population segments with different coefficients characterizing the segments

When we regress Y onto a set of predictors, we assume that people are drawn from a single population with common linear coefficients

With “mixed” populations, the overall regression analysis can characterize neither segment very well and lead to sub-optimal inferences and intervention strategies

Mixture Regression

Another Example of Aggregation Bias

Mixture Regression

IntentionSN

Latent Class X

A four class model fits data best (entries are linear coefficients)

Segment 1 (42%): .33 .02 .01 -.01

Mixture Model for Heavy Episodic Drinking

Segment 2 (17%): .10 .29 .30 .01

Segment 3 (21%): .30 .29 .05 .04

Segment 4 (20%): .48 .09 .25 -.03

Aact SN DN PBC

Interaction Analysis and Establishing Generalizability

It is common for people to conclude that an effect “generalizes” in the absence of a statistically significant interaction effect

Problem is that we can never accept the null hypothesis of a zero interaction contrast

Generalizability

Example with RCT of obesity treatment and gender

Solution: Adopt the framework of equivalence testing

Step 1: Specify a threshold value that will be used to define functional equivalence

Step 2: Specify the range of functional equivalence

Generalizability

Step 3: Calculate the 95% CI for the interaction contrast

Step 4: Determine if the CI is completely within the range of functional equivalence

Measurement Error

It is well known that measurement error can bias parameter estimates in multiple regression. This holds with vigor for interaction analysis

One approach to dealing with measurement error in general is to use latent variable modeling

Measurement Error

Depression

Measurement Error

Depression

Measurement Error

D1 D2 D3

Depression

e1 e2 e3

Latent Variable Regression

Support

X1 X2 X3

Z2 Z3Z1

e4 e5 e6

e10 e11

There are a about a half a dozen approaches to how best to model latent variable interactions (e.g., quasi-maximum likelihood; Bayesian). I recommend the approach developed by Herbert Marsh as a good balance between utility and complexity, coupled with Huber-White sandwich estimators for robustness

Latent Variable Regression

Latent variable regression using multiple group analysis

X1 X2 X3

Z2 Z3Z1

e4 e5 e6

Multi-Group Modeling in SEM

Assumption Violations

If assumptions of normality or variance homogeneity are suspect

Huber-White sandwich estimators

Assumption Violations

Use approaches with robust standard errors

Be careful of outlier resistant robust methods

Bootstrapping

Rand Wilcox work with smoothers

Thank God It Has Ended!

analysis of interaction effects

causal analysis

direct causal influence

types of causal relationships

line of text

intermediary variable

moderator variable

focal independent variable

basics of interaction

Documents

structure-soil- structure interaction at srs effects-seismic...

meaningful interaction analysis

repeated interaction and reputation effects. repeated...

interaction lyytinen & gaskin. interaction – definition in...

unit 10: interaction and quadratic effects

effects of soil-structure interaction in seismic analysis...

measuring asymmetric persistence and interaction effects

hydrodynamics of thruster interaction · 06/11/2009 ·...

chemical and physical effects of interaction between

effects of soil-structure interaction in seismic analysis

interaction effects - le.ac.uk

protein interaction analysis

flander’s interaction analysis

additive main effects and multiplicative interaction

interaction analysis

interaction effects in econometrics - university of houston

sensitivity analysis of rail-structure interaction force...

adjusting for tax interaction effects in the economic...

analysis of soil structure interaction effects on …

interaction effects of multiple pool fires