analysis of interaction effects
DESCRIPTION
Analysis of Interaction Effects. James Jaccard New York University. Overview. Will cover the basics of interaction analysis, highlighting multiple regression based strategies. - PowerPoint PPT PresentationTRANSCRIPT
Analysis of Interaction Effects
James JaccardNew York University
Will cover the basics of interaction analysis, highlighting multiple regression based strategies
Overview
Will discuss advanced issues and complications in interaction analysis. This treatment will be somewhat superficial but hopefully informative
Conceptual Foundations of Interaction Analysis
Causal Theories
Most (but not all) theories rely heavily on the concept of causality, i.e., we seek to identify the determinants of a behavior or mental state and/or the consequences of a behavior or environmental/mental state
I am going to ground interaction analysis in a causal framework
Causal Theories
Causal theories can be complicated, but at their core, there are five types of causal relationships in causal theories
Direct Causal Relationships
A direct causal relationship is when a variable, X, has a direct causal influence on another variable, Y:
X Y
Direct Causal Relationships
Frustration Aggression+
Direct Causal Relationships
Frustration Aggression+
Quality of Relationship with Mother
Adolescent Drug Use
-
Indirect Causal Relationships
Indirect Causal Relationships
An indirect causal relationship is when a variable, X, has a causal influence on another variable, Y, through an intermediary variable, M:
M YX
Indirect Causal Relationships
Quality ofRelationshipwith Mother
AdolescentDrug Use
AdolescentSchool Work
Ethic
Spurious Relationship
Spurious Relationship
A spurious relationship is one where two variables that are not causally related share a common cause:
C
YX
Bidirectional Causal Relationships
Bidirectional Causal Relationships
A bidirectional causal relationship is when a variable, X, has a causal influence on another variable, Y, and that effect, Y, has a “simultaneous” impact on X:
YX
Bidirectional Causal Relationships
Quality of Relationship with Mother
Adolescent Drug Use
Moderated Causal Relationships
Moderated Causal Relationships
A moderated causal relationship is when the impact of a variable, X, on another variable, Y, differs depending on the value of a third variable, Z
Z
YX
Moderated Causal Relationships
Treatment vs. No Treatment
Depression
Gender
Moderated Causal Relationships
Treatment vs. No Treatment
Depression
Gender
Exp Negative Peers
Drug Use
Quality of Parent-Adolescent Relationship
Moderated Causal Relationships
The variable that “moderates” the relationship is called a moderator variable.
Z
YX
Causal Theories
We put all these ideas together to build complex theories of phenomena. Here is one example:
Quality ofRelationshipwith Mother
AdolescentDrug Use
AdolescentSchool Work
Ethic
Time MotherSpends with
Child
Gender
Interaction Analysis
Interactions, when translated into causal analysis, focus on moderated relationships
When I encounter an interaction effect, I think:
Z
YX
Key step in interaction analysis is to identify the focal independent variable and the moderator variable.
Sometimes it is obvious – such as with the analysis of a treatment for depression on depression as moderated by gender
Interaction Analysis
Gender
DepressionTreat vs Control
Sometimes it is not obvious – such as an analysis of the effects of gender and ethnicity on the amount of time an adolescent spends with his or her mother
Interaction Analysis
Statistically, it matters not which variables take on which role. Conceptually, it does.
Gender
Time SpentEthnicity
The Statistical Analysis of Interactions
Omnibus tests – I do not use these
Some Common Practices
Hierarchical regression – I use sparingly
Focus on unstandardized coefficients - we tend to stay away from standardized coefficients in interaction analysis because they can be misleading and they do not have “clean” mathematical properties
Y = a + b1 X + e
A “Trick” We Will Use: Linear Transformations
Satisfaction = a + b1 Grade + e
Satisfaction = 12 + -.50 Grade + e
Y = a + b1 X + e
A “Trick” We Will Use: Linear Transformations
Satisfaction = a + b1 Grade + e
Satisfaction = 12 + -.50 Grade + e
Satisfaction = 9 + -.50 (Grade – 6) + e
Y = a + b1 X + e
A “Trick” We Will Use: Linear Transformations
Satisfaction = a + b1 Grade + e
Satisfaction = 12 + -.50 Grade + e
Satisfaction = 9 + -.50 (Grade – 6) + e
“Mean centering” is when we subtract the mean
Will focus on four cases:
Categorical IV and Categorical MV
Assume you know the basics of multiple regression and dummy variables in multiple regression
Interaction Analysis
Continuous IV and Categorical MV
Categorical IV and Continuous MV
Continuous IV and Continuous MV
Categorical IV and Categorical MV
Categorical IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Gender (female = 1, male = 0)
Z = Grade (6th = 1, 7th = 0)
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Three questions:
Is there a gender difference for 6th graders?
Is there a gender difference for 7th graders?
Are these gender effects different?
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Gender effect for 6th grade: 8 – 7 = 1
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Gender effect for 6th grade: 8 – 7 = 1
Gender effect for 7th grade: 7 – 4 = 3
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Gender effect for 6th grade: 8 – 7 = 1
Gender effect for 7th grade: 7 – 4 = 3
Interaction contrast: (8-7) – (7– 4) = -2
Y = a + b1 Gender + b2 Grade + b3 (Gender)(Grade)
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Y = 4.0 + 3.0 Gender + b2 Grade + -2.0 (Gender)(Grade)
Y = a + b1 Gender + b2 Grade + b3 (Gender)(Grade)
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Y = 4.0 + 3.0 Gender + b2 Grade + -2.0 (Gender)(Grade)
Flipped: Y = 7.0 + 1.0 Gender + b2 Grade + 2.0 (Gender)(Grade)
Extend to groups > 2 (add 8th grade)
Categorical IV and Categorical MV
6th 7th
Female 8.0 7.0
Male 7.0 4.0
Inclusion of covariates
How to generate means and tables
Continuous IV and Categorical MV
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
For females: b = 0.33
For males: b = 0.20
Three questions:
Are the effects different: 0.33 – 0.20
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
For females: b = 0.33
For males: b = 0.20
Y = a + b1 Gender + 0.20 Time + 0.13 (Gender)(Time)
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
For females: b = 0.33
For males: b = 0.20
Y = a + b1 Gender + 0.20 Time + 0.13 (Gender)(Time)
Flipped: Y = a + b1 Gender + 0.33 Time + -0.13 (Gender)(Time)
Continuous IV and Categorical MV
Do not estimate slopes separately; use flipped reference group strategy
Extend to groups > 2 (use grade as example)
Categorical IV and Continuous MV
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Ad language: Half shown ad in Spanish (0) and half in English (1)
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Ad language: Half shown ad in Spanish (0) and half in English (1)
Latino identity: 1 = not at all, 7 = strong identify
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Ad language: Half shown ad in Spanish (0) and half in English (1)
Latino identity: 1 = not at all, 7 = strong identify
Outcome = Attitude toward product (1 = unfavorable, 7 = unfavorable)
Hypothesized moderated relationship
Common Analysis Form: Median Split
Many researchers not sure how to analyze this, so use median split for continuous moderator variable and conduct ANOVA
Why this is bad practice….
Categorical IV and Continuous MV
Identity Mean English – Mean Spanish
1 1.502 1.003 0.504 0.005 -0.506 -1.007 -1.50
Categorical IV and Continuous MV
Identity Mean English – Mean Spanish
1 1.502 1.003 0.504 0.005 -0.506 -1.007 -1.50
Y = a + b1 Ad language + b2 Identity + b3 Ad X Identity
Categorical IV and Continuous MV
In order to make intercept meaningful, subtracted 1 from Latino Identity measure, so ranged from 0 to 6
Y = a + b1 Ad language + b2 Identity + b3 Ad X Identity
Categorical IV and Continuous MV
Categorical IV and Continuous MV
Mean attitude for Spanish ad for Latino ID = 1 is 3.215
Categorical IV and Continuous MV
Mean attitude for Spanish ad for Latino ID = 1 is 3.215
Mean difference for Latino ID = 1 is 1.707 (p < 0.05)
Categorical IV and Continuous MV
Mean attitude for Spanish ad for Latino ID = 1 is 3.215
Mean attitude for English ad for Latino ID = 1 is 4.922
Mean difference for Latino ID = 1 is 1.707 (p < 0.05)
Categorical IV and Continuous MV
Identity Mean English Mean Spanish Difference
1 4.922 3.215 1.707*2 3 4 5 6 7
Categorical IV and Continuous MV
Identity Mean English Mean Spanish Difference
1 4.922 3.215 1.707*2 4.915 3.662 1.253*
3 4 5 6 7
Categorical IV and Continuous MV
Identity Mean English Mean Spanish Difference
1 4.922 3.215 1.707*2 4.915 3.662 1.253*
3 4.908 4.108 0.800*
4 5 6 7
Categorical IV and Continuous MV
Identity Mean English Mean Spanish Difference
1 4.922 3.215 1.707*2 4.915 3.662 1.253*
3 4.908 4.108 0.800*
4 4.901 4.555 0.346*
5 4.895 5.002 -0.107
6 4.888 5.449 -0.561*
7 4.882 5.896 -1.014*
(Common practice, Mean = 3, SD = 1.2; Show R program)
Continuous IV and Continuous MV
Y: Child anxiety (0 to 20)
X: Parent anxiety (0 to 20)
Z: Parenting behavior: Control (0 to 20)
Continuous IV and Continuous MV
Y: Child anxiety (0 to 20)
X: Parent anxiety (0 to 20)
Z: Parenting behavior: Control (0 to 20)
Control b for Y onto X
7 .10 8 .20 9 .30 10 .40 11 .50 12 .60 13 .70
Continuous IV and Continuous MV
Control b for Y onto X
7 .10 8 .20 9 .30 10 .40 11 .50 12 .60 13 .70
Y = a + b1 Control + 0.10 PA + 0.10 (Control)(PA)
(Common practice versus regions of significance)
(Why we include component parts)
Advanced Topics
Three Way Interactions
Three Way Interactions
Identify focal independent variable
Identify first order moderator variable
Identify second order moderator variable
Grade
SatisfactionGender
Ethnicity
Three Way Interactions
Grade 7 Grade 8
Female 6.0 6.0
Male 5.0 4.0
IC1 = (6-5) - (6-4) = -1
IC = (6-5) – (6-4) = -1
European American
Three Way Interactions
Grade 7 Grade 8
Female 6.0 6.0
Male 5.0 4.0
IC1 = (6-5) - (6-4) = -1
Grade 7 Grade 8
Female 6.0 6.0
Male 6.0 6.0
IC = (6-5) – (6-4) = -1 IC = (6-6) – (6-6) = 0
European American Latinos
Three Way Interactions
Grade 7 Grade 8
Female 6.0 6.0
Male 5.0 4.0
IC1 = (6-5) - (6-4) = -1
Grade 7 Grade 8
Female 6.0 6.0
Male 6.0 6.0
IC = (6-5) – (6-4) = -1 IC = (6-6) – (6-6) = 0
European American Latinos
TW = [(6-5) – (6-4)] - [(6-6) – (6-6)] = -1
Three Way Interactions
G7 (1) G8 (0)
Female (1) 6.0 6.0
Male (0) 5.0 4.0
IC1 = (6-5) - (6-4) = -1
G7 (1) G8 (0)
Female (1) 6.0 6.0
Male (0) 6.0 6.0IC = (6-5) – (6-4) = -1 IC = (6-6) – (6-6) = 0
European American (1) Latinos (0)
Y = 6.0 + 0 Gender + b2 Grade + b3 Ethnic + 0 (Gender)(Grade)
+ b5 (Gender)(Ethnic) + b6 (Grade)(Ethnic) + -1 (Gender)(Grade)(Ethnic)
TW = [(6-5) – (6-4)] - [(6-6) – (6-6)] = -1
Modeling Non-Linear Interactions
Y = α + β1 X + β2 Z + ε
Modeling Non-Linear Interactions
β1 = α’ + β3 Z + β4 Z2
Y = α + β1 X + β2 Z + ε
Substitute right hand side for β1:
Modeling Non-Linear Interactions
β1 = α’ + β3 Z + β4 Z2
Y = α + (α’ + β3 Z + β4 Z2) X + β2 Z + ε
Y = α + β1 X + β2 Z + ε
Substitute right hand side for β1:
Modeling Non-Linear Interactions
β1 = α’ + β3 Z + β4 Z2
Y = α + (α’ + β3 Z + β4 Z2) X + β2 Z + ε
Expand:
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε
Modeling Non-Linear Interactions
Re-arrange terms:
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε
Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 + ε
Modeling Non-Linear Interactions
Re-arrange terms:
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε
Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 + ε
Re-label and you have your model:
Y = α + β1 X + β2 Z + β3 XZ + β4 XZ2 + ε
Modeling Non-Linear Interactions
Re-arrange terms:
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z + ε
Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 + ε
Re-label and you have your model:
Y = α + β1 X + β2 Z + β3 XZ + β4 XZ2 + ε
Use centering strategy to isolate effect of X on Y (β1 ) at any given value of Z; also consider modeling intercept
Exploratory Interaction Analysis
Exploratory Interaction Analysis
Use program in R
Y = Tenured or not (using MLPM)
X = Number of articles published
Y = α + β1 X + ε
Z = Number of years since hired
X COEFFICENT AND M VALUES
N M Value X Slope
478 1.000 .000475 2.000 .002457 3.000 .007408 4.000 .007330 5.000 .009246 6.000 .008166 7.000 .005115 8.000 .00974 9.000 .01148 10.000 .001
Regression Mixture Modeling
BI = α + β1 Aact + β2 PN + β3 PBC + ε
Mixture Regression
But, in reality, we probably are mixing heterogeneous population segments with different coefficients characterizing the segments
When we regress Y onto a set of predictors, we assume that people are drawn from a single population with common linear coefficients
With “mixed” populations, the overall regression analysis can characterize neither segment very well and lead to sub-optimal inferences and intervention strategies
Mixture Regression
Another Example of Aggregation Bias
Mixture Regression
Aact
IntentionSN
PBC
Latent Class X
A four class model fits data best (entries are linear coefficients)
Segment 1 (42%): .33 .02 .01 -.01
Mixture Model for Heavy Episodic Drinking
Segment 2 (17%): .10 .29 .30 .01
Segment 3 (21%): .30 .29 .05 .04
Segment 4 (20%): .48 .09 .25 -.03
Aact SN DN PBC
Interaction Analysis and Establishing Generalizability
It is common for people to conclude that an effect “generalizes” in the absence of a statistically significant interaction effect
Problem is that we can never accept the null hypothesis of a zero interaction contrast
Generalizability
Example with RCT of obesity treatment and gender
Solution: Adopt the framework of equivalence testing
Step 1: Specify a threshold value that will be used to define functional equivalence
Step 2: Specify the range of functional equivalence
Generalizability
Step 3: Calculate the 95% CI for the interaction contrast
Step 4: Determine if the CI is completely within the range of functional equivalence
Measurement Error
It is well known that measurement error can bias parameter estimates in multiple regression. This holds with vigor for interaction analysis
One approach to dealing with measurement error in general is to use latent variable modeling
Measurement Error
D1
Depression
e1
Measurement Error
D1
Depression
e1
Measurement Error
D1 D2 D3
Depression
e1 e2 e3
Latent Variable Regression
e3
Z
X
Support
X1 X2 X3
X1Z1
X2Z2
X3Z3
Y
Z2 Z3Z1
Y1 Y2
e2e1
e8
e7
e9
e4 e5 e6
e10 e11
d3
There are a about a half a dozen approaches to how best to model latent variable interactions (e.g., quasi-maximum likelihood; Bayesian). I recommend the approach developed by Herbert Marsh as a good balance between utility and complexity, coupled with Huber-White sandwich estimators for robustness
Latent Variable Regression
Latent variable regression using multiple group analysis
e3
Z
X
X1 X2 X3
Y
Z2 Z3Z1
Y1 Y2
e2e1
e4 e5 e6
e7 e8
d3
Multi-Group Modeling in SEM
Assumption Violations
If assumptions of normality or variance homogeneity are suspect
Huber-White sandwich estimators
Assumption Violations
Use approaches with robust standard errors
Be careful of outlier resistant robust methods
Bootstrapping
Rand Wilcox work with smoothers
Thank God It Has Ended!