statistical considerations in identifying mechanisms of change
TRANSCRIPT
Statistical Considerations in Identifying Mechanisms of
Change
J. Scott Tonigan
The statistical search for mechanisms of change involves multiple inferential tests, ones that gener-ally follow a fixed sequence designed to demonstrate mediation. While there are several popularapproaches to conducting such tests, e.g., SEM and MRA, the inflated Type I error rate problemassociated with conducting these tests has received little, if any, attention. This paper offers 2solutions to avoid committing Type I errors associated with mediational tests. Most straight-forward, investigators may choose to use a Bonferroni adjustment. In contrast, a design-basedapproach can be used that tests rival explanations for the observed effects. Examples drawn fromaddiction research are provided.
Key Words: Mediation, Type I Error, Alcohol Treatment.
P RESENTATIONS AT THIS Satellite Session on Mech-anisms of Behavior Change have first highlighted differ-
ent conceptual frameworks to investigate how changes inaddictive behavior occur, followed by case studies that illus-trate the relative advantages of particular statistical tech-niques in investigating treatment-specific and non-specificactive ingredients. Interestingly, as a starting point all casestudies presented in this workshop endorsed Baron and Ken-ny’s (1986) 4 inferential tests to demonstrate full (as opposedto partial) mediation when searching for mechanisms produc-ing behavior change: (1) efficacy test, (2) intervention test, (3)change ⁄ mediator test, and (4) delta efficacy test. Also inter-esting, none of the presentations addressed the implications ofthe multiple-comparison problem for their respective media-tional analyzes. This oversight of inflated Type I error inher-ent in Baron and Kenny’s (1986) approach is not unique totoday’s presentations. It is rare, indeed, to see any adjustmentfor inflated Type I error associated with tests of mediation inthe peer-reviewed literature, and one cannot help but wonderhow such adjustments may alter declarations of ‘‘signifi-cance.’’To what extent is inflated Type I error a concern when con-
ducting classical tests for mediation? We know, for example,that orthogonal contrasts have an inflated Type I error rateof, 1 to (1–a)c, where c equals the number of contrasts. In thesimple case of 4 inferential (and orthogonal) tests this formulainforms us that our Type I error rate will equal about 0.23.Typically, then, there is about a 20% chance of falsely reject-
ing at least one ‘‘true’’ null hypothesis when applying Baronand Kenny’s approach to mediation. It is important to stressthat this Type I error rate estimation is conservative becausemediational tests are not orthogonal. In fact, 3 of the 4 infer-ential tests include the intervention (A) term.Two approaches for contending with inflated Type I error
associated with mediation analysis are presented; both ofwhich have advantages and disadvantages. Most straight-forward is simply partitioning alpha to account for multipleinferential test (e.g., Harris, 1985; Maxwell and Delaney,1989). Here, a = 0.05 may be divided by the number ofplanned tests, and the obtained inferential statistic can becompared with critical value associated with the reduced a(e.g., 0.05 ⁄4 = 0.0125). When conducting Baron and Kinny’s4 mediation test’s, for ease obtained statistics can be com-pared with tabled critical values of a = 0.01, with rejection ofthe null hypothesis reported at 0.05.Not widely known, the overall a of 0.05 can be maintained
across tests without the need to equally allocate the criticalvalues for the planned inferential tests. Specifically, when 3tests are planned it is not necessary to use a 3 in the denomi-nator and 0.05 as the numerator. One may, for example,desire a more liberal critical value for 1 test. Here, one candivide 0.05 by 2, and set the more liberal critical value associ-ated with 0.025 to the desired test. To maintain overalla = 0.05, however, the remaining 2 tests need to partition0.025 (which also does not require equal weighting). Obvi-ously, the cost of having one liberal critical value is achievedby setting more stringent critical values for other, less impor-tant, tests. While the mechanics of non-equal partitioning of ais relatively simple, the application of non-equal weightingrequires theoretical justification or, at a minimum, a justifi-able rationale.As an illustration of this approach, let’s consider Connors
et al.’s (2001), prediction that Alcoholics Anonymous (AA)-related benefit was explained, in part, by increased self-effi-cacy to remain abstinent. Applying Baron & Kenny’s criteria,
From the Center on Alcoholism, Substance Abuse and Addictions,University of New Mexico, Department of Psychology, Albuquerque,New Mexico (JST).
Reprint requests: J. Scott Tonigan, PhD, 2350 Alamo S.E. Albu-querque, New Mexico 87131; E-mail: [email protected]
Conflict of Interest: The author states no conflict of interest.Copyright � 2007 by the Research Society on Alcoholism.
DOI: 10.1111/j.1530-0277.2007.00494.x
Alcoholism: Clinical and Experimental Research Vol. 31, No. S3October 2007
Alcohol Clin Exp Res, Vol 31, No S3, 2007: pp 55S–56S 55S
they concluded that self-efficacy was a likely change mecha-nism for outpatient clients in Project MATCH because: (1)AA attendance predicted later abstinence (b = 0.25,p < 0.001), (2) AA attendance predicted later self-efficacy(b = 0.14, p < 0.004), (3) self-efficacy predicted later absti-nence (b = 0.39, p < 0.001),, and (4) a 24% decrement inthe intervention effect was observed with the mediator in themodel, D = 0.06. All of these tests met the more stringentBonferroni adjusted alpha (a of 0.01).A design-based approach to contend with inflated Type I
error involves testing for rival explanations for observedmediation. Here, the focus is on how, if at all, variables notaccounted for in the model may account for the ‘‘apparent’’mediation. Obviously, there are an infinite number of rivalexplanations for an observed causal mechanism shown toproduce an intended effect, many of which are not even mea-sured. Astute advance planning, however, will lead to theinclusion of some likely rival candidates for producingintended outcomes, and exclusion of these rival mediators viastatistical means adds substantial confidence that observedfindings are not spurious Type I errors.We recently conducted the rival variable approach to vali-
date a mediation model. In particular, we reasoned that AA-related benefit could be explained by practicing of the AAprogram (12 steps). Figure 1 displays our predictions andobserved path coefficients between AA attendance, absti-nence, and a composite measure of completion of AA stepsamong Project MATCH clients in Albuquerque NM at 10-year follow-up. As shown, applying Baron and Kenny’s crite-ria led us to conclude that AA-related benefit was produced,in part, by practicing of the 12 steps. We questioned, however,if frequency of AA attendance mobilized AA step work, espe-cially at long-term follow-up when many of the participantshad already completed the 12 steps. We reasoned that AAstep work may actually be explained by recent treatmentexperiences of study participants, many of who would beencouraged to attend AA and work the 12-steps. Here, ourinvestigation focused on the validity of the intervention test(2), within Baron and Kenny’s criteria to establish mediation.Of course, the rival explanation approach can be applied toany path in the mediation model.
Figure 2 shows that our rival predictions are partially sup-ported. Specifically, frequency of formal therapy days in the6-month period before the interview was significantly andpositively related with frequency of AA meeting attendanceduring the same period (b = 0.35, p < 0.001). In contrast,frequency of therapy days did not predict extent of AA stepwork as we predicted. While rival variables not included infigure 2 may, in fact, account for increased AA step work, wecan be reasonably confident that such increases are not anartifact of recent treatment experiences.In sum, there is little evidence to suggest that inflated Type
I error is a concern among researchers conducting mediationanalyzes, regardless of the specific path under consideration.This is unfortunate. Prescribed methods for testing putativecausal models involve a sequence of statistical tests thatincrease the likelihood of falsely rejecting the null hypothesis.This paper briefly described 2 methods to control for, and toevaluate, the possibility of a Type I error. Alternativeapproaches to investigating mediation that use only a singlestatistical are available, but adoption of these techniques bythe researchers is slow, e.g., difference in coefficients andproduct of coefficients approach. The cautionary nature ofthis presentation is not intended to curtail the conduct ofmediational analyzes, quite the opposite. Succinctly, we needto investigate mechanisms producing behavior change in waysthat enhance the probability of later replications. Significantstrides in this direction will be achieved with the reduction ofspurious findings in the published literature.
REFERENCES
Baron RM, Kenny DA (1986) The moderator-mediator distinction in social
psychological research: conceptual, strategic, and statistical considerations.
J Pers Soc Psychol 51:1173–1182.
Connors GJ, Tonigan JS, Miller WR (2001) A longitudinal model of AA affili-
ation, participation, and outcome: retrospective study of the Project
MATCH outpatient and aftercare samples. J Stud Alcohol 62:817–825.
Harris RJ (1985) A Primer of Multivariate Statistics, 2nd edn. Academic Press,
Inc, New York.
Maxwell SE, Delaney HD (1989) Designing Experiments and Analyzing Data:
A Model Comparison Perspective. Wadsworth Publishing Company,
Belmont, CA.
AAMeetingsAttended
AA stepsCompleted
PercentDays
Abstinent
.22*
(.15)
.49* .18*
Years 9-10 End of Year 10
Fig. 1. Stage 1 rival model.
AAMeetingsAttended
AA stepsCompleted
PercentDays
Abstinent
.22*
(.15)
.48 .18*
Years 9-10 End of Year 10
DaysFormal
Therapy
.35
.01
Years 9-10
Fig. 2. Stage 2 rival model.
56S TONIGAN