Transcript
Page 1: The Statistics of Synergism

J Mol Cell Cardiol 30, 723–731 (1998)

Review

The Statistics of SynergismBryan K. SlinkerDepartment of Veterinary and Comparative Anatomy, Pharmacology, and Physiology,Washington State University, Pullman, WA, USA

(Received 1 December 1997, accepted in revised form 2 February 1998)

B. K. S. The Statistics of Synergism. Journal of Molecular and Cellular Cardiology (1998) 30, 723–731.Biological scientists often want to determine whether two agents or events, for example, extracellular stimuliand/or intracellular signaling pathways, act synergistically when eliciting a biological response. When settingout to study whether two experimental treatments act synergistically, most biologists design the correctexperiment—they administer four treatment combinations consisting of (1) the first treatment alone, (2) thesecond treatment alone, (3) both treatments together, and (4) neither treatment (i.e. the control). Many biologistsare less clear about the correct statistical approach to determining whether the data collected in such anexperimental design support a conclusion regarding synergism, or lack thereof. The non-additivity of twoexperimental treatments that is central to the definition of synergism leads to an algebraic formulation cor-responding to the statistical null hypothesis appropriate for testing whether or not there is synergism. Theresulting complex contrast among the four treatment group means is identical to the interaction effect tested ina two-way analysis of variance (ANOVA). This should not be surprising, because synergism, by definition, occurswhen two treatments interact, rather than act independently, to influence a biological response. Hence, in themost readily implemented approach, the correct statistical analysis of a question of synergism is based on testingthe interaction effect in a two-way ANOVA. This review presents the rationale for this correct approach toanalysing data when the question is of synergism, and applies this approach to a recent published example. Inaddition, a common incorrect approach to analysing data with regards to synergism is presented. Finally, severalassociated statistical issues with regard to correctly implementing a two-way ANOVA are discussed.

1998 Academic Press Limited

K W: Analysis of variance; ANOVA.

that “the whole is greater than the sum of theIntroductionparts”, which is in accordance with the formaldefinitions stated above.Synergism: interaction of discrete agencies . . .,

When setting out to study synergism, mostagents (as drugs), or conditions such that thebiologists almost instinctively design the correcttotal effect is greater than the sum of individualexperiment—they administer four treatment com-effects (Merriam-Webster, 1996)binations consisting of (1) the first treatment alone,co-operative action of discrete agencies such(2) the second treatment alone, (3) both treatmentsthat the total effect is greater than the sumtogether, and (4) neither treatment (i.e. the control).of the effects taken independently (Merriam-For example, Boyadjieva and Sarkar (1997) studiedWebster, 1981)the interaction of cAMP-mediated cell signaling and

Biological scientists often want to determine ethanol in regulating b-endorphin secretion fromwhether two agents or events, for example extra- hypothalamic cells in culture by studying four treat-cellular stimuli (Boyadjieva and Sarkar, 1997) and/ ment groups: dibutyl-cAMP alone, ethanol alone,or intracellular signaling pathways (Boyadjieva and combined treatment with dibutyl-cAMP and eth-Sarkar, 1997; Yamazaki et al., 1997), act syner- anol, and no treatment (control). Similarly, whengistically to produce a biological response. By syn- Yamazaki et al. (1997) wanted to study whether

there was synergism of protein kinase A (PKA)-ergism, most biologists mean something to the effect

Please address all correspondence to: Bryan K. Slinker, VCAPP Department, Washington State University, Pullman, WA 99164-6520,USA.

0022–2828/98/040723+09 $25.00/0 mc980655 1998 Academic Press Limited

Page 2: The Statistics of Synergism

B. K. Slinker724

and protein kinase C (PKC)-mediated cell signalingin the activation of the Raf-1/mitogen-activatedprotein (MAP) kinase cascade in cardiac myocytes,they studied four treatment groups: forskolin (FSK)alone to activate PKA-mediated signaling, 12-0-tetradecanoylphorbol-13-acetate (TPA) alone to ac-tivate PKC-mediated signaling, combined treatmentwith FSK and TPA, and no treatment (control).

The interpretative task is to analyse these datain such a way that any claim of synergism, or lackthereof, is substantiated by the observed responsesto the four treatments. Typically, such sub-

Factor B

Factor A

Absent (–) Present (+)

Absent (–)

Present (+) B both

neither A

stantiation depends on a statistical treatment of theFigure 1 Schematic diagram of the typical 2×2 factorialdata. Unfortunately, the instincts of many biologistsexperimental design employed to investigate questions offail them at this point and, having designed thesynergism. Each of the two factors, A and B, has twoappropriate experiment, they fail to match the ex- levels. That is, factor A is either present (+) or absent

perimental design with the correct statistical ana- (−), and factor B is either absent (−) or present (+). Itlysis of the data (unlike the examples cited above, is the existence of two levels for each of two treatment

factors that leads to the designation of this experimentalwhich were chosen because they did perform thedesign as a 2×2 factorial.correct statistical analysis).

whether or not the treatments have the expectedeffect.

Recall that the basis of statistical hypothesis test-The meaning of synergism leads directlying is that each time we collect data, we are sam-to the appropriate statistical analysispling a small number of all possible experimentalunits (e.g. rats, humans, cell culture wells, and soFor every experimental design there is, generally

speaking, at least one correct statistical analysis of on). If we repeated the experiment on anothersample, the measured response(s) would be dif-the data. The correct statistical analysis is de-

termined by, among others, the type of data, the ferent, simply because of random sampling vari-ability. Given this random sampling variability, thenumber of treatments, the organization of treatment

groups, the distribution of the data collected, and statistical question is, are any observed biologicalresponse(s) due to the experimental ma-the biological question(s) that led to doing the

experiment in the first place. Unfortunately, for nipulation(s), or are they more likely due to simplerandom sampling variation?each experimental design, there are also many

incorrect statistical treatments of the data. In the To answer this question, we re-state it as astatistical null hypothesis about the unknown popu-case of synergism, the correct statistical approach

is dictated by the meaning of synergism. To dem- lations means (usually, that the treatment has noeffect). By assuming certain characteristics of theonstrate this, let us define two generic experimental

treatments, A and B. data and experimental design, a statistic (such ast or F ) can be formulated to incorporate the relevantThe biological hypothesis is that treatments A

and B synergistically influence a response. The full information in the data. Then, the distribution ofall the possible values that would be expected of thisexperimental design needed to test this hypothesis

requires four treatment groups: (1) neither treat- statistic if the null hypothesis is true is determined.Finally, the sample data are used to calculate ament (control), (2) treatment A alone, (3) treatment

B alone, and (4) both treatments A and B. Four sample statistic, which is compared to the dis-tribution of all the possible values of the statisticsample means, ×̄neither, ×̄A, ×̄B, and ×̄both, are

computed from the data collected under each of that are consistent with the null hypothesis. If thesample statistic lies in a predefined extreme regionthese treatment conditions. These sample means

provide estimates of the true, but unknown, popu- of the distribution expected when the null hypo-thesis is true, say the 5% most extreme values, thelation means, lneither, lA, lB, and lboth, respectively

(Fig. 1 shows this experimental design). The sample null hypothesis is “rejected”, because it is unlikelythat the difference in sample means is due to simplemeans and standard deviations are used to for-

mulate the statistical hypothesis tests that will pro- random sampling variability and, therefore, weconclude that a statistically significant treatmentvide evidence to substantiate claims regarding

Page 3: The Statistics of Synergism

Statistics of Synergism 725

effect has been demonstrated (with P<0.05). Con-versely, if the sample statistic is not within thisextreme region, the null hypothesis is “accepted”,because the data are consistent with what wouldbe expected if it is true (i.e. the observed differencesare consistent with simple random sampling vari-ation). Therefore, we conclude that there is not astatistically significant treatment effect.

So, if the biological hypothesis is that treatmentsA and B synergistically influence a response, what

Factor A

Fac

tor

B (+)

(–)

(–) (+)

both – neither

A – neither

both

AB

neither

B – neither

is the corresponding statistical null hypothesis? Tofind out, we first re-state the biological expectation Figure 2 The relationship of population mean valuesof synergism: for a hypothetical experimental design, using generic

treatment factors A and B (as diagrammed in Fig. 1) tothe biological response to both treatments A and B the interaction contrast defined by equations (1)–(3). Iftogether will be greater than the sum of the response there is synergism, the quantity (lboth−lneither) will be

significantly larger than the sum of the quantitiesto treatment A alone and the response to treatment(lA−lneither) and (lB−lneither), as stated in equation (1),B alone.and the lines connecting the means will diverge as shown

Note that the focus is on the response of three by the two solid lines (i.e. the solid lines are not parallel).On the other hand, if there is no interaction, (lboth−lneither)of the treatment groups compared to the fourth,will equal the sum of (lA−lneither) and (lB−lneither), ascontrol, group (i.e. neither treatment A nor B). Tostated in equation (2), and the lines connecting the

derive the appropriate statistical null hypothesis, means would be parallel, as shown by the dashed linethis narrative biological expectation needs to be and lower solid line.mathematically re-stated in terms of the four treat-ment group means. Using the notation of the popu- found in many advanced statistics texts [for ex-lation means given above, this mathematical ample, Maxwell and Delaney (1990)], and is im-restatement is plemented in some of the larger statistical

computing packages.(lboth−lneither)>(lA−lneither)+(lB−lneither) (1) There is, however, a more accessible way to

approach the statistical computation. The formalwhere (lboth−lneither) is the response to both treatments definitions of synergism that were stated aboveA and B together and (lA−lneither)+(lB−lneither) is emphasize that synergism is recognized when twothe sum of the response to treatment A alone and the agents interact or, equivalently, when their com-response to treatment B alone. Equation (1) states bined effect is manifest in a non-additive manner. Inwhat is expected if there is synergism. The op- the statistical analysis of data collected in ex-posite—what is expected when there is no syn- perimental designs using two treatment factors, theergism—is the statement of the statistical null biological concept of non-additive effects has a directhypothesis (denoted by H0), and is given by counterpart in the statistical interaction effect that

may be included in a two-way analysis of varianceH0: (lboth−lneither)=(lA−lneither)+(lB−lneither) (2)(ANOVA). In fact, it can be shown that the usualF statistic computed for the interaction effect in aThis algebraic combination of means is illustratedtwo-way ANOVA is equivalent to testing the nullgraphically in Figure 2.hypothesis stated by the contrast in equation (3)In statistical terminology, equation (2) is a com-(Maxwell and Delaney, 1990).1

plex contrast among the four treatment groupmeans. To convert equation (2) to the usual form

1 This development ignores the distinction between one- andof a null hypothesis for a contrast, it is rearrangedtwo-tailed statistical hypothesis tests. Strictly speaking, the

to definition of synergism leads to a one-tailed hypothesis. However,I present the issue in terms of a two-tailed hypothesis test that,statistically speaking, makes no distinction between synergismH0: lboth−lA−lB+lneither=0 (3)and competition—i.e. the treatments are either independent orthey interact (in which case synergism would be concluded if

Equations (2) and (3) re-state the biological question the combined response mean was greater than the sum of themeans of individual treatments). The issue of one- v two-tailedof synergism as the statistical null hypothesis thathypotheses is ignored because it needlessly complicates thisis used as the basis for analysing the experimentaldiscussion. Furthermore, most investigators will want to use a

data. The procedure for computing an F statistic two-tailed approach because if they fail to show synergism, theywill want to conclude that the effects are independent.for the complex contrast in equation (3) can be

Page 4: The Statistics of Synergism

B. K. Slinker726

14

0

FSK

MA

P k

inas

e ac

tivi

ty

7

(–) (+)

(–)

(+)

TPA

(a)

250

0

FSK

[3 H]-

phen

ylal

anin

e in

corp

orat

ion

125

(–) (+)

(–)

(+)

TPA

(b)

Figure 3 Response of MAP kinase activity (a) and [3H]-phenylalanine incorporation (used as a measure of proteinsynthesis) (b) in neonatal rat cardiac myocytes to the presence and absence of 10−5 mol/l FSK and 10−9 mol/l TPA.The data are mean±... for a sample size of four in each group. These data correspond to the data shown in Figures4 and 8 of Yamazaki et al. (1997), with one modification: the data as reported by Yamazaki et al. for the control group(i.e. neither FSK or TPA) MAP kinase activity were assigned a value of 1.0, so the mean is reported as 1.0, withoutretaining the variability in the data (i.e. the ... is implicitly assigned a value of zero). Because this artificiallyunderestimates the true population variability, the data as shown in Figure 2(a) have assumed a ... comparable tothat of the other groups. Presenting the data from the four treatment groups as shown in this figure, emphasizes the2×2 factorial nature of the experimental design, and also emphasizes the presence of an interaction as shown by thenon-parallel lines (if there was no interaction, the lines would be parallel). This presentation format is, accordingly,more useful than the format shown in Figure 4. Before it can be definitively concluded that there is interaction,however, a two-way ANOVA must be performed to see whether the visual appearance of interaction is large enoughrelative to the random variation in the data to allow rejecting the null hypothesis of no interaction [see Table 1(a) and(b) for the results of the corresponding statistical analysis of the data shown in panels (a) and (b)].

the data shown in Figures 3(a) and 3(b) are shownDoing it right v doing it wrong, anin Table 1(a) and (b), respectively. For MAP kinaseexampleactivity, the F statistic for the interaction effect is4.79, with 1 and 12 degrees of freedom [TableSynergism the right way1(a)]. The critical value of F for a=0.05 and 1 and12 degrees of freedom, which is 4.75. The computedYamazaki et al. (1997) wanted to determine invalue of 4.79 exceeds this critical value and so itcardiac myocytes whether there was synergism inis concluded that there is a statistically significantthe actions of PKA- and PKC-mediated activation ofpositive interaction effect [P<0.05; the exact P=the Raf-1/mitogen-activated protein (MAP) kinase0.049, as reported in Table 1(a)]. The biologicalpathway. To stimulate PKA-mediated signaling,interpretation of this statistical result and the posi-they exposed neonatal rat cardiac myocytes intive interaction shown in the pattern of means inculture to 10−5 mol/l FSK. To stimulate PKC-me-Figure 3(a) is that FSK and TPA acted synergisticallydiated signaling, they exposed neonatal rat cardiacto activate MAP kinase. (These authors reached amyocytes to 10−9 mol/l TPA. In addition, they ex-similar conclusion when evaluating Raf-1 activity,posed neonatal rat cardiac myocytes to both FSKa result I am not showing.)and TPA together. Finally, as a control, a group of

For protein synthesis, as measured by [3H]-neonatal rat cardiac myocytes was not exposed tophenylalanine incorporation, the interaction F stat-either FSK or TPA. They measured three biologicalistic is 2.04 with 1 and 12 degrees of freedom [Tableresponses: MAP kinase activity, Raf-1 activity, and1(b)]. The P value associated with the computed Fprotein synthesis ([3H]-phenylalanine in-is 0.179 [Table 1(b)], and so it is concluded thatcorporation). Using the reported means and stand-there is not a statistically significant interactionard deviations in their Figures 4 (MAP kinaseeffect (P=0.179). The biological interpretation ofactivity) and 8 ([3H]-phenylalanine incorporation),this statistical result is that FSK and TPA actedI simulated their data (thus, my results closelyindependently (i.e. there was neither synergismparallel theirs, but the reader should keep in mindnor competition) to affect protein synthesis ([3H]-that I am not working with their exact data set).phenylalanine incorporation). At this point, we canThese data are plotted in Figures 3(a) and 3(b),conclude that the effects of FSK and TPA on proteinrespectively.

The results of performing a two-way ANOVA on synthesis, if any, are independent. To see if either

Page 5: The Statistics of Synergism

Statistics of Synergism 727

Table 1 Computer printout of the two-way ANOVA results of the MAP kinase activity (a) and [3H]-phenylalanineincorporation (b) responses to the four combinations of FSK and TPA treatments studied by Yamazaki et al. (1997).The plots of the data corresponding to each analysis are shown in the corresponding panels of Figure 3

(a)Analysis of variance table for MAP kinase activity

Source term DF Sum of squares Mean square F-ratio Prob. level Power(a=0.05)

A (FSK) 1 97.44651 97.44651 25.85 0.000269∗ 0.991548B (TPA) 1 126.1387 126.1387 33.46 0.000087∗ 0.998564AB 1 18.07525 18.07525 4.79 0.049044∗ 0.469694S 12 45.24201 3.770167Total (adjusted) 15 286.9025Total 16

∗ Term significant at a=0.05.

(b)Analysis of variance table for [3H]-phenylalanine incorporation

Source term DF Sum of squares Mean square F-ratio Prob. level Power(a=0.05)

A (FSK) 1 8965.438 8965.438 10.61 0.006862∗ 0.798543B (TPA) 1 12508.07 12508.07 14.80 0.002321∗ 0.909736AB 1 1724.616 1724.616 2.04 0.178616 0.233707S 12 10139.74 844.9785Total (adjusted) 15 33337.87Total 16

∗ Term significant at a=0.05.

has a significant independent effect, we then proceed the ANOVA by multiple pair-wise comparisons; or(2) to forego a one-way ANOVA and simply proceedto evaluate each main effect in the two-way ANOVA.

Both main effects are statistically significant [Table directly to do multiple pair-wise comparisons amongthe four treatment group means.1(b)]: the F statistic for the FSK main effect is 10.61

(P=0.007), and the F statistic for the TPA main effect If MAP kinase activity and [3H]-phenylalanineincorporation are each analysed using one-wayis 14.80 (P=0.002). The biological interpretation of

the results of these statistical tests, when taken to- ANOVA, followed by multiple comparisons usingTukey’s procedure (Glantz, 1997), identical con-gether, is that FSK and TPA do not act synergistically,

but rather that each has an independent effect to sig- clusions are reached for both responses: the Fstatistics for both MAP kinase activity (F=21.37;nificantly increase protein synthesis.P<0.005) and [3H]-phenylalanine incorporation(F=9.15; P=0.002) are significant, and Tukey’sprocedure for multiple comparisons (with an ex-Synergism the wrong wayperiment-wise error rate of 0.05) leads to the iden-tification of the statistically significant pair-wiseAlthough these authors (Yamazaki et al., 1997)

did the analysis correctly, it is important in this differences shown in Figures 4(a) and 4(b). Fromthese patterns of significant pair-wise differences,discussion to examine a common way that such

data are analysed incorrectly. Figures 4(a) and 4(b) which are identical for both MAP kinase activityand [3H]-phenylalanine incorporation, it is not pos-show the data in Figures 3(a) and 3(b), respectively,

replotted in the bar graph format that is used sible to make any judgements about the presenceor absence of synergism between FSK and TPA—typically to display such data. The most common

incorrect analyses of these experimental data would which, as we have seen from the correct analysis,is present for MAP kinase activity but not for proteinbe either: (1) to perform a one-way ANOVA, ig-

noring the factorial structure of the experimental synthesis. Faced with these responses, however,some would be tempted to construct an argumentdesign, and consider the four treatment groups to

be levels of a single treatment factor, then follow roughly as follows: neither FSK nor TPA alone had

Page 6: The Statistics of Synergism

B. K. Slinker728

250

0

Treatment

[3 H]-

phen

ylal

anin

e in

corp

orat

ion

125

None FSK + TPA

(b)

FSK TPA

† †

*14

0

Treatment

MA

PK

act

ivit

y

7

None FSK + TPA

(a)

FSK TPA

††

*

Figure 4 Response of MAP kinase activity (a) and [3H]-phenylalanine incorporation as a measure of protein synthesis(b) in neonatal rat cardiac myocytes to the presence and absence of 10−5 mol/l FSK and 10−9 mol/l TPA. The data aremean±... for a sample size of four in each group. These data are the same data as shown in Figures 3(a) and 3(b),respectively, but have been re-plotted to emphasize an incorrect analysis of the data using a one-way ANOVA followedby multiple comparisons using Tukey’s method. ∗ P<0.05 compared to control group (i.e. neither FSK nor TPA);† P<0.05 compared to FSK+TPA group. Notice that, (1) this approach yielded identical interpretations for both MAPkinase activity and [3H]-phenylalanine incorporation, (2) we cannot tell from the pattern of significant pair-wisedifferences whether or not these are synergism (i.e. there are many non-synergistic responses that would yield thissame pattern), and (3) for the [3H]-phenylalanine incorporation, for example, neither FSK nor TPA treatment alone issignificantly different than control, implying that neither has a significant independent effect. This is in sharp contrastto the results of the 2×2 factorial ANOVA, which provides strong evidence that both FSK and TPA have significantindependent effects to increase protein synthesis.

a significant effect, when compared to control; but a statistically significant independent effect. Thiscontrasts sharply to the multiple pair-wise com-treatment with both FSK and TPA did have a

significant effect, which was also significantly dif- parison result [see Fig. 4(b)], which indicated thatthe [3H]-phenylalanine incorporation responses toferent than FSK treatment alone and TPA treatment

alone. Therefore, there is synergism. If this line of both FSK alone and TPA alone were not significantlydifferent from control, which suggests, incorrectly,argument were pursued fully, it would be concluded

that FSK and TPA acted synergistically to activate that neither treatment alone had an effect.There are two reasons for this difference in con-MAP kinase and to stimulate protein synthesis. The

latter conclusion, as we have seen from the correct clusion using multiple pair-wise comparisons vusing two-way ANOVA. First, correctly conductinganalysis using two-way ANOVA, is wrong.

There is yet another problem with the multiple a set of multiple pair-wise comparisons causes theloss of statistical power because each individualpair-wise comparisons approach. Using a multiple

comparisons procedure, in this case Tukey’s test, pair-wise comparison is tested at a significance levelless than 0.05 in order to keep the total error rateled to the conclusion that there was no significant

effect of either FSK alone or TPA alone for both for the collection of pair-wise tests at no more than0.05. In contrast, there is no need to do multipleMAP kinase activity and [3H]-phenylalanine in-

corporation. Contrast this latter interpretation with pair-wise comparisons if the two-way ANOVA isused. Second, in the absence of a significant inter-the 2×2 factorial ANOVA result shown in Table

1(b). Because there was no significant interaction, action effect, each main effect in the two-wayANOVA, for example, for FSK, is equivalent con-it was concluded that the effects of FSK and TPA

were independent (i.e. that there was neither syn- ceptually to conducting a single t test to comparethe mean of the two FSK (−) groups (i.e. poolingergism nor competition). In the absence of a sig-

nificant interaction, each of the two main effects in all data for both TPA treatment conditions) withthe mean of the two FSK (+) groups (again, poolingthe ANOVA was examined, one for the independent

effect of FSK and the other for the independent all data for both TPA treatment conditions). Themain effect for TPA has an analogous interpretation.effect of TPA. As we saw previously, the main

effect of FSK treatment and the main effect of TPA Thus, by correctly taking into account the factorialstructure of the experimental design by using two-treatment are both statistically significant. Thus,

based on the two-way ANOVA result, it would be way ANOVA, the sample size is effectively doubledwhen assessing the independent effects of FSK aloneconcluded that, although there was no synergism

between FSK and TPA, each treatment exerted and TPA alone. Both the avoidance of multiple

Page 7: The Statistics of Synergism

Statistics of Synergism 729

comparisons and the effective doubling of sample turn, changes the computation of the F statisticsin the ANOVA. It is beyond the scope of this briefsize increase the statistical power of the two-way

ANOVA compared to using one-way ANOVA and/ review to examine the details of implementing the2×2 ANOVA when one or more of the factorsor multiple comparisons. The two-way ANOVA ap-

proach, thus, has a decided advantage. Accordingly, involves repeated measures, and the reader shouldconsult a statistician and/or advanced statistics texteven if synergism is not the central experimental

question, there is still an advantage to carrying for more information [see, for example, Glantz andSlinker (1990) or Maxwell and Delaney (1990)]. Itthrough any factorial structure from the ex-

perimental design into the statistical analysis of the is important to note, however, that the complicatingissue of repeated measures does not change thedata.interpretation of the statistical interaction effect asit relates to the biological hypothesis of synergism,as was presented above—it only changes the com-Additional considerationsputation of the F statistics on which the in-terpretation is based.It is beyond the scope of this brief review to examine

the full range of additional statistical considerationsthat must be kept in mind when analysing ex-perimental data using two-way ANOVA. However, Evaluating the assumptions underlying ANOVAthere are three important issues that I will mentionbriefly. The different types of ANOVA assume certain things

about the nature of the experiment and the resultingdata. The so-called completely randomized ANOVA(i.e. no repeated measures) that has been illustratedThere is more than one type of two-way ANOVAin this review assumes that the sample data aredrawn randomly from normally distributed popu-I have drawn a correspondence between the inter-

action implied by biological synergism and the lations, and that there is equality of variances inthese populations. It is beyond the scope of thisstatistical interaction effect in a two-way ANOVA,

and have considered an example of a 2×2 factorial review to delve into these assumptions, which aretreated in many statistical texts [see, for example,experimental design to illustrate the correct pro-

cedure. In the example from Yamazaki et al. (1997), Madansky (1988), Glantz and Slinker (1990), andMaxwell and Delaney (1990)]. However, as witheach of the four treatment conditions was studied

in a separate group of cell cultures, such that any any statistical analysis, the worth of a statisticaltreatment of experimental data diminishes con-one cell culture dish received only one of the four

experimental treatments. Accordingly, the data siderably if either of these assumptions is violatedand no corrective action is taken.were analysed using a so-called completely ran-

domized ANOVA. However, there are 2×2 factorial For experimental designs in which one or bothof the treatment factors is a so-called repeatedexperimental designs in which each subject (where

subject could be a patient, animal, organ, culture measures factor, the assumption of equality of vari-ances that applies to a completely randomized de-dish, etc.) receives more than one treatment. For

example, in theory at least, it would have been sign does not apply in the same way. Rather, foreach treatment involving repeated measures, thispossible for Yamazaki et al. to have subjected the

same cell culture dishes to all four experimental assumption is replaced with a more complex as-sumption of the equality of variances of the dif-treatments, so that each subject (in this case cell

culture dish) received all of the treatments. If this ferences between levels of the treatment andequality of correlation among levels of the treat-were done, both treatment factors would be said to

be repeated measures factors. Similarly, it would be ment. This more complex assumption is often re-ferred to as compound symmetry, and is beyondpossible to design an experiment in which only one

of the two treatment factors is repeated measures, the scope of this brief review [see, for example,Maxwell and Delaney (1990) for a thorough dis-but the other is not—leading to what is sometimes

called a split-plot design. Although assessing syn- cussion of this assumption and associated correctivemeasures]. Fortunately, compound symmetry isergism still requires evaluating the interaction effect

in 2×2 factorial design, as was illustrated in the guaranteed if there are only two levels of the treat-ment factor. Thus, even if one or both of theexample above, the presence of repeated measures

factors requires a different specification of the math- treatments in the 2×2 factorial experimental de-sign discussed in this review is a repeated measuresematical model underlying the ANOVA. This, in

Page 8: The Statistics of Synergism

B. K. Slinker730

treatment, the issue of compound symmetry is 1 (statistical printouts). For MAP kinase activity,the results of the two-way ANOVA [Table 1(a)] ledmoot.to the conclusion that there was a statisticallysignificant interaction effect (P=0.049), and thusit was concluded that PKA- and PKC-mediatedIs the sample size large enough to give credence to a

claim that there is no synergism? signaling synergistically activated MAP kinase. Be-cause this is a “positive result”, a retrospective

An often ignored aspect of the statistical analysis analysis of the power of the F test for the interactioneffect in these data is moot. In contrast, for [3H]-of experimental data is statistical power and its

converse, the probability that there has been an phenylalanine incorporation, the result of the two-way ANOVA [Table 1(b)] led to the conclusion thaterror when concluding that the null hypothesis

could not be rejected when, in fact, the treatment there was not a statistically significant interactioneffect (P=0.179). Thus, it was concluded that,did have an effect. For a two-way ANOVA, statistical

power must be treated separately for each of the although PKA- and PKC-mediated signaling syner-gistically activated MAP kinase activity [as well astwo main treatment effects and for the interaction

(i.e. power must be considered separately for each Raf-1 activity, data not shown here, see Yamazaki etal. (1997)], these synergies in activation of signalingF statistic). For a discussion of synergism, the focus

is obviously on the power of the interaction effect. intermediates were not expressed “downstream” inincreased protein synthesis. However, the plots ofOnce again, however, it is beyond the scope of this

brief review to treat the issue of statistical power MAP kinase activity [Fig. 3(a)] and [3H]-phenyl-alanine incorporation [Fig. 3(b)] look similar—bothfully, and the reader is referred to the introductory

explanation of statistical power in Glantz (1997) suggest an interaction effect because the lines arenot parallel. However, the lack of parallelism is notand to more advanced treatments in Zar (1984)

and Cohen (1988), which address the details of as marked for the data in Figure 3(b) as it is forthe data in Figure 3(a). Relative to the randomhow to implement statistical power analysis in a

factorial ANOVA. variation in the sample data, the interaction effectsuggested in Figure 3(b) is not large, and so theBecause it relates to the probability of falsely

accepting the null hypothesis, statistical power be- null hypothesis of no interaction cannot be rejectedand it must be concluded, therefore, that there iscomes an issue only when the statistical analysis

of the data suggests that there is no statistically no synergism.But how confident can we be of this negativesignificant effect. If the statistical power is low, there

can be little confidence in such a “negative result”. conclusion? To answer this question, we must com-pute the statistical power of the test for interactionOf the many factors that must be considered when

evaluating statistical power, an insufficient sample in these data. The probability of falsely acceptingthe null hypothesis, the so-called Type II error insize is the only easily remedied problem for failure

to achieve adequate statistical power in a given statistics, is 1−power. The power for the interactioneffect for the data shown in Figure 3(b) is estimatedexperiment. Accordingly, statistical power con-

siderations are most useful prospectively, during to be 0.234 [Table 1(b)]. Hence, the probability offalsely accepting the null hypothesis is 1−0.234=experimental planning, at which time the required

sample size necessary to satisfy certain assumptions 0.766, meaning that there is a 76.6% chance thatwe have falsely accepted the null hypothesis, givenabout the expected magnitude and variability in

the outcome variable(s) can be estimated. However, the observed magnitude of interaction and the ob-served sample standard deviations.any time a “negative result” is obtained based on

a statistical hypothesis test, the statistical power of Because each measured response variable willhave a different amount of random variation, andthe test, as performed on that set of data, can be

estimated. This retrospective consideration of the because each may respond to the experimentaltreatments with a different magnitude of effect, nostatistical power of a test as performed on the data

is an important part of the discussion of negative one sample size can necessarily accommodate all ofthe desired experimental response variables. Hence,results, because it quantifies the probability of falsely

accepting the null hypothesis and, accordingly, pro- the magnitude of PKA- and PKC-mediated changein MAP kinase activity and the random variationvides quantitative insight into the strength of the

evidence supporting the resulting conclusion(s). in MAP kinase activity from sample to sample areaccommodated adequately by the sample size ofTo make this more concrete, let us again consider

the results from analysing the data of Yamazaki et four used in this experiment. However, the relativelysmaller magnitude of effect on [3H]-phenylalanineal. (1997) shown in Figure 3 (data plots) and Table

Page 9: The Statistics of Synergism

Statistics of Synergism 731

Sciences, 2nd edn. Hillsdale, NJ: Lawrence Erlbaumincorporation coupled with, perhaps, a larger ran-Assoc.dom sampling variability, led to inadequate power

G SA, 1997. Primer of Biostatistics, 4th edn. Newfor this response variable with a sample size of York: McGraw-Hill.four—assuming that an investigator would con- G SA, S BK, 1990. Primer of Applied Regression

and Analysis of Variance. New York: McGraw-Hill.sider the observed magnitude of increased proteinM A, 1988. Prescriptions for Working Statisticians.synthesis to be biologically important.

New York: Springer-Verlag.M SE, D HD, 1990. Designing Experiments

and Analyzing Data: A Model Comparison Perspective.Belmont, CA: Wadsworth Publishing Co.Acknowledgements

M-W, 1981. Webster’s New Collegiate Dic-tionary. Springfield, MA: G. and C. Merriam Co.Supported by a grant-in-aid from the Washington M-W, 1996. Merriam-Webster’s Collegiate

Affiliate of the American Heart Association (96- Dictionary, 10th edn. Springfield, MA: G. and C. Mer-077290). riam Co.

Y T, K I, Z Y, K S, M T, HY, S I, T H, K K-, K O,T T, Y Y, 1997. Protein kinase A and

References protein kinase C synergistically activate the Raf-1 kin-ase/mitogen-activated protein kinase cascade in neo-

B NI, S DK, 1997. The role of cAMP in natal rat cardiomyocytes. J Mol Cell Cardiol 29:ethanol-regulated b-endorphin release from hy- 2491–2501.pothalamic neurons. Alcohol Clin Exp Res 21: 728–731. Z JH, 1984. Biostatistical Analysis, 2nd edn. Englewood

Cliffs, NJ: Prentice-Hall, Inc.C J, 1988. Statistical Power Analysis for the Behavioral


Top Related