critical levels and magnitude of effects

8
Acta Psychologica 55 (1984) 273-280 North-Holland 273 CRITICAL LEVELS AND MAGNITUDE OF EFFECTS David NAVON * Unicwwt~y of Ha&, Isruel Accepted May 1983 Researchers sometimes report an effect of an experimental manipulation on a dependent variable in terms of the difference between levels of a certain nuisance variable required in the different experimental conditions to reach a given criterion of behavior. This method. however. may yield illusory impressions about magnitude of effects. Suppose you are faced with the commonplace problem that you wish to study the effect of one variable on another one when the latter is suspected to be affected also by a third one. Let us adhere to the convention to call these variables the independent variable, the depen- dent variable, and the nuisance variable, respectively. For simplicity of exposition let the independent variable have just two levels. A typical solution to the problem presented above is to control the nuisance variable somehow, or to let it vary and partial its effect out statistically. Every introductory book of experimental design lists a number of different ways to do it. The common feature of all these ways is that they aim at isolating the net contribution of the independent variable to the variability of the dependent variable. However, there is one other procedure which employs a different logic. Let it be called the method of criticul levels. Its basic characteristic is that the variability investigated is not that of the dependent variable but rather that of the nuisance variable: Rather than looking at the difference between groups (or experimental conditions) on the depen- dent variable within a given level, or across levels, of the nuisance variable, this method calls for inspecting the difference between the different levels of the nuisance variable required by each of the groups * Author’s address: David Navon. Dept. of Psychology, University of Haifa. Haifa 31999, Israel. OOOl-6918/84/$3.00 ‘7: 1984, Elsevier Science Publishers B.V. (North-Holland)

Upload: david-navon

Post on 26-Aug-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Critical levels and magnitude of effects

Acta Psychologica 55 (1984) 273-280

North-Holland

273

CRITICAL LEVELS AND MAGNITUDE OF EFFECTS

David NAVON *

Unicwwt~y of Ha&, Isruel

Accepted May 1983

Researchers sometimes report an effect of an experimental manipulation on a dependent variable

in terms of the difference between levels of a certain nuisance variable required in the different

experimental conditions to reach a given criterion of behavior. This method. however. may yield

illusory impressions about magnitude of effects.

Suppose you are faced with the commonplace problem that you wish to study the effect of one variable on another one when the latter is suspected to be affected also by a third one. Let us adhere to the convention to call these variables the independent variable, the depen- dent variable, and the nuisance variable, respectively. For simplicity of exposition let the independent variable have just two levels. A typical solution to the problem presented above is to control the nuisance variable somehow, or to let it vary and partial its effect out statistically. Every introductory book of experimental design lists a number of different ways to do it. The common feature of all these ways is that they aim at isolating the net contribution of the independent variable to the variability of the dependent variable.

However, there is one other procedure which employs a different logic. Let it be called the method of criticul levels. Its basic characteristic is that the variability investigated is not that of the dependent variable but rather that of the nuisance variable: Rather than looking at the difference between groups (or experimental conditions) on the depen- dent variable within a given level, or across levels, of the nuisance variable, this method calls for inspecting the difference between the different levels of the nuisance variable required by each of the groups

* Author’s address: David Navon. Dept. of Psychology, University of Haifa. Haifa 31999, Israel.

OOOl-6918/84/$3.00 ‘7: 1984, Elsevier Science Publishers B.V. (North-Holland)

Page 2: Critical levels and magnitude of effects

to attain a given value of the dependent variable. Let this value of the dependent variable be called the criterid value, and the corresponding levels of the nuisance variable be called critical levels (see illustration in fig. 1).

A quick scan of the psychological literature reveals that this method is employed in various situations. I have chosen three examples, each from a different field, to illustrate its use.

Schneider and Shiffrin (1977) studied the effect of memory load (or task complexity) on visual search for target numerals in arrays (frames) of characters flashed briefly one after the other, and measured the probability of target detection. They describe one of their findings in the following words:

the estimated frame time needed to reach a given performance level (say 0.70) appears to range

from about 60-800 milliseconds, depending on the load placed on the search process. Such a result

is certainly among the largest selective attention effects ever to be reported (Schneider and

Shiffrin 1977: 12.)

Thus, instead of holding frame duration constant (or averaging across all durations) and reporting differences in accuracy. Schneider and Shiffrin elected to present the effect of load by holding performance constant and reporting about differences among critical durations cor- responding to the different load conditions.

INDEPENDENT

VARIABLE

I

/ I I

CRITICAL CRITICAL

LEVEL(A) LEVEL (6)

NUISANCE VARIABLE

Fig. 1. An illustration for the rationale of the method of critical levels (assuming that the nuisance

variable and the independent one do not interact in their effect on the dependent one).

Page 3: Critical levels and magnitude of effects

D. Nauon / Critical lerwls 275

A currently prevalent procedure in studies of performance of com- plex tasks is the adaptive technique, by which some difficulty variable is continually manipulated according to feedback from the performance of the subject until performance stabilizes. The final difficulty level at which performance stabilizes is taken as a measure of the subject’s ability. For example, in a study of manual tracking, North and Gopher (1976) compared flight instructors with flight students on the accelera- tion level of the moving target they could handle. This is another instance in which the dependent variable and the nuisance variable switch roles: What is being compared is not performance levels but rather critical acceleration levels.

Turgeon and Hill (1977) studied the relation between children’s age and their conceptual organization as measured by their response to reversal and half-reversal shifts following discrimination learning. The experimental task was to sort various sets of cards. Turgeon and Hill defined the occurrence of two consecutive almost errorless trials (15 out of 16 correct placements in a sorting trial) as the performance criterion and compared different age groups on mean number of trials to criterion. The direct measure would have been to compare age groups on performance on a given trial.

Many more examples exist, of course. Note that any comparison of sensory thresholds or of mental ages employs the method of critical levels. A biological counterpart is the assessment of the potencies of materials by means of comparing doses needed to obtain a certain reaction of some living subject (see Finney 1964).

So far for examples. The issue is whether comparison of critical levels is as readily interpretable as is the direct comparison of levels of the dependent variable.

We are often concerned not only with the existence of an effect which is inferred from the results of significance tests, but also with its magnitude. But the magnitude of which effect? We are, of course, interested in the effect of the independent variable on the dependent one. However, the method of critical levels does not yield a direct answer about the difference between the means of the dependent variable corresponding to the experimental conditions or groups, but rather on the difference between the corresponding critical levels of the nuisance variable. We are often led to complete the picture by inferring implicitly about the magnitude of difference on the dependent variable from the magnitude of observed difference on the nuisance variable.

Page 4: Critical levels and magnitude of effects

Alas, such an inference may be fallacious. The reason is that the difference between the experimental conditions on the nuisance varia- ble depends not only on the difference between them on the dependent variable, but also on the sensitivity of the dependent variable to the nuisance variable. Ironically. the more relevant the nuisance variable is, the smaller the expected difference between critical levels. Hence, a sure recipe to observe a large apparent difference between critical levels, if any effect on the dependent variable exists, is to select an extremely irrelevant nuisance variable on which to define critical levels.

Let me illustrate this by an example of a diabolic abuse of statistics. Suppose you are in the advertising business, and that your present assignment is to advertise a certain model of a car (let us call it Swift), whose prominent feature is its good acceleration. You decide to make the public aware of this virtue of Swift cars by presenting some data comparing their performance with that of “the most highly-rated other model” (let us call it Fleet). Unfortunately for you, the difference in acceleration proper is not very impressive. So, in your efforts to make the comparison more convincing, you get the insight to report critical levels of some variable, namely levels of some relevant variable required by the two cars to attain a given acceleration A,.. If you hesitate which variable to pick, you should be advised to pick the one with the least effect on acceleration. For example, comparing the velocities of oppo- site wind despite which the two cars can reach acceleration AC. will probably do you very little good, since acceleration is very sensitive to opposite wind, so the difference between the two critical wind velocities must be fairly small (one unit in fig. 2A). On the other hand, you would do better to compare the cars with regard to the velocity of cross wind which they can withstand acceleration with A,. (a difference of 6 units in fig. 2B).

Of course, unlike this example. scientists usually select their statisti- cal tools bona fide. But, they should be aware of the possible conse- quences of their choice. As seen above, the magnitude of the apparent effect is subject to the way we choose to demonstrate it. An effect may loom large just because we express it in terms of a relatively irrelevant nuisance variable, so that the critical levels delimit a considerable proportion of the variable domain (or are judged fairly wide apart relative to some other anchor points along the continuum). For exam- ple, suppose accuracy of item recognition in a certain memory task was only mildly sensitive to a manipulation of sleep deprivation. Still, one

Page 5: Critical levels and magnitude of effects

D. Naoon / Critical leuels 277

“FLEET “SWIFT

VELOCITY OF OPPOSITE WIND

A

j.c.__--:-:-:::::::-:-I::::-:::;EL I SWIFT I

I I / ,

FLEET I I

1 I I ,

“FLEET “SWIFT

VELOCITY OF CROSS WIND

B

Fig. 2. An illustration for the effect of the sensitivity of the dependent variable (here. car

acceleration) to different nuisance variables (here, velocities of opposite and cross winds) might

have on the difference between critical levels compared here &,,Fr vs. VFLEET.

may get the impression that the manipulation was potent if its effect is reported as a difference between, say, critical study-test intervals, and if, in addition, performance on that task is not very much affected by the duration of the study-test intervals within a reasonable range.

Page 6: Critical levels and magnitude of effects

Surely, statistical measures of magnitude, such as omega squared, can serve to restore the picture to its true proportions, since they take into account the variance among observations which is affected by the sensitivity of the dependent variable to the nuisance variable in the same manner that the difference between means is. The problem is, however, that measures of magnitude are seldom reported, and it is not yet a matter of routine to report standard deviations. Even when they are reported, readers’ impressions of magnitudes are probably much more readily influenced by comparison with the absolute scale than with the variability along it. To get a feeling for the methodological intuitions of researchers, the reader is invited to quote or paraphrase the above quotation from Schneider and Shiffrin (1977). and try to elicit critical comments from colleagues. I tried it on several of my colleagues, each with quite good methodological background and expe- rience, and having read, presumably, several hundreds of experimental papers. None felt uncomfortable about the judgment of effect magni- tude without relying on any variability statistic. It seems that an effect which spans a wide range of the dependent measure is usually perceived as large, without further inquiry about measures of dispersion or magnitude.

The conclusion is that the method of critical levels should be used with caution, especially when the objective is not the detection of an effect but rather the estimation of its extent. The use of the method of critical levels is usually unwarranted when the criteria1 value of the dependent variable is chosen arbitrarily. Critical levels are more meaningful and even indispensable when they correspond to the attain- ment of a unique criterion or when they mark the termination of some stage. Some examples are: the age at which a baby just stands up, the rehearsal period that would bring about error-free recall, the shortest stimulus onset asynchrony (SOA) at which a masker ceases to have any disruptive effect on a target stimulus.

But note that even in that case results are highly susceptible to misinterpretation. For example, suppose it is found that the effect of sleep deprivation on visual detection threshold is considerable when the latter is measured in terms of critical intensity, but is small when critical exposure duration is being measured. One may be tempted to conclude that sleep deprivation reduces very much brightness sensitivity but does not affect very much rate of processing. However, such an hypothetical finding could be more prudently explained as a simple

Page 7: Critical levels and magnitude of effects

D. Nmon / Critad leorls 279

mathematical consequence of a putative differential sensitivity of visual detection to stimulus intensity and exposure duration.

Imagine now a researcher who trains subjects to estimate distances with and without perspective cues. He finds that the presence of perspective cues has a large effect on the minimal level of illumination which subjects could tolerate for accurate distance estimation. He is surprised later to discover that when illumination level is kept constant, the availability of perspective cues hardly affects accuracy at all. His puzzlement may be resolved, if he is reminded that these two ap- parently discrepant findings are in truth manifestations of the same fact, only viewed from two opposite perspectives, which is the small sensitivity of accuracy of distance estimation to level of illumination.

The use of the method of critical levels can always be defended by arguing that the researcher is actually interested in the effect of the independent variable not on the variable on which the critical value is defined but rather on the variable on which critical levels are measured. In other words, the claim might be that the dependent variable is what the researcher decides to measure. This seems to me a shoddy approach to scientific inquiry. While psychologists do have some freedom (to their dismay, I presume) in the choice of operational measures for their dependent variables, their hypotheses, especially those derived from well-defined models, do not leave them much freedom (or, so it should be) with regard to the selection of the dependent variable itself. True, sometimes the ‘method of critical levels serves as an indirect way of studying effects on the variable on which critical levels are measured. For example, critical exposure duration (e.g., Schneider and Shiffrin 1977) may be used with the intent to estimate effects on speed of processing rather than on accuracy. In that case, the choice of the variable on which critical levels are measured is not arbitrary at all. Clearly, my criticism is restricted to cases in which the hypothesis does not assert anything about the variable on which critical levels are measured, but rather about the variable on which the criteria1 value is defined.

Another conceivable use of the method of critical levels is when the researcher is interested not just in the effect of the independent variable on the dependent one but rather on the interactive effect of the independent variable and the “nuisance” variable. For example, one might measure critical study-test intervals to explore the effect of sleep deprivation of the persistence of memory (viz., its sensitivity to elapsing

Page 8: Critical levels and magnitude of effects

280 D. Navon / Crrticul leuels

time) rather than simply on memory strength. However, for this matter it is as a rule more advisable to plot performance as a function of the two variables (in this case, amount of sleep and study-test intervals). Two data-points can tell no more than two data-points can tell.

References

Finney, D.J.. 1964. Statistical methods in biological assay. (2nd ed.) London: Griffin.

North, R.A. and D. Gopher, 1976. Measures of attention as predictors of flight performance.

Human Factors 18. I-14.

Schneider, W. and R.M. Shiffrin, 1977. Controlled and automatic human information processing,

I: Detection, search and attention. Psychological Review 84, l-66.

Turgeon, V.F. and S.D. Hill, 1977. A developmental analysis of the formation and use of

conceptual categories. Journal of Experimental Child Psychology 23, 108-116.