a boolean map theory of visual attentionlaplab.ucsd.edu/articles/huang_pashler_pr2007.pdf · a...

33
A Boolean Map Theory of Visual Attention Liqiang Huang and Harold Pashler University of California, San Diego A theory is presented that attempts to answer two questions. What visual contents can an observer consciously access at one moment? Answer: only one feature value (e.g., green) per dimension, but those feature values can be associated (as a group) with multiple spatially precise locations (comprising a single labeled Boolean map). How can an observer voluntarily select what to access? Answer: in one of two ways: (a) by selecting one feature value in one dimension (e.g., selecting the color red) or (b) by iteratively combining the output of (a) with a preexisting Boolean map via the Boolean operations of intersection and union. Boolean map theory offers a unified interpretation of a wide variety of visual attention phenomena usually treated in separate literatures. In so doing, it also illuminates the neglected phenomena of attention to structure. Keywords: visual attention, consciousness, visual feature Visual attention, in its most fundamental sense, is a selective visual process that governs access to consciousness. Therefore, a theory of visual attention may naturally start with two questions, addressing limitations in access and mechanisms of selection, respectively. These questions boil down to the following: What can an observer visually consciously access (or as we sometimes say, apprehend) at any given moment? And how do the mecha- nisms of visual selection govern the choice of what is accessed? The tasks and limitations studied in the great majority of pre- vious work in the visual attention field (e.g., visual search tasks; Duncan & Humphreys, 1989; Treisman & Gelade, 1980; Wolfe, 1994) relate to the process of selection because the key manipu- lations in these studies have involved varying the quantity and nature of distracting information (e.g., set size manipulations, feature search vs. conjunction search) in situations in which only one single attribute is to be accessed and reported. In such cases, separating the relevant information (target) from irrelevant infor- mation (distractors) is the chief difficulty posed by the task. Accessing the to-be-reported feature of the target, on the other hand, is a relatively minor aspect of the task and one whose difficulty is usually held constant across different conditions of a given study. Attentional limitations in access have been the focus of only a few studies, most notably those of Duncan and colleagues dealing with a phenomenon that he termed target–target competition (Duncan, 1980a, 1980b). In Duncan’s experiments, the separation of relevant information from irrelevant information was deliber- ately made very easy; the results provided evidence that conscious apprehension of multiple bits of relevant information was subject to severe capacity limitations not involved in selection per se. The distinction between limitations in access and selection is con- cretely illustrated in Figure 1. An observer tries to perceive the color(s) of square(s) in brief exposures. The access is easy in the top row (only one color has to be accessed) but difficult in the bottom row (two colors have to be accessed). The selection is easy in the left column (no distractors) but difficult in the right column (many distractors). On the one hand, most visual search studies basically examine the situation depicted in the top right corner. On the other hand, Duncan’s (1980a, 1980b) studies pertained to the bottom left corner. 1 The literature on visual attention has made significant progress in characterizing the principles of selection, based upon 25 years of active research on visual search, much of it sparked by the pio- neering work of Treisman and Gelade (1980; see Quinlan, 2003, for a recent review). On the other hand, limitations of access have been studied much less extensively, and here, findings have basi- cally been restricted to showing that access is subject to severe capacity limitations (e.g., Bundesen, 1990; Duncan, 1980a, 1980b). To see the sort of issues that remain unresolved in the field, consider Figure 2. What can a person consciously apprehend at any given instant when he or she tries to take in all the information 1 This selection–access distinction is somehow similar to the concepts of capacity–selectivity widely used in the literature (e.g., Desimone & Dun- can, 1995; Pashler, 1998) where capacity is related to limitations in access and selectivity is related to limitations in selection. However, the two have often been confounded; for example, in a very difficult visual search task, the challenge should be selectivity, but researchers often suggest that attentional capacity is limited in such tasks. Liqiang Huang and Harold Pashler, Department of Psychology, Univer- sity of California, San Diego. This research was supported by National Institute of Mental Health Grant R01-MH45584 to Harold Pashler. Special thanks to Anne Treisman, whose comments and prior writings were of enormous assistance in help- ing us to formulate and present the theory described here. We also thank James Enns, Patrick Cavanagh, Jeremy Wolfe, Kyle Cave, Donald I. A. MacLeod, William Prinzmetal, Nancy Kanwisher, Steve Link, Alex Hol- combe, and Edward Vul for very helpful comments and/or discussion. Correspondence concerning this article should be addressed to Liqiang Huang, who is now at the Center for the Study of Brain, Mind and Behavior, Princeton University, Green Hall, Princeton, NJ 08544. E-mail: [email protected] Psychological Review Copyright 2007 by the American Psychological Association 2007, Vol. 114, No. 3, 599 – 631 0033-295X/07/$12.00 DOI: 10.1037/0033-295X.114.3.599 599

Upload: others

Post on 18-Jun-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

A Boolean Map Theory of Visual Attention

Liqiang Huang and Harold PashlerUniversity of California, San Diego

A theory is presented that attempts to answer two questions. What visual contents can an observerconsciously access at one moment? Answer: only one feature value (e.g., green) per dimension, but thosefeature values can be associated (as a group) with multiple spatially precise locations (comprising a singlelabeled Boolean map). How can an observer voluntarily select what to access? Answer: in one of twoways: (a) by selecting one feature value in one dimension (e.g., selecting the color red) or (b) byiteratively combining the output of (a) with a preexisting Boolean map via the Boolean operations ofintersection and union. Boolean map theory offers a unified interpretation of a wide variety of visualattention phenomena usually treated in separate literatures. In so doing, it also illuminates the neglectedphenomena of attention to structure.

Keywords: visual attention, consciousness, visual feature

Visual attention, in its most fundamental sense, is a selectivevisual process that governs access to consciousness. Therefore, atheory of visual attention may naturally start with two questions,addressing limitations in access and mechanisms of selection,respectively. These questions boil down to the following: Whatcan an observer visually consciously access (or as we sometimessay, apprehend) at any given moment? And how do the mecha-nisms of visual selection govern the choice of what is accessed?

The tasks and limitations studied in the great majority of pre-vious work in the visual attention field (e.g., visual search tasks;Duncan & Humphreys, 1989; Treisman & Gelade, 1980; Wolfe,1994) relate to the process of selection because the key manipu-lations in these studies have involved varying the quantity andnature of distracting information (e.g., set size manipulations,feature search vs. conjunction search) in situations in which onlyone single attribute is to be accessed and reported. In such cases,separating the relevant information (target) from irrelevant infor-mation (distractors) is the chief difficulty posed by the task.Accessing the to-be-reported feature of the target, on the otherhand, is a relatively minor aspect of the task and one whosedifficulty is usually held constant across different conditions of agiven study.

Attentional limitations in access have been the focus of only afew studies, most notably those of Duncan and colleagues dealing

with a phenomenon that he termed target–target competition(Duncan, 1980a, 1980b). In Duncan’s experiments, the separationof relevant information from irrelevant information was deliber-ately made very easy; the results provided evidence that consciousapprehension of multiple bits of relevant information was subjectto severe capacity limitations not involved in selection per se. Thedistinction between limitations in access and selection is con-cretely illustrated in Figure 1. An observer tries to perceive thecolor(s) of square(s) in brief exposures. The access is easy in thetop row (only one color has to be accessed) but difficult in thebottom row (two colors have to be accessed). The selection is easyin the left column (no distractors) but difficult in the right column(many distractors). On the one hand, most visual search studiesbasically examine the situation depicted in the top right corner. Onthe other hand, Duncan’s (1980a, 1980b) studies pertained to thebottom left corner.1

The literature on visual attention has made significant progressin characterizing the principles of selection, based upon 25 years ofactive research on visual search, much of it sparked by the pio-neering work of Treisman and Gelade (1980; see Quinlan, 2003,for a recent review). On the other hand, limitations of access havebeen studied much less extensively, and here, findings have basi-cally been restricted to showing that access is subject to severecapacity limitations (e.g., Bundesen, 1990; Duncan, 1980a,1980b).

To see the sort of issues that remain unresolved in the field,consider Figure 2. What can a person consciously apprehend at anygiven instant when he or she tries to take in all the information

1 This selection–access distinction is somehow similar to the concepts ofcapacity–selectivity widely used in the literature (e.g., Desimone & Dun-can, 1995; Pashler, 1998) where capacity is related to limitations in accessand selectivity is related to limitations in selection. However, the two haveoften been confounded; for example, in a very difficult visual search task,the challenge should be selectivity, but researchers often suggest thatattentional capacity is limited in such tasks.

Liqiang Huang and Harold Pashler, Department of Psychology, Univer-sity of California, San Diego.

This research was supported by National Institute of Mental HealthGrant R01-MH45584 to Harold Pashler. Special thanks to Anne Treisman,whose comments and prior writings were of enormous assistance in help-ing us to formulate and present the theory described here. We also thankJames Enns, Patrick Cavanagh, Jeremy Wolfe, Kyle Cave, Donald I. A.MacLeod, William Prinzmetal, Nancy Kanwisher, Steve Link, Alex Hol-combe, and Edward Vul for very helpful comments and/or discussion.

Correspondence concerning this article should be addressed to LiqiangHuang, who is now at the Center for the Study of Brain, Mind andBehavior, Princeton University, Green Hall, Princeton, NJ 08544. E-mail:[email protected]

Psychological Review Copyright 2007 by the American Psychological Association2007, Vol. 114, No. 3, 599–631 0033-295X/07/$12.00 DOI: 10.1037/0033-295X.114.3.599

599

Page 2: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

within each of the panels of this figure? The answer is not obviouson the basis of existing theory and research. Feature-integrationtheory (Treisman & Gelade, 1980) might seem to speak to thisquestion. After all, the theory states that features are processed in

parallel and are simultaneously available, whereas attention isnecessary to bind features from different dimensions together intoone object. From that, one might infer that access to two features(e.g., two colors) can be achieved in parallel as long as they do not

Figure 2. What information can be consciously accessed at any given instant when an observer attends to eachof these three different panels? Boolean map theory claims that only one feature (e.g., one color) can beconsciously accessed at one instant; therefore, it contends that the configuration of colors and respectivelocations in Panels b and c can be apprehended at a glance (because only one color is mapped to one or morelocations), whereas the configuration in Panel a cannot be momentarily apprehended (because the two colorshave to be apprehended one by one).

Figure 1. Various displays with which an observer can carry out the task of perceiving the color of the squares.Two types of attentional limitations (selection, access) are illustrated. Columns indicate whether selection of therelevant elements (squares) is easy or difficult (based on presence of disks as distractors). Rows indicate whetheraccess to the relevant information is easy or difficult (based on complexity of to-be-accessed information).

600 HUANG AND PASHLER

Page 3: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

have to be bound with other feature dimensions. This conclusion isdemonstrably wrong, as we show below. At the risk of somewhatoversimplifying the theory to be presented here, the answer to thisquestion offered in the present article essentially boils down tothis: What can be accessed at any given instant is only one featurevalue per dimension, with all these feature values (in their entirety)being perceived to occupy one or multiple locations.

A clear appreciation of the distinction between selection andaccess is crucial to the present theory because they are typicallynot formally distinguished in the literature. For example, a ques-tion such as “Can we attend to two colors simultaneously?” canmean two entirely different things. When referring to access, itamounts to the question “Is it possible to detect (or perceive) twocolors simultaneously?” When referring to selection, it is tanta-mount to asking, “Can one selection process be based on twocolors at the same time?” The answers to these questions arepotentially independent in the sense that one can envision theoriesthat would embrace any of the four possible pairs of answers tothese two binary questions. Moreover, as we discuss below, thenotion of seeing a color can be analyzed in a number of differentways.

A Boolean Map Theory of Visual Attention

What Is a Boolean Map?

Before describing the Boolean map theory in a concrete fashion,we must explain what a Boolean map is. As hinted at above, it isa spatial representation that partitions a visual scene into twodistinct and complementary regions: the region that is selected andthe region that is not selected.2 In Figure 3, if a Boolean map is

2 In the present theoretical framework, a Boolean map is a collection oflocations. It should be emphasized that those locations are regions exactlycovered by relevant stimuli (i.e., one should imagine the map as being, asit were, shrink-wrapped to conform tightly to the object). For example, inFigure 3, a Boolean map of both objects covers precise regions, and thus,the shape information (square, ball) is also automatically included. Thisdiffers substantially from previous theories (e.g., feature-integration the-ory) in which location seems to be regarded as orthogonal to the form orshape of the object occupying the location (e.g., square, ball, as in Figure3). Therefore we suspect the word location in previous theories has beentaken to imply one single coordinate value describing the location ofsomething like the centroid of the entire object.

Figure 3. Three possible labeled Boolean maps that could describe the display on the left (composed of a reddisk and a green square). One map includes only the disk and indicates its shape and color (a red disk); anotherincludes only the square and indicates its shape and color (a green square). The third map includes both thesquare and the disk and describes only the global shape, but neither of the two colors.

601BOOLEAN MAP THEORY

Page 4: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

created to represent only the red object or only the green object(two natural possibilities, but not the only ones), then the otherobject would be missing from the selected region of the Booleanmap. If both are selected, then as far as the Boolean map isconcerned, they become indistinguishable: No differentiation inthe featural properties of the objects is possible with respect tocolor, shape, or any more abstract property. The word Booleanrefers to the fact that the visual scene is divided into only twobinary (i.e., Boolean) levels.

The Boolean map may be associated with what we call featurallabels. It is a key claim of the theory that the Boolean map as awhole may have only a single featural label per dimension, and thisfeatural label must provide an overall featural description of theentire region. That is, there is no mechanism for associating afeatural property with any subset of the Boolean map smaller thanthe whole. For example, when a Boolean map of the display shownin Figure 3 encompasses both the red object and the green object,there may be a color label, but if so, this color label cannot provideaccess to redness or greenness. To access the redness, one wouldneed to create a Boolean map that covers only the red object.However, there can be independent featural labels for differentdimensions of a single map. For example, a single Boolean mapcould have redness as a color label and verticalness as an orien-tation label. For the sake of simplicity, this principle of one featurevalue per dimension at any instant is hereafter often referred to asone feature value at an instant, as a shorthand for the propositionthat there can be a separate feature label (but only one) for eachdimension and that the label cannot be associated with anythingless than the Boolean map in its entirety.

Also, at the risk of belaboring what may already be obvious, weshould point out the Boolean map is a format in which one featurevalue is tied to potentially multiple locations. These locations areencoded by the pattern of Boolean map itself, so, unlike otherfeature values, multiple locations can be represented simulta-neously by one Boolean map.

Boolean Map Is the Mechanism of Visual Access

The preceding paragraph describes the Boolean map as anabstract data structure. Naturally, we introduce the concept ofBoolean map to argue that it corresponds to a real and fundamentalaspect of human vision. The labeled Boolean map, it is arguedhere, corresponds to the information that can be consciously ap-prehended at one instant based on vision. To put it another way, anobserver’s visual awareness corresponds to one and only oneBoolean map at any given instant. This momentary consciousapprehension provides access to both the shape of the Booleanmap (the set of potentially multiple location values) and theidentity of associated feature labels and represents the latter asproperties belonging to the former. Thus, to apprehend the rednessof a red object, one must create a Boolean map that covers the redobject. We contend that observers are perfectly able to construct aBoolean map that covers both a red object and a green object, butif this is done, neither redness nor greenness is consciously acces-sible at that instant (as in the bottom right example of Figure 3).

The account described above can be roughly summarized asthree tenets for easier reference, though they should not be inter-preted as isolated propositions. These are obligatory encoding oflocation (consciously accessed visual information is always in-

dexed by location), single-feature access (only one feature valuecan be accessed at one time), and multiple-location access (mul-tiple locations can be accessed at one time).

We have proposed the format of Boolean map as characterizingthe content of conscious access at any one instant. This is anattempt to answer the first question about access raised in thebeginning of this article (What can an observer visually con-sciously access at any given moment?). The second questiondescribed there, relating to selection (How do the mechanisms ofvisual selection govern the choice of what is accessed?), requiresus to describe the possible ways in which a Boolean map can becreated, and this is discussed in the next section. In short, then,Boolean map theory addresses the two questions about access andselection by (a) specifying the data format of a Boolean map and(b) constraining the possible ways of creating a Boolean map,respectively.

Given that feature-integration theory (and its subsequent off-shoots) does not explicitly distinguish between selection and ac-cess and—at least, read narrowly—is restricted to characterizingselection, it is not a simple matter to relate these propositions tothat tradition. A more directly relevant line of work is found inDuncan (1980a, 1980b), who—as already mentioned— docu-mented the difficulty of simultaneously accessing two targets evenwhen the selection process is easy. One of the ways that the theorypresented here goes beyond Duncan’s analysis is in proposing thateven though observers cannot simultaneously access two featurevalues, they can simultaneously access two or more location val-ues.

Creation of Boolean Map

A second postulate of the present theory is that all top-downcontrol over visual perception is accomplished by processes thattrigger and guide the creation of a Boolean map. This creation isenvisioned as occurring through one of two different possibleroutes. The first route is creating a Boolean map de novo byselecting one feature value from a single dimension (which couldinclude location itself, through processes described below). Here,the location or locations that characterize the distribution of thatfeature value are included in the resulting Boolean map (e.g.,selecting all red objects in a visual scene results in a labeledBoolean map extending over the region comprising red things).The second route is through combining the preexisting Booleanmap with the output of the first route (selection based on onefeature value) via the Boolean operations of intersection andunion—subject to important limitations described below. The the-ory also makes the strong claim that these two routes exhaust themechanisms of selecting one set of visual information or anotherand routing it to consciousness. From that, two additional impor-tant conclusions follow. First, only one feature value can be usedto create a Boolean map at one time. Second, a Boolean mapcannot be created by simultaneously combining information frommore than one dimension (i.e., creation is always based on onedimension at one time). These claims obviously run counter to asignificant amount of opinion in the field, and the reader may findtheir import rather vague and abstract at this point. Each of themwill be elaborated upon, rendered more concrete, and defendedbelow. To avoid misunderstandings, it should be noted that theBoolean operations postulated in our discussion of the mechanisms

602 HUANG AND PASHLER

Page 5: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

of Boolean operations are fully consistent with the claim that atany given instant there is only one Boolean map because newBoolean maps created through Boolean operations automaticallyreplace any preexisting map. The claims above can be roughlysummarized as two tenets. Feature-by-feature selection (selectioncan be based on only one feature value at one time) and availabil-ity of Boolean operations (any selection that cannot be directlyaccomplished by selection based on one feature value must becreated through Boolean operations).

It should be noted that the Boolean map is a relatively spatiallyprecise and potentially complex spatial representation that canencompass spatially disconnected regions. This is of course naturalgiven that the Boolean map can be created by selecting a feature,and the distribution of features can obviously occupy disconnectedregions within a display. Thus, the Boolean map theory impliesthat terms like spotlight or zoom lens are entirely inadequate—even as metaphors—to describe the character of visual attention(e.g., Driver & Baylis, 1989; see also LaBerge & Brown, 1989).

At the risk of stating what may now be obvious, a feature valueused in governing the creation of a Boolean map can be voluntarilychosen from all the feature dimensions present in a particulardisplay. For example, in a display in which objects vary in severalcolor values and several orientation values, the Boolean map canbe voluntarily created either by selecting a color or by selecting anorientation. It is important to note, however, that a Boolean mapcan also be created by selecting all the objects in a display. Wehave already seen an example of this in the selection of bothobjects in the lowermost Boolean map depicted in Figure 3.

Summary of the Boolean Map Theory

According to the theory examined here, there are two differentkinds of visual attentional limitations. One is defined by con-straints on the available ways to create a Boolean map (i.e.,selection). The other is defined by the information that can becontained in a Boolean map (i.e., access). Describing these con-straints will begin to flesh out our proposed answers to the twoquestions raised earlier: What can an observer visually consciouslyaccess at one moment, and how do observers select what to access?

Thus, Boolean map theory can be briefly encapsulated in twopropositions.

1. Principle of access. A Boolean map is a spatial repre-sentation that divides the visual field into two discretesubsets: selected and not selected. This map can beassociated with featural labels, but no more than one perdimension. A feature label must provide a single overallfeatural description of the entire region encompassed bythe map. At any given instant, an observer’s consciousawareness of a visual scene can be represented by asingle labeled Boolean map.

2. Principle of selection. All top-down control in visualperception is accomplished by directing the ways inwhich a Boolean map is created. This takes place in oneof two ways: (a) by selecting one feature value in onedimension or (b) by combining the existing Boolean mapwith output of (a) through one of two Boolean operations(intersection and union).

To take it further, the essential points of Boolean map theory canbe further condensed into the five tenets we have given above:obligatory encoding of location, single-feature access, multiple-location access, feature-by-feature selection, and availability ofBoolean operations.

The above presentation of the theory seeks to characterize whatis basically a data format underlying conscious access and thepossible ways of generating that data format. The theory may alsobe viewed as a description of how selection can be undertaken andwhat results emerge from the process of selection. Though thesubstance is the same, the latter approach may strike some readersas more familiar. Again, the same five tenets figure in this morenarrative mode of describing the theory, which runs as follows:The visual system can select one feature at one time (feature-by-feature selection), and the process yields a map of the spatialdistribution of the feature (obligatory encoding of location) withthe option of combining this map with a preexisting map (Booleanoperations). In each selection process, only one feature value canbe accessed from one dimension (single-feature access), but themap itself provides access to multiple locations (multiple-locationaccess).

Figure 4 shows a schematic representation of Boolean maptheory. Here, visual processing is divided into two stages: featuremaps and Boolean map. Feature maps are generated in early vision(the two subroutines relating to feature maps—feature-locationroutine and location-feature routine—are described below). TheBoolean map is the sole mechanism of conscious access, withcreation of Boolean maps being the sole means of top-downcontrol.

Some Additional Comments on Concepts Introduced Here

Feature maps. In the preceding discussion, we have used theterms feature and feature map. In using these concepts, we areof course borrowing ideas common throughout much of thevision literature. This assumes the existence of a set of mech-anisms that can extract featural descriptions from visual inputwithout the generation of a Boolean map (Treisman & Gelade,1980). Presumably, the feature maps include those computingmotion, size, color, spatial frequency, orientation, and perhapsothers (Treisman & Gormican, 1988). However, it follows fromwhat has been said already that those features cannot be con-sciously accessed until they are represented as labels on aBoolean map and that they participate in generating consciousexperience only according to the two rules of Boolean mapformation stated above. Also, to clarify terminology, in thisarticle, the word feature is used to refer to feature values (e.g.,red and green are features), whereas color and orientation arealways referred to as dimensions.

There are two natural ways of using visual information in thefeature maps, which we describe as two subroutines. The firsttakes as an input a featural value and returns a Boolean mapdescribing all the locations at which that feature value is present(this is called the feature-location routine and abbreviated FL).This FL is somewhat similar to prior concepts such as feature-based attention or the guidance process in Wolfe’s guided searchmodel (Kim & Cave, 1995; Wolfe, 1994) or Hoffman’s two-stagemodel (Hoffman, 1979). The second subroutine takes as input alocation value and returns a featural value for that location (the

603BOOLEAN MAP THEORY

Page 6: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

location-feature routine, abbreviated LF).3 In Boolean map theory,the FL is responsible for creating the Boolean map according to atop-down control value (e.g., seeing everything red, seeing theobject on the top left corner). The LF is responsible for creatinglabels from all the dimensions (i.e., allowing them to reach con-scious access) once a Boolean map is created.

A formal representation of Boolean map theory. Here, we givea very simplified formal representation incorporating the basicaspects of Boolean map theory to help avoid the potential ambi-guity present in any verbal presentation, such as the one offeredabove.

Definitions: L is the location variable. X and Y are two featurevariables. L � l1 means the location variable is given the value l1.X � x1 means Feature X is given the value x1. The arrow symbol(3) means returning the value.

So, the basic sequence of events in attentional selection of visualinformation is as follows:

1. Specifying one control value voluntarily, for example,(xcontrol) or (lcontrol). This value is used in FL below.

2. From this control value, a set of locations is retrievedfrom FL: FL(xcontrol1)3 (l1, l2, l3, . . .), or FL(lcontrol1)3(l1, l2, l3, . . .).

3. The set of locations could be combined with the locations

of the preexisting Boolean map through Boolean opera-tions.

4. For each of the locations values (l1, l2, l3, . . .), a set offeature values is retrieved from LF: LF(l1) 3 [x(l1),y(l1)]; LF(l2) 3 [x(l2), y(l2)] . . . .

5. Those feature values are used to compute the featurelabels: [(x(l1), x(l2), . . .] 3 xlabel.

3 The effectiveness of these two subroutines plausibly depends on howinformation in the feature maps is organized. Only the LF would bepossible in a map that represented only the feature value in each location,which seems to have been the conventional (though usually only implicit)conceptualization of a feature map in previous theorizing. By way ofanalogy, consider a telephone book that lists 100,000 names (correspond-ing to locations) in alphabetical order, providing a telephone number foreach (corresponding to a featural value). This representation will supportsomething akin to the LF: rapidly finding the telephone number (featurevalue) for each desired name (location). However, as everyone knows, it isdifficult to find the name that goes with a particular telephone number, thatis, an ordinary phone book does not support the FL. To effectively find aname for a particular number, one needs another book (a reverse index) thatlists all the names in the order of telephone numbers. The FL plausiblydepends on such a reverse index.

Figure 4. A schematic overview of Boolean map theory. The sensory analysis begins at the top throughcreation of the feature maps. Top-down control is (solely) implemented by giving commands to the feature-location routine to trigger creation of a Boolean map. The Boolean map (along with the feature labels sent backby the location-feature routine) constitutes the visual information that is consciously accessible at one moment.

604 HUANG AND PASHLER

Page 7: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

When those feature values are all equal (i.e., selected region ishomogeneous in that dimension), the label has an explicit value.4

If x(l1) � x1 and x(l2) � x1 . . . , then xlabel � x1.Thus, the theory presented in this article can be summarized as

follows:

1. Principle of access. Conscious access has the format of alabeled Boolean map: The set of consciously accessibleinformation can be represented as: (l1, l2, l3, . . . .; xlabel,ylabel). In this format, there can be multiple locationvalues, but only one feature value can be represented foreach dimension.

2. Principle of selection. The Boolean map is created byselecting one feature value in one dimension at a time(e.g., xcontrol or lcontrol), with the option of Boolean op-erations with the preexisting Boolean map.

Revisiting the distinction between selection and access.Above, we suggested that although a distinction between selectionand access had made a few cameo appearances in the literature onvisual attention, the two types of account have largely been con-flated in most theoretical writings. This point can be highlighted ina more concrete fashion given the concepts introduced in theprevious section. The control value (feature or location) sent to FLis distinct from the conscious percept that is returned by LF. Inboth cases, we argue that there can be only one feature value (to beaccessed/to be used to select) at one instant, as stated in the tenetsof single-feature access and feature-by-feature selection, respec-tively. Although these two claims may at first sound redundant,they are logically quite distinct. That is, one can easily imagine ahypothetical theory that would maintain only one of them and notthe other. For example, a theory might maintain feature-by-featureselection and claim that one can select all vertical objects and thenhave simultaneous access to all of their three colors. On the otherhand, a theory might uphold single-feature access while maintain-ing that one can immediately perceive the spatial pattern of greenand blue objects from a background of red and yellow objects.

Predictions from Boolean map theory. Here, we note a few ofthe most immediate and important predictions that follow fromBoolean map theory. From the claim that conscious access has theformat of Boolean map (l1, l2, l3, . . . ; xlabel, ylabel), one may infertwo things. First, only one feature value can be accessed from eachdimension at one instant (single-feature access), whereas multiplelocation values can be accessed simultaneously (multiple-locationaccess). For example, in Figure 5a, accessing the colors of the ballsrequires a series of Boolean maps selecting individual balls, butaccessing the locations of all the balls (i.e., the spatial configura-tion or pattern) requires only one Boolean map selecting all balls.Second, access to featural information is always accompanied byawareness of the location of the accessed feature (i.e., if anobserver sees something, the observer always knows where it is:obligatory encoding of location). We should point out, in case it isnot already self-evident, that both of these predictions are infundamental disagreement with the most famous aspects offeature-integration theory. According to feature-integration theory,features are processed in parallel and are simultaneously availableas long as they do not have to be integrated. Thus, access tomultiple features can be achieved in parallel, and these do not have

to be accompanied by location information. The first predictionalso runs against ordinary intuition, which seems to suggest tothose inclined to reflect on the matter that an observer can con-tinuously perceive multiple features at the same time (e.g., inFigure 2a).

The claims about selection described above put two importantconstraints on the possible operations of selection. First, thefeature-by-feature selection tenet rules out the idea of making aselection based on more than one feature value simultaneously(e.g., dividing color space into two parts and selecting one of them;

4 Boolean map theory does not specify what label would be used whenthe selected region is not homogeneous on one dimension. Even if such alabel cannot provide conscious access to individual feature values, it isprobably able to provide an overall statistical description of all the featurevalues of the selected region (Chong & Treisman, 2003, 2005).

Figure 5. Two demonstrations of the claim that the selected region itselfcan serve as data for visual analysis. Attending to the red, white, andyellow disks in Panel a, an observer can see their triangular arrangement.In Panel b, attending to the blue lines or the green lines, one sees a periodicpattern; attending to both the blue and the green lines, one sees a periodicpattern with twice the frequency.

605BOOLEAN MAP THEORY

Page 8: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

Bauer, Jolicoeur, & Cowan, 1996; D’Zmura, 1991). Second, theavailability of Boolean operations tenet claims that any selectionthat cannot be accomplished by selection based on a single featureis implemented by Boolean operations. Thus, it rules out the ideaof weighting signals from multiple dimensions into one singlescalar according to behavior goal (e.g., Wolfe, 1994). Both claimsare elaborated upon and defended later in this article.

Attention as structure and data. According to traditional the-ories of attention, preattentive visual machinery generates manyvisual encodings, and attention selects, enhances, or inhibits someof these. Selection of a region is thus viewed as a signal thatcontrols processing rather than as data to be analyzed by visualprocessing routines. By contrast, in the theory suggested here, thespatial pattern of the Boolean map (i.e., the shape of the selectedregion) is data available to conscious access by itself.5

It is not difficult to construct simple displays that demonstratethe potential for such analysis. Consider Figure 5a. If cued toattend to any three balls (e.g., the red, white, and yellow ones), anobserver can perceive the triangle with corners located at the ballsafter the balls have been selected in three steps. The location andshape of the triangle depend on which three colors are picked.Given the number of possible triangles, it seems unlikely that allcombinations are analyzed as triangles by preattentive visual ma-chinery prior to the selection. Therefore, perceiving that trianglerequires spatial analysis of the pattern of Boolean map itself ratherthan selection of visual information already computed in earlyvision. A similar example is given in Figure 5b. An observer canchoose to attend to only blue lines or only green lines and see arelatively low spatial frequency pattern. Alternatively, the observercan choose to attend to both blue and green lines and see a patternof twice the spatial frequency.

Despite the ease with which examples of this kind can becreated, they have rarely been discussed in past literature onattention. The first example mentioned above (a triangle from threeselected balls) has sometimes been mentioned under the heading ofvisual interpolation. Besides that, the spatial analysis of the se-lected region has apparently been considered in depth only withina few scattered theoretical treatments: the study of visual routines(Ullman, 1984), attentional tracking (Pylyshyn & Storm, 1988; seealso Yantis, 1992, for an explicit approach of shape analysis),motion perception (Lu & Sperling, 1995), and what Cavanagh,Labianca, and Thornton (2001) and colleagues termed sprites. Ineach case, an intriguing phenomenon was noticed, and significantempirical findings were noted. However, the underlying principleof analysis of the selected region has not been elevated to anyparticularly important place in theorizing about visual attention.The present article contends that new insight into a variety ofperceptual and attentional phenomena may be gained if this con-cept is elevated to a very prominent place.

Moreover, by conceptualizing attention as a spatial pattern thatitself provides data, the theoretical account developed here drawsattention to a broader range of phenomena than those upon whichvisual attention theories have traditionally been based. Most re-search has focused on visual search tasks in which a person scansa complex display with the goal of finding a (usually singular)target. The present account seeks to illuminate a wider array ofways in which attention enables the perception of visual structure.Of course, visual search can be described as a sort of structuralanalysis (discerning the presence of a target), but the structure

detected usually involves only a small aspect of a display. Thisextension may depart from previous research but not from theintended domain of previous theories; for example, althoughfeature-integration theory has popularized the visual search design,the theory itself has been built upon evidence from convergingparadigms including illusory conjunction, texture segregation, ob-ject files, and so on and was never intended to be focused specif-ically on visual search.6

Boolean Map Theory Compared With Previous Theories

Many components of Boolean map theory resemble elements inprevious theories. Table 1 lists some of the claims and other uniqueaspects in Boolean map theory and refers to some of the theoriesthat these claims agree with or challenge. This table should help toplace the present framework in the context of the literature.

In brief, Boolean map theory can be related to previous theo-rizing in the following very general way. Duncan (1980a, 1980b)distinguished two types of attentional limit: capacity (limitation ofaccess) and selectivity (limitation of selection). Previous researchin the area of visual attention (Treisman & Gelade, 1980; Wolfe,1994) has made substantial progress in uncovering the principlesof selection. However, these and other theories have not offeredany qualitative or formal constraints on access—on the informa-tion inherent in a person’s momentary visual awareness. In Bool-ean map theory, both access and selection are characterized usingthe concept of the Boolean map.

Boolean Map as the Mechanism of Visual Access

The previous section described the claims of Boolean maptheory largely in the language of data structures because webelieve this provides the clearest and most appropriate rubric forthe theory. However, the reader probably found this relativelyabstract formulation unfamiliar and perhaps even strange. Our goalin the next several sections of this article is to persuade the readerthat although this formulation may be unusually abstract, it notonly accounts for many concrete but hard-to-reconcile resultswithin the familiar lines of visual attention research but also pointsto many new kinds of visual phenomena that have not been muchexplored but deserve exploration—especially those relating to theperception of structure in complex displays. As noted above, thetheory proposed here assumes that a Boolean map is the onlymechanism through which visual information (featural labels andshape of Boolean map) can be consciously accessed. We firstdescribe evidence consistent with the basic hypothesis that onlyone feature value per dimension can be accessed from the Booleanmap. Then, we show evidence that there can indeed be separate

5 One might suggest that it would be simpler to assume that the shape ofthe passed information, instead of the pattern of the Boolean map, is beinganalyzed. For example, in Figure 5a, if a filter removes all the other ballsexcept the three, the scene will appear to be like a triangle, identical to ashape analysis of the Boolean map. Although this point has some merit, ashape computation must be performed to make the three dissociated ballsappear as a triangle. As we show below, the feature information is dis-carded in this shape analysis. So, this interpretation may in the end not bedistinguishable from the proposed shape analysis of the Boolean map.

6 A. Treisman (personal communication, February 10, 2006).

606 HUANG AND PASHLER

Page 9: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

featural labels for different dimensions associated with a singlemap.

Only One Feature Value Can Be Accessed at One Instant

As illustrated in Figure 3, given a display containing both a redand a green object, there are three possible Boolean maps thatcould be constructed. One is a Boolean map that selects only thered object. Another is a Boolean map that selects only the greenobject. The third is a Boolean map that encompasses both objects.If this third possibility is elected, then according to Boolean maptheory, the individual colors cannot be accessed. Therefore, toaccess the properties of individual objects (e.g., to determine thatthe display contains both a red and a green ball), one would needto create two distinct Boolean maps in series. When that is notpossible, observers will not be able to make any judgment aboutthose individual features.

Structure of multicolor displays. As mentioned above, studiesof visual attention have been overwhelmingly focused on visualsearch and a few other tasks. However, in daily life, people oftenview their environment without the goal of finding any particularsingle target object. Rather, they come to be aware of globalstructural information about a scene or display. Researchers havestudied a number of phenomena that relate to the perception ofstructure, including global/local or Navon figures (large lettersmade out of small letters; Navon, 1977) and structure from motion(Koenderink & van Doorn, 1991). In this article, we focus on therole of visual attention in spatial transformation tasks (symmetryperception, matching, and mental rotation).

Most previous studies of symmetry perception have focused onsymmetry of the spatial configuration of inherently Boolean pat-terns such as those created by distribution of black dots or lines, asin Figures 6a and 6b (e.g., Barlow & Reeves, 1979; Palmer &Hemenway, 1978; van der Helm & Leeuwenberg, 1996). Bycontrast, less is known about the perception of symmetry in dis-plays involving multifeature variation, even though these forms of

symmetry are evidently important in both nature and art (seeFigures 6c and 6d).

The only two studies of symmetry perception that systematicallymanipulated surface features have come from our own laboratory(Huang & Pashler, 2002; Morales & Pashler, 1999). In the first ofthese studies, observers were required to judge symmetry in thearrangement of color in a regular grid pattern. That is, the observerdecided whether each and every square in the grid had the samecolor as the corresponding square located equidistant across theaxis of symmetry, with asymmetric patterns differing in the colorof only one or two squares. The results led to the conclusion(which initially came as a surprise to us) that the symmetryjudgment required a serial process in which different subfigures(subsets of the display having a given color, such as red or green)were assessed in series. One finding was that the greater thenumber of colors, the longer the response times (holding constantthe size of the grid). Another, more telling result came from whatwas termed the ABBA/ABCD method (see Figure 7). Here, wecompared responses to two different types of four-color displays inwhich two squares mismatched. In the ABBA condition, the twomismatched pairs involved only two colors (e.g., one red might bechanged to green, and one green changed to red), whereas in theABCD condition, the two mismatched pairs involved all fourcolors (e.g., one red changed to yellow, and one green changed toblue). Responses were faster in the ABCD condition, where allfour colors were involved. This would be predicted by the hypoth-esis of sequential scanning through colored subfigures because anymismatch would become evident in the first subfigure checked inthe ABCD condition, but not in the ABBA condition.

Morales and Pashler (1999) concluded from their findings thatsymmetry detection machinery is inherently color-blind. In thepresent article, this conclusion is endorsed but fundamentallyreinterpreted: now, it is contended that an iterative color-by-colorBoolean mapping strategy and the underlying poverty of momen-tary visual awareness that it implies are in no way confined to, or

Table 1Aspects of Boolean Map Theory

Tenets and unique aspects Previous similar claim Previous opposite claim

Single-feature access (only one feature value can be consciouslyaccessed at one instant)

Treisman & Gelade (1980)

Obligatory encoding of location (location information is alwaysselected simultaneously with feature values)

Treisman & Gelade (1980)

Spatial pattern of selected region is analyzed Lu & Sperling (1995), Yantis (1992)

Early vision automatically computes a set of separate featuresmaps

Treisman & Gelade (1980)

Feature-by-feature selection (feature-based selection can only bebased on one feature value at one time, not a subset in thefeature space)

D’Zmura (1991)

A Boolean operation, not top-down salience, is the strategy ofconducting conjunction search

Wolfe (1994)

Distinguishing the attentional limit in access and selection Duncan (1980a, 1980b)

607BOOLEAN MAP THEORY

Page 10: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

indeed even particularly related to, symmetry perception per se.Instead, the serial scanning from color to color is the strategy ofchoice because a Boolean map is the only source of consciousaccess. Any visual perceptual process that is not already accom-plished by early vision would be blind to the spatial arrangementof feature in general and would need to rely upon the selection ofonly one feature at one time.

If Boolean map theory is correct, it follows immediately that itshould be impossible directly to perceive the spatial structure of amulticolor display, even if an assessment of symmetry is notrequired. For example, if an observer wants to apprehend thespatial arrangement of the colors in Figure 8, he or she has toconstruct four Boolean maps in series and carry out a serial scan inwhich the shape of these maps is assessed.

The first experiments to be described here extend these resultsfrom symmetry to the simpler and more fundamental task ofpattern matching. Observers judged whether two simultaneouslypresented displays were identical or not (subject to rotation in oneexperiment). The details of methods and results are described inthe Appendix.

As shown in Figure 8, in the repetition detection experiment(Experiment 1), the observer tried to judge if two color patternswere identical. In the rotation experiment (Experiment 2), the

observer tried to judge if two color patterns were identical after 90°rotation. Both experiments used the ABCD–ABBA design fromthe symmetry studies (Huang & Pashler, 2002; Morales & Pashler,1999). Performance was indeed significantly better in ABCDdisplays than ABBA displays in both repetition detection androtation tasks, suggesting that the same color-to-color scanningprinciple operated in repetition detection (with or without theadditional need for mental rotation).

Figure 9 illustrates the same point more informally. Each patternis composed of 16 squares, each of a different color. Are the twopatterns identical? Unlike in the displays in Figure 8, Booleanmapping should now be highly inefficient (fewer squares perBoolean map), making the comparison exceedingly laborious, asmost observers seem to agree that it is.

The single-feature–multiple-locations format. The basic defi-nition of Boolean map describes it as having a single-feature–multiple-locations format. Therefore, it follows that one can onlyaccess one feature value (e.g., the color of this spot is green) at onetime, but one may simultaneously access multiple locations (e.g.,there is a spot here and here and here). Most people seem to agreeintrospectively that if they have to perceive a multicolor patternlike those in Figure 8, they tend to sequentially shift from subfig-ure of one color to another. However, viewing a simpler pattern

Figure 6. Symmetry perception tasks. Panels a and b show monochrome (and thus inherently Boolean)displays used in most research in this area. Panels c and d show naturalistic color-symmetry displays. Photographin Figure 6c copyright 2007 by Oscar Gutierrez (www.ogphoto.com). Used with permission.

608 HUANG AND PASHLER

Page 11: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

such as those in Figure 2a, many people intuit that they can seethe two colors simultaneously. We suspect that a sequentialstrategy still operates7 but that the sequential steps are lessreadily introspectable in these cases, perhaps simply becausethey are faster.

Experiment 3 (see Figure 10a) was conducted to provide anexceptionally direct test of the theory with a method similar to thatof Duncan (1980a, 1980b). Here, a two-color pattern was pre-sented simultaneously in one condition and successively in anothercondition. In both conditions, the stimuli were masked after a verybrief exposure. The participants were later shown a test color thatcould be one of the two colors from the display (half the time) ora completely new color (the other half of the time). If the twocolors cannot be accessed at the same time, as Boolean map theorypredicts, then the successive condition will have a substantialadvantage on the test over the simultaneous condition (Shiffrin &Gardner, 1972). This prediction was confirmed. In addition, amodel fitting suggested that the performance difference betweensimultaneous and successive conditions fit well with the predic-tions of a strictly sequential model. Experiment 4 (see Figure 10b)tested the issue of accessing two locations simultaneously, also ina successive/simultaneous method. Here, we found that partici-pants did slightly better in the simultaneous condition than in thesuccessive condition. The results of the successive/simultaneouscomparison were significantly different in Experiment 3 and inExperiment 4. Taken together, then, these studies are consistentwith the idea that people can indeed only access two colors inseries, whereas they can access two locations simultaneously.These conclusions follow directly from the theory proposed here,and to our knowledge, they do not follow explicitly from any othertheory in the literature.

Is it a problem for Boolean map theory that, when viewing adisplay like that of Figure 3 (containing a red disk and a greensquare), many observers claim to have a vivid subjective experi-ence of seeing the two colors at the same time? Before assumingthat this is a problem, one should note that the theory does notimply that what such an observer would experience would matchthe experience of viewing always just one object or of seeing twoobjects that are gray or some other uniform color. The Booleanmap shown in the bottom of Figure 3 implies an awareness of twoobjects (and perhaps the presence of heterogeneity of color)—whatthe viewer is claimed to lack is the simultaneous awareness ofwhat colors are present and how they are bound to the shapes.However, that information can be immediately accessed by creat-ing one of the two other Boolean maps shown in the figure. AsDennett (1991) has pointed out in discussing the fact that peopletend to be oblivious to the blurriness of their peripheral vision,observers may be unable to distinguish between actually havingcertain information explicitly represented in awareness and havingthe ability to access that information quickly whenever they wantit (see also O’Regan & Noe, 2001). Thus, if observers can rapidlyaccess the redness of the disk and the greenness of the square inFigure 3 whenever they want to, as the current theory implies, thatmay suffice to make the observers report that they see both colorsat the same time. Additionally, it should be noted that although thepresent theory challenges some commonsense ways of understand-ing conscious experience, experimental findings in this area prob-ably make such challenges inevitable. The Boolean map theoryproposed here seems to be in less stark conflict with ordinaryintrospections than are recent views of visual awareness promptedby findings from change detection and change blindness research(e.g., O’Regan & Noe, 2001; Rensink, 2000a, 2000b), a point thatwe return to below.

So far, we have discussed the issue of accessing two colors orlocations in the sense of obtaining information about an individualelement. Another, perhaps more important, perspective is thatsimultaneous accessing of multiple locations makes them a holisticpattern and no longer a mere collection of individual locations. Aholistic pattern represents associations between locations in the (x,y) plane, reflecting their simultaneous presence in the display. Thisstands in stark contrast with the fate of features, where no suchassociation can be represented. For example, colors are not orga-nized into a pattern in the color space and thus must be perceivedindividually. This can be seen in Figure 11. In Figure 11a, one caneasily verify that the spatial pattern of disks on the left has the

7 This is consistent with the recent findings of Howard and Holcombe(2007). Howard and Holcombe asked observers to keep monitoring thefeatures or locations of a few objects that were continuously changing.Observers showed a lag in reporting spatial frequencies (they tended toreport an “old” feature value rather than the latest feature value), and thelag increased monotonically with number of objects, as expected if theobservers serially visited each object and could only report “cached con-tent” from their last visit. More interestingly, no large lag increase wasfound for reporting locations, consistent with the central claim of Booleanmap theory: One must access multiple feature values serially but can havesimultaneous access to multiple locations.

Figure 7. Two different displays used in analyzing color-symmetry per-ception. In both the ABCD and ABBA displays, the patterns are not colorsymmetric. However, in the ABCD display, there are two mismatched pairsinvolving all four colors. In the ABBA display, there are two mismatchedpairs, both involving the same two colors. According to Boolean maptheory, the ABCD display should elicit quicker rejection because asym-metry is detected in whatever subfigure (red, yellow, green, or blue)happens to be considered first.

609BOOLEAN MAP THEORY

Page 12: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

same shape as the pattern of disks on the right.8 By contrast, toverify that the same set of colors is present in the two, one mustrely upon a sequential checking (iteratively forming a Booleanmap that spans the two displays, one consisting of a pair of yellowitems, one of a pair of red items, and so on). The differencebetween location and other features is even more obvious withrespect to their dependency upon each other. Shuffling the diskscauses little harm to the comparison of spatial pattern but rendersit substantially more difficult to verify the correspondence ofcolors (see Figure 11b). This makes sense if multiple locations canbe perceived together as a holistic pattern, whereas features canonly be perceived individually through their locations (see D. G.Watson, Maylor, & Bruce, 2005, for evidence that a sequentialchecking strategy is used in determining number of colors presentin a display). Although this observation may seem intuitivelyobvious once pointed out, it is not clear to us how prior theoreticalanalyses would in any way entail it.

Depth is evidently an exception to the single-feature-accesstenet. As shown in Figure 12, one can see multiple depth levels atone time, and they form a vivid pattern in three-dimensional space.As mentioned above, multiple locations can be accessed simulta-neously. Given that depth is an aspect of location, this exceptiondoes not undermine but rather extends and reinforces the plausi-

bility of the present analysis (while offering hints about whatcoordinate structure Boolean maps are likely to reside in).

Attentional tracking. Attentional tracking has been widelyused to study visual attentional limits in general and object-basedattention in particular. In this experimental paradigm, a largenumber of identical objects move randomly in the display. Some ofthem are highlighted at the beginning of the trial, and participantstry to track them while they move. At the end of the trial, partic-ipants report identifying properties of the tracked objects. Themost fundamental finding about attentional tracking is that partic-ipants can track about 4–5 items quite well (Pylyshyn & Storm,1988).

8 Regarding Figure 11a, we have already, with Figure 5a, talked aboutthe one Boolean map selecting all balls to access their spatial pattern (e.g.,all locations). This is not inconsistent with the principle of selection thatclaims one can only select one feature value at one time. Presumably, inthis case, selection is not based on color at all. Instead, it is done by sendinga command like “all objects” to the LF routine, and the Boolean mapcreated can provide access to its spatial pattern (but not access to colorvalues of balls). This is natural and probably depends on the bottom-upsalience that we discuss later.

Figure 8. Top left panel: matching task (“Are the two multicolor displays identical?”). Middle left panel:symmetry judgment (“Are the two multicolor displays mirror reflections of each other?”). Bottom left panel:mental rotation task (“Can the left display be rotated to become identical to the right display?”). Boolean maptheory claims that such complex multicolored displays must be decomposed into colored subfigures (rightpanels) before structural analysis can be performed.

610 HUANG AND PASHLER

Page 13: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

Attentional tracking provides a strong test for Boolean maptheory because attentional tracking would appear to require thedirect use of a Boolean map. In most visual attention studies, thetask-relevant information is usually some property of the object,whereas in attentional tracking, the spatial distribution of attentionplays a critical role in the task. Also, with static displays, the strictlimit of one Boolean map at one time may be masked by thestrategy of rapidly switching between one Boolean map and an-other, whereas in attentional tracking, objects are rapidly moving,presumably foiling any such strategy. A number of importantstudies from the tracking literature fit very well with the analysisof tracking offered by Boolean map theory.9

The most widely discussed account of tracking is the theory ofindexing proposed by Pylyshyn (1989). According to this view,each object is listed and differentiated by a pointer, allowing evenphysically identical items to be treated as different objects. Bool-ean map theory offers a very different analysis: Because the natureof visual encoding is Boolean, objects are either selectedor not selected. Therefore, observers should not be able to maintainany cognitive differentiation between two selected objects duringthe time all the objects are being tracked. Suppose each ball startswith a different digit in it, and the participants track four ballsnumbered 1–4. At some point, the digit is erased, and they keeptracking for some time after that. In the process of tracking,participants should readily be able to report which four balls wereinitially cued. However, they should be unable to report the iden-tities of the objects (i.e., which is which). Note that this predictiongoes against indexing theories (Pylyshyn, 1989; see also Kahne-man & Treisman, 1984; Kahneman, Treisman, & Gibbs, 1992).However, it was recently confirmed by Pylyshyn (2004).

In addition, the Boolean map theory predicts that participantswill be poorly aware of any features of the objects besides theirlocation. For example, when participants track 4 among 10 balls,the Boolean map will not encode the colors or other features ofthose balls. Two recent studies confirmed this prediction. Scholl,

Pylyshyn, and Franconeri (2007) asked participants to track fouritems. When participants were tracking, one feature of an item wassuddenly obscured, and the observer was asked to report thatfeature. Not surprisingly, participants’ memory about location wasmuch better for tracked items than nontracked items. Remarkably,however, there was no difference between the tracked items andother items in the accuracy of reports on color and shape. Simi-larly, Saiki (2003) asked participants to keep track of three itemsthat rotated in a predictable way: These three balls were evenlydistributed in a circle and moved with equal angular velocity.Naturally, this is a very easy task. Participants were asked to reportany sudden switch of color between two items. Performance wasfound to be very poor, suggesting that even if an observer istracking only three items that rotate in a predictable way, theobserver is hardly aware of their features.

Shape analysis of adjacent regions. Boolean map theory alsomakes strong predictions about the extraction of spatial informa-tion from two adjacent regions. As shown in Figure 13, whenobservers view an adjoining pair of regions, one red and one green,Boolean map theory predicts that the only useful option is to selectone region or the other and process the shape of this region; if anobserver selects both (which is also an option), then the differencebetween the two regions disappears, and no access is provided toshape information that depends upon the contour between them.

Most readers will immediately realize that what is discussedhere is not a new phenomenon but rather the familiar phenomenonof figure–ground segmentation, which has been studied for a verylong time (e.g., Rubin, 1921/2001). Given this literature, there isno need here to document the robustness of this phenomenon.However, there are some interesting points of contrast between the

9 Because we were not aware of these studies when we derived thepredictions from Boolean map theory, we view these studies as providingespecially strong support for the theory; the reader may or may not agree.

Figure 9. Are the two displays identical? With 16 squares and 16 colors, the task is extremely laborious,according to Boolean map theory, because it cannot be efficiently decomposed into a small number of subfiguresof a given color.

611BOOLEAN MAP THEORY

Page 14: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

traditional interpretation, which has been formulated using theconcept of figure–ground segmentation, and the interpretationprovided by Boolean map theory.

The traditional interpretation is Bayesian in character: it statesthat figure–ground segmentation reflects the fact that it would bea great coincidence for two adjacent objects to happen to haveouter contours that fit snugly into each other (e.g., Rock, 1983).Instead, it is overwhelmingly more likely that one of the curvesonly appears so because of occlusion (i.e., part of one object isactually hiding behind the other object). In recognition of this fact,the visual system determines which object is behind the other anddeclines to assign the object the shape dictated by the contour (insome formulations, it is represented as perceptually shapeless).This account attributes the failure to assign shape on one side notto any processing difficulty but rather to intelligent inference. Wecall it the intelligent contour removal account. Most work in thefield has adopted this approach. One example is the parallelinteractive model of configural analyses (PIMOCA; Peterson, deGelder, Rapcsak, Gerhardstein, & Bachoud-Levi, 2000). Thismodel assumes that different parts participate in perception orga-nization in parallel but also tend to inhibit each other across edges.

PIMOCA has been successful in explaining some pertinent phe-nomena (e.g., the role of familiar shapes).

Although the intelligent contour removal account has evidentmerit, its assumption about the direction of causality here maybe questioned. We suggest that the human visual system cannotsimultaneously process two abutting shapes because the spatialprocessing required for shape analysis relies on a Boolean map,and only as a consequence of this inability is a mechanismneeded to decide which region is likely to be more important(figure). If the intelligent contour removal account is correct (atleast in its strongest form), observers should be able to processthe shapes on both sides when neither of the two regionsappears to be behind the other. On the other hand, Boolean maptheory predicts that even if neither of the two regions appears tobe behind the other, the shapes on both sides can still not beaccessed simultaneously.

As it happens, some features found in natural scenes do at leastas good a job of illustrating the point as would an artificial display(while possibly also raising questions about the statistical assump-tions underlying the intelligent removal interpretation). In thephotographs seen in Figure 14, when inspecting the curves in the

Figure 10. Method of Experiments 3–4. In Experiment 3 (a: color–color experiment), two colors are presentedeither simultaneously or successively. The participants later decide whether a probe color (not shown here) wasin the display or not. In Experiment 4 (b: location–location experiment), two locations are presented eithersimultaneously or successively. The participants later decide whether a probe square (not shown here) was in oneof the locations shown or not. According to Boolean map theory, two features (e.g., colors) can only be accessedsequentially, predicting an advantage for successive displays in Experiment 3. On the other hand, two locationscan be accessed in parallel, predicting no advantage for successive displays in Experiment 4. These predictionswere confirmed.

612 HUANG AND PASHLER

Page 15: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

middle of the pictures (a ridge in Figure 14a and a crevice inFigure 14b), the reader will probably notice that other visual cueshave been strong enough to overcome the tendency of figure–ground segmentation. That is to say, the visual system correctlydetermines that neither of the two sides is behind the other.However—at least judging from introspection—although it is easyto attend to both sides, extracting the shape of the center ridge orcrevice can only be done from the left side or the right side, butnever both.

This demonstration argues against the extreme version of theintelligent contour removal account. The account could doubt-less be modified to explain this phenomenon. For example,even if the primary reason for assigning the shape informationto only one region or the other is to deal with occlusion, thistendency may be misapplied to other situations in which occlu-sion is not present. In any case, the phenomenon is directlypredicted by Boolean map theory and subsumed under a muchmore general principle.

Obligatory encoding of location. One basic tenet of Booleanmap theory is that to access a feature, one must create a Booleanmap encompassing a region comprising only that feature. There-fore, it is impossible to know the feature value of an object withoutalso apprehending its location. This claim of Boolean map theoryis similar to the claim emerging from research on detection ofgratings at threshold, to the effect that signals are detected onlocation-labeled channels (A. B. Watson & Robson, 1981). Thedifference, however, is that according to the present view, thechannel could be a potentially complex and rich spatial represen-tation encompassing articulated and discontinuous regions.

Ordinary visual experience is surely consistent with this view.Whenever one sees a tree, a car, or a boy playing basketball, one

simultaneously knows where it or he is with some reasonableprecision. This is so intuitive that it may seem trivial. However, itis not at all self-evident and indeed, is not true of all sensorymodalities; one can be perfectly aware of an auditory event such asa bird singing while having only a vague sense of its location (seeKubovy, 1981, 1988, and Kubovy & Van Valkenburg, 2001, forconverging evidence that location is unique in vision, but notaudition, and for stimulating discussion of this issue).

Indeed, feature-integration theory explicitly claims that featurescan be accessed without knowledge of their location (Treisman &Gelade, 1980). Although initial evidence appeared to favor thisaccount, analyses that considered the role of guessing led to theopposite conclusion (Johnston & Pashler, 1990; for a recent re-view, see Quinlan, 2003; see also Driver & Vuilleumier, 2001).The literature is now generally consistent in implying that a featurecannot be accessed without simultaneous knowledge of its loca-tion.

No access to details within a Boolean map. Boolean maptheory predicts that an observer will have no access to featureinformation that applies to only part of the Boolean map. Whenone selects the whole, one has no access to a feature descriptionthat pertains to only one part of this whole. For example, whenone selects both objects in Figure 3, it is not possible to assignthe color red to the whole Boolean map even if it is a single-feature value. This notion is consistent with the findings of He,Cavanagh, and Intriligator (1996). They found that even if theorientation of a crowded grating patch could not be explicitlyreported, viewing the patch could nonetheless induce anorientation-specific adaptation effect, indicating that its orien-tation must be represented in the visual system. He et al. arguedthat this happens because attention cannot be allocated to thecrowded individual grating.

Poverty of momentarily accessible information. One basic te-net of Boolean map theory is that observers can access only onesingle feature at a time. How can we reconcile that with thecommonsense observation that people see so many things simul-taneously? Perhaps people do not really have access to as muchvisual information as they are tempted to suppose.

In studies of change detection with intervening delays sufficientto produce flicker, it has been found that people detect only a smallproportion of changes that are introduced into temporally succes-sive displays separated by intervals of about 100 ms or longer(Levin & Simons, 1997; Pashler, 1988b; Rensink, 2000a, 2000b;Rensink, O’Regan, & Clark, 1997, 2000). As many researchershave pointed out, this appears to challenge the ordinary phenom-enology of a rich visual awareness (although the term changeblindness appears overdramatized, given that four to five items areoften successfully processed in change detection designs; Pashler,1988b). Regardless, change detection performance could eitherunderestimate or overestimate the amount of information per-ceived at any given instant because the task may depend upon amemory that could encompass more or less than what is instanta-neously perceivable.

One recent account of change detection (Rensink, 2000a,2000b) proposed that observers access only one object at onetime. By contrast, Boolean map theory claims that observerscan access only one Boolean map, not one object, at a time.There is no widely accepted definition of one object. However,it seems to us that the stimuli in Experiment 3 (see Figure 10a,

Figure 11. In Panel a, comparing the shapes (i.e., set of relative locations)of two patterns is at least partially parallel, whereas comparing the set ofcolors seems to require sequential checking from color to color, based onintrospective report. Shuffling the colors around among locations, as inPanel b, does not seem to affect the ease with which the shapes can bematched. However, shuffling the locations around noticeably impairs theability to verify that the set of colors appearing in the left display of Panelb is identical to the set of colors appearing on the right.

613BOOLEAN MAP THEORY

Page 16: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

a four-segment wheel) would usually be regarded as one objectand the stimuli in Experiment 4 (see Figure 10b, a two-dotpattern) would usually be regarded as two objects. As shownabove, different parts of the four-segment wheel have to beaccessed sequentially, whereas the two-dot pattern can be ac-cessed all at once. Therefore, it seems more plausible to sup-pose that it is the Boolean map, not the object, that correspondsto what can be accessed at one moment (see also Jiang, Chun,& Olson, 2004). Naturally, readers may or may not agree withour opinion on what constitutes an object, but in any case, theconcept of Boolean map is at least a more clearly definedconcept. In sum, change detection findings point up the remark-able poverty of the momentarily accessible visual information.However, views derived from change detection research aboutwhat can be accessed at one moment (an object) are open toquestion, whereas the view that what can be accessed at onemoment is a labeled Boolean map seems more consistent withthe (limited) existing data.

Feature Labels for Different Dimensions

Above, we discussed the proposition that there can only be onelabel per map for any one dimension (e.g., color). In the case ofdifferent feature dimensions (e.g., color and orientation), can therebe a separate featural label for each dimension in a single labeledBoolean map? Concretely, this question amounts to whether ob-servers can perceive features of different dimensions simulta-neously (e.g., the redness and verticalness of an object). Thisquestion can be addressed by showing participants a few objects

that can vary in two dimensions. If performance is worse whenthey have to report feature values in both dimensions, then thatwould indicate a difficulty in attaching two labels to the sameBoolean map. The number of objects has to be very small and theperceptual task has to be very difficult (e.g., very brief displays) toensure that what is measured is a perceptual limit instead of a limitof visual working memory. A few pertinent studies exist (e.g.,D. A. Allport, 1971; Duncan, 1984; Magnussen & Greenlee, 1997;see also Egeth, 1966), and they agree in finding no competitionbetween different dimensions. Thus, the answer to our questionabove appears to be in the affirmative: A Boolean map can havelabels for different dimensions.

Given this specification, Boolean map theory predicts the now-classic finding of the same-object advantage (Duncan, 1984):Perceiving two features of one object is easier than perceiving twofeatures that belong to different objects. The reason is because onlyone Boolean map has to be created in the first case, whereas twoBoolean maps have to be created in the second case.

Boolean map theory claims that what is commonly calledobject-based attention is just another manifestation of the natureof Boolean maps. For two features to be simultaneously acces-sible, the crucial requirement is that they reside in the sameregion, not that they belong to one continuous whole. Forexample, Boolean map theory predicts that if two featuresbelong to different regions of one object (e.g., Figure 10a),there will be competition between them even if they belong tothe same object from the perspective of perceptual structure, asconfirmed by Experiment 3.

Figure 12. A demonstration that depth is an exception to the “one feature level at one time” principle of access,as illustrated in Figure 11. The black spots in various depth levels can be accessed simultaneously, and anobserver can perceive a pattern in three-dimensional space.

614 HUANG AND PASHLER

Page 17: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

The Creation of a Boolean Map

In this section, we first clarify the two principles that have beenproposed to characterize the creation of a Boolean map: the selec-tion from one dimension and the application of Boolean opera-tions. Then, we defend the tenet of feature-by-feature selection.Next, we defend the claim of the availability of Boolean opera-tions, arguing that a selection relying upon more than one dimen-sion is solved by a Boolean intersection operation and not by theuse of a top-down salience map. Finally, we argue that all top-down control is mediated by creation of Boolean maps.

Selection From One Dimension: Selection Based on OneFeature Value

Creating a Boolean map from one feature value within onedimension is the simplest and most fundamental way of creating aBoolean map.10 There seems to be no question that this is possiblewhen the feature space is clearly defined (e.g., selecting all redobjects from among green ones), but we should briefly describewhat can be counted as a feature dimension. As noted, we borrowthe concept of feature maps from feature-integration theory (Tre-isman & Gelade, 1980; Treisman & Gormican, 1988; Treisman &Sato, 1990) and the guided search model (Wolfe, 1994). There isa certain amount of ambiguity in those theories as to what countsas a feature. It is not our purpose to shed new light on this issue;on the basis of prior work, a reasonably complete list might becolor, orientation, size and spatial frequency, motion, depth, andperhaps certain aspects of shape (e.g., curvature, closure, digit/

letter identity). Also, location itself can also be used for selection;for example, selecting a ball from an array of identical objectswould require generating a Boolean map based on a specificationof location.

In addition to the standard feature maps, we assume that there isalso a bottom-up salience map. This map represents locations thatare inherently salient based on a fixed set of computations, forexample, those extracting local discrepancies between a featureand the surround (e.g., Itti, Koch, & Niebur, 1998). One suchexample, which we discussed earlier, is selecting all objectspresent in the display. Also, in Figure 15, it is fairly easy to see thepattern of all the nonblue colors because they are local minority.This cannot simply be based on a strategy of excluding one color:It seems extremely difficult to exclude one color in Figure 16. Theconcepts of salience and salience maps have often been used in theprevious literature to refer to a map that combines informationfrom various underlying feature maps in a way that reflects theindividual’s momentary goals and task set (e.g., Cave & Wolfe,1990; Wolfe, 1994; Wolfe, Cave, & Franzel, 1989). For example,if a person searches for a red vertical object among green verticalobjects and red horizontal objects, some theorists postulate asalience map that combines the redness and verticalness into asingle scalar to assess the overall relevance of an object. Thisconcept is termed top-down salience map in this article. Thisdistinction is important because Boolean map theory denies theexistence of a top-down salience map, and one goal of the presentarticle is to persuade the reader that this reasonable-soundingnotion is in fact highly questionable.

Can Observers Simultaneously Select Two FeatureValues?

This question about possibility of simultaneous selection of twofeature values is very fundamental. However, it has never beendirectly addressed. The most relevant evidence available from thevisual search literature suggests that one can simultaneously selecttwo features if they are linearly separable in color space from allother colors in the display (see Bauer et al., 1996, and D’Zmura,1991, for evidence that this constraint operates in disjunctivesearch for a single color-based visual search target). However, thisdoes not appear to govern performance in more general cases ofattention-to-structure tasks. In Figure 16, it appears very difficultto select union (green, blue) or union (red, yellow), even thoughthey are easily separated in color space from the remaining colors.So, it seems that one cannot divide the color space into two partsand select one of them; one really has to base his or her selectionon one color at one time.

10 One may point out that previous findings of automatic capture ofattention when observers lack any explicit intention (Yantis & Jonides,1984) are inconsistent with our notion that a Boolean map is always createdfrom one feature value. However, there may well be one or more defaultstrategies for creating Boolean maps (Folk, Remington, & Johnston, 1992,1993; see also Pashler & Harris, 2001; Yantis, 1993): for example, creatinga Boolean map from the bottom-up salience dimension (favoring the mostsalient object; Yantis & Egeth, 1999; Yantis & Johnson, 1990). Also, thereis some reason to doubt that attention capture is really automatic (Koshino,Warner, & Juola, 1992; Warner, Juola, & Koshino, 1990).

Figure 13. When a red and a green region abut each other, Boolean maptheory predicts that selection of both regions entails a Boolean map thatcannot support analysis of the shape of the contour that separates the two.

615BOOLEAN MAP THEORY

Page 18: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

As mentioned earlier, selection by location differs from otherforms of feature-by-feature selection. For one thing, location-based selection is not limited to a small spot but can be alsoeffectively applied to a region (i.e., the objects in that region) evenwhen these are more or less widely distributed in space. The exactprinciples governing this location-based selection should be stud-ied in the future, but for present purposes, we need only assumethat slightly more flexibility exists in this process.

The preceding discussion points up the fact that questions aboutattention to structure are not trivial corollaries of the sorts ofquestions posed in visual search studies. The phenomenon also haspotential practical implications for practical fields like informationvisualization, where the question of what can be seen at a glance(in cases much like Figure 16) is a critical factor in designing newtechnologies for data exploration and presentation (Bertin, 1983;Ware, 2000). It should also be noted that the knowledge gleanedfrom the literature on texture segregation is insufficient to charac-terize the limitations governing perception of structure. A standardtexture-segregation task (Julesz, 1981) involves determining whenone (usually convex and regular) region can be perceptually sep-arated from the rest of the pattern. In Figure 17, texture segregationof blue and green from red and yellow is apparently fast andeffortless, in clear contrast to the general difficulty revealed inattention to structure with the same color choices. For data visu-

alization, the situation of Figure 16 is clearly more relevant thanthe situation in Figure 17. Thus, examining tasks such as those inFigure 16, along with more standard texture-segregation tasks,should help us begin to achieve a more complete understanding ofattention to structure.

Boolean Operations as the Only Way of CombiningDifferent Dimensions

The purpose of this section is to argue that the strategy describedby the second principle, combining the output of one-dimensionselection with the present Boolean map in Boolean operations, isthe only way that selection can be governed by information frommore than one dimension. To put it another way, a Boolean mapcan be created only by giving one control value at a time to the FL(sometimes applied iteratively, so as to modify the current Booleanmap).

The Boolean operations: Intersection or union. Boolean maptheory contends that people are able to apply certain Booleanoperations to modify existing Boolean maps. Using the operationof union means creating a new Boolean map consisting of every-thing that either (a) is presently attended or (b) satisfies somenewly specified property. Using the operation of intersectionmeans creating a new Boolean map consisting of everything that

Figure 14. Shape extraction and figure–ground segmentation. Even when relative depth cues are absent, as inthese photographs, the difficulty in assigning the central contour (a ridge in Panel a and a crevice in Panel b) todetermine the shape of both surrounding regions appears to persist, as Boolean map theory would predict. Photoin Figure 14a copyright 2007 by Declan McCullagh/mccullagh.org. Used with permission.

616 HUANG AND PASHLER

Page 19: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

both (a) is presently attended and (b) satisfies some newly speci-fied property. It should be mentioned that postulating these Bool-ean operations does not contradict the claim that at any giveninstant, there is only one Boolean map underlying momentaryvisual awareness because the new Boolean map is assumed auto-matically to replace the old one.

This potentiality to use Boolean operations to govern consciousaccess to structured displays can be readily demonstrated, althoughinterestingly, it has not to our knowledge been remarked uponpreviously. In Figure 18a, the reader is invited to start by seeing allthe red items and then to select from these just the circles. Wecontend that this is achieved by first creating a Boolean mapconsisting of all the red items and, from there, utilizing intersec-tion, as described above, to create a new Boolean map consistingof just the red circles. From this new Boolean map, one caneffectively extract spatial information: For example, an observersees that the red circles are arranged in a symmetric pattern,

whereas, for example, the red crosses or squares are not. As weargue below, the task of conjunction search (Treisman & Gelade,1980) is also accomplished by intersection—only, in this case, justone item is left after the application of the second step.

In Figure 18b, most observers report that they can (with someeffort) select the union of the red pattern on the left and the coarsetexture pattern on the right. Alternatively, one can select the unionof the red pattern on the left and the fine texture pattern on theright. In each case, the spatial arrangement of the union can bevividly perceived. Tasks such as selecting a few from manyidentical objects seem to be also accomplished by iterative unionoperations. The union operation seems to have one striking restric-tion. As shown in Figure 16, a union operation cannot be used toselect union (red, yellow) together. We speculate that the unionoperation is effective only when the two sets to be combined arenot spatially intertwined. This is perhaps because the union oper-ation is implemented by connecting two global shape descriptionsinto one. Naturally, two spatially intertwined patterns will make acompletely new shape instead of a simple connection of the twoindividual shapes, so the union operation is foiled.

Certainly, we do not intend to suggest that the visual system cancreate Boolean maps of unlimited complexity through repetitiveBoolean operations of intersection and union. It appears that thereis some limit as to how long and elaborate a sequence is possible.Plausibly, this is because the sequence of Boolean operations usedto create one Boolean map must be stored in some machinery(which we might clumsily refer to as the Boolean map creatingprocedure, or BMCP). For example, green3 intersection verticalis stored in the BMCP if one selects vertical objects among thegreen objects, or (x1, y1)3 union (x2, y2)3 union (x3, y3) is storedif one selects three objects sequentially by location. Plausibly, thisBMCP is a form of working memory, with very limited capacity,therefore allowing only a few intersection and union operations tobe composed in series (e.g., pulling together arbitrary objects

Figure 15. A symmetric structure is readily detected when only a fewelements differ from a homogeneous surround. This is interpreted asevidence that a bottom-up salience computation based on local differencescan specify a Boolean map without any single unifying feature.

Figure 16. Selecting two colors simultaneously. Selecting the union ofthe green and the blue or the union of the red and the yellow is verydifficult—despite the fact that the first pair of colors is linearly separablefrom the second pair in CIE color space.

Figure 17. The compact blue-or-green square is readily segregated fromthe red-and-yellow surround in the classic perceptual task used to studytexture segregation. The contrast between this successful texture segrega-tion and the poor perception of structure seen in Figure 16 is argued todemonstrate that perception of structure, as discussed in the present article,should not be equated with texture segregation.

617BOOLEAN MAP THEORY

Page 20: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

through repetitive union operations can evidently only encompass4–5 of them).

Boolean operations and top-down salience map: Two competingaccounts. Boolean map theory claims that if a judgment requirescombining information from more than one dimension, a Booleanmap can be created from only one single-feature dimension at atime, and selection based on a combination of two dimensions isaccomplished by the intersection operation mentioned above. Thisentails a subset search strategy for the classic color–form conjunc-tion search task of Treisman and Gelade (1980). For example, ifone wants to find a red vertical target, a Boolean map must becreated to mark all the red objects. Then, this Boolean map can becombined with information from an orientation map via the inter-section operation to generate a new Boolean map that contains thevertical object among them.

Various previous theories (notably, Wolfe, 1994) have assumedthat information from different feature dimensions is combinedinto a master salience map. This salience map is a combination ofa bottom-up salience map and a top-down salience map. As men-tioned above, the present formulation agrees that the bottom-upsalience map plays a role in selection. The crucial differencebetween bottom-up salience and top-down salience is that theformer is not task specific and is not under voluntary control. Thus,our claim can also be understood in this way: There is a saliencemap in visual processing, but it is strictly bottom-up and cannot bealtered by top-down control, for example, by instructions or vol-untary goals.

The crucial difference between intersection (subset search) andtop-down salience is that in the intersection operation, the featuremap is combined with the preexisting Boolean map, not directlywith another feature map. One might think the difference is trivial.However, as we show below, even if these two models makerelatively similar predictions about standard conjunction search,they make starkly different predictions in other situations.

In fact, there has for some years been empirical evidence in theliterature pointing to the existence of a subset search strategy inconjunction search (Egeth, Virzi, & Garbart, 1984; see alsoKaptein, Theeuwes, & van der Heijden, 1995). Egeth and col-leagues (1984) showed that in this task, search is mainly restrictedto target-color-bearing distractors and that search latencies arehardly affected by the number of target-orientation-bearing dis-tractors. However, their study had the weakness of not controllingthe relative strength of the feature difference in the two dimen-sions. Therefore, a master top-down salience map could alsopotentially explain the findings of Egeth and colleagues by main-taining that the top-down salience is much greater for a target-color distractor than for a target-orientation distractor, thus atten-tion is always guided there.

Subset search is a necessary assumption, whereas top-downsalience is not. Additional support for the subset search strategyis seen with displays such as those in Figure 19. It is easy to findthe orientation singleton among just the green bars. This must bedone by selecting green items first. If features are combinedtogether to form a top-down salience map, then orientation signalsFigure 18. a: Boolean intersection task—first, select all the red items,

then see the subset that are Os. b: Boolean union task—first, select the red,then additionally see the coarse texture on the right. Unlike Figure 15, thenonoverlapping union supports Boolean mapping (albeit with some modestbut apparent difficulty).

Figure 19. Subset strategy in conjunction search. Task: Find the orien-tation singleton among just the green bars (the orientation of this singletonbeing unspecified). Boolean map theory predicts this can readily beachieved, whereas some other theories appear to predict the opposite.

618 HUANG AND PASHLER

Page 21: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

will not be useful because it is not known in advance whichorientation should be boosted. Wolfe (1994; see also Friedman-Hill & Wolfe, 1995) admitted the necessity of a subset search insuch displays; he argued that this was one option, with anotheroption being combining feature information into a top-down sa-lience map. However, the evidence that has been adduced insupport of this latter strategy seems open to question.

One example of such putative evidence is the finding of betterperformance in triple-conjunction search (e.g., finding a target thatis red and horizontal and large) as compared with double-conjunction search (Wolfe, Cave, & Franzel, 1989). This advan-tage is also naturally explained by Boolean map theory. In a typicaltriple-conjunction search task, creating a Boolean map from anyone feature dimension excludes two thirds of the distractors. Intypical double-conjunction search task, creating a Boolean mapfrom one feature dimension filters out only half of the distractors.Thus, there are 50% more relevant items in double-conjunctionsearch than in triple-conjunction search. In addition, the participantonly has to exploit any two of the three relevant feature dimen-sions. This wider range of strategy choices could also aid perfor-mance. Moreover, the local featural difference between items islarger in the case of triple-conjunction search, and thus, bottom-upsalience could contribute to the performance. Wolfe (1994) alsoargued for the master salience map on the basis of a comparison ofthe slopes for target-present versus target-absent trials. However, ifone assumes that observers apply the subset search strategy seri-ally for different parts of the display, these patterns can be ex-plained.

Taken together, the data show that subset search is a necessarynotion, whereas the top-down salience map is an unnecessarynotion. This does not directly disprove the possibility of top-downsalience computation, of course, but does undermine the idea ongrounds of parsimony. Below, we present evidence that moredirectly challenges the existence of a top-down salience map.

Trading one dimension against another. If signals from vari-ous dimensions can be combined into the one single top-downsalience map, it ought to be possible to arrange for one dimensionto trade off against another in a visual selection task. Concretely,this means that observers should be able to simultaneously pull outobjects that have a specific color or a specific shape by assigningpositive salience weights to that color and that shape. Is this in factpossible? When looking at Figure 20, observers seem to agree it isextremely difficult to perceive the pattern of union (red circles,green crosses)—that is, disjunction. On the other hand, this diffi-culty does not happen simply because the red–green differenceand circle–cross difference are not large enough as it is fairly easyto perceive the pattern of (green circles)—that is, conjunction. Iftop-down determination of salience is the underlying mechanismused to accomplish such selections, the salience difference be-tween these two subsets ought to be the same regardless of whichone is to be selected: In one case, green and circle would beboosted; in the other case, red and cross would be boosted. There-fore, the idea of a top-down salience map cannot account for themarked difference in the difficulty of selecting these two subsets.Boolean map theory, on the other hand, offers an explanation ofthis difference: The selection of (green circles) is carried outthrough intersection, whereas the selection of union (red circles,green crosses) is difficult because the union operation is noteffective when the two sets intertwine. One may note, of course,

that the selected set is homogeneous in the case of conjunction butheterogeneous in the case of disjunction and argue that this ex-plains why the task is more difficult in the case of disjunction thanconjunction. However, the very fact that homogeneity or hetero-geneity of the actual feature values (and not just the descriptionthat governs the selection) makes such a big difference proves thatthe selection is implemented in a way that depends upon the actualfeature values present in the display. There is no reason this shouldbe the case according to the top-down salience map hypothesis.Even if the original feature values are heterogeneous in the case ofdisjunction, the top-down salience map should have combinedthem into one single scalar, and thus, the success of selectionshould be comparable for disjunction and conjunction.

Similarly, Figure 21 illustrates an example in which the targetsare defined as elements with a high combined size–luminancevalue (compared with the distractors). Both size and luminancevary in four levels, so the targets do not have any particularpredetermined size or luminance values that can be searched for.However, if size and luminance signals could be added to generateone single top-down salience measure, the task should be relativelyeasy. By way of analogy, suppose one is reviewing files of grad-uate school applicants and wishes to prescreen by two factors:GRE scores and grade point average (GPA). The most convenientway to do this task is to compute a composite index (analogous totop-down salience) and rank them accordingly. If one is not able todo that, one may have to look for particular combinations (e.g.,GPA � 3.0 & GRE � 1400; GPA � 3.3 & GRE � 1300; GPA �2.7 & GRE � 1500), which is likely to be a very laborious process.Going back to the visual search task of Figure 21, if the top-down

Figure 20. It is easy to perceive the pattern of green circles, whereas it isextremely difficult to perceive the union of the red circles and greencrosses. A top-down salience map, if it exists, should allow one to separatethese two subsets equally well regardless of which one is selected. Thischallenges the view that the binding problem is solved by a mastertop-down salience map.

619BOOLEAN MAP THEORY

Page 22: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

salience map allows one to search for a high combined size–luminance value, then the task should be relatively easy. If no suchmechanism exists, however, one is forced to search for the partic-ular pairs of size and luminance one after another, and the taskshould be very laborious. In the bottom panel of Figure 21, thereare three targets; our informal observations suggest that observersdo indeed find the task very difficult. We conclude, therefore, thatwhen observers perform a conjunction search, they really have tosearch for a particular conjunction of feature values, not an abstractsize–luminance value without concern for the concrete values ofsize and luminance.

All these phenomena make good sense if there simply is notop-down salience map that can be used to allow two dimensionsto trade off against each other. Of course, one might proposeadditional restrictions on the top-down salience map to account forthese results. For example, one might assume that the top-downsalience map works only in the case of conjunction and not in thecase of disjunction. This would make the notion of top-downsalience map very similar (for the purposes of this task) to theintersection operation in Boolean map theory (see also Treisman &Sato, 1990, for a revised version of feature-integration theory thathas something in common with this view). The only remainingcrucial difference is that such an account would propose thatred–vertical targets are directly pulled out with no intermediatestep, whereas Boolean map theory predicts that one has to firstselect everything red or everything vertical. We try to distinguishthese two possibilities below.

Special difficulty for conjunction in brief displays. If a top-down salience map allows a conjunction to be directly selectedmuch as is a feature, albeit less effectively, then the time course ofconjunction search should be very similar to an inefficient featuresearch of similar difficulty. However, Boolean map theory predicts

that the time course of conjunction search should have a distinctiveproperty. The reason for this is that in conjunction search, the chiefdifficulty of the task lies in the fact that two Boolean maps have tobe created successively. Therefore, the time course of conjunctionsearch in a brief exposure should have a spoon shape, with per-formance increasing extremely slowly at the very beginning butthen increasing sharply after a certain point. The reason is asfollows: Conjunction search is a two-step process, in which thefirst step (creating the first Boolean map) provides almost nouseful guidance for selecting a response in the task. On the otherhand, even a difficult feature search is a one-step process in whichuseful information might be expected to emerge from the veryoutset (albeit slowly). By way of an analogy, a young persongraduating from high school often has the option of going tocollege for 4 years and then finding a job. The 4 college years donot directly provide any income (analogous to the first step inconjunction search). The individual can also go directly to work,typically providing an immediate but smaller income (analogous tothe difficult feature search). Thus, if one measures the accumulatedincome versus number of years after high school, the direct-to-work strategy (i.e., difficult feature search) yields greater benefitsfor the first years, but eventually the college strategy (i.e., con-junction search) will outstrip it. Naturally, the turning point variesfrom trial to trial, smoothing out any sharp elbows in the averagedata. Nevertheless, if one compares the time courses of conjunc-tion search and an inefficient feature search of appropriate diffi-culty, one would expect a crossover interaction, with the perfor-mance of the conjunction search being worse at the very beginningbut ultimately exceeding performance of an inefficient featuresearch.11 Experiment 5 confirms that the predicted crossover in-teraction can in fact be observed (see Figure 22). A top-downsalience map—or any other mechanism that assumes the selectionof a conjunction is functionally similar to but only less efficientthan the selection of a feature—would not seem to predict thispattern.

Recent studies on feature-based attention (Moore & Egeth,1998; Shih & Sperling, 1996) also indicate that signals fromdifferent feature dimensions cannot be combined in very briefdisplays.12 For example, in Moore and Egeth (1998), participantssearched for a digit in a display composed of equal numbers of redand green letters. In the constant condition, the color of the targetwas constant and known to participants. In the random condition,the color of the target was randomly specified on each trial. Notsurprisingly, performance in the constant condition was better. Ifsignals from different feature dimensions are combined into atop-down salience map, then this advantage presumably comesfrom giving some weight to the informative color dimension when

11 In the experiment described here, to match the overall performance,the feature search was deliberately chosen to involve much lower discrim-inability than the conjunction search, so the time course could be comparedmeaningfully. The classic finding of steeper slopes for conjunction searchpertains to experiments in which the same feature values are used in bothtasks.

12 Moore and Egeth (1998) did not interpret their result in this way.Their main purpose was to show that color-based attention does notfacilitate the sensory processing of the assigned color directly but givespriority in visual search to the items of the assigned color.

Figure 21. The top panel defines targets and distractors. The reader isinvited to search for these targets in the larger bottom panel. If a flexibletop-down salience map can be constructed on the fly, then it should bepossible to find targets defined as elements with high combined size–luminance values relative to distractors.

620 HUANG AND PASHLER

Page 23: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

computing the overall top-down salience. For example, suppose ared letter has a value of 10, the red digit has a value of 20, and agreen letter has a value of 0, each with some noise. This wouldobviously help the red digit win in the competition.

If so, one would naturally expect the advantage of knowledgeabout target color to occur even with very brief displays. However,Moore and Egeth (1998) found that performance in the constantcondition and in the random condition was identical for very briefdisplays. One may say that the signal reflecting the color differ-ence was too weak in very brief displays to guide attention effec-tively. However, the signal from the digit–letter identity differencewas strong enough to produce reasonable performance (about0.75). Thus it is implausible that color difference, which wasobviously a much stronger signal in their experiment than thesignal from digit–letter identity difference, would be too weak tomake any difference at all. On the other hand, Boolean map theoryexplains the results naturally: There was not enough time with thebrief displays to accomplish the two-step sequence described.Selection on the color dimension was simply abandoned because italone was not sufficient to locate the target.

Dissociation between top-down salience and bottom-up sa-lience. Various published data on the distinction between whathave been termed singleton detection mode and feature detectionmode (e.g., Bacon & Egeth, 1994) also raise problems for theconcept of a top-down salience map. In most proposals for amaster salience map, top-down salience and bottom-up salienceare assumed jointly to determine the tendency of an object toattract attention. However, previous studies on the distinctionbetween feature search mode and singleton search mode suggestedthat these two types of signals are processed in a fundamentallydifferent way. Pashler (1988a; see also Theeuwes, 1992) showedthat search for a singleton is disrupted by the presence of anotherirrelevant singleton. Bacon and Egeth (1994) convincingly dem-onstrated that this happens only when the visual search is executedin a singleton detection mode (looking for something that isdifferent from the rest); when search is performed in feature

detection mode (e.g., looking for red), the irrelevant singleton doesnot significantly disrupt performance. We interpret this as indicat-ing that when participants are in singleton detection mode, theBoolean map is created from the bottom-up salience map, so anirrelevant singleton will disrupt search. However, when an ob-server is in feature detection mode, the Boolean map is createdfrom the relevant feature map, and thus, the irrelevant singletonwill not disrupt performance. If top-down salience and bottom-upsalience are always combined into one map, then even featuresearch should always be disrupted by the presence of anotherirrelevant singleton. Naturally, one can assume that the top-downsalience map and the bottom-up salience map are functionallydissociated from each other. Such a modification, however, throwsfurther doubt on the plausibility of top-down salience.

All Top-Down Control Is Ultimately Attributable toSelection

We have conceptualized selection as involving the creation of aBoolean map, but that is still one step from warranting the claimthat all top-down control is ultimately attributed to creation of aBoolean map: After all, there could be top-down control other thanselection. We argue that such top-down controls are all ultimatelymediated by creation of Boolean maps.13

One important case to consider is change of perceptual structure(i.e., imposing a different percept on the same stimuli). For example,switching from the vase to two faces is one well-known example.Boolean map theory predicts that one can only change perceptualstructure by creating an appropriate Boolean map (i.e., imposing oneor another perceptual structure is implemented by allocating spatialattention in one way or another, not through any separate mechanism;see Slotnick & Yantis, 2005; Tsal & Kolbet, 1985), but it does notnecessarily dictate what available strategy will be used in any giveninstance. A natural idea would be that creating a Boolean mapencompassing an element that is more meaningful in the desiredperceptual structure but less meaningful in the present perceptualstructure tends to force a switch to the desired structure.14

In the case of face–vase demonstration, the pattern can be per-ceived as one vase or two faces placed on a background. The impor-tant fact is that if the display is now organized as two faces on abackground, then the center region (vase-shaped) is not one elementof such organization. Attending to that region forces the perceptualstructure to switch the other way in which the center red region isindeed an element. Therefore, selecting the vase-shaped region tendsto make that region be figure, and vice versa. This is supported byempirical evidence (e.g., Vecera, Flevaris, & Filapek, 2004).

The same point seems to be true in the case of other reversiblepictures. In the classic young woman–old woman reversible fig-

13 There are other studies viewed as instances of top-down control. Itseems to us these studies can be viewed as involving either switchingbetween different strategies for the creation of a Boolean map (e.g.,attentional set effect for spatial frequency: Davis & Graham, 1981; Davis,Kramer, & Graham, 1983) or selection in the competition for postpercep-tual processes (Maruff, Danckert, Camplin, & Currie, 1999; Remington &Folk, 2001). Therefore, they do not contradict our claim.

14 By using the word meaningful, we suggest that one part is functionallyindependent and functionally important. This ultimately implies a likeli-hood measurement from a Bayesian perspective.

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0 100 200 300 400 500

S timuli Exposure Duration (ms)

Acc

urac

y

Inef f icient feature search

C onjunction search

Figure 22. Proportion correct as a function of exposure duration of asearch display (followed by mask) in Experiment 5. This experimentcompared a color–form conjunction search and a relatively difficult featuresearch. Boolean map theory entails that two Boolean maps must be createdin sequence to perform conjunction search, whereas in feature search, thisis not necessary. Therefore, the conjunction search should take a muchlonger time to initiate. Thus, the theory predicts that a crossover interactionmay occur for appropriate difficulty levels, as the results confirmed.

621BOOLEAN MAP THEORY

Page 24: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

ure, the display is organized differently when it is viewed as old oryoung. If one wants to switch, for example, from the young womanto the old woman, one attends to the region that would be the faceof the old women but is the face and neck of the young women(face evidently being a more meaningful part than face and neck).

Similarly, with a Necker cube, attending to one location or anothertends to force the cube to be perceived as one way or another(Kawabata, Yamagami, & Noaki, 1978; see Figure 23). This probablyreflects the effect of creating a Boolean map to select part of the cube(e.g., surfaces, edges, or corners). For example, attending to the greensurface in Figure 23b seems to have the opposite effect from attendingto the green corner in Figure 23c. Therefore, the effect cannot beadequately explained by shifting fixation to one area or anotherbecause attending to both a green surface and a green corner wouldthen involve shifting to the bottom left part of the cube. Selecting partof the cube (e.g., edges, corners, surfaces) with spatial attention (i.e.,Boolean map) must be the crucial factor here. It would appear that theunderlying rule may be that whatever part is selected by a Booleanmap tends to be perceived as nearer to the observer (front side). Onemay surmise that an element in the front side is probably moremeaningful than an element in the back side. Therefore it is alsoconsistent with the notion that creating a Boolean map that is a more

meaningful element of desired perceptual structure but is less mean-ingful in the present perceptual structure tends to force the perceptualstructure to switch to the desired structure.

This account might make reasonable ecological sense. If thepre-Boolean map vision achieved two equally likely perceptualstructures and there was doubt which one was the correct inter-pretation of the external world, top-down control might select oneof these two interpretations. If a Boolean map is created for onepart that is more meaningful in Structure 1 but less meaningful inStructure 2, then the visual system might reasonably take that as avote for Structure 1.15

In sum, in various reversible figures, imposing a new structureseems to depend on moving spatial attention (i.e., creating a Booleanmap) to the pertinent part of the object, supporting our claim thattop-down control is achieved by creation of a Boolean map.

15 This top-down influence probably has a noticeable effect only in thecase of ambiguous perceptual structures, and this analysis should not betaken to imply that top-down control is either sufficient (i.e., that it canalways decide the perceptual structure) or necessary (i.e., that reversals ofperceptual structure do not occur spontaneously).

Figure 23. In a Necker cube, attending to one location or another (red vs. green dot) tends to force the cube to beperceived in one way or another (Panel a). This probably reflects the effect of creating a Boolean map to select partof the cube (e.g., surfaces, edges, or corners) instead of a spotlight covering the general region. Attending to the greensurface (Panel b) seems often to have the opposite effect from attending to the green corner (Panel c).

622 HUANG AND PASHLER

Page 25: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

Relation to Previous Theories

At the outset of this article, we remarked upon some differencesbetween Boolean map theory and prior theories. The next sectioncompares Boolean map theory with several previously proposedtheories, briefly highlighting some of the similarities and differ-ences. From the perspective of Boolean map theory, theories ofvisual attention can be roughly divided into two categories: theo-ries illuminating selection and theories dealing with consciousaccess.

Theories Illuminating Selection

Boolean map theory obviously owes a great debt to feature-integration theory (Treisman & Gelade, 1980) chiefly—but notonly—with respect to Treisman’s fundamental insight that thevisual system has a severe problem in maintaining multiple asso-ciations between different spatial and featural representations.This, of course, is something that is often called the bindingproblem in the broadest sense. We have already described the twofundamental disagreements between Boolean map theory andfeature-integration theory: First, Boolean map theory holds thatonly one feature value can be accessed at one instant, whereasfeature-integration theory holds that multiple feature values aresimultaneously available. This contradiction is due to a lack ofexplicit distinction between selection and access. The experimentspresented by feature-integration theory prove only that featurevalues are simultaneously available for the mechanism of selec-tion, in the sense that all of them are ready to be called upon. Theyare, however, as we have shown, not simultaneously available forconscious access in the sense that even if all of the feature valuesare ready to be called upon, only one can be actually pulled out.Second, Boolean map theory holds that a feature value is alwaysaccessed together with its location value(s), whereas feature-integration theory holds that a feature value can be accessedwithout its location value. On both issues, empirical evidencefavoring Boolean map theory has been discussed. On the otherhand, Boolean map theory also readily explains the original find-ings that feature-integration theory was constructed to explain. Forexample, in the case of conjunction–feature search distinction,Boolean map theory predicts that conjunction search should gen-erally be more difficult than feature search because an extraintersection operation is necessary.

The guided search model (Wolfe, 1994) is a major alternative tofeature-integration theory proposed to address a similar set ofquestions. The guided search model incorporates a number of ideasthat have both interesting resemblances to and points of contrastwith Boolean map theory. Like Boolean map theory, guided searchviews selection as inherently location based. On the other hand, theguided search model shares the same ambiguity about access asfeature-integration theory and therefore says little about tasks otherthan visual search. As for the strategy of exploiting differentdimensions in selection, guided search claims that multiple sepa-rate feature dimensions can be summed into one scalar array(salience map), which in turn determines the allocation of visualattention. Boolean map theory denies that a simultaneous weight-ing of different dimensions is possible.

Boolean map theory also makes a strong claim that selection canbe on only one feature value at one time. This question is not

formally discussed by the major previous theories illuminatingselection and appears to be a novel feature of the Boolean maptheory.

Theories About Access

The main theories that explicitly discuss access are the theory oftarget–target competition (Duncan, 1980a, 1980b) and the biasedcompetition model (Desimone & Duncan, 1995). Duncan (1980a,1980b) addressed access very explicitly and clearly. Boolean maptheory follows Duncan in postulating a highly limited capacity ofvisual conscious access. The major advance to which Boolean maptheory aspires is in providing a formal specification of constraintson access. Duncan (1980a, 1980b) stated that the access is subjectto severe capacity limitations, whereas Boolean map theory seeksto characterize these limitations in the language of data structures.

The proposal of object-based attention (Duncan, 1984) high-lights another side of visual attention, which feature-integrationtheory and other theories have failed to address. The major findingin Duncan’s studies is that feature values of different dimensionsfrom the same object can be simultaneously accessed, showinglittle competition with each other. Moreover, studies of object-based attention have shown that attention cannot be viewed as aconvex spotlight that moves from one location to another. Bothconclusions follow directly from Boolean map theory. The majoradvance to be hoped from Boolean map theory here is that thetheory offers a more explicit notion of what it means for featuresto be within object—a notion that differs from conventional inter-pretations of objecthood (Scholl, 2001; Scholl & Pylyshyn, 1999;Scholl, Pylyshyn, & Feldman, 2001). When two features belong todifferent parts of a continuous perceptual whole, some may con-sider them to be within the same object. Boolean map theoryinstead claims two features can be accessed without competitiononly when they reside in the same spatial region. As shown earlierin this article (Experiments 3 and 4), this prediction of Booleanmap theory receives empirical support.

The idea that observers can extract only surprisingly impover-ished information from a glance at a scene is in some sense quiteold: It is implied by the most widely discussed theories of visualattention (Broadbent, 1958; Duncan, 1980a, 1980b; Treisman &Gelade, 1980). However, the idea has been reinforced by theobservation that people do a very poor job of detecting changesintroduced over offsets of about 100 ms or longer whether thedisplays consist of arrays of familiar characters (Pashler, 1988b) orpictures of natural scenes (e.g., Rensink, 2000a, 2000b). A numberof researchers, notably Rensink et al. (1997, 2000), have argued onthis basis for the proposition that the conscious awareness ofscenes is impoverished in content. Boolean map theory echoes thisobservation but places it within a perspective that also makes senseof the many ways in which people can discern relations holdingbetween widely scattered elements or objects in a scene. Somewritings inspired by change blindness would seem to imply thathuman visual experience is akin to the experience of looking at theworld through a long cardboard tube. However, people demonstra-bly have a capacity for visual perception of structure that does notfit with such a view (indeed, many of the tasks described earliersimply could not be performed to any reasonable degree by acreature limited in this way). Boolean map theory offers a descrip-tion of the representational content of human visual experience

623BOOLEAN MAP THEORY

Page 26: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

that may conform more closely to the peculiar combination ofstrengths and weaknesses that characterizes human perceptualabilities. The theory entails massive parallelism and enormousflexibility but also some striking representational limitations.

Boolean Map Theory Offers a Potential Unification ofPrevious Disparate Theories

We have shown that Boolean map theory provides a unifiedframework in which to view target–target competition (Duncan,1980a, 1980b), feature integration (Treisman & Gelade, 1980),object-based attention (Duncan, 1984), and change blindness (Pa-shler, 1988b; Rensink, 2000a, 2000b). What makes this set ofphenomena an interesting group of ideas to try to unify is the factthat they are usually viewed in connection with opposing theoret-ical ideas. Above, we have discussed how an explicit distinctionbetween selection and access explains the apparent contradictionbetween the parallel feature processing in feature-integration the-ory and the strong representational constraints that (we contend)govern conscious access to features. For another example, thedifficulty of conjunction search is typically explained by assumingthat features are not automatically bound together, whereas thesame-object advantage is taken as evidence that features are auto-matically bound together into perceptual objects. Likewise, thispuzzling inconsistency is due to the lack of an explicit distinctionbetween selection and access. Attention is necessary for binding inthe sense that feature values from different dimensions cannot bedirectly combined before they are exploited by the mechanism ofselection. However, they can nonetheless be represented simulta-neously in a consciously accessed Boolean map (see also Nieu-wenstein, Chun, van der Lubbe, & Hooge, 2005, for an interestingreport on how a selection–access distinction can shed light onattentional blink). Boolean map theory provides a unified accountthat reconciles these findings.

General Discussion

We turn now to some conjectures and discussions that, thoughnot integral to Boolean map theory, illustrate the potential of theapproach to raise new questions and shed new light on some oldand new issues in vision and visual cognition.

Visual Working Memory

It has long been recognized that visual working memory isclosely connected to visual perception (Kosslyn, 1980; Phillips,1974). Strictly speaking, Boolean map theory does not necessarilypredict anything about visual working memory per se, but it wouldseem natural if Boolean map theory shed light upon the limitationsand character of visual working memory. It has long been notedthat the term memory does not necessarily entail a mechanismwhose sole—or even primary—function is to retain informationper se (A. Allport, 1980). Any mechanism that can be used torepresent information over time can be called memory, and oneway to retain spatial information over a short period would be tomaintain a Boolean map to select those locations.

The apparent linkage between visual spatial working memoryand visual attention proposed by Awh and Jonides (2001) wouldseem very compatible to this conjecture. These investigators found

enhanced processing of new visual signals arriving at locationsbeing maintained in a spatial working memory task (Awh, Jonides,& Reuter-Lorenz, 1998). On the other hand, moving attentionaway from memorized locations impaired memory for those loca-tions (Smyth, 1996; Smyth & Scholey, 1994). Furthermore, it hasbeen reported that visual spatial memory loads significantly dis-rupt visual search (Han & Kim, 2004; Woodman & Luck, 2004).These observations all suggest that maintaining visual spatialmemory is at least partly accomplished by directing visual atten-tion in an appropriate fashion (Awh & Jonides, 2001). This findingis obviously congenial to Boolean map theory with its contentionthat the distribution of spatial attention is data for spatial analysis.

What about visual (as against spatial) working memory? Visualworking memory as usually assessed must include not only spatialinformation but also featural information (Phillips, 1974). Are thecontents of visual working memory limited to one Boolean map?One map would not be sufficient to maintain anything more thanthe spatial distribution of a single-feature value. Performancelevels in visual working memory tasks often show that people canmaintain somewhat, but not a great deal, more information thanthat (e.g., Stefurak & Boynton, 1986; Wheeler & Treisman, 2002),and thus, visual working memory cannot be limited to just oneBoolean map. One possibility is that the content of a few Booleanmaps might be retained.16 This seems potentially consistent withthe finding of Jiang, Olson, and Chun (2000), who showed thatvisual working memory is organized around spatial configurationand that this configuration is in some ways more primitive thanfeature information. For example, memory for a feature of oneitem is disrupted by a change in the location of other items,whereas a change in the features of other items has little effect onthe memory of the location. Jiang and colleagues also speculatedthat items of the same color are probably represented together (seealso Kanizsa, 1979). These observations would all be consistentwith the idea that visual working memory is represented as acollection of several Boolean maps.

Object-Based Attention

Object-based attention has been an important topic in recentresearch on visual attention. One question that has been hotlydebated is whether object-based attention is always mediated byspatial attention (as grouped array theory contends; see Kramer,Weber, & Watson, 1997) or not (as the spatially invariant accountclaims; see Vecera & Farah, 1994). We believe part of the ambi-guity surrounding this question is due to the lack of an explicitdistinction between selection and access. If the question is askedabout access, Boolean map theory holds that all visual information(featural information, object identity, etc.) is obligatorily indexedby locations and that the person must also access these locationswhen the features are accessed. On this point, the current view isin agreement with grouped array theory. On the other hand, if thequestion is posed with respect to selection, a Boolean map can becreated from various types of nonspatial cues, so, from this per-

16 This statement is not in contradiction to the earlier statement that thereis only one Boolean map at any instant. Here, we merely imply that visualworking memory is organized in a format similar to that of the Booleanmap. These “Boolean maps” do not provide conscious access.

624 HUANG AND PASHLER

Page 27: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

spective, attention can be nonspatial. Plainly speaking, spatiallocations are always part of what is selected, but they are notalways part of the cue to select.

Cuing and Processing Optimization

The term visual attention is often used to encompass two po-tentially distinct concepts. One of these relates to selection ofvisual information for conscious access, which is the focus ofBoolean map theory. The other concept is that one act of selectionmay influence subsequent selection (and processing) of othervisual information in close spatial proximity (e.g., cuing improvesperformance, Posner, 1980, or more directly, perceiving a partic-ular object improves the perception of subsequent objects in thesame location, Kim & Cave, 1995)—something we term process-ing optimization.

Two examples may help clarify the difference between theBoolean map and processing optimization. First consider a cuingparadigm. An initial display is composed of one red dot and onegreen dot (both briefly presented), followed by a display showingtwo characters (one digit and one letter). Suppose the digit isusually in the location of red dot and the letter is usually in thelocation of green dot (with this reversed in a small percentage oftrials, say, 10%). Assume further that the observer’s task is tomake some judgment about the digit (e.g., is it odd or even?). Inthis situation, it would be common to regard the performanceadvantage in digit responses when the digit follows the red dot(cued) rather than when it follows the green dot (miscued) as ameasure of attention to the red dot. This would correspond well tothe concept of processing optimization. The Boolean map, on theother hand, corresponds to the mere fact that the red dot is seen.Therefore, if such an experiment revealed no difference as afunction of whether the digit was cued versus miscued, onemight—following conventional practice—conclude that attentionwas not paid to the red dot. This would be correct only with respectto processing optimization. However, if the observer saw the reddot (e.g., if the observer could state where the red dot was), then,according to the present theory, a Boolean map must have beencreated to encompass this dot, and in that sense, attention musthave been paid to it.

In the above example of cuing, processing optimization corre-sponds to the conventional concept of attention. On the other hand,in the attentional tracking paradigm, participants attempt to track afew balls from a group of other, usually physically identical balls.By the conventions of the attentional tracking literature, attentionto the tracked balls is said to be reflected in an ability to discrim-inate those attended balls from the others. This corresponds to theBoolean map. Processing optimization, on the other hand, wouldhave to be measured otherwise. Suppose 10 characters suddenlyappear in the location of 10 balls (one in each): If the characters inthe positions formerly occupied by balls that were being trackedare perceived better, this would reflect processing optimization.

Equating attention with the Boolean map (i.e., selection andaccess) seems to be in line with some—but by no means all—priorusage (e.g., Duncan, 1980a, 1980b, 1984; Intriligator & Cavanagh,2001; Lu & Sperling, 1995; Pylyshyn & Storm, 1988; Treisman &Gelade, 1980; Ullman, 1984). On the other hand, a large literatureon visual attention relating to the cuing paradigm (e.g., Downing,1988; Eriksen & St. James, 1986; Gobell, Tseng, & Sperling,

2004; Posner, 1980; Posner, Snyder, & Davidson, 1980; Yeshurun& Carrasco, 1998) has focused on processing optimization.

Though these aspects of attention seem clearly separable, to ourknowledge, their general distinction has not been made explicitlyin the literature—potentially causing some confusion. For exam-ple, although it is obvious that observers can perceive the spatialrelationship between two spatially separate objects without bring-ing in objects between them, it has been considered a nonobviousquestion to ask whether attention can be paid to two spatiallyseparate objects (e.g., Awh & Pashler, 2000; Bichot, Cave, &Pashler, 1999; Hahn & Kramer, 1998; Kramer & Hahn, 1995;McMains & Somers, 2004). Clearly, selection (and access) is whatis at issue in the first case, whereas processing optimization is thetopic in the second case. Also, in cuing paradigms, the unattendedtarget is often detected only slightly more slowly and/or lessaccurately than the attended target. However, in many discussions,attention has been claimed to be a prerequisite for access to anyvisual information whatsoever (e.g., Rensink, 2000a, 2000b).Again, it appears that attention can only mean processing optimi-zation in the first case and selection (and access) in the secondcase.

The present article is devoted exclusively to developing anaccount of selection and access, and for present purposes, wemerely note that there is an important conceptual distinction be-tween selection–access and processing optimization. Uncoveringthe empirical differences between the two—and we suspect theremay be several important differences relating to time course,spatial precision, and other attributes—is a topic for future discus-sion and investigation.

Eye Movements and Boolean Map Theory

Attention and eye movements have a close functional relation-ship (for reviews, see Hoffman, 1998; Rayner, 1998). For example,a shift of visual attention to the target of an upcoming saccadeseems to be required before the saccade can commence (Deubel &Schneider, 1996; Hoffman & Subramaniam, 1995). Consequently,as Liversedge and Findlay (2000) pointed out, efforts to modelvisual search that disregard the pattern of eye movements occur-ring during these tasks are probably passing up a potentially usefulsource of empirical constraints (see also Findlay, 1997; Scialfa &Joffe, 1998; Williams & Reingold, 2001; Zelinsky, 1996.) For thesame reason, exploring the pattern of eye movements occurring inthe kinds of perception-of-structure tasks examined here mightwell shed light on the ideas proposed in the current article. Onevery basic question to be examined in that regard is whether eyefixations directly reflect the spatial distribution of the Boolean maphypothesized in this article. For example, when an observer at-tempts to select both objects in Figure 3, does the eye tend to fixateupon something like the centroid of the Boolean map (even when,as in the figure, this is not even part of the selected region)? If thator some other relatively direct relationship generally holds, theneye movements are likely to prove very useful in exploring andtesting the current approach and some of the tasks examined here.

Perceptual Grouping

Boolean map theory may also pose a challenge to some widelyassumed interpretations of classic grouping phenomena. In Figure

625BOOLEAN MAP THEORY

Page 28: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

24, the display can be organized as columns according to Ge-staltists’ proximity grouping principle or as rows according to thesimilarity grouping principle. It is conventionally assumed thatsimilarity grouping means this: Each item is linked together withany neighboring elements that are very similar to it (Kubovy,Holcombe, & Wagemans, 1998; Kubovy & Wagemans, 1995).However, is this really what happens? In the case of proximitygrouping (vertical organization), all five columns do indeed seemto be simultaneously perceivable. However, in the case of simi-larity grouping, most people report that they can see the three rowsof crosses or the three rows of balls—but not all six rows at thesame time. If one tries to select all six, then the rows turn intocolumns. All of these observations can be made sense of in thefollowing way. Suppose there is no general process of similaritygrouping in the sense of various objects around a display becominglinked with other objects on the basis of various kinds of sharedfeatures. Instead, in line with Boolean map theory, suppose thatsimilarity-based grouping is achievable only through a two-stepprocess: (a) forming a Boolean map composed of one subset ofstimuli of a certain type (e.g., all the crosses) and (b) applyingproximity grouping to the elements that are represented in the map.Within the Boolean map of this display, items are closer to theirhorizontal neighbors than their vertical neighbors, and thus, one(but not both) of the two horizontal three-stripe arrangementsshould become perceptually available. We suspect that futureresearch can probably uncover objective ways of testing the kindof hierarchical iterative grouping that we are hypothesizing.

Neural Underpinnings of the Boolean Map

In recent years, many researchers have been seeking to identifyneural correlates or underpinnings of visual attention (e.g., Boyn-

ton, 2005; Bundesen, Habekost, & Kyllingsbaek, 2005; Gandhi,Heeger, & Boynton, 1999; Kastner, De Weerd, Desimone, &Ungerleider, 1998; O’Craven, Downing, & Kanwisher, 1999; Yan-tis et al., 2002). Boolean map theory is formulated at an algorith-mic level, and thus, it does not directly imply any particular neuralimplementation. Nevertheless, some potential connections seemworth exploring.

The access constraints posited by Boolean map theory mightfit quite well with a strikingly simple— one might even saycrudely simple—interpretation of the idea of distinct “what?”and “where?” pathways (Mishkin, Ungerleider, & Macko,1983), combined with a very simple idea about how bindingacross multiple specialized brain areas might be achieved.Imagine, for example, that there are topographic maps in thedorsal visual processing stream (chiefly in the parietal lobe) thatrepresent the location of attended inputs, and suppose that acomplex but stable pattern of activity there can potentiallyrepresent a complex configuration of multiple spatial locations.Coexisting with such a pattern of activity, suppose that in theventral stream, there are numerous specific areas that representdifferent featural properties of the attended inputs (e.g., whatcolor? what orientation? etc.) and that a stable pattern of activ-ity in any one of these areas can represent one (but only one)specific value along that dimension. Given these premises, ifone hypothesizes that a conscious visual percept consists of aset of stable states within both the dorsal and the ventralstreams, the combination of the two would have precisely therepresentational capacity postulated for the Boolean map datastructure—if one assumes that there are no mechanisms forbinding beyond co-occurrence of these states. It seems at leastconceivable, therefore, that the linkages between Boolean maptheory and visual neurophysiology might turn out to be morestraightforward than the linkages have been for other theories ofvisual attention. This point is offered merely as a conjecture; adetailed analysis of how Boolean map theory might be imple-mented in the brain is obviously a large task going far beyondthe scope of the present article.

Concluding Comments

The present article has outlined a novel approach to visualattention and visual awareness. The basic two principles of Bool-ean map theory, stated in the theory section above, are not repeatedhere. In addition to offering a potential unification of some distinct(and, in a few cases, seemingly contradictory) aspects of visualattention, the present theory has several distinctive features, anumber of which invite further investigation:

1. The theory proposes a representational format for visualconscious access (single-feature–multiple-locations datastructure). Previous theorizing on limitations in accesshas shown that visual conscious access is capacity limited(Duncan, 1980a, 1980b), but (as far as we know) thepresent article offers the first effort to describe a testableset of representational constraints that might govern thisaccess. The analysis seems strong and counterintuitive inits contention that only one feature per dimension can beaccessed at one instant and that merely seeing a simplered–green object (see Figure 10a) requires sequentialconstruction of two representations.

Figure 24. A probe of similarity grouping. Proximity grouping wouldfavor a five-vertical-stripe organization, whereas similarity group-ing—as traditionally understood—favors a six-horizontal-stripe orga-nization. However, it seems that one can select either three horizontalstripes of balls or three horizontal stripes of crosses. Attempting toselect both will obligatorily turn the organization into vertical stripes.We infer that similarity grouping operates by selecting one subset (e.g.,all the crosses) to include in a Boolean map; proximity grouping is thenapplied to this map.

626 HUANG AND PASHLER

Page 29: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

2. In this article, we have proposed a rather different concep-tion of the principles of selection than has been proposedbefore, especially with regard to the exploitation of infor-mation from different dimensions and based on differentfeature values. Unlike the question of conscious access, themechanisms of selection have been the subject of extensivetheory, from feature-integration theory to the guided searchmodel, and the topic has inspired a large body of research.The potential reinterpretation of these phenomena proposedhere suggests many new lines of investigation.

3. We have advocated (and provided some examples of) theuse of novel tasks designed to demand attention to structure.This may draw new empirical attention to a broad topic thathas been neglected because of what, in our view, has beenan overly narrow focus on visual search. As a concomitantbenefit, such research may promote the development ofricher connections between the study of visual attention andthe rapidly developing field of information visualization(see Ware, 2000, for an excellent overview of this field, witha focus on connections to perceptual science).

References

Allport, A. (1980). Patterns and actions: Cognitive mechanisms arecontent-specific. In G. Claxton (Ed.), Cognitive psychology: New direc-tions (pp. 26–64). London: Routledge & Kegan Paul.

Allport, D. A. (1971). Parallel encoding within and between elementarystimulus dimensions. Perception & Psychophysics, 10, 104–108.

Awh, E., & Jonides, J. (2001). Overlapping mechanisms of attention andspatial working memory. Trends in Cognitive Sciences, 5, 119–126.

Awh, E., Jonides, J., & Reuter-Lorenz, P. A. (1998). Rehearsal in spatialworking memory. Journal of Experimental Psychology: Human Percep-tion and Performance, 24, 780–790.

Awh, E., & Pashler, H. (2000). Evidence for split attentional foci. Journalof Experimental Psychology: Human Perception and Performance, 26,834–846.

Bacon, W. F., & Egeth, H. E. (1994). Overriding stimulus-driven atten-tional capture. Perception & Psychophysics, 55, 485–496.

Barlow, H. B., & Reeves, B. C. (1979). Versatility and absolute efficiencyof detecting mirror symmetry in random dot displays. Vision Research,19, 783–793.

Bauer, B., Jolicoeur, P., & Cowan, W. B. (1996). Visual search for colourtargets that are or are not linearly separable from distractors. VisionResearch, 36, 1439–1466.

Bertin, J. (1983). Semiology of graphics. Madison: University of Wiscon-sin Press.

Bichot, N. P., Cave, K. R., & Pashler, H. (1999). Visual selection mediatedby location: Feature-based selection of noncontiguous locations. Per-ception & Psychophysics, 61, 403–423.

Boynton, G. (2005). Attention and visual perception. Current Opinion inNeurobiology, 15, 465–469.

Broadbent, D. A. (1958). Perception and communication. London: Perga-mon Press.

Bundesen, C. (1990). A theory of visual attention. Psychological Review,97, 523–547.

Bundesen, C., Habekost, T., & Kyllingsbaek, S. (2005). A neural theory ofvisual attention: Bridging cognition and neurophysiology. PsychologicalReview, 112, 291–328.

Cavanagh, P., Labianca, A. T., & Thornton, I. M. (2001). Attention-basedvisual routines: Sprites. Cognition, 80, 47–60.

Cave, K. R., & Wolfe, J. M. (1990). Modeling the role of parallel process-ing in visual search. Cognitive Psychology, 22, 225–271.

Chong, S. C., & Treisman, A. (2003). Representation of statistical prop-erties. Vision Research, 43, 393–404.

Chong, S. C., & Treisman, A. (2005). Attentional spread in the statisticalprocessing of visual displays. Perception & Psychophysics, 67, 1–13.

Davis, E. T., & Graham, N. (1981). Spatial frequency uncertainty effects inthe detection of sinusoidal gratings. Vision Research, 21, 705–712.

Davis, E. T., Kramer, P., & Graham, N. (1983). Uncertainty about spatialfrequency, spatial position, or contrast of visual patterns. Perception &Psychophysics, 33, 20–28.

Dennett, D. C. (1991). Consciousness explained. Boston: Little, Brown.Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual

attention. Annual Review of Neuroscience, 18, 193–222.Deubel, H., & Schneider, W. X. (1996). Saccade target selection and object

recognition: Evidence for a common attentional mechanism. VisionResearch, 36, 1827–1837.

Downing, C. J. (1988). Expectancy and visual spatial attention: Effects onperceptual quality. Journal of Experimental Psychology: Human Per-ception and Performance, 14, 188–202.

Driver, J., & Baylis, G. C. (1989). Movement and visual attention: Thespotlight metaphor breaks down. Journal of Experimental Psychology:Human Perception and Performance, 15, 448–456.

Driver, J., & Vuilleumier, P. (2001). Perceptual awareness and its loss inunilateral neglect and extinction. Cognition, 79, 39–88.

Duncan, J. (1980a). Demonstration of capacity limitation. Cognitive Psy-chology, 12, 75–96.

Duncan, J. (1980b). The locus of interference in the perception of simul-taneous stimuli. Psychological Review, 87, 272–300.

Duncan, J. (1984). Selective attention and the organization of visual informa-tion. Journal of Experimental Psychology: General, 113, 501–517.

Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulussimilarity. Psychological Review, 96, 433–458.

D’Zmura, M. (1991). Color in visual search. Vision Research, 31, 951–966.Egeth, H. E. (1966). Parallel versus serial processes in multidimensional

stimulus discrimination. Perception & Psychophysics, 1, 245–252.Egeth, H. E., Virzi, R. A., & Garbart, H. (1984). Searching for conjunc-

tively defined targets. Journal of Experimental Psychology: HumanPerception and Performance, 10, 32–39.

Eriksen, C. W., & St. James, J. D. (1986). Visual attention within andaround the field of focal attention: A zoom lens model. Perception &Psychophysics, 40, 225–240.

Findlay, J. M. (1997). Saccade target selection during visual search. VisionResearch, 37, 617–631.

Folk, C. L., Remington, R. W., & Johnston, J. C. (1992). Involuntarycovert orienting is contingent on attentional control settings. Journal ofExperimental Psychology: Human Perception and Performance, 18,1030–1044.

Folk, C. L., Remington, R. W., & Johnston, J. C. (1993). Contingentattentional capture: A reply to Yantis (1993). Journal of ExperimentalPsychology: Human Perception and Performance, 19, 682–685.

Friedman-Hill, S., & Wolfe, J. M. (1995). Second-order parallel process-ing: Visual search for the odd item in a subset. Journal of ExperimentalPsychology: Human Perception and Performance, 21, 531–551.

Gandhi, S. P., Heeger, D. J., & Boynton, G. M. (1999). Spatial attentionaffects brain activity in human primary visual cortex. Proceedings of theNational Academy of Sciences, USA, 96, 3314–3319.

Gobell, J. L., Tseng, C. H., & Sperling, G. (2004). The spatial distributionof visual attention. Vision Research, 44, 1273–1296.

Hahn, S., & Kramer, A. F. (1998). Further evidence for the division ofattention among non-contiguous locations. Visual Cognition, 5, 217–256.

Han, S. H., & Kim, M. S. (2004). Visual search does not remain efficientwhen executive working memory is working. Psychological Science, 15,623–628.

He, S., Cavanagh, P., & Intriligator, J. (1996, September 26). Attentionalresolution and the locus of visual awareness. Nature, 383, 334–337.

627BOOLEAN MAP THEORY

Page 30: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

Hoffman, J. E. (1979). A two-stage model of visual search. Perception &Psychophysics, 25, 319–327.

Hoffman, J. E. (1998). Visual attention and eye movements. In H. Pashler(Ed.), Attention (pp. 119–154). London: University College London Press.

Hoffman, J. E., & Subramaniam, B. (1995). The role of visual attention insaccadic eye movements. Perception & Psychophysics, 57, 787–795.

Howard, C., & Holcombe, A. O. (2007). Progressively poorer perceptualprecision and progressively greater perceptual lag: Tracking the changingfeatures of one, two, and four objects. Manuscript submitted for publication.

Huang, L., & Pashler, H. (2002). Symmetry detection and visual attention:A “binary-map” hypothesis. Vision Research, 42, 1421–1430.

Intriligator, J., & Cavanagh, P. (2001). The spatial resolution of visualattention. Cognitive Psychology, 43, 171–216.

Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visualattention for rapid scene analysis. IEEE Transactions on Pattern Anal-ysis and Machine Intelligence, 20, 1254–1259.

Jiang, Y. H., Chun, M. M., & Olson, I. R. (2004). Perceptual grouping inchange detection. Perception & Psychophysics, 66, 446–453.

Jiang, Y. H., Olson, I. R., & Chun, M. M. (2000). Organization of visualshort-term memory. Journal of Experimental Psychology: Learning,Memory, and Cognition, 26, 683–702.

Johnston, J. C., & Pashler, H. (1990). Close binding of identity and locationin visual feature perception. Journal of Experimental Psychology: Hu-man Perception and Performance, 16, 843–856.

Julesz, B. (1981, March 12). Textons, the elements of texture perception,and their interactions. Nature, 290, 91–97.

Kahneman, D., & Treisman, A. (1984). Changing views of attention andautomaticity. In R. Parasuraman & D. A. Davis (Eds.), Varieties ofattention (pp. 29–62). New York: Academic Press.

Kahneman, D., Treisman, A., & Gibbs, B. J. (1992). The reviewing ofobject files: Object-specific integration of information. Cognitive Psy-chology, 24, 175–219.

Kanizsa, G. (1979). Organization in vision: Essays on gestalt perception.New York: Praeger Publishers.

Kaptein, N. A., Theeuwes, J., & van der Heijden, A. H. C. (1995). Searchfor a conjunctively defined target can be selectively limited to a color-defined subset of elements. Journal of Experimental Psychology: Hu-man Perception and Performance, 21, 1053–1069.

Kastner, S., De Weerd, P., Desimone, R., & Ungerleider, L. C. (1998,October 12). Mechanisms of directed attention in the human extrastriatecortex as revealed by functional MRI. Science, 282, 108–111.

Kawabata, N., Yamagami, K., & Noaki, M. (1978). Visual fixation pointsand depth perception. Vision Research, 18, 853–854.

Kim, M. S., & Cave, K. R. (1995). Spatial attention in visual search forfeatures and feature conjunctions. Psychological Science, 6, 376–380.

Koenderink, J. J., & van Doorn, A. J. (1991). Affine structure from motion.Journal of the Optical Society of America: Optics, Image Science, andVision, 8(A), 377–385.

Koshino, H., Warner, C. B., & Juola, J. F. (1992). Relative effectiveness ofcentral, peripheral, and abrupt-onset cues in visual attention. QuarterlyJournal of Experimental Psychology: Human Experimental Psychology,45(A), 609–631.

Kosslyn, S. M. (1980). Image and mind. Cambridge, MA: Harvard Uni-versity Press.

Kramer, A. F., & Hahn, S. (1995). Splitting the beam: Distribution ofattention over noncontiguous regions of the visual field. PsychologicalScience, 6, 381–386.

Kramer, A. F., Weber, T. A., & Watson, S. E. (1997). Object-basedattentional selection—Grouped arrays or spatially invariant representa-tions? Comment on Vecera and Farah (1994). Journal of ExperimentalPsychology: General, 126, 3–13.

Kubovy, M. (1981). Psychology Survey, Vol. 2—Connolly, K. Contempo-rary Psychology, 26, 471–472.

Kubovy, M. (1988). Should we resist the seductiveness of the

space:time::vision:audition analogy? Journal of Experimental Psychol-ogy: Human Perception and Performance, 14, 318–320.

Kubovy, M., Holcombe, A. O., & Wagemans, J. (1998). On the lawfulnessof grouping by proximity. Cognitive Psychology, 35, 71–98.

Kubovy, M., & Van Valkenburg, D. (2001). Auditory and visual objects.Cognition, 80, 97–126.

Kubovy, M., & Wagemans, J. (1995). Grouping by proximity and multi-stability in dot lattices: A quantitative gestalt theory. PsychologicalScience, 6, 225–234.

LaBerge, D., & Brown, V. (1989). Theory of attentional operations inshape identification. Psychological Review, 96, 101–124.

Levin, D. T., & Simons, D. J. (1997). Failure to detect changes to attendedobjects in motion pictures. Psychonomic Bulletin & Review, 4, 501–506.

Liversedge, S. P., & Findlay, J. M. (2000). Saccadic eye movements andcognition. Trends in Cognitive Science, 4, 6–14.

Lu, Z. L., & Sperling, G. (1995). The functional architecture of humanvisual motion perception. Vision Research, 35, 2697–2722.

Magnussen, S., & Greenlee, M. W. (1997). Competition and sharing ofprocessing resources in visual discrimination. Journal of ExperimentalPsychology: Human Perception and Performance, 23, 1603–1616.

Maruff, P., Danckert, J., Camplin, G., & Currie, J. (1999). Behavioral goalsconstrain the selection of visual information. Psychological Science, 10,522–525.

McMains, S. A., & Somers, D. C. (2004). Multiple spotlights of attentionalselection in human visual cortex. Neuron, 42, 677–686.

Mishkin, M., Ungerleider, L. G., & Macko, K. A. (1983). Object vision andspatial vision: Two cortical pathways. Trends in Neurosciences, 6, 414–417.

Moore, C. M., & Egeth, H. (1998). How does feature-based attention affectvisual processing? Journal of Experimental Psychology: Human Per-ception and Performance, 24, 1296–1310.

Morales, D., & Pashler, H. (1999, May 13). No role for colour in symmetryperception. Nature, 399, 115–116.

Navon, D. (1977). Forest before trees: Precedence of global features invisual perception. Cognitive Psychology, 9, 353–383.

Nieuwenstein, M. R., Chun, M. M., van der Lubbe, R. H. J., & Hooge,I. T. C. (2005). Delayed attentional engagement in the attentional blink.Journal of Experimental Psychology: Human Perception and Perfor-mance, 31, 1463–1475.

O’Craven, K. M., Downing, P. E., & Kanwisher, N. (1999, October 7).fMRI evidence for objects as the units of attentional selection. Nature,401, 584–587.

O’Regan, J. K., & Noe, A. (2001). A sensorimotor account of vision and.visual consciousness. Behavioral and Brain Sciences, 24, 939–1031.

Palmer, S. E., & Hemenway, K. (1978). Orientation and symmetry: Effectsof multiple, rotational, and near symmetries. Journal of ExperimentalPsychology: Human Perception and Performance, 4, 691–702.

Pashler, H. (1988a). Cross-dimensional interaction and texture segregation.Perception & Psychophysics, 43, 307–318.

Pashler, H. (1988b). Familiarity and visual change detection. Perception &Psychophysics, 44, 369–378.

Pashler, H. (1998). The psychology of attention. Cambridge, MA: MIT Press.Pashler, H., & Harris, C. R. (2001). Spontaneous allocation of visual attention:

Dominant role of uniqueness. Psychonomic Bulletin & Review, 8, 747–752.Peterson, M. A., de Gelder, B., Rapcsak, S. Z., Gerhardstein, P. C., &

Bachoud-Levi, A. C. (2000). Object memory effects on figure assign-ment: Conscious object recognition is not necessary or sufficient. VisionResearch, 40, 1549–1567.

Phillips, W. A. (1974). Distinction between sensory storage and short-termvisual memory. Perception & Psychophysics, 16, 283–290.

Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Exper-imental Psychology, 32, 3–25.

Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and thedetection of signals. Journal of Experimental Psychology: General, 109,160–174.

628 HUANG AND PASHLER

Page 31: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

Pylyshyn, Z. (1989). The role of location indexes in spatial perception: Asketch of the FINST spatial-index model. Cognition, 32, 65–97.

Pylyshyn, Z. W. (2004). Some puzzling findings in multiple object track-ing: I. Tracking without keeping track of object identities. Visual Cog-nition, 11, 801–822.

Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independenttargets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 1–19.

Quinlan, P. T. (2003). Visual feature integration theory: Past, present, andfuture. Psychological Bulletin, 129, 643–673.

Rayner, K. (1998). Eye movements in reading and information processing:20 years of research. Psychological Bulletin, 124, 372–422.

Remington, R. W., & Folk, C. L. (2001). A dissociation between attentionand selection. Psychological Science, 12, 511–515.

Rensink, R. A. (2000a). The dynamic representation of scenes. VisualCognition, 7, 17–42.

Rensink, R. A. (2000b). Seeing, sensing, and scrutinizing. Vision Research,40, 1469–1487.

Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or not to see:The need for attention to perceive changes in scenes. PsychologicalScience, 8, 368–373.

Rensink, R. A., O’Regan, J. K., & Clark, J. J. (2000). On the failure todetect changes in scenes across brief interruptions. Visual Cognition, 7,127–145.

Rock, I. (1983). The logic of perception. Cambridge, MA: MIT Press.Rubin, E. (2001). Figure and ground. In S. Yantis (Ed.), Visual perception:

Essential readings (pp. 225–229). Philadelphia: Psychology Press.(Original work published 1921)

Saiki, J. (2003). Feature binding in object-file representations of multiplemoving items. Journal of Vision, 3, 6–21.

Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition,80, 1–46.

Scholl, B. J., & Pylyshyn, Z. W. (1999). Tracking multiple items throughocclusion: Clues to visual objecthood. Cognitive Psychology, 38, 259–290.

Scholl, B. J., Pylyshyn, Z. W., & Feldman, J. (2001). What is a visualobject? Evidence from target merging in multiple object tracking. Cog-nition, 80, 159–177.

Scholl, B. J., Pylyshyn, Z. W., & Franconeri, S. L. (2007). The relationshipbetween property-encoding and object-based attention: Evidence frommultiple object tracking. Manuscript submitted for publication.

Scialfa, C. T., & Joffe, K. M. (1998). Response times and eye movementsin feature and conjunction search as a function of target eccentricity.Perception & Psychophysics, 1067–1082.

Shiffrin, R. M., & Gardner, G. T. (1972). Visual processing capacity andattentional control. Journal of Experimental Psychology, 93, 72–82.

Shih, S. I., & Sperling, G. (1996). Is there feature-based attentionalselection in visual search? Journal of Experimental Psychology: HumanPerception and Performance, 22, 758–779.

Slotnick, S. D., & Yantis, S. (2005). Common neural substrates for thecontrol and effects of visual attention and perceptual bistability. Cogni-tive Brain Research, 24, 97–108.

Smyth, M. M. (1996). Interference with rehearsal in spatial workingmemory in the absence of eye movements. Quarterly Journal of Exper-imental Psychology: Human Experimental Psychology, 49(A), 940–949.

Smyth, M. M., & Scholey, K. A. (1994). Interference in immediate spatialmemory. Memory & Cognition, 22, 1–13.

Stefurak, D. L., & Boynton, R. M. (1986). Independence of memory forcategorically different colors and shapes. Perception & Psychophysics,39, 164–174.

Theeuwes, J. (1992). Perceptual selectivity for color and form. Perception& Psychophysics, 51, 599–606.

Treisman, A., & Gormican, S. (1988). Feature analysis in early vision:Evidence from search asymmetries. Psychological Review, 95, 15–48.

Treisman, A., & Sato, S. (1990). Conjunction search revisited. Journal of

Experimental Psychology: Human Perception and Performance, 16,459–478.

Treisman, A. M., & Gelade, G. (1980). A feature-integration theory ofattention. Cognitive Psychology, 12, 97–136.

Tsal, Y., & Kolbet, L. (1985). Disambiguating ambiguous figures byselective attention. Quarterly Journal of Experimental Psychology: Hu-man Experimental Psychology, 37(A), 25–37.

Ullman, S. (1984). Visual routines. Cognition, 18, 97–159.van der Helm, P. A., & Leeuwenberg, E. L. J. (1996). Goodness of visual

regularities: A nontransformational approach. Psychological Review,103, 429–456.

Vecera, S. P., & Farah, M. J. (1994). Does visual attention select objects orlocations? Journal of Experimental Psychology: General, 123, 146–160.

Vecera, S. P., Flevaris, A. V., & Filapek, J. C. (2004). Exogenous spatialattention influences figure-ground assignment. Psychological Science,15, 20–26.

Ware, C. (2000). Information visualization: Perception for design. SanFrancisco: Morgan Kaufmann.

Warner, C. B., Juola, J. F., & Koshino, H. (1990). Voluntary allocationversus automatic capture of visual attention. Perception & Psychophys-ics, 48, 243–251.

Watson, A. B., & Robson, J. G. (1981). Discrimination at threshold:Labeled detectors in human vision. Vision Research, 21, 1115–1122.

Watson, D. G., Maylor, E. A., & Bruce, L. A. M. (2005). The efficiency offeature-based subitization and counting. Journal of Experimental Psy-chology: Human Perception and Performance, 31, 1449–1462.

Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visualmemory. Journal of Experimental Psychology: General, 131, 48–64.

Williams, D. E., & Reingold, E. M. (2001). Preattentive guidance of eyemovements during triple conjunction search tasks: The effects of featurediscriminability and saccadic amplitude. Psychonomic Bulletin & Re-view, 8, 476–488.

Wolfe, J. M. (1994). Guided search 2.0: A revised model of visual search.Psychonomic Bulletin & Review, 1, 202–238.

Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided search: Analternative to the feature integration model for visual search. Journal ofExperimental Psychology: Human Perception and Performance, 15, 419–433.

Woodman, G. F., & Luck, S. J. (2004). Visual search is slowed whenvisuospatial working memory is occupied. Psychonomic Bulletin &Review, 11, 269–274.

Yantis, S. (1992). Multielement visual tracking: Attention and perceptualorganization. Cognitive Psychology, 24, 295–340.

Yantis, S. (1993). Stimulus-driven attentional capture and attentional con-trol settings. Journal of Experimental Psychology: Human Perceptionand Performance, 19, 676–681.

Yantis, S., & Egeth, H. E. (1999). On the distinction between visualsalience and stimulus-driven attentional capture. Journal of Experimen-tal Psychology: Human Perception and Performance, 25, 661–676.

Yantis, S., & Johnson, D. N. (1990). Mechanisms of attentional priority.Journal of Experimental Psychology: Human Perception and Perfor-mance, 16, 812–825.

Yantis, S., & Jonides, J. (1984). Abrupt visual onsets and selective atten-tion: Evidence from visual search. Journal of Experimental Psychology:Human Perception and Performance, 10, 601–621.

Yantis, S., Schwarzbach, J., Serences, J. T., Carlson, R. L., Steinmetz,M. A., Pekar, J. J., & Courtney, S. M. (2002). Transient neural activityin human parietal cortex during spatial attention shifts. Nature Neuro-science, 5, 995–1002.

Yeshurun, Y., & Carrasco, M. (1998, November 5). Attention improves or impairsvisual performance by enhancing spatial resolution. Nature, 396, 72–75.

Zelinsky, G. (1996). Using eye saccades to assess the selectivity of searchmovements. Vision Research, 36, 2177–2187.

(Appendix follows)

629BOOLEAN MAP THEORY

Page 32: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

Appendix

Experiments

General Method

Participants

Undergraduate students from the University of California, SanDiego, participated in this project. All participants had normal orcorrected-to-normal vision. There were 9 participants in Experi-ment 1, 8 participants in Experiment 2, 18 participants in Exper-iment 3, 16 participants in Experiment 4, and 16 participants inExperiment 5.

Apparatus

Stimuli were presented on a 1,024 pixel � 768 pixel colormonitor. Participants viewed the displays from a distance of about60 cm and entered responses using the keyboard. The program waswritten in Microsoft Visual Basic 6.0 and was run in MicrosoftWindows XP, using timing routines tested with the BlackboxToolkit.

Procedure

Each trial began with a small white fixation cross presented for400 ms in the center of the screen. After a short blank interval (400ms), the stimuli were presented. Participants made the appropriatedecision and responded by pressing one of two adjacent keys (j andk) with fingers of the right hand after all stimuli had been presented(in some experiments, there was more than one frame). In Exper-iments 1–2, the stimuli remained in the display until response, andparticipants were asked to respond as accurately and quickly aspossible. In Experiments 3–5, the stimuli were masked (detailsbelow), and participants were asked to respond as accurately aspossible (unspeeded response). A tone sounded to indicate whetherthe response was correct, and the next trial began 400 ms later.

Experiments 1–2: Repetition and Rotation of ColorPatterns

Method

Sample displays in ABBA and ABCD conditions from Exper-iment 1 are shown above in Figure 7. In Experiment 1 (repetitionexperiment), two 4 � 4 color patterns were presented against agray background. They were each 5.6 cm � 5.6 cm (each square1.4 cm � 1.4 cm). The distance between their centers was 8.4 cm(gap � 2.8 cm). We used the four colors red, green, blue, andyellow. The two patterns might be identical or different, each witha probability of 50%. Participants responded to this by pressingtwo keys (j for different, k for identical). When the patterns wereidentical, the four colors were each used four times. When theywere different, two positions with different colors were randomlychosen from the display. There were two types of display. In theABBA condition, the two changed squares switched color witheach other. In the ABCD condition, the two changed squares wereeach changed to a new color. For example (see Figure 7), supposethat two positions were chosen and that they were red and green.

In the ABBA condition, the red square was changed to green, andthe green square was changed to red. In the ABCD condition, thered square was changed to yellow, and the green square waschanged to blue.

In Experiment 2 (rotation experiment), everything else was thesame as Experiment 1, except that the right pattern was rotated 90°clockwise or anticlockwise and the participants judged if thepatterns were identical before the rotation. The direction of rota-tion was constant across a whole block and alternated from blockto block. The direction of rotation was explicitly given before eachblock.

In both the repetition experiment and the rotation experiment,participants finished five blocks (one block had 100 trials in therepetition experiment and 50 trials in the rotation experiment.) Thefirst block was excluded from data analysis as practice.

Results

The logic of this method is as follows: In the ABCD condition,the Boolean map of any color will suffice to reveal that the twopatterns are different. In the ABBA condition, only the Booleanmaps of two colors will reveal that the two patterns are different.Therefore, if participants are serially scanning from one color toanother color, then they always have to check only one color todetect a difference in the ABCD condition, whereas, in the ABBAcondition, they sometimes have to check two or even three colorsto detect a difference. Thus, response times should be longer in theABBA condition. Otherwise, if participants are doing parallelcomparison of the entire pattern, the two conditions should containequivalent amounts of difference for the ABBA and ABCD con-ditions.

The response time of the ABBA condition was longer than thatof the ABCD condition in both the repetition experiment (2,100 msvs. 1,905 ms, difference � 195 ms), F(1, 8) � 11.96, p � .01, andthe rotation experiment (3,463 ms vs. 2,659 ms, difference � 804ms), F(1, 7) � 16.76, p � .01. Consistently, the error rates of theABBA condition were also higher in both the repetition experi-ment (7.7% vs. 5.3%), F(1, 8) � 4.44, p � .025, and the rotationexperiment (14.4% vs. 5.5%), F(1, 7) � 11.74, p � .02. Therefore,the results of Experiments 1–2 fit the claim of Boolean mappingtheory that the comparison is being done by a serial scanningstrategy from color to color rather than by a parallel comparison ofthe entire patterns.

Experiments 3–4: Simultaneous Access to Two ColorValues or Two Location Values

Method

Sample displays and procedures are illustrated above in Fig-ure 10. In Experiment 3 (see Figure 10a), a wheel (diameter � 4.2cm) was divided into four quadrants. The top left and bottom rightquadrants shared one color, and the bottom left and top rightquadrants shared one color. Four colors (red, green, blue, yellow)were used in Experiment 3, and in each trial, two of the four were

630 HUANG AND PASHLER

Page 33: A Boolean Map Theory of Visual Attentionlaplab.ucsd.edu/articles/Huang_Pashler_PR2007.pdf · A Boolean Map Theory of Visual Attention What Is a Boolean Map? ... (two natural possibilities,

randomly picked to constitute the wheel, and one of the four wasrandomly chosen to be the probe color, which was presented as asquare (0.78 cm � 0.78 cm) in the center of the screen. The twocolors of the wheel were presented either successively (interframeinterval � 700 ms) or simultaneously. In both cases, the displayswere rapidly masked (duration of mask � 200 ms) after a verybrief exposure. The duration of stimuli exposure was adjusted foreach participant to achieve a moderate performance (ranging from23 ms to 61 ms, with an average of 42 ms). The probe color waspresented 700 ms after the offset of the last frame (i.e., the onlyframe in the simultaneous condition and the second frame in thesuccessive condition) and remained present until response. InExperiment 3, participants were required to perceive the two colorsin the wheel and later to determine if the probe color was one ofthem.

In Experiment 4 (see Figure 10b), yellow squares could bepresented in four locations (four quadrants) and were divided intotwo pairs (one pair was top left and bottom right; the other pair wasbottom left and top right). In each trial, one square was randomlypicked from each pair to constitute the stimulus, and one of thefour was randomly chosen to be the probe location. Squaresmeasured 0.78 cm � 0.78 cm each and were 0.91 cm off the centerof the display both vertically and horizontally. The two squareswere presented either successively (interframe interval � 700 ms)or simultaneously. In both cases, the displays were rapidly masked(duration of mask � 200 ms) after a very brief exposure. Theduration of stimuli exposure was adjusted for each participant toachieve a moderate performance (ranging from 12 ms to 50 ms,with an average of 30 ms). The probe location was presented 700ms after the offset of the last frame (i.e., the only frame in thesimultaneous condition and the second frame in the successivecondition) and remained present until response. In Experiment 4,participants were required to perceive the locations of the twosquares and later to determine if the probe location was one ofthem.

In both Experiments 3 and 4, participants performed 10 blocksof 50 trials. In each block, the trials switched between simulta-neous and successive presentations. The first block was excludedfrom data analysis as practice.

Results

In Experiment 3, the average accuracy in the successive condi-tion was significantly better than that in the simultaneous condition(successive condition 0.734 vs. simultaneous condition 0.665),F(1, 17) � 41.23, p � .0001. In Experiment 4, the averageaccuracy in the successive condition was slightly worse than in thesimultaneous condition (successive condition 0.743 vs. simulta-neous condition 0.770), F(1, 15) � 3.93, p � .1. The interactionwas significant, F(1, 32) � 31.17, p � .0001. These results suggestthat there is a severe difficulty in accessing two colors simulta-neously, whereas there is no such difficulty in accessing twolocations simultaneously.

A model was constructed to ask how sequential the color accessreally was in Experiment 3. Assume that the performance islimited by the strength of the early sensory signal and also therestriction of visual access. To simplify the question, we assumed

a dichotomy whereby, in a certain percentage of trials ( p), thesignal is strong enough to allow conscious access, and in the rest(1 � p), it is not. We also assumed that the strength of earlysensory signal is independent for each color. In the successivecondition, each color is perceived in p trials; therefore, the accu-racy should be 0.5 � p/2. In the simultaneous condition, the earlysensory signal allows zero colors to be perceived in (1 � p)(1 �p) trials, one color to be perceived in 2p(1 � p) trials, and twocolors to be perceived in pp trials. The restriction of consciousaccess applies to the latest condition (two colors), so only onecolor can be perceived in those trials. Taken together, in thesimultaneous condition, one color can be perceived in 2p � pptrials; therefore, each of the two colors can be perceived in p �pp/2 trials. That corresponds to an accuracy of 0.5 � ( p/2) �( pp/4). The best fitting value of p is 0.456. Given that, thepredicted accuracy for simultaneous and successive conditions is0.728 and 0.676, respectively, and very closely resembles theactual data of 0.734 and 0.665. In short, the significant but notenormous difference between performance in the simultaneous andsuccessive conditions is consistent with strictly sequential accessto colors.

Experiment 5: Time Course of Feature Search andConjunction Search

Method

In Experiment 5, participants searched for a red vertical targeteither from a uniform array of yellowish red vertical bars (featuresearch task) or an array of red horizontal bars and green verticalbars (conjunction search task). In each display, 32 items (includinga target) were placed in arrays of nine columns and four rows, withno item placed in the center column (i.e., two 4 � 4 arrays with agap in the center). The distances between centers of items were1.30 cm both vertically and horizontally. Each bar was 0.78 cmlong and 0.16 cm wide. In each trial, there was always one target.Participants decided whether it was on the left side (left fourcolumns) or right side (right four columns) of the display andresponded. The display was presented for 100 ms, 200 ms, or 400ms before being masked. In Experiment 5, participants performed12 blocks of 80 trials. The first two blocks were excluded fromdata analysis as practice. The blocks alternated between featuresearch blocks and conjunction search blocks, and the starting blockwas balanced across participants.

Results

In Experiment 5 (see Figure 22, above), the performance ofconjunction search was inferior to feature search at short exposurebut superior at long exposure; the interaction was significant, F(1,15) � 16.39, p � .002. This distinctive pattern suggests thatconjunction search relies upon a dimension-to-dimension subsetsearch strategy, not a top-down salience map (see text for details).

Received November 3, 2005Revision received December 28, 2006

Accepted January 2, 2007 �

631BOOLEAN MAP THEORY