what is thinking? thinking/reasoning is the process by which we go beyond the information given...

90
What is thinking? Thinking/reasoning is the process by which we go beyond the information given (beyond what we see or are told) Distinguish between representations involved in the course of actively reasoning and those that constitute “standing knowledge Only active representations are referred to as thought These are sometimes viewed as being in “working memory” (STM) Such active representations take part in reasoning, solving problems and drawing inferences from standing knowledge. “Standing knowledge” is said to be in “long term memory” (LTM). The BIG questions: Do we think in language (e.g., English) or Do we think in pictures, or Both or neither

Upload: phillip-hall

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

What is thinking?Thinking/reasoning is the process by which we go beyond the

information given (beyond what we see or are told)

● Distinguish between representations involved in the course of actively reasoning and those that constitute “standing knowledge Only active representations are referred to as thought These are

sometimes viewed as being in “working memory” (STM) Such active representations take part in reasoning, solving problems and

drawing inferences from standing knowledge. “Standing knowledge” is said to be in “long term memory” (LTM).

● The BIG questions: Do we think in language (e.g., English) or Do we think in pictures, or Both or neither

Desiderata for a form of representation

● A format for representing thoughts must meet certain conditions (Fodor & Pylyshyn, 1988) :

1) The capacity to think is productive (there is no limit to how many distinct thoughts the competence encompasses), Therefore thoughts are built from a finite set of concepts

2) The capacity to represent and to draw inferences is systematic: If we have the capacity to think certain thoughts then we also have the capacity to think other related thoughts.

3) Thoughts may be false but they are not ambiguous to the thinker When sentences are ambiguous it is because they express several possible

unambiguous thoughts.

Thinking in words

● Our experience of “thinking in words” is that of carrying on in inner dialogue with ourselves. But consider a typical fragment of such a dialogue. It conforms to Gricean principles of discourse: Be maximally informative – don’t say things that your audience already knows.

● Since our mental dialogue follows these maxims it is clear that much is left unsaid. But according to the view that one thinks in words, if it is unsaid then it is also unthought! Or, conversely, if it was thought, it was not thought in words. It was left to the imagination of the listener – but according to this view there is no room for imagining something other than what was said.

● Thoughts experienced as inner dialogue grossly underdetermine what is thought, so words cannot be the vehicle of thought.

As I sit here thinking about what I will say in this lecture, I observe myself thinking,

If this was my thought, then what did I mean by “example” or “this”? And who was I referring to when I said “nobody”? Was there a presupposition that I wanted to persuade someone?

Obviously I knew what I meant, but how was this knowledge represented? Not in words since I cannot find it anywhere in my consciousness. And if it was there in unconscious words, it would still have the same properties of anaphora, ambiguity, presupposition, and entailment since those are inherent in natural language.

“I’d better find a concrete example of this or nobody will understand what I mean, and then they certainly will not believe it!”

Lingua Mentis● The representation of thoughts needs to meet the four

conditions just listed (finite conceptual base, productivity, systematicity, freedom from ambiguity)

● For that reason, thought requires a format similar to a logical calculus (or LF). Call it the Language of Thought (LOT), after Fodor’s famous 1975 book. This is not to say that reasoning cannot use other forms of

representation in addition to LOT. Because LOT appears ill suited to represent magnitudes, the

proposal that there is an additional (perhaps analog) form of representation is attractive

But none proposed so far is satisfactory – perhaps because the notion of an analog is ill defined.

Representational and inferential systematicity● Representational systematicity (Fodor & Pylyshyn, 1998) refers to the fact

that if you can think certain thoughts then you have the capacity to think an indefinite number of other related thoughts: e.g., if you can think both that snow is white and that crows are black then you have the concepts snow, crow, white, and black which gives you the capacity also to think snow is black and crows are white.

● Inferential or rule systematicity (Pylyshyn, 1984, Chapter 3) refers to the fact that for representations to enter into rules, the representations must have the relevant distinct constituents. For a rule of inference such as “From P Q and Not-Q infer P” the parts P and Q have to be explicitly recognizable. The same is true of if-then rules. Suppose a system’s behavior is expressed by a pair of rules such as (1) if Q1 and Q2 hold, then execute action A1, and (2) if Q1 and Q3 hold then execute action A2. The three distinct conditions Q1, Q2 and Q3 must be constituents of a representation of the state of the system to which a rule applies. The rule could not be expressed by a representation that fuses the conditions, as connectionist models do (with Qn≡F[Q1,Q2]).

We often have strong experiences about the steps we go through in solving a problem:

But does that tell us how we solve it?

● Arnheim’s Visual Thinking (1969) Rudolph Arnheim claims that when people solve visual analogy problems,

they go through a sequence of mental states that involve a “rich and dazzling experience” of “instability,” of “fleeting, elusive resemblances,” and of being “suddenly struck by” some perceptual relationships. If this is true, does it explain how we solve the problems?

● What steps do you go through in understanding language?● How do you experience thinking about numbers?

What is 9 + 7? What is 6 x 8? Which is larger 379 or 397?

● Daniel Tammet, autistic savant Ramachandra’s Science Video What does the description of Daniel’s experience tell us?

Thinking in pictures (or pictures + words)

● There is a large literature on scientific discovery that credits images as the cause of the discovery (benzine ring)

● Are pictures better than words for expressing thoughts or for creating new thoughts? Why are images often cited as the format for non-verbal or intuitive thoughts?

● To understand how pictures or words could serve as the basis for encoding thoughts we need to understand the assumptions underlying the claim that thoughts are encoded in pictures or words. What’s missing is an understanding of the distinction between

form and content which itself rests on another distinction central to cognitive science – the distinction between architecture and representation (more on this later)

The problem of experiential access to mental contents and processes

● What does how things look tell you about the contents of your mental representation? Must there be a representation corresponding to an appearance? What do the changing contents of your conscious experience tell you about

the changing representations that your mind goes through? Does it provide a trace of the process?

● What do the contents of experience tell you about how you make decisions or solve problems? (Example later)

● Does a description of your experience provide the basis for an account that is explanatory?

How well do you know your own mind?

Why suppose that thoughts might be represented in the form of pictures?

● Over 65% of the cortex is devoted to vision

● Most of our knowledge of the world comes through vision

● If we have a visual module, why not use it to encode/decode thoughts?

There are many questions about what goes on when we have the experience

of “seeing with the mind’s eye”● Is mental imagery a special form of thought? If so, in what

way is it special? Are mental images sensory-based and modality-specific? Are mental images like pictures? In what respect? Are images different from other forms of thought? Do they,

for example, resemble what they represent?● Does mental imagery use the visual system? If so, what does

that tell us about the format of images?● Is there neurophysiological evidence for a pictorial “display”

in visual cortex? What if a display were found in human visual cortex?

● These questions will be addressed this week (and maybe even next week)

● But if mental imagery is to be thought of as being closely related to vision, we first we need to ask some questions about what vision is like.

● First we need to recognize that what drives the imagery-vision parallel is the similar phenomenology but yet the phenomenology of vision is very misleading

The phenomenology of seeing (including its completeness, its filled-out fine details, and its panoramic scope) turns out to be an illusion!

● We see far less, and with far less detail and scope, than our phenomenology suggests

● Objectively, outside a small region called the fovea, we are colorblind and our sight is so bad we are legally blind. The rest of the visual field is seriously distorted and even in the fovea not all colors are in focus at once.

● More importantly, we register surprisingly little of what is in our field of view. Despite the subjective impression that we can see a great deal that we cannot report, recent evidence suggests that we cannot even tell when things change as we watch.

What do we take in when we see?

● What we actually take in functionally depends on: Whether you are asking about the preconceptual

information or the conceptual (seeing-as) information Even pre-conceptual information is impoverished and built

up over time. We will see later that this consists primarily in individuating and keeping track of individual objects

Whether the information was attended or not. Although unattended information is not entirely screened out, it is certainly curtailed and sometimes even inhibited.

Examples of attentional inhibition

● Negative Priming (Treisman & DeShepper, 1996).

Is there a figure on the right that is the same as the figure on the left? When the figure on the left is one that had appeared as an ignored

figure on the right, RT is long and accuracy poor. This “negative priming” effect persisted over 200 intervening trials

and lasted for a month!

The effect of attention on whether objects are

perceived /encoded: Inattentional Blindness(Mack, A., & Rock, I., 1998.

Inattentional blindness. Cambridge, MA: Mit Press)

Inattentional Blindness

● The background task is to report which of two arms of the + is longer. One critical trial per subject, after about 3,4 background trials. Another “critical” trial presented as a divided attention control.

● 25% of subjects failed to see the square when it was presented in the parafovea (2° from fixation).

● But 65% failed to see it when it was at fixation!

● When the background task cross was made 10% as large, Inattentional Blindness increased from 25% to 66%.

● It is not known whether this IB is due to concentration of attention at the primary task, or whether there is inhibition of outside regions.

Where does this leave us?● Given the examples of memory errors, should we conclude that

seeing is a process of constructing conceptual descriptions?● Most cognitive scientists and AI people would say yes, although

there would be several types of exception. There remains the possibility that for very short durations (e.g.

0.25 sec) there is a form of representation very like visual persistence – sometimes called an ‘iconic storage’ (Sperling, 1960).

From a neuroscience perspective there is evidence of a neural representation in early vision – in primary visual cortex – that is retinotopic and therefore “pictorial.”

Doesn’t this suggest that a ‘picture” is available in the brain in vision?We shall see later that this evidence is misleading and does not support

a picture theory of vision or of visual memory A major theme of later lectures will be to show that an important

mechanism of vision is not conceptual but causal: Visual Indexes Many people continue to hold a version of the “picture theory” of

mental representation in mental imagery. More on this later.

Architecture and Process

● We now come to the most important distinction of all – that between behavior attributable to the architecture of a system and that attributable to properties of the things that are represented. Without this distinction we cannot distinguish between phenomena that reveal the nature of the system and phenomena that reflect the effects of external variables

● So here is an example to make the point

An illustrative example: Behavior of a mystery box

What does this behavior pattern tell us about the nature of the box?

time

The moral of this example: Regularities in behavior may be due to one of two very different causes:

1. The inherent nature of the system (to its relatively fixed structure), or

2. The nature of what the system represents (what it “knows”).

The main difference between picture-theorists and the rest of us (me) is in

how we answer the following question:● Do experimental findings on mental imagery (such as

those I will review) tell us anything about the properties of a special imagery architecture? Or do they tell us about the knowledge that people have about how things would look if they were actually to see them (together with some common psychophysical skills)? While these are the main alternatives, there are also other

reasons why experiments come out as they do Notice that the architecture alternative includes properties of

the format adopted in a particular domain of representation – e.g., the Morse code used by the code-box in our example

The imagery debateThe imagery debate

Imagine various events unfolding before your “mind’s eye” –

● Imagine a bicyclist racing up a hill. Down a hill?

● Imagine turning a large heavy wheel. A light wheel.

● Imagine a baseball being hit. What shape trajectory does it trace out? Where would you run to catch it?

● Imagine a coin dropping and whirling on its edge as it eventually settles. Describe how it behaves.

● Imagine a heavy ball (a shot-put) being dropped at the same time as a light ball (a tennis ball). Indicate when they hit the floor. Repeat for different heights.

● Form a vivid auditory image of Alfred Brendel playing the minute waltz so you hear every note clearly. How long will it take? Why?

Examples to probe your intuition

What color do you see when two color filters overlap?

Conservation of volume example

A basic mistake: Failure to distinguish between properties of the world being represented and properties of the representation or of the representational medium

“Representation of object O with property P”

Is ambiguous between these two parsings:

Representation of (Object O with property P)vs

(Representation of object O) with property P

Why do things happen the way they do in your imagination?

● Is it because of the format of your image or your cognitive architecture? Or because of what you know? Did it reveal a capacity of mind? Or was it because you made it do what it did?

● Can you make your image have any properties you choose? Or behave in any way you want? Why not? How about imagining an object from all directions at once, or

from no particular direction? How about imagining a 4-dimensional object? Can you imagine a printed letter which is neither upper nor

lower case? A triangle that is not a particular type?

Major Question 1: Constraints imposed by imagery

More demonstrations of the relation between vision and imagery● Images constructed from descriptions

The D-J example(s) The two-parallelogram example

● Amodal completion● Reconstruals: Slezak

Can images be visually reinterpreted?

● There have been many claims that people can visually reinterpret images These have all been cases where one could easily figure out

what the combined image would look like without actually seeing it (e.g., the J – D superposition).

● Pederson’s careful examination of visual “reconstruals” showed (contrary to her own conclusion) that images are never ambiguous (no Necker cube or figure-ground reversals) and when new construals were achieved from images they were quite different from the ones achieved in vision (more variable, more guessing from cues, etc).

● The best evidence comes from a philosopher (Slezak, 1992, 1995)

Slezak figures

Pick one (or two) of these animals and memorize what they look like. Now rotate it in your mind by 90 degrees clockwise and see what it looks like.

Slezak figures rotated 90o

Do this imagery exercise:Imagine a parallelogram like this one

Now imagine an identical parallelogram directly below this one

Connect each corner of the top parallelogram with the corresponding corner of the bottom parallelogram

What do you see when you imagine the connections?Did the imagined shape look (and change) like the one you see now?

Amodal completion by imagery?

Is this what you saw?

Continue….

1. Are images spatial – i.e. do they have spatial properties such as size, distance, and relations such as above, next-to, in-between? Do the axioms of Euclidean geometry and measure theory apply to them?

a) ab + bc ac and ab = ba

b) If abc = 90°, then ab2 + bc2 = ac2

2. If yes, what would that entail about how they must be instantiated in the brain?

a) Could they be analogue? What constraints does that impose?

b) Is the space 2-D or 3D?

3. Might they be in some “functional space” – i.e., behave as though they were spatial without having to be in real physical brain-space? What does that entail?

Images and space

Do mental images have size?

1. Imagine a mouse across the room so it’s image size (% of your image display it occupies) is small

2. Now imagine it close to you so it fills your image display

Of these two conditions, in which does it take longer to answer “can you see the mouse’s whiskers?”

3. Imagine a horse. How close can you come to the image before it starts to overflow your image display? Repeat with a toaster, a table, a person’s face, etc

Do mental images have size?Imagine a very small mouse. Can you see its whiskers? Now imagine a huge mouse. Can you see its whiskers?

Which is faster?

Image of (small X) vs Small(image of X)

Mental rotation

Time to judge whether (a)-(b) or (b)-(c) are the same except for orientation increases linearly with the angle between them (Shepard & Metzler, 1971)

Imagine this shape rotating in 3D

When you make it rotate in your mind, does it seem to retain its rigid 3D shape without re-computing it?

The missing bit of logic:

● What is assumed in the case of mental rotation?● According to Prinz (2002) p 118,

“If visual-image rotation uses a spatial medium of the kind Kosslyn envisions, then images must traverse intermediate positions when they rotate from one position to another. The propositional system can be designed to represent intermediate positions during rotation, but that is not obligatory.”

● But what makes this obligatory in “functional Space”?

How are these ‘assumptions’ realized?● Assumptions such as rigidity must therefore be a

property inherent in the architecture (the ‘display’)● That raises the question of what kind of architecture

could possibly enforce rigidity of form. This brings us back to the proposed architecture – a physical display Notice, however, that such a display, by itself, does not

rigidly maintain the shape as orientation is changed. There is evidence that rotation is incremental not holistic and

is dependent on the complexity of the form and the task Also such rigidity could not be part of the architecture of an

imagery module because we can easily imagine situations in which rigidity does not hold (e.g. imagine a rotating snake!).

Mental Scanning

● Some hundreds of experiments have now been done demonstrating that it takes longer to scan attention between places that are further apart in the imagined scene. In fact the relation is linear between time and distance.

● These have been reviewed and described in: Denis, M., & Kosslyn, S. M. (1999). Scanning visual mental

images: A window on the mind. Cahiers de Psychologie Cognitive / Current Psychology of Cognition, 18(4), 409-465.

Rarely cited are experiments by Pylyshyn & Bannon which I will summarize for you.

Studies of mental scanningDoes it show that images have metrical space?

(Pylyshyn & Bannon. See Pylyshyn, 1981)

Conclusion: The image scanning effect is Cognitively Penetrable i.e., it depends on goals and beliefs, or on Tacit Knowledge.

What is assumed in imagist explanations of mental scanning?

● In actual vision, it takes longer to scan a longer distance because real distance, real motion, and real time is involved, therefore this equation holds due to natural law:

Time = distance

speed

But what ensures that a corresponding relation holds in an image?

The obvious answer is: Because the image is laid out in real space!

But what if that option is closed for empirical reasons?● Imagists appeal to a “Functional Space” which they liken to a matrix

data structure in which some pairs of cells are closer and others further away, and to move from one to another it is natural that you pass through intermediate cells

● Question: What makes these sorts of properties “natural” in a matrix data structure?

The central problem with imagistic explanations…

Kosslyn view: Images as depictive representations● “A depictive representation is a type of picture, which specifies the

locations and values of configurations of points in a space. For example, a drawing of a ball on a box would be a depictive representation.

● The space in which the points appear need not be physical…, but can be like an array in a computer, which specifies spatial relations purely functionally. That is, the physical locations in the computer of each point in an array are not themselves arranged in an array; it is only by virtue of how this information is ‘read’ and processed that it comes to function as if it were arranged into an array (with some points being close, some far, some falling along a diagonal, and so on).

● Depictive representations convey meaning via their resemblance to an object, with parts of the representation corresponding to parts of the object… When a depictive representation is used, not only is the shape of the represented parts immediately available to appropriate processes, but so is the shape of the empty space … Moreover, one cannot represent a shape in a depictive representation without also specifying a size and orientation….” (Kossyln, 1994, p 5)

Thou shalt not cheat

● There is no natural law or principle that requires that the representations of time, distance and speed to be related according to the motion equation. You could equally easily imagine an object moving instantly or according to any motion relation you like, since it is your image!

● There are two possible answers why the relation

Time = Representation of distance

Representation of speed

typically holds in an image-scanning task?

1. Because subjects have tacit knowledge that this is what would happen if they viewed a real display, or

2. Because the matrix is taken to be a simulation of a real-world display, as it often is in computer science

Thou shalt not cheat● What happens in ALL imagist accounts of phenomena,

including mental scanning and mental rotation is that imagists assume that images have the properties of real space in order to provide a principled explanation, and then retreat to some “functional” or not-quite-real space when it is pointed out that they are assuming that images are laid out in real brain space.

● This happens with mental rotation as well, even though it is an involuntary and universal way of solving the rotated-figure task so long as the task involves tokens of enantiomorphs.

● Experiments have shown that: No rotation occurs if the figures have landmarks or asymmetries

that can be used to identify them, and Records of eye movements show that mental rotation is done

incrementally: It is not a holistic rotation as experienced. The “rate of rotation” depends on the conceptual complexity of

the recognition task so is not a result of the architecture

A final point…● In Kosslyn, Thompson & Ganis (2007) the authors cite

Ned Block to the effect that one does not need an actual 2D surface, so long as the connections upstream from the cortical surface can decode certain pairs of neurons in terms of their imagined distance. Think of long stretchy axons going from a 2D surface to subsequent processes. Imagine that the neurons are randomly moved around so they are no longer on a 2D layout. As long as the connections remain fixed it will still behave as though there was a 2D surface.

● Call this the “encrypted 2D layout” version of literal space.

The encrypted-spatial layout alternative● By itself the encrypted-layout alternative will not do because

without referring to the original locations, the relation between pairs of neurons and scan time is not principled. In the end the only principle we have is Time=distance/speed so unless the upstream system decrypts the neuron locations into their original 2D surface locations the explanation for the increase in time with increased imagined distance remains a mere stipulation. It stipulates, but does not explain why, when two points are further away in the imagined layout it takes longer to scan between them or why scanning between them requires that one visit ‘intermediate’ locations along the way.

● But this is what we need to explain! So long as what we have is a stipulation, we can apply it to any form of representation! What was a principled explanation with the literal 2D display has now been given up for a mere statement of how it shall be.

The ‘Imagery Debate’ Redux

● According to Kosslyn there have been 3 stages in the debate over the nature of mental images:

1. The first was concerned with the role of images in learning and memory (Paivio’s Dual Code theory). While influential at the time it has now been largely abandoned except for a few recidivists like Barsalou;

2. The second stage involved the study of metrical-spatial properties of images and the parallels between vision and imagery, as assessed by reaction time measures;

3. Finally we now have the discovery of brain mechanisms underlying visual imagining and so finally the ‘resolution of the imagery debate’.

Mental imagery and neuroscience● Neuroanatomical evidence for a retinotopic display in

the earliest visual area of the brain (V1)● Neural imaging data showing V1 is more active during

mental imagery than during other forms of thought Also the form of activity differs for small vs large images in

the way that it differs when viewing small and large displays

● Transcranial magnetic stimulation (TMS) of visual areas interferes more with imagery than other forms of thought

● Clinical cases show that visual and image impairment tend to be similar (Bisiach, Farah)

● More recent psychophysical measures of imagery shows parallels with comparable measures of vision, and these can be related to the receptive cells in V1

Neuroscience has shown that the retinal pattern of activation is displayed on the surface of the cortex

Tootell, R. B., Silverman, M. S., Switkes, E., & de Valois, R. L. (1982). Deoxyglucose analysis of retinotopic organization in primate striate cortex. Science, 218, 902-904.

There is a topographical projection of retinal activity on the visual cortex of the cat and monkey.

Examples of problems of drawing conclusions about mental imagery from neuroscience data

1. The capacity for imagery and for vision are known to be independent. Also all imagery results are observed in the blind.

2. Cortical topography is 2-D, but mental images are 3-D – all phenomena (e.g. rotation) occur in depth as well as in the plane.

3. Patterns in the visual cortex are in retinal coordinates whereas images are in world-coordinates Your image stays fixed in the room when you move your eyes or turn

your head or even walk around the room

4. Accessing information from an image is very different from accessing it from the perceived world. Order of access from images is highly constrained. Conceptual rather than graphical properties are relevant to image

complexity (e.g., mental rotation).

Problems with drawing conclusions about mental images from the neuroscience evidence

5. Retinal and cortical images are subject to Emmert’s Law, whereas mental images are not;

6. The signature properties of vision (e.g. spontaneous 3D interpretation, automatic reversals, apparent motion, motion aftereffects, and many other phenomena) are absent in images;

7. A cortical display account of most imagery findings is incompatible with the cognitive penetrability of mental imagery phenomena, such as scanning and image size effects;

8. The fact that the Mind’s Eye is so much like a real eye (e.g., oblique effect, resolution fall-off) should serve to warn us that we may be studying what observers know about how the world looks to them, rather than what form their images take.

Problems with drawing conclusions about mental images from the neuroscience evidence

9. Many clinical cases can be explained by appeal to tacit knowledge and attention The ‘tunnel effect’ found in vision and imagery (Farah) is

likely due to the patient knowing what things now looked like to her post-surgery

Hemispatial neglect seems to be a deficit in attention, which also explains the “representational neglect” in imagery reported by Bisiach

A recent study shows that imaginal neglect does not appear if patients have their eyes closed. This fits well in the account I will offer in which the spatial character of a mental images derives from concurrently perceived space.

An over-arching consideration:

What if colored three-dimensional images were found in visual cortex? What would that tell you about the role of mental images in reasoning?

Would this require a homunculus?

Is this a straw man?

Should we welcome back the homunculus?

● In the limit if the visual cortex contained the contents of ones conscious experience in imagery we would need an interpreter to “see” this display in visual cortex

● But we will never have to face this prospect because many experiments show that the contents of mental images are conceptual (or, as Kosslyn puts it, are ‘predigested’).

● And finally, it is clear that you can make your image do whatever you want and to have whatever properties you wish. There are no known constraints on mental images that cannot be

attributed to lack of knowledge of the imagined situation (e.g., imagining a 4-dimensional block).

All currently claimed properties of mental images are cognitively penetrable.

One view: Images as depictive representations● “A depictive representation is a type of picture, which specifies the

locations and values of configurations of points in a space. For example, a drawing of a ball on a box would be a depictive representation.

● The space in which the points appear need not be physical…, but can be like an array in a computer, which specifies spatial relations purely functionally. That is, the physical locations in the computer of each point in an array are not themselves arranged in an array; it is only by virtue of how this information is ‘read’ and processed that it comes to function as if it were arranged into an array (with some points being close, some far, some falling along a diagonal, and so on).

● Depictive representations convey meaning via their resemblance to an object, with parts of the representation corresponding to parts of the object… When a depictive representation is used, not only is the shape of the represented parts immediately available to appropriate processes, but so is the shape of the empty space … Moreover, one cannot represent a shape in a depictive representation without also specifying a size and orientation….” (Kossyln, 1994, p 5)

In what sense are mental images spatial?

This is the most challenging and the most thoroughly researched question in the last 20 years of research on mental imagery. It is also the most seductive because of the phenomenology associated with imagining.

Even though many experiments have been devoted to asking whether imagery involves the visual system.

Does mental imagery use the visual system?

this question is asked because of the expectation that it will cast some light on the format of mental images, and in particular that it will tell us something about why images appear to have spatial properties.

But I will suggest that a positive answer to this question actually speaks against the hypothesis that images are laid out in some space inside the head.

Vision is involved when images are superimposed onto visual displays

Many experiments show that when you project an image onto a display the image acts very much like a superimposed display Shepard & Podgorny, Hayes, Bernbaum & Chung … Interference effects (Brooks) Interaction with the motor system (Finke, Tluka) Superposition yields some visual illusions

Maybe all imagery phenomena are like this! Mental scanning and superposition Only need pairing of a few perceived objects with

imagined ones The mechanism for such pairings may be the FINST index

Visual illusions with projected images

Bernbaum & Chung. (1981)

Shepard & Podgorny experiment

Both when the displays are seen and when the F is imagined, RT to detect whether the dot was on the F is fastest when the dot is at the vertex of the F, then when on an arm of the F, then when far away from the F – and slowest when one square off the F.

Differences between vision and visual imagery in the control of motor actions

● Imagery clearly has some connection to motor control – you can point to things in your image. This may be why images feel spatial

● Finke showed that you could get adaptation with imagined hand position that was similar to adaptation to displacing prism goggles

● You can also get Stimulus-Response compatibility effects between the location of a stimulus in an image and the location of the response button in space

● Both these findings provide support for the view that the spatial character of images comes from their projection onto a visual scene.

S-R Compatibility effect with a visual displayThe Simon effect: It is faster to make a response in the

direction of an attended objects than in another direction

Response for A is faster when YES in on the left in these displays

S-R Compatibility effect with a recalled (mental) display

RT is faster when the A is recalled (imagined) as being on the left

The same RT pattern occurs for a recalled display as for a perceived one

The spatial-metrical character of images

● A number of experiments have been cited as showing that images must actually have metrical properties, particularly spatial ones (not just represent metrical properties, but have them).

● The most commonly cited ones are experiments involving Image size Mental scanning across a mental image Mental rotation of images

Where do we stand?● It seems that a literal picture-in-the-brain theory is

untenable for many reasons – including the major empirical differences between mental images and cortical images. A serious problem with any format-based explanation of mental imagery is the cognitive penetrability of many of the imagery experiments.

● Is there a middle ground between a view of mental images as pictorial/spatial and a view that says the pictoriality of images is an illusion that arises from the similarity of the experience of imaging and of seeing? How do we explain the spatial character of images – the fact

that they seem to be laid out in space? How do we explain the fact that images look like what they

represent?

What is the alternative?

● Neither seeing nor imaging entails the existence of something pictorial. The notion of a “picture” only arises because viewing a

(literal) picture produce a similar experience to viewing scenes (that is why pictures were thought to be “magic”).

But yet there is something spatial about perception in general (visual, auditory, proprioceptive,..). Where does that come from? And does that hold the secret to understanding the spatiality of mental images?

The spatial character of images and the spatial nature of the world

● For an answer to what is spatial in imagery we need to look into what is spatial about perception. This is a nontrivial question about which we have some interesting ideas – some of which come from (of all places) Gibsonian influences on what has been called the Situated Vision movement in Cognitive Science and Robotics.

What does it mean to say that we perceive the world as spatial?

● When we “notice” new properties in a scene, they are consistent with what we noticed earlier, in terms of the axioms of geometry and the laws of physics. Our noticings are generally “monotonic”

● We can examine a scene in any order and we can re-examine parts of it because its relevant properties generally do not change. Unlike imagining what would happen, seeing does not run into a frame problem

● We can navigate through the world we perceive But what exactly does this mean? We can engage in “reactive planning” and other forms of “situated”

behaviors that require contact with the world

● Rather than representing metrical space itself, we deploy mechanisms for re-examining objects in the world (“world as external memory”).

Aside on the ‘situated cognition’ movement

● It has a cult following, but like many cults, there is some truth to it.

● The idea is that we do not represent the spatial layout of a visual scene (except in the most sparse and course manner). To do so would not only greatly tax memory, but would be redundant. We do not need to represent the scene in detail if we can return to it for further information when we need it – and there is evidence that we do just that. But how do we do it? Example of saccadic integration and deictic reference

Now even more of an aside on Visual Indexing (FINST) Theory: What it’s like and why we need it

Visual Indexes as Demonstrative Reference

● Visual indexes are a mechanism for referring to (or pointing to) visual objects without first having represented their properties: They are a direct referential link between a mental token and a preconceptual individual object.

● They are needed to specify where to assign focal attention

● They are needed to evaluate multiple-argument predicates All arguments in a visual predicate must be bound to objects before the

predicates can be evaluated

● They allow the external environment to be used instead of memory in carrying out tasks (e.g., Ballard’s copying task).

● They allow us to get around having to assume a metrical image to explain trans-saccadic integration, visual stability and many spatial imagery phenomena.

The spatiality of images is inherited from the spatiality of the seen world

● If we can find what properties that give perceived space it spatial character, then maybe our mental representations can exploit these spatial properties when we superimpose an image on a perceived scene. Examples: scanning, visual-motor adaptation

● This is the proposal: The representations underlying mental images achieve their spatial properties by being associated with real perceived objects or locations. This is how they inherit the essential Euclidean character of space.

● Our “sense of space” is extremely accurate even without vision, and can plausibly be used during mental imaging.

The spatiality of images and the spatiality of the world in which they are situated

If we think of projecting images onto a perceived scene as involving binding objects of thought to objects of perception, we can explain: The sense in which objects in a mental image can be in spatial

relations to one another Why images are in allocentric coordinates (where is your image?)

They make use of coordinate updating with voluntary sightless egomotion

Why Finke was able to show visual-motor adaptation results with imagined hand positions

Why we get some induced visual illusions Why we get S-R compatibility findings Why we sometimes observe hemispatial neglect with mental

images (Bisiach)

A final point: Why do mental images look like what they represent?

● More important: What kind of a fact is this? Is it a conceptual or an empirical fact that imagined things

look like what they are images of? Could an image of X look like something quite different

from X? Is there a possible world in which your image of your dog looked like a soup bowl? Or even in which your image of Tom looked like an image of his twin brother?

● “Looks like” is a problematic notion because it bridges from an experience to a description. Wittgenstein’s example: Why does it “look like” the sun is

going around the earth rather than that the earth is rotating?

But that doesn’t explain why we can solve mental geometry problems more

easily by imagining the figures!

● There are many problems that you can solve much more easily when you imagine a layout than when you do not.

● In fact many instances of solving problems by imagining a layout that seem very similar to how would solve them if one had pencil-and-paper.

● We need to understand is what happens in the visual case in order to see how images can help this processes in the absence of a real diagram.

How do real visual displays help thinking? How do diagrams, graphs, charts, maps and other visual

objects help us to reason and to solve problems? The question why visual aids help is nontrivial my Seeing

& Visualizing, chapter 8 contains some speculative discussion, e.g., they allow the visual system to:• make certain kinds of inferences just by looking

• make use of visual demonstratives to offload some of the memory*

• Physical displays embody the axioms of measure theory and of geometry so they don’t need to be explicitly expressed in reasoning

The big question is whether (and how) any of these advantages carry over to imaginal thinking! Do mental images have some (or any) of the critical properties that make diagrams helpful in reasoning?

Visual inferences?● If we recall a visual display it is because we have encoded

enough information about its visual-geometrical properties that we can meet some criteria, e.g., we can draw it. But there are innumerably many ways to encode this information that are sufficient for the task (e.g. by encoding pairwise spatial relations, global spatial relations, and so on). For many properties the task of translating from one form to another is much more difficult than the task of visually encoding it – the translation constitutes visual inference.

● The visual system generalizes from particular instances as part of its object-recognition skill (all recognition is recognition-as and therefore assumes generalization from tokens to types). It is also very good at noticing certain properties (e.g., relative sizes, deviations from square or circle, collinearity, inside, and so on). These capabilities can be exploited in graphical layouts.

An example in which an image helps thinking that is primarily logical

● Three-term series problems: John is taller than Mary and John is shorter than Fred. Who is tallest, who is shortest?

● A common way to solve this problem is by using “spatial paralogic”. Construct a list: Read off Tallest at the top and

Shortest at the bottom

● What does this assume about the image format/architecture?

John

Mary

Fred

Some assumptions about the image medium● When two items are entered in the image their relative

locations remain fixed despite certain operations on the image (moves)

● When a third item is entered with a certainspatial relation to an earlier one, the relation between the first two may remain unchanged

● The spatial relations in the image continue to be a correct model of the intended relation (e.g., “taller”) even for relations between pairs not originally intended – e.g., Reading off ‘Fred is taller than Mary’ is a valid inference But note that many pair-wise relations do not validly map onto the

vertical dimension: e.g., married to, loves, stands next to,…

● These image assumptions are true of a picture drawn on a rigid surface, but why must it be true in the mind? The Frame Problem in robotics Also the relevance problem in inference

John

Mary

Fred

A few of difficulties with this view

● Indeterminacies are a problem John is taller than Mary Mary is shorter than Fred Are Fred and John the same height? What do we do when we discover that

John is shorter than Fred? Spatial location alone does not help – we

still need symbols such as the arrow

● Johnson-Laird: Reasoning by model J-L found that the difficulty of the problem increases when

more than one possible spatial model applies

John

Mary

Fred

What is assumed? The fact that interspersing the plate between the spoon

and the knife does not alter the previously encoded relation between spoon and knife, and between the knife and the cup, is attributable to the user’s knowledge of both the formal relation “to the right of” (e.g., that it survives interpolation) and the cohesion of the knife-cup pair which were moved together because their relation was previously specified. Properties such as cohesion can be easily seen in a diagram where all that is required in order to place the plate to the right of the spoon is to move the knife-cup column over by one. But rigidity of form is not a property of images, one would have to compute what happens to parts of a pattern when other parts are moved (cf mental rotation).

What is the least you need to assume?

● Relative spatial relations need to be represented in some manner that is invariant over some transformations

● If spatial relations between individual items are represented in some way, recognizing them may require perceptual (not necessarily visual) pattern recognition How this can happen without an internal display is the

subject of Chapter 5 of the Things and Places book.

Example: Memorize this map so you can draw it accurately

From your memory:

● Which groups of 3 or more locations are collinear?● Which locations are midway between two others?● Which locations are closest to the center of the island?● Which pairs of locations are at the same latitude?● Which is the top-most (bottom-most) location?

If you could draw the map from memory using whatever properties you noticed and encoded, you could easily answer the questions by looking at your drawing – even if you had not encoded the relations in the queries.

Draw a rectangle. Draw a line from the bottom corners to a point on the opposite vertical side. Do these two lines intersect? Is the point of intersection of the two lines below or above the midpoint? Does it depend on the particular rectangle you drew?

A B

CD

m m’

y

x

Which properties of a real diagram also hold for a mental diagram?

● A mental “diagram” does not have any of the properties that a real diagram gets from being on a rigid 2D surface.

● When you imagine 3 points on a line, labeled A, B, and C, must B be between A and C? What makes that so? Is the distance AC greater than the distance AB or BC?

● When you imagine drawing point C after having drawn points A and B, must the relation between A and B remain unchanged (e.g., the distance between them, their qualitative relation such as above or below). Why?

● These questions raise what is known as the frame problem in Artificial Intelligence. If you plan a sequence of actions, how do you know which properties of the world a particular action will change and which it will not, given that there are an unlimited number of properties and connections in the world?

What happens when we fail to make the represented-representation distinction