vision and visual neuroscience ii - mit9.520/spring09/classes/class22_ts... · 2010. 1. 22. ·...

Vision and visual neuroscience IIThomas Serre & Tomaso Poggio

McGovern Institute for Brain ResearchDepartment of Brain & Cognitive SciencesMassachusetts Institute of Technology

Past lecture

Problem of visual recognition and visual cortex

Historical background

Neurons and areas in the visual system

Feedforward hierarchical models

Hierarchical anatomical organization

Felleman & van Essen 1991

source: Jim DiCarlo

Object recognition in the visual cortex

Ventral visual stream

source: Jim DiCarlo



source: Jim DiCarlo


Hierarchical architecture:


source: Jim DiCarlo


Hierarchical architecture:Latencies


source: Jim DiCarlo


Hierarchical architecture:LatenciesAnatomy


source: Jim DiCarlo


Hierarchical architecture:LatenciesAnatomyFunction


Hubel & Wiesel 1959, 1962, 1965, 1968

Nobel prize 1981

simplecells

complexcells


Kobatake & Tanaka 1994

see also Oram & Perrett 1993; Sheinberg & Logothetis 1996; Gallant et al 1996; Riesenhuber & Poggio 1999

gradual increase in complexity of preferred stimulus


see also Oram & Perrett 1993; Sheinberg & Logothetis 1996; Gallant et al 1996; Riesenhuber & Poggio 1999

Parallel increase in invariance properties (position and scale)

of neuronsKobatake & Tanaka 1994

Rapid recognition: monkey electrophysiology

Hung* Kreiman* Poggio & DiCarlo 2005

Robust invariant readout of category information from small population of neurons

Single spikes after response onset carry most of the information

Thorpe et al ‘96

Rapid recognition: human behavior

Computational considerations

Simple units Complex units

Template matching Gaussian-like tuning

~ “AND”

Riesenhuber & Poggio 1999 (building on Fukushima 1980 and Hubel & Wiesel 1962)

Invariance max-like operation

~”OR”

Animal

vs.

non-animal

Complex cells

Tuning

Simple cells

MAX

Main routes

Bypass routes

PG

Co

rte

x

Ro

str

al S

TS

Prefrontal

Cortex

STP

DP VIP LIP 7a PP FST

PO V3A MT

TPO PGa IPa

V3

V4

PIT TF

TG 36 35

LIP

,VIP

,DP,7

a

V2

,V3

,V4

,MT,M

ST

PIT

, A

IT

AIT

,36

,35

MSTc

}V1

PG

TE

46 8 45 1211,

13

TEa TEm

AIT

V2

V1

dorsal stream

'where' pathway

ventral stream

'what' pathway

MSTp

C1

S1

S2

S3

S2b

C2

classification

units

0.2 - 1.1o

0.4 - 1.6o

0.6 - 2.4o

1.1 - 3.0o

0.9 - 4.4o

1.2 - 3.2

o

o

o

o

o

oo

Model

layers

RF sizes

S4 7o

Num.

units

C2b 7o

C3 7o

10 6

104

107

105

104

107

100

102

103

103

Incre

ase

in

co

mp

lexity (

nu

mb

er

of

su

bu

nits),

RF

siz

e a

nd

in

va

ria

nce

Un

su

pe

rvis

ed

ta

sk-in

de

pe

nd

en

t le

arn

ing

Superv

ised

task-d

ependent le

arn

ing

(Riesenhuber & Poggio 1999 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva & Poggio 2007)

✦V1:•Simple and complex cells tuning properties

(Schiller et al 1976; Hubel & Wiesel 1965; Devalois et al 1982)

•MAX operation in subset of complex cells (Lampl et al 2004)

✦V4:•Tuning for two-bar stimuli (Reynolds Chelazzi

& Desimone 1999)

•MAX operation (Gawne et al 2002)

• Two-spot interaction (Freiwald et al 2005)

• Tuning for boundary conformation (Pasupathy & Connor 2001)

• Tuning for Cartesian and non-Cartesian gratings (Gallant et al 1996)

✦IT:•Tuning and invariance properties (Logothetis

et al 1995)

•Differential role of IT and PFC in categorization (Freedman et al 2001 2002 2003)

•Read out data (Hung Kreiman Poggio & DiCarlo 2005)

•Average effect in IT (Zoccolan Cox & DiCarlo 2005; Zoccolan Kouh Poggio & DiCarlo in press)

✦Human behavior:•Rapid animal categorization (Serre Oliva

Poggio 2007)

This lecture

This lecture

1.Learning a loose hierarchy of image fragmentsThe algorithm

Recognition in the real-world

This lecture



2.Rapid recognition and feedforward processing:Predicting human performance

“Clutter problem”

This lecture





3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”

Predicting human eye movements

Gabor filters

Parameters fit to V1 data (Serre & Riesenhuber 2004)

17 spatial frequencies (=scales)

4 orientations

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3S1 units

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3C1 units

Increase in tolerance to position (and in RF size)

Local max over pool of S1 cells

C1

S1

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3C1 units

Increase in tolerance to scale

C1 Local max over pool of S1 cells

Receptive field sizesModel Cortex References

simple cells 0.2o ! 1.1o " 0.1o ! 1.0o [Schiller et al., 1976e;Hubel and Wiesel, 1965]

complex cells 0.4o ! 1.6o " 0.2o ! 2.0o

Peak frequencies (cycles /deg)Model Cortex References

simple cells range: 1.6 ! 9.8 bulk " 1.0 ! 4.0 [DeValois et al., 1982a])mean/med: 3.7/ 2.8 mean: " 2.2

range: " 0.5 ! 8.0complex cells range: 1.8 ! 7.8 bulk " 2.0 ! 5.6

mean/med: 3.9/ 3.2 mean: 3.2range " 0.5 ! 8.0

Frequency bandwidth at 50% amplitude (cycles / deg)Model Cortex References

simple cells range: 1.1 ! 1.8 bulk " 1.0 ! 1.5 [DeValois et al., 1982a]med: " 1.45 med: " 1.45

range " 0.4 ! 2.6complex cells range: 1.5 ! 2.0 bulk " 1.0 ! 2.0

med: 1.6 med: 1.6range " 0.4 ! 2.6

Frequency bandwidth at 71% amplitude (index)Model Cortex References

simple cells range: 44 ! 58 bulk " 40 ! 70 [Schiller et al., 1976d]med: 55

complex cells range 40 ! 50 bulk " 40 ! 60med. 48

Orientation bandwidth at 50% amplitude (octaves)Model Cortex References

simple cells range: 38o ! 49o — [DeValois et al., 1982b]med: 44o

complex cells range: 27o ! 33o bulk " 20o ! 90omed: 43o med: 44o

Orientation bandwidth at 71% amplitude (octaves)Model Cortex References

simple cells range: 27o ! 33o bulk " 20o ! 70o [Schiller et al., 1976c]med: 30o

complex cells range: 27o ! 33o bulk " 20o ! 90omed: 31o

Serre & Riesenhuber 2004

50 0 500

0.2

0.4

0.6

0.8

1

orientation (in degree)

resp

onse

optimal baredgegrating

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3S2 units

Features of moderate complexity (n~1,000 types)

Combination of V1-like complex units at different orientations

Synaptic weights w learned from natural images

5-10 subunits chosen at random from all possible afferents (~100-1,000)

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3S2 units

stronger facilitation

stronger suppression

homogenous fields

cross-orientation

fields

Nature Neuroscience - 10, 1313 - 1321 (2007) / Published online: 16 September 2007 | doi:10.1038/nn1975

Neurons in monkey visual area V2 encode combinations of orientationsAkiyuki Anzai, Xinmiao Peng & David C Van Essen

a b c

d e f0 2

0

–1

–1–1

–2–0.5

–0.52–1–

–2

–0.5

–1.0

–1–2

1

2

0 2

0

–1

–2

–2

1

2V2

24 spikes per s

0 1

0

0.5

1.0 V2

11 spikes per s

V2

18 spikes per s

0 0.5

0

0.5

V1

32 spikes per s

0 2

0

1

2 V2

14 spikes per s

0 1

0

1

y (°

)

V2

16 spikes per s

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3C2 units

Same selectivity as S2 units but increased tolerance to position and size of preferred stimulus

Local pooling over S2 units with same selectivity but slightly different positions and scales

S2 units in V2 and C2 in V4?

Beyond C2 units

Units increasingly complex and invariantS3/C3 units:

Combination of V4-like units with different selectivitiesDictionary of ~1,000 features = num. columns in IT (Fujita 1992)

S4 units:View-tuned units (imprinted with part of the training set, e.g. animal and non-animal images but still unsupervised)Tuning and invariance properties agrees with IT data (Logothetis Pauls & Poggio 1995)

Animalvs.

non-animal

C1S1

S2

S3S2bC2

classif.units

S4

C2b

C3

Idea 1: Built-in invariance to 2D transformations (rotation and scale)

Idea 2: Generic features shared between multiple categories

Overall reduce “sample complexity” and reduces number of training examples needed to learn a task .

So why hierarchies?

Task-specific = categorization circuitsAnimal

vs.

non-animal

Complex cells

Tuning

Simple cells

MAX

Main routes

Bypass routes

Prefrontal

Cortex

V4

PIT

35

PIT

, A

IT

AIT

,36

,35

V1

PG

TE

45 1211,

13

AIT

V2

V1

dorsal stream

'where' pathway

ventral stream

'what' pathway

C1

S1

S2

S3

S2b

C2

classification

units

0.2 - 1.1o

0.4 - 1.6o

0.6 - 2.4o

1.1 - 3.0o

0.9 - 4.4o

1.2 - 3.2

o

o

o

o

o

oo

Model

layers

RF sizes

S4 7o

Num.

units

C2b 7o

C3 7o

10 6

104

107

105

104

107

100

102

103

103

Incre

ase

in

co

mp

lexity (

nu

mb

er

of

su

bu

nits),

RF

siz

e a

nd

in

va

ria

nce

Un

su

pe

rvis

ed

ta

sk-in

de

pe

nd

en

t le

arn

ing

Superv

ised

task-d

ependent le

arn

ing

V1

V2

V4

PIT

AIT

PFC

features of increasing complexity and tolerance to position and scale

view-based object representation but tolerant position, scale and small rotations

Animal

vs.

non-animal

Complex cells

Tuning

Simple cells

MAX

Main routes

Bypass routes

Prefrontal

Cortex

V4

PIT

35

PIT

, A

IT

AIT

,36

,35

V1

PG

TE

45 1211,

13

AIT

V2

V1

dorsal stream

'where' pathway

ventral stream

'what' pathway

C1

S1

S2

S3

S2b

C2

classification

units

0.2 - 1.1o

0.4 - 1.6o

0.6 - 2.4o

1.1 - 3.0o

0.9 - 4.4o

1.2 - 3.2

o

o

o

o

o

oo

Model

layers

RF sizes

S4 7o

Num.

units

C2b 7o

C3 7o

10 6

104

107

105

104

107

100

102

103

103

Incre

ase

in

co

mp

lexity (

nu

mb

er

of

su

bu

nits),

RF

siz

e a

nd

in

va

ria

nce

Un

su

pe

rvis

ed

ta

sk-in

de

pe

nd

en

t le

arn

ing

Superv

ised

task-d

ependent le

arn

ing

V1

V2

V4

PIT

AIT

PFC Evidence for adult plasticity

very likely

likely

limited evidence

supervised learning from a handful of training examples ~ linear perceptron

Animal

vs.

non-animal

Complex cells

Tuning

Simple cells

MAX

Main routes

Bypass routes

Prefrontal

Cortex

V4

PIT

35

PIT

, A

IT

AIT

,36

,35

V1

PG

TE

45 1211,

13

AIT

V2

V1

dorsal stream

'where' pathway

ventral stream

'what' pathway

C1

S1

S2

S3

S2b

C2

classification

units

0.2 - 1.1o

0.4 - 1.6o

0.6 - 2.4o

1.1 - 3.0o

0.9 - 4.4o

1.2 - 3.2

o

o

o

o

o

oo

Model

layers

RF sizes

S4 7o

Num.

units

C2b 7o

C3 7o

10 6

104

107

105

104

107

100

102

103

103

Incre

ase

in

co

mp

lexity (

nu

mb

er

of

su

bu

nits),

RF

siz

e a

nd

in

va

ria

nce

Un

su

pe

rvis

ed

ta

sk-in

de

pe

nd

en

t le

arn

ing

Superv

ised

task-d

ependent le

arn

ing

V1

V2

V4

PIT

AIT

PFC

unsupervised developmental-like learning stage

Columns in the cortex

Layers of the model are organized in columns

Each model unit is equivalent to ~100 IF (~1 column of cortex)

Each hypercolumn contains the same basic dictionary of features and is replicated at all positions and scales

..

..

. .... ...

..

.. .

.. ..

. ...

. ..

. .. .

.

.......

Learning is sequential

Start with layer S2/C2 then S2b/C2b and S3/C3

Pick one unit in layer Sk

Select random set of inputs from retinotopically organized afferents

Sk

Ck-1

w1

w2 w3

Sk

Ck-1

w1

w2 w3

x1xk

xp

x j

x2 x3

y

w=xImprint with random patch of

natural image

Sk

Ck-1

w1

w2 w3

x1xk

xp

x j

x2 x3

y = exp !1

2!2

n

j =1

(wj ! x j )2[ ]y

...

✦ We learn ~1,000 units this way and then move to the next layer✦ Learning follows a long tradition of researchers who have argued that the visual system may be adapted to the statistics of the natural environment (Attneave 1954; Barlow 1961; Atick 1992; Ruderman 1994; Simoncelli & Olshausen 2001)

✦Here we assume the input image moves (shifting and looming) so that the selectivity of the imprinted units gets replicated at all positions and scales

...

.. .

.

Learning invariancesw| T. Masquelier & S. Thorpe

(CNRS, France)

see also (Foldiak 1991; Perrett et al 1984; Wallis & Rolls, 1997; Einhauser et al 2002; Wiskott & Sejnowski 2002; Spratling 2005)

✦ Simple cells learn correlation in space (at the same time)

✦ Complex cells learn correlation in time

movie courtesy of Wolfgang Einhauser

Learning invariancesw| T. Masquelier & S. Thorpe

(CNRS, France)

see also (Foldiak 1991; Perrett et al 1984; Wallis & Rolls, 1997; Einhauser et al 2002; Wiskott & Sejnowski 2002; Spratling 2005)

✦ Simple cells learn correlation in space (at the same time)

✦ Complex cells learn correlation in time

movie courtesy of Wolfgang Einhauser

S1 units

C1 unit

Learning a dictionary of shape-components in the visual cortex

Learning frequent image features during development

Object categories share reusable features

Large redundant vocabulary for implicit geometry

Learning a dictionary of shape-components in the visual cortex

Learning frequent image features during development

Object categories share reusable features

Large redundant vocabulary for implicit geometry

V1

IT

Learning a dictionary of shape-components in visual cortex

“critical” feature columns in IT

(Tanaka, 1996)



(Tanaka, 1996)

✦ Pre-attentive processing:• “Loose collection of basic features” (Wolfe & Bennett 1997)

• “Unbound features” (Treisman et al)



(Tanaka, 1996)

✦ Pre-attentive processing:• “Loose collection of basic features” (Wolfe & Bennett 1997)

• “Unbound features” (Treisman et al)

✦ Computer vision:• Component-based > holistic representation (Perona

et al 1995, 1996, 2000; Heisele Serre & Poggio 2001, 2002)

• Features of intermediate complexity are optimal (Ullman, 2002)

• Bag of features (Csurka et al 2004; Sivic et al 2005; Sudderth et al 2005)

C2 vs. IT neurons

Model data: Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005 Experimental data: Hung* Kreiman* Poggio & DiCarlo 2005

TRAIN

TEST

3.4ocenter

Size:Position:

3.4ocenter

1.7ocenter

6.8ocenter

3.4o2o horz.

3.4o4o horz.

0

0.2

0.4

0.6

0.8

1

Cla

ssifi

catio

n pe

rform

ance

IT Model

Application to computer vision

Bio-motivated computer vision

Computer vision system based on the

response properties of neurons in the ventral stream of the visual

cortex

Serre Wolf & Poggio 2005; Wolf & Bileschi 2006; Serre et al 2007

Scene parsing and object recognition

Bio-motivated computer vision

Jhuang Serre Wolf & Poggio 2007

Action recognition in video sequences motion-sensitive MT-like units

Recognition accuracy

Dollar et al ‘05

model chance

KTH Human 81.3% 91.6% 16.7%

Weiz. Human 86.7% 96.3% 11.1%

UCSD Mice 75.6% 79.0% 20.0%

★ Cross-validation: 2/3 training, 1/3 testing, 10 repeats Jhuang Serre Wolf & Poggio ICCV’07

Automatic recognition of rodent behavior

Serre Jhuang Garrote Poggio Steele in prep

Automatic recognition of rodent behavior

human agreement

72%

proposed system

71%

commercial system

56%

chance 12%

Performance

Serre Jhuang Garrote Poggio Steele in prep

This lecture







Database collected by Torralba & Oliva (2003)

Head Close-body Medium-body Far-body

Animals

Natural

distractors

Artificial

distractors

Head Close-

body

Far-

body

Medium-

body

1.0

1.4

2.6

2.4

1.8

Pe

rfo

rma

nce

(d

')

Model (82% correct)

Serre Oliva Poggio 2007

High performance (~90%) when

maximal amount of information present

in the absence of clutter

Performance decreases (~74%) with increasing amount of clutter

Limitation of feedforward model compatible with decrease in response in V4 (Reynolds

Chelazzi & Desimone 1999) and IT in the presence of clutter (Zoccolan, Cox, DiCarlo, 2005; Zoccolan, Kouh, Poggio, DiCarlo, in sub; Rolls, Aggelopoulos, Zheng, 2003)

“Clutter effect”

Head Close-

body

Far-

body

Medium-

body

1.0

1.4

2.6

2.4

1.8

Perf

orm

ance (

d')

Model (82% correct)

Animal presentor not ?

30 ms ISI

20 ms

Image

Interval Image-Mask

Mask1/f noise

80 ms

(Thorpe et al 1996; Van Rullen & Koch 2003; Bacon-Mace et al 2005)

Same effect for human observers!

Head Close-

body

Far-

body

Medium-

body

1.0

1.4

2.6

2.4

1.8

Pe

rfo

rma

nce

(d

')

Model (82% correct)

Human observers (80% correct)


(n=24)

Image-by-image correlation:

Heads: ρ=0.71

Close-body: ρ=0.84

Medium-body: ρ=0.71

Far-body: ρ=0.60

Model predicts level of performance on rotated images (90 deg and inversion)

Further comparisons


Show matlab demo

This lecture







Spatial attention solves the “clutter problem”

see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others

Problem: How to know where to attend?




foreground




foreground

background



XXXX


foreground

background


Science 22 April 2005:Vol. 308. no. 5721, pp. 529 - 534

Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4

Narcisse P. Bichot, Andrew F. Rossi, Robert Desimone

XXXX



Science 22 April 2005:Vol. 308. no. 5721, pp. 529 - 534

Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4

Narcisse P. Bichot, Andrew F. Rossi, Robert Desimone

XXXX


Answer: Parallel feature-based attention

XXXX

Parallel feature-based attention modulation

0 100 200 0 100 200

time from fixation (ms)

0norm

aliz

ed s

pike

act

ivity

1

2

the preferred feature was cued (22). Neuronsresponded better to their preferred featurein the RF compared to nonpreferred features(Fig. 3, A and B) (color, P G 0.01; shape, P G0.001). In the key test, we found that responseswere enhanced if the distracter in the RF wasof the neuron’s preferred color and it was alsothe same color (but, by design, not the sameshape) as the color-shape conjunction target(Fig. 3A) (P 0 0.002). In other words, thedistracter shared in the bias for the targetstimulus if it shared one of its features, con-sistent with the predictions of parallel searchmodels. The median enhancement was 8%,with more than 86% of the neurons having alarger response when the RF stimulus shareda feature with the searched-for target (chi-square, P G 0.005). There was also an en-hancement of the response when the shape ofthe distracter matched the shape of the color-shape conjunction target, consistent with paral-lel models, but this enhancement was smallerand developed later than the color-related en-hancement (Fig. 3, A and B). When the RFdistracter was of the preferred feature, shape-

related enhancement was not significant inthe same time interval as that used in thefeature search task, but it became significantÈ150 ms after fixation onset (P 0 0.035).This is consistent with the behavioral evi-dence described above, that the monkey usedthe color feature more than the shape featurein guiding its search to the color-shape con-junction target (fig. S2B). The LFP magni-tude (Fig. 3, C and D) and power were notmodulated by stimulus or cue features in theconjunction task.

There was also significant enhancementof the spike-field coherence in the gammaband when the RF distracter had the neu-ron’s preferred feature and that feature wasin common with the target for either a color(Fig. 3E) (P G 10j5) or shape (Fig. 3F) (P G0.001) match. The enhancement in the lattercase was smaller, again consistent with themonkey’s behavioral bias in favor of usingcolor information. The median enhancementof coherence with a color match was 22%,with 97% of spike-LFP pairs showing an in-crease (chi-square, P G 10j5), and the median

enhancement with a shape match was 17%,with 78% of spike-LFP pairs showing an in-crease (P G 0.002). Thus, the top-down biasin visual search is not limited to cases inwhich the RF stimulus is the search target butinstead applies to any stimulus, even a dis-tracter, that contains a feature relevant to thesearch, consistent with parallel models. It isalso consistent with the results from the featuresearch task, in which we found that enhance-ment occurred for colors that were similar tothe target color. Both results potentially ex-plain why search is often more difficult whenthe distracters share features with the target,as in some forms of conjunction search (8).

Serial selection during search. Finally,although we have emphasized the evidencefor parallel mechanisms in search, the tasknecessarily had a spatial attention (serial)component to it, in that the animals made sev-eral saccades to stimuli in the array whilesearching for the targets. To test for spatial at-tention effects on responses, we compared re-sponses and spike-field synchronization to astimulus in the RF when either it was selectedfor a saccade or the saccade was made to astimulus outside the RF (Fig. 4).

Selecting the RF stimulus for a saccade ledto an enhancement of the neuronal responseacross the population (Fig. 5A) (populationmedian enhancement of 36%, P G 10j5,

RF stimulus istarget of saccade

RF stimulus is nottarget of saccade

SACCADE:

SACCADE:

RF

FIX

vs.

Test for serial (spatial) selection

Fig. 4. Illustration of the saccade enhancementanalysis. We compared neuronal measures whenthe monkey made a saccade to an RF stimulusversus a saccade away from the RF. In this dis-play, fixating the purple cross, for example,brings the green star into the neuron’s RF. Wewould then compare neuronal responses whenthe green star in the RF was the target of thesaccade, to those when the saccade target wasto a stimulus outside the RF, e.g., the orange A.Activity was analyzed from the time the purplecross was fixated to when the next saccadewas initiated.

1

0

0

.15

20

Time from fixation (ms) Time from fixation (ms)

Frequency (Hz) Frequency (Hz)

COLOR EFFECT

-.2

0

.2

A B

C D

E F

Nor

mal

ized

spi

ke a

ctiv

ityN

orm

aliz

ed L

FP

200100 0 200100

SHAPE EFFECT

Spi

ke-f

ield

coh

eren

ce

1006020 10060

Fig. 3. (A to F) Feature-related enhancement of neuronal activity and synchronization duringconjunction search. Conventions are as given in Fig. 2.

R E S E A R C H A R T I C L E S

22 APRIL 2005 VOL 308 SCIENCE www.sciencemag.org532

on

Fe

bru

ary

18

, 2

00

9

ww

w.s

cie

nce

ma

g.o

rgD

ow

nlo

ad

ed

fro

m

2000 100

time from fixation (ms)

0

norm

aliz

ed s

pike

act

ivity

1

2

attend within RF

attend away from RF

Serial spatial attention modulation

XXXX

Attention as Bayesian inference

see also Rao 2005; Lee & Mumford 2003 Chikkerur Serre & Poggio in prep

PFC

IT

V4/PIT

V2



PFC

IT

V4/PIT

V2

feature-basedattention



PFC

IT

V4/PIT

V2

FEF/LIP

spatial attention




PFC

IT

V4/PIT

V2

FEF/LIP

spatial attention


O

Fi

Fli

I

L

location priors

object priors

N

vision and visual neuroscience ii - mit9.520/spring09/classes/class22_ts... · 2010. 1. 22. ·...

Documents