vision and visual neuroscience ii - mit9.520/spring09/classes/class22_ts... · 2010. 1. 22. ·...
TRANSCRIPT
![Page 1: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/1.jpg)
Vision and visual neuroscience IIThomas Serre & Tomaso Poggio
McGovern Institute for Brain ResearchDepartment of Brain & Cognitive SciencesMassachusetts Institute of Technology
![Page 2: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/2.jpg)
Past lecture
Problem of visual recognition and visual cortex
Historical background
Neurons and areas in the visual system
Feedforward hierarchical models
![Page 3: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/3.jpg)
Hierarchical anatomical organization
Felleman & van Essen 1991
![Page 4: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/4.jpg)
source: Jim DiCarlo
Object recognition in the visual cortex
![Page 5: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/5.jpg)
Ventral visual stream
source: Jim DiCarlo
Object recognition in the visual cortex
![Page 6: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/6.jpg)
Ventral visual stream
source: Jim DiCarlo
Object recognition in the visual cortex
Hierarchical architecture:
![Page 7: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/7.jpg)
Ventral visual stream
source: Jim DiCarlo
Object recognition in the visual cortex
Hierarchical architecture:Latencies
![Page 8: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/8.jpg)
Ventral visual stream
source: Jim DiCarlo
Object recognition in the visual cortex
Hierarchical architecture:LatenciesAnatomy
![Page 9: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/9.jpg)
Ventral visual stream
source: Jim DiCarlo
Object recognition in the visual cortex
Hierarchical architecture:LatenciesAnatomyFunction
![Page 10: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/10.jpg)
Object recognition in the visual cortex
Hubel & Wiesel 1959, 1962, 1965, 1968
Nobel prize 1981
simplecells
complexcells
![Page 11: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/11.jpg)
Object recognition in the visual cortex
Kobatake & Tanaka 1994
see also Oram & Perrett 1993; Sheinberg & Logothetis 1996; Gallant et al 1996; Riesenhuber & Poggio 1999
gradual increase in complexity of preferred stimulus
![Page 12: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/12.jpg)
Object recognition in the visual cortex
see also Oram & Perrett 1993; Sheinberg & Logothetis 1996; Gallant et al 1996; Riesenhuber & Poggio 1999
Parallel increase in invariance properties (position and scale)
of neuronsKobatake & Tanaka 1994
![Page 13: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/13.jpg)
Rapid recognition: monkey electrophysiology
Hung* Kreiman* Poggio & DiCarlo 2005
Robust invariant readout of category information from small population of neurons
Single spikes after response onset carry most of the information
![Page 14: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/14.jpg)
Thorpe et al ‘96
Rapid recognition: human behavior
![Page 15: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/15.jpg)
Computational considerations
Simple units Complex units
Template matching Gaussian-like tuning
~ “AND”
Riesenhuber & Poggio 1999 (building on Fukushima 1980 and Hubel & Wiesel 1962)
Invariance max-like operation
~”OR”
![Page 16: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/16.jpg)
Animal
vs.
non-animal
Complex cells
Tuning
Simple cells
MAX
Main routes
Bypass routes
PG
Co
rte
x
Ro
str
al S
TS
Prefrontal
Cortex
STP
DP VIP LIP 7a PP FST
PO V3A MT
TPO PGa IPa
V3
V4
PIT TF
TG 36 35
LIP
,VIP
,DP,7
a
V2
,V3
,V4
,MT,M
ST
PIT
, A
IT
AIT
,36
,35
MSTc
}V1
PG
TE
46 8 45 1211,
13
TEa TEm
AIT
V2
V1
dorsal stream
'where' pathway
ventral stream
'what' pathway
MSTp
C1
S1
S2
S3
S2b
C2
classification
units
0.2 - 1.1o
0.4 - 1.6o
0.6 - 2.4o
1.1 - 3.0o
0.9 - 4.4o
1.2 - 3.2
o
o
o
o
o
oo
Model
layers
RF sizes
S4 7o
Num.
units
C2b 7o
C3 7o
10 6
104
107
105
104
107
100
102
103
103
Incre
ase
in
co
mp
lexity (
nu
mb
er
of
su
bu
nits),
RF
siz
e a
nd
in
va
ria
nce
Un
su
pe
rvis
ed
ta
sk-in
de
pe
nd
en
t le
arn
ing
Superv
ised
task-d
ependent le
arn
ing
(Riesenhuber & Poggio 1999 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva & Poggio 2007)
✦V1:•Simple and complex cells tuning properties
(Schiller et al 1976; Hubel & Wiesel 1965; Devalois et al 1982)
•MAX operation in subset of complex cells (Lampl et al 2004)
✦V4:•Tuning for two-bar stimuli (Reynolds Chelazzi
& Desimone 1999)
•MAX operation (Gawne et al 2002)
• Two-spot interaction (Freiwald et al 2005)
• Tuning for boundary conformation (Pasupathy & Connor 2001)
• Tuning for Cartesian and non-Cartesian gratings (Gallant et al 1996)
✦IT:•Tuning and invariance properties (Logothetis
et al 1995)
•Differential role of IT and PFC in categorization (Freedman et al 2001 2002 2003)
•Read out data (Hung Kreiman Poggio & DiCarlo 2005)
•Average effect in IT (Zoccolan Cox & DiCarlo 2005; Zoccolan Kouh Poggio & DiCarlo in press)
✦Human behavior:•Rapid animal categorization (Serre Oliva
Poggio 2007)
![Page 17: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/17.jpg)
This lecture
![Page 18: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/18.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
![Page 19: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/19.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
![Page 20: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/20.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”
Predicting human eye movements
![Page 21: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/21.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”
Predicting human eye movements
![Page 22: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/22.jpg)
Gabor filters
Parameters fit to V1 data (Serre & Riesenhuber 2004)
17 spatial frequencies (=scales)
4 orientations
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3S1 units
![Page 23: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/23.jpg)
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3C1 units
Increase in tolerance to position (and in RF size)
Local max over pool of S1 cells
C1
S1
![Page 24: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/24.jpg)
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3C1 units
Increase in tolerance to scale
C1 Local max over pool of S1 cells
![Page 25: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/25.jpg)
Receptive field sizesModel Cortex References
simple cells 0.2o ! 1.1o " 0.1o ! 1.0o [Schiller et al., 1976e;Hubel and Wiesel, 1965]
complex cells 0.4o ! 1.6o " 0.2o ! 2.0o
Peak frequencies (cycles /deg)Model Cortex References
simple cells range: 1.6 ! 9.8 bulk " 1.0 ! 4.0 [DeValois et al., 1982a])mean/med: 3.7/ 2.8 mean: " 2.2
range: " 0.5 ! 8.0complex cells range: 1.8 ! 7.8 bulk " 2.0 ! 5.6
mean/med: 3.9/ 3.2 mean: 3.2range " 0.5 ! 8.0
Frequency bandwidth at 50% amplitude (cycles / deg)Model Cortex References
simple cells range: 1.1 ! 1.8 bulk " 1.0 ! 1.5 [DeValois et al., 1982a]med: " 1.45 med: " 1.45
range " 0.4 ! 2.6complex cells range: 1.5 ! 2.0 bulk " 1.0 ! 2.0
med: 1.6 med: 1.6range " 0.4 ! 2.6
Frequency bandwidth at 71% amplitude (index)Model Cortex References
simple cells range: 44 ! 58 bulk " 40 ! 70 [Schiller et al., 1976d]med: 55
complex cells range 40 ! 50 bulk " 40 ! 60med. 48
Orientation bandwidth at 50% amplitude (octaves)Model Cortex References
simple cells range: 38o ! 49o — [DeValois et al., 1982b]med: 44o
complex cells range: 27o ! 33o bulk " 20o ! 90omed: 43o med: 44o
Orientation bandwidth at 71% amplitude (octaves)Model Cortex References
simple cells range: 27o ! 33o bulk " 20o ! 70o [Schiller et al., 1976c]med: 30o
complex cells range: 27o ! 33o bulk " 20o ! 90omed: 31o
Serre & Riesenhuber 2004
50 0 500
0.2
0.4
0.6
0.8
1
orientation (in degree)
resp
onse
optimal baredgegrating
![Page 26: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/26.jpg)
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3S2 units
Features of moderate complexity (n~1,000 types)
Combination of V1-like complex units at different orientations
Synaptic weights w learned from natural images
5-10 subunits chosen at random from all possible afferents (~100-1,000)
![Page 27: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/27.jpg)
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3S2 units
stronger facilitation
stronger suppression
homogenous fields
cross-orientation
fields
![Page 28: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/28.jpg)
Nature Neuroscience - 10, 1313 - 1321 (2007) / Published online: 16 September 2007 | doi:10.1038/nn1975
Neurons in monkey visual area V2 encode combinations of orientationsAkiyuki Anzai, Xinmiao Peng & David C Van Essen
a b c
d e f0 2
0
–1
–1–1
–2–0.5
–0.52–1–
–2
–0.5
–1.0
–1–2
1
2
0 2
0
–1
–2
–2
1
2V2
24 spikes per s
0 1
0
0.5
1.0 V2
11 spikes per s
V2
18 spikes per s
0 0.5
0
0.5
V1
32 spikes per s
0 2
0
1
2 V2
14 spikes per s
0 1
0
1
y (°
)
V2
16 spikes per s
![Page 29: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/29.jpg)
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3C2 units
Same selectivity as S2 units but increased tolerance to position and size of preferred stimulus
Local pooling over S2 units with same selectivity but slightly different positions and scales
S2 units in V2 and C2 in V4?
![Page 30: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/30.jpg)
Beyond C2 units
Units increasingly complex and invariantS3/C3 units:
Combination of V4-like units with different selectivitiesDictionary of ~1,000 features = num. columns in IT (Fujita 1992)
S4 units:View-tuned units (imprinted with part of the training set, e.g. animal and non-animal images but still unsupervised)Tuning and invariance properties agrees with IT data (Logothetis Pauls & Poggio 1995)
Animalvs.
non-animal
C1S1
S2
S3S2bC2
classif.units
S4
C2b
C3
![Page 31: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/31.jpg)
Idea 1: Built-in invariance to 2D transformations (rotation and scale)
Idea 2: Generic features shared between multiple categories
Overall reduce “sample complexity” and reduces number of training examples needed to learn a task .
So why hierarchies?
![Page 32: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/32.jpg)
Task-specific = categorization circuitsAnimal
vs.
non-animal
Complex cells
Tuning
Simple cells
MAX
Main routes
Bypass routes
Prefrontal
Cortex
V4
PIT
35
PIT
, A
IT
AIT
,36
,35
V1
PG
TE
45 1211,
13
AIT
V2
V1
dorsal stream
'where' pathway
ventral stream
'what' pathway
C1
S1
S2
S3
S2b
C2
classification
units
0.2 - 1.1o
0.4 - 1.6o
0.6 - 2.4o
1.1 - 3.0o
0.9 - 4.4o
1.2 - 3.2
o
o
o
o
o
oo
Model
layers
RF sizes
S4 7o
Num.
units
C2b 7o
C3 7o
10 6
104
107
105
104
107
100
102
103
103
Incre
ase
in
co
mp
lexity (
nu
mb
er
of
su
bu
nits),
RF
siz
e a
nd
in
va
ria
nce
Un
su
pe
rvis
ed
ta
sk-in
de
pe
nd
en
t le
arn
ing
Superv
ised
task-d
ependent le
arn
ing
V1
V2
V4
PIT
AIT
PFC
features of increasing complexity and tolerance to position and scale
view-based object representation but tolerant position, scale and small rotations
![Page 33: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/33.jpg)
Animal
vs.
non-animal
Complex cells
Tuning
Simple cells
MAX
Main routes
Bypass routes
Prefrontal
Cortex
V4
PIT
35
PIT
, A
IT
AIT
,36
,35
V1
PG
TE
45 1211,
13
AIT
V2
V1
dorsal stream
'where' pathway
ventral stream
'what' pathway
C1
S1
S2
S3
S2b
C2
classification
units
0.2 - 1.1o
0.4 - 1.6o
0.6 - 2.4o
1.1 - 3.0o
0.9 - 4.4o
1.2 - 3.2
o
o
o
o
o
oo
Model
layers
RF sizes
S4 7o
Num.
units
C2b 7o
C3 7o
10 6
104
107
105
104
107
100
102
103
103
Incre
ase
in
co
mp
lexity (
nu
mb
er
of
su
bu
nits),
RF
siz
e a
nd
in
va
ria
nce
Un
su
pe
rvis
ed
ta
sk-in
de
pe
nd
en
t le
arn
ing
Superv
ised
task-d
ependent le
arn
ing
V1
V2
V4
PIT
AIT
PFC Evidence for adult plasticity
very likely
likely
limited evidence
![Page 34: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/34.jpg)
supervised learning from a handful of training examples ~ linear perceptron
Animal
vs.
non-animal
Complex cells
Tuning
Simple cells
MAX
Main routes
Bypass routes
Prefrontal
Cortex
V4
PIT
35
PIT
, A
IT
AIT
,36
,35
V1
PG
TE
45 1211,
13
AIT
V2
V1
dorsal stream
'where' pathway
ventral stream
'what' pathway
C1
S1
S2
S3
S2b
C2
classification
units
0.2 - 1.1o
0.4 - 1.6o
0.6 - 2.4o
1.1 - 3.0o
0.9 - 4.4o
1.2 - 3.2
o
o
o
o
o
oo
Model
layers
RF sizes
S4 7o
Num.
units
C2b 7o
C3 7o
10 6
104
107
105
104
107
100
102
103
103
Incre
ase
in
co
mp
lexity (
nu
mb
er
of
su
bu
nits),
RF
siz
e a
nd
in
va
ria
nce
Un
su
pe
rvis
ed
ta
sk-in
de
pe
nd
en
t le
arn
ing
Superv
ised
task-d
ependent le
arn
ing
V1
V2
V4
PIT
AIT
PFC
unsupervised developmental-like learning stage
![Page 35: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/35.jpg)
Columns in the cortex
![Page 36: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/36.jpg)
Layers of the model are organized in columns
Each model unit is equivalent to ~100 IF (~1 column of cortex)
Each hypercolumn contains the same basic dictionary of features and is replicated at all positions and scales
..
..
. .... ...
..
.. .
.. ..
. ...
. ..
. .. .
.
.......
![Page 37: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/37.jpg)
Learning is sequential
Start with layer S2/C2 then S2b/C2b and S3/C3
Pick one unit in layer Sk
Select random set of inputs from retinotopically organized afferents
Sk
Ck-1
w1
w2 w3
![Page 38: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/38.jpg)
Sk
Ck-1
w1
w2 w3
x1xk
xp
x j
x2 x3
y
w=xImprint with random patch of
natural image
![Page 39: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/39.jpg)
Sk
Ck-1
w1
w2 w3
x1xk
xp
x j
x2 x3
y = exp !1
2!2
n
j =1
(wj ! x j )2[ ]y
![Page 40: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/40.jpg)
...
✦ We learn ~1,000 units this way and then move to the next layer✦ Learning follows a long tradition of researchers who have argued that the visual system may be adapted to the statistics of the natural environment (Attneave 1954; Barlow 1961; Atick 1992; Ruderman 1994; Simoncelli & Olshausen 2001)
✦Here we assume the input image moves (shifting and looming) so that the selectivity of the imprinted units gets replicated at all positions and scales
...
.. .
.
![Page 41: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/41.jpg)
Learning invariancesw| T. Masquelier & S. Thorpe
(CNRS, France)
see also (Foldiak 1991; Perrett et al 1984; Wallis & Rolls, 1997; Einhauser et al 2002; Wiskott & Sejnowski 2002; Spratling 2005)
✦ Simple cells learn correlation in space (at the same time)
✦ Complex cells learn correlation in time
movie courtesy of Wolfgang Einhauser
![Page 42: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/42.jpg)
Learning invariancesw| T. Masquelier & S. Thorpe
(CNRS, France)
see also (Foldiak 1991; Perrett et al 1984; Wallis & Rolls, 1997; Einhauser et al 2002; Wiskott & Sejnowski 2002; Spratling 2005)
✦ Simple cells learn correlation in space (at the same time)
✦ Complex cells learn correlation in time
movie courtesy of Wolfgang Einhauser
S1 units
C1 unit
![Page 43: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/43.jpg)
Learning a dictionary of shape-components in the visual cortex
Learning frequent image features during development
Object categories share reusable features
Large redundant vocabulary for implicit geometry
![Page 44: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/44.jpg)
Learning a dictionary of shape-components in the visual cortex
Learning frequent image features during development
Object categories share reusable features
Large redundant vocabulary for implicit geometry
![Page 45: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/45.jpg)
Learning a dictionary of shape-components in the visual cortex
Learning frequent image features during development
Object categories share reusable features
Large redundant vocabulary for implicit geometry
V1
IT
![Page 46: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/46.jpg)
Learning a dictionary of shape-components in the visual cortex
Learning frequent image features during development
Object categories share reusable features
Large redundant vocabulary for implicit geometry
V1
IT
![Page 47: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/47.jpg)
Learning a dictionary of shape-components in visual cortex
“critical” feature columns in IT
(Tanaka, 1996)
![Page 48: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/48.jpg)
Learning a dictionary of shape-components in visual cortex
“critical” feature columns in IT
(Tanaka, 1996)
✦ Pre-attentive processing:• “Loose collection of basic features” (Wolfe & Bennett 1997)
• “Unbound features” (Treisman et al)
![Page 49: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/49.jpg)
Learning a dictionary of shape-components in visual cortex
“critical” feature columns in IT
(Tanaka, 1996)
✦ Pre-attentive processing:• “Loose collection of basic features” (Wolfe & Bennett 1997)
• “Unbound features” (Treisman et al)
✦ Computer vision:• Component-based > holistic representation (Perona
et al 1995, 1996, 2000; Heisele Serre & Poggio 2001, 2002)
• Features of intermediate complexity are optimal (Ullman, 2002)
• Bag of features (Csurka et al 2004; Sivic et al 2005; Sudderth et al 2005)
![Page 50: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/50.jpg)
C2 vs. IT neurons
Model data: Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005 Experimental data: Hung* Kreiman* Poggio & DiCarlo 2005
TRAIN
TEST
3.4ocenter
Size:Position:
3.4ocenter
1.7ocenter
6.8ocenter
3.4o2o horz.
3.4o4o horz.
0
0.2
0.4
0.6
0.8
1
Cla
ssifi
catio
n pe
rform
ance
IT Model
![Page 51: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/51.jpg)
Application to computer vision
![Page 52: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/52.jpg)
Bio-motivated computer vision
Computer vision system based on the
response properties of neurons in the ventral stream of the visual
cortex
Serre Wolf & Poggio 2005; Wolf & Bileschi 2006; Serre et al 2007
Scene parsing and object recognition
![Page 53: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/53.jpg)
Bio-motivated computer vision
Jhuang Serre Wolf & Poggio 2007
Action recognition in video sequences motion-sensitive MT-like units
![Page 54: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/54.jpg)
Bio-motivated computer vision
Jhuang Serre Wolf & Poggio 2007
Action recognition in video sequences motion-sensitive MT-like units
![Page 55: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/55.jpg)
Recognition accuracy
Dollar et al ‘05
model chance
KTH Human 81.3% 91.6% 16.7%
Weiz. Human 86.7% 96.3% 11.1%
UCSD Mice 75.6% 79.0% 20.0%
★ Cross-validation: 2/3 training, 1/3 testing, 10 repeats Jhuang Serre Wolf & Poggio ICCV’07
![Page 56: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/56.jpg)
Automatic recognition of rodent behavior
Serre Jhuang Garrote Poggio Steele in prep
![Page 57: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/57.jpg)
Automatic recognition of rodent behavior
human agreement
72%
proposed system
71%
commercial system
56%
chance 12%
Performance
Serre Jhuang Garrote Poggio Steele in prep
![Page 58: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/58.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”
Predicting human eye movements
![Page 59: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/59.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”
Predicting human eye movements
![Page 60: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/60.jpg)
Database collected by Torralba & Oliva (2003)
Head Close-body Medium-body Far-body
Animals
Natural
distractors
Artificial
distractors
![Page 61: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/61.jpg)
Head Close-
body
Far-
body
Medium-
body
1.0
1.4
2.6
2.4
1.8
Pe
rfo
rma
nce
(d
')
Model (82% correct)
Serre Oliva Poggio 2007
![Page 62: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/62.jpg)
High performance (~90%) when
maximal amount of information present
in the absence of clutter
Performance decreases (~74%) with increasing amount of clutter
Limitation of feedforward model compatible with decrease in response in V4 (Reynolds
Chelazzi & Desimone 1999) and IT in the presence of clutter (Zoccolan, Cox, DiCarlo, 2005; Zoccolan, Kouh, Poggio, DiCarlo, in sub; Rolls, Aggelopoulos, Zheng, 2003)
“Clutter effect”
Head Close-
body
Far-
body
Medium-
body
1.0
1.4
2.6
2.4
1.8
Perf
orm
ance (
d')
Model (82% correct)
![Page 63: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/63.jpg)
Animal presentor not ?
30 ms ISI
20 ms
Image
Interval Image-Mask
Mask1/f noise
80 ms
(Thorpe et al 1996; Van Rullen & Koch 2003; Bacon-Mace et al 2005)
![Page 64: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/64.jpg)
Same effect for human observers!
Head Close-
body
Far-
body
Medium-
body
1.0
1.4
2.6
2.4
1.8
Pe
rfo
rma
nce
(d
')
Model (82% correct)
Human observers (80% correct)
Serre Oliva Poggio 2007
(n=24)
![Page 65: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/65.jpg)
Image-by-image correlation:
Heads: ρ=0.71
Close-body: ρ=0.84
Medium-body: ρ=0.71
Far-body: ρ=0.60
Model predicts level of performance on rotated images (90 deg and inversion)
Further comparisons
Serre Oliva Poggio 2007
![Page 66: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/66.jpg)
Show matlab demo
![Page 67: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/67.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”
Predicting human eye movements
![Page 68: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/68.jpg)
This lecture
1.Learning a loose hierarchy of image fragmentsThe algorithm
Recognition in the real-world
2.Rapid recognition and feedforward processing:Predicting human performance
“Clutter problem”
3.Beyond feedforward processing:Top-down cortical feedback and attention to solve the “clutter problem”
Predicting human eye movements
![Page 69: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/69.jpg)
Spatial attention solves the “clutter problem”
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others
Problem: How to know where to attend?
![Page 70: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/70.jpg)
Spatial attention solves the “clutter problem”
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others
Problem: How to know where to attend?
foreground
![Page 71: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/71.jpg)
Spatial attention solves the “clutter problem”
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others
Problem: How to know where to attend?
foreground
background
![Page 72: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/72.jpg)
Spatial attention solves the “clutter problem”
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others
XXXX
Problem: How to know where to attend?
foreground
background
![Page 73: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/73.jpg)
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others
Science 22 April 2005:Vol. 308. no. 5721, pp. 529 - 534
Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4
Narcisse P. Bichot, Andrew F. Rossi, Robert Desimone
XXXX
Spatial attention solves the “clutter problem”
![Page 74: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/74.jpg)
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997;and many others
Science 22 April 2005:Vol. 308. no. 5721, pp. 529 - 534
Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4
Narcisse P. Bichot, Andrew F. Rossi, Robert Desimone
XXXX
Spatial attention solves the “clutter problem”
Answer: Parallel feature-based attention
![Page 75: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/75.jpg)
XXXX
Parallel feature-based attention modulation
0 100 200 0 100 200
time from fixation (ms)
0norm
aliz
ed s
pike
act
ivity
1
2
![Page 76: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/76.jpg)
the preferred feature was cued (22). Neuronsresponded better to their preferred featurein the RF compared to nonpreferred features(Fig. 3, A and B) (color, P G 0.01; shape, P G0.001). In the key test, we found that responseswere enhanced if the distracter in the RF wasof the neuron’s preferred color and it was alsothe same color (but, by design, not the sameshape) as the color-shape conjunction target(Fig. 3A) (P 0 0.002). In other words, thedistracter shared in the bias for the targetstimulus if it shared one of its features, con-sistent with the predictions of parallel searchmodels. The median enhancement was 8%,with more than 86% of the neurons having alarger response when the RF stimulus shareda feature with the searched-for target (chi-square, P G 0.005). There was also an en-hancement of the response when the shape ofthe distracter matched the shape of the color-shape conjunction target, consistent with paral-lel models, but this enhancement was smallerand developed later than the color-related en-hancement (Fig. 3, A and B). When the RFdistracter was of the preferred feature, shape-
related enhancement was not significant inthe same time interval as that used in thefeature search task, but it became significantÈ150 ms after fixation onset (P 0 0.035).This is consistent with the behavioral evi-dence described above, that the monkey usedthe color feature more than the shape featurein guiding its search to the color-shape con-junction target (fig. S2B). The LFP magni-tude (Fig. 3, C and D) and power were notmodulated by stimulus or cue features in theconjunction task.
There was also significant enhancementof the spike-field coherence in the gammaband when the RF distracter had the neu-ron’s preferred feature and that feature wasin common with the target for either a color(Fig. 3E) (P G 10j5) or shape (Fig. 3F) (P G0.001) match. The enhancement in the lattercase was smaller, again consistent with themonkey’s behavioral bias in favor of usingcolor information. The median enhancementof coherence with a color match was 22%,with 97% of spike-LFP pairs showing an in-crease (chi-square, P G 10j5), and the median
enhancement with a shape match was 17%,with 78% of spike-LFP pairs showing an in-crease (P G 0.002). Thus, the top-down biasin visual search is not limited to cases inwhich the RF stimulus is the search target butinstead applies to any stimulus, even a dis-tracter, that contains a feature relevant to thesearch, consistent with parallel models. It isalso consistent with the results from the featuresearch task, in which we found that enhance-ment occurred for colors that were similar tothe target color. Both results potentially ex-plain why search is often more difficult whenthe distracters share features with the target,as in some forms of conjunction search (8).
Serial selection during search. Finally,although we have emphasized the evidencefor parallel mechanisms in search, the tasknecessarily had a spatial attention (serial)component to it, in that the animals made sev-eral saccades to stimuli in the array whilesearching for the targets. To test for spatial at-tention effects on responses, we compared re-sponses and spike-field synchronization to astimulus in the RF when either it was selectedfor a saccade or the saccade was made to astimulus outside the RF (Fig. 4).
Selecting the RF stimulus for a saccade ledto an enhancement of the neuronal responseacross the population (Fig. 5A) (populationmedian enhancement of 36%, P G 10j5,
RF stimulus istarget of saccade
RF stimulus is nottarget of saccade
SACCADE:
SACCADE:
RF
FIX
vs.
Test for serial (spatial) selection
Fig. 4. Illustration of the saccade enhancementanalysis. We compared neuronal measures whenthe monkey made a saccade to an RF stimulusversus a saccade away from the RF. In this dis-play, fixating the purple cross, for example,brings the green star into the neuron’s RF. Wewould then compare neuronal responses whenthe green star in the RF was the target of thesaccade, to those when the saccade target wasto a stimulus outside the RF, e.g., the orange A.Activity was analyzed from the time the purplecross was fixated to when the next saccadewas initiated.
1
0
0
.15
20
Time from fixation (ms) Time from fixation (ms)
Frequency (Hz) Frequency (Hz)
COLOR EFFECT
-.2
0
.2
A B
C D
E F
Nor
mal
ized
spi
ke a
ctiv
ityN
orm
aliz
ed L
FP
200100 0 200100
SHAPE EFFECT
Spi
ke-f
ield
coh
eren
ce
1006020 10060
Fig. 3. (A to F) Feature-related enhancement of neuronal activity and synchronization duringconjunction search. Conventions are as given in Fig. 2.
R E S E A R C H A R T I C L E S
22 APRIL 2005 VOL 308 SCIENCE www.sciencemag.org532
on
Fe
bru
ary
18
, 2
00
9
ww
w.s
cie
nce
ma
g.o
rgD
ow
nlo
ad
ed
fro
m
2000 100
time from fixation (ms)
0
norm
aliz
ed s
pike
act
ivity
1
2
attend within RF
attend away from RF
Serial spatial attention modulation
XXXX
![Page 77: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/77.jpg)
Attention as Bayesian inference
see also Rao 2005; Lee & Mumford 2003 Chikkerur Serre & Poggio in prep
PFC
IT
V4/PIT
V2
![Page 78: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/78.jpg)
Attention as Bayesian inference
see also Rao 2005; Lee & Mumford 2003 Chikkerur Serre & Poggio in prep
PFC
IT
V4/PIT
V2
feature-basedattention
![Page 79: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/79.jpg)
Attention as Bayesian inference
see also Rao 2005; Lee & Mumford 2003 Chikkerur Serre & Poggio in prep
PFC
IT
V4/PIT
V2
FEF/LIP
spatial attention
feature-basedattention
![Page 80: Vision and visual neuroscience II - MIT9.520/spring09/Classes/Class22_ts... · 2010. 1. 22. · Animal vs. non-animal Complex cells Tuning Simple cells MAX Main routes Bypass routes](https://reader034.vdocuments.site/reader034/viewer/2022051810/6018c5f5b6443718964f400b/html5/thumbnails/80.jpg)
Attention as Bayesian inference
see also Rao 2005; Lee & Mumford 2003 Chikkerur Serre & Poggio in prep
PFC
IT
V4/PIT
V2
FEF/LIP
spatial attention
feature-basedattention
O
Fi
Fli
I
L
location priors
object priors
N