slides09 fop10 v1&objects - university of...
Post on 19-Mar-2018
229 Views
Preview:
TRANSCRIPT
Vision IIIFrom Early Processing to
Object Perception
Chapter 10 in Chaudhuri
1 1
2
• Beyond the retina: 2 pathways to V1
• Subcortical structures (LGN & SC)
• Primary visual cortex: Lines, direction, colour
• M vs. P pathways: Movement vs. Particulars
• Object & Face recognition
Overview of Topics
3
How We See Things(short version)
• RGCs: Dot detectors
• V1: Line orientation, motion, colour
• V4: Shapes
• Temporal lobe: Objects & Faces
Fundamental Concept:Brain Organization Schemes
4
Dorsal vs. Ventral Streams
• Dorsal stream mainly involved in motion processing (the “where pathway”)
• Ventral mainly involved in processing identity of objects (the “what pathway”)
• Will focus on ventral here, which takes most of its input from P pathway (more later)
5
From Eye To Brain
6
Contralaterality in Vision
• One might think info from left eye goes to right brain & vice versa, but no. Instead...
• Information from left half of visual field goes (first) to right half of brain & vice versa
• That is, everything to the left of what you’re fixating goes to the right hemisphere (first), and vice versa.
7
Contralaterality in Vision
• Nasal halves of retinas (close to nose):
• Capture light from temporal half of visual field
• Send signals across to contralateral side of brain
• Temporal halves of retinas (close to temples):
• Capture light from nasal half of visual field
• Send signals along to ipsilateral side of brain
8 9
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
9
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
9
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
9
Nasal Visual FieldTemporal RetinaIpsilateral Hemisphere
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
10
Temporal Visual FieldNasal RetinaContralateral Hemisphere
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
10
Temporal Visual FieldNasal RetinaContralateral Hemisphere
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
10
Temporal Visual FieldNasal RetinaContralateral Hemisphere
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
11
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
11
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
11
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
11
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
11
Optic Tract
Primary Visual Cortex (V1)
Optic Radiations
Foveal Representation• Is the fovea split, with info from each half carried to separate
hemispheres?
• No, instead it is represented in both hemispheres
• Evidence for this is seen in “foveal sparing”, i.e., continued visual function in fovea after loss of a visual hemifield due to stroke
12
Questions
• Light from the temporal visual field of the right eye falls on the ________ half of the retina, which sends information to the ______ side of the brain
•What are the two large streams of visual information that exit V1 and go to the rest of the brain?
13
Two Pathways From Eye To Cortex
• Geniculocortical pathway:
• Lateral Geniculate Nucleus (LGN) of thalamus to V1
• ≈90% of RGC outputs
• Tectopulvinar pathway:
• Superior colliculus (aka “tectum”) to Pulvinar nucleus to visual cortex (many parts)
• ≈10% of RGC outputs
14
The LGN
• As we’ve seen, thalamus has nuclei for early sensory processing of all sensory modalities (except smell).
• LGN is the one for vision
• A knee-shaped part of the thalamus
• Has a six-layered structure
• Does early visual processing (centre-surround RFs)
• Good example of a module that is organized in layers and columns
15 16
The LGN
• LGN has left and right halves.
• Each half receives signals from right and left eyes
• Layers 2, 3, & 5 receive input from the ipsilateral eye
• Layers 1, 4, & 6 receive input from the contralateral eye
• C I I C I C “See I? I see! I see!”
• 1+4 = 6, not true, so “contra”; 2+3 = 5, true, so “ipsi”
17
The LGN
• LGN layers 1 & 2 are magnocellular, with large neurones
• Part of the M-pathway, responsible for motion
• LGN layers 3-6 are parvocellular, with small neurones
• Part of the P-pathway, responsible for colour and detail
18
• Laterality:
• Red: Receive signals from the ipsilateral eye.
• Blue: Receive signals from the contralateral eye.
• Pathways (aka Channels)
• Solid: parvocellular layers.
• Dotted: magnocellular layers
19
The LGN Visual Processing in LGN
• LGN neurones have centre-surround receptive fields, just like retinal ganglion cells
• However, LGN cells are more “selective”, possibly representing some signal processing to reduce noise and produce sharper tuning than RGCs.
• These “cleaned” signals are sent on the V1
20
Visual Processing in LGN
• The LGN’s connection to V1 goes both ways, with a descending fibre tract
• This has been found to play a role in attentional modulation
• Example: After I ask “Are there any questions?”, likely some of my magno LGN cells are more active (those for upward motion)
21
The LGN’sRetinotopic Map
• Each LGN layer is organized according to a Retinotopic Map
• Map: Each place on the retina corresponds to a place on the LGN in a systematic fashion
• Retinotopic: Neurones that are adjacent in LGN have RFs that are adjacent on the retina
22
Adjacent spots in the visual field correspond toadjacent spots on the retina, which in turn correspond toadjacent spots in the LGN
23
Adjacent spots in the visual field correspond toadjacent spots on the retina, which in turn correspond toadjacent spots in the LGN
23
Adjacent spots in the visual field correspond toadjacent spots on the retina, which in turn correspond toadjacent spots in the LGN
23
Adjacent spots in the visual field correspond toadjacent spots on the retina, which in turn correspond toadjacent spots in the LGN
23
How do we know this?
• Single-cell recording experiments in monkeys
• Recorded from individual neurones with a very fine electrode.
• For example, our electrode might penetrate the LGN parallel to its surface, thus staying the same layer.
• Measuring response of LGN neurones, and moving systematically across surface of LGN, we find the receptive fields move systematically across retina.
24 25
25 25
25 25
25 25
25
LGN Location Columns
• Single-cell recording experiments in monkeys
• If we instead move the electrode down through the LGN layers, we find the RFs are all the same place
• That is, the LGN is organized into location columns.
• i.e., there are columns of neurones that all process information from the same location on retina
26
27 27
27 27
27 27
27 27
Questions
•Which layers of the right LGN receive inputs from the left eye?
•What do we mean when we say the LGN has location columns?
•What do we mean when we say the LGN has a retinotopic layout?
28
Superior Colliculus
• Small branch from the optic tract goes to SC
• Has retinotopic map of contralateral visual field
• Signals go from SC to another thalamic nucleus, the pulvinar
• From there, they go to many parts of the visual cortex
29
Superior Colliculus
• SC receives descending signals from visual, auditory and somatosensory cortices
• SC integrates these to coordinate eye and body movements toward stimuli
• Example: You hear a loud sound or feel a tap on your shoulder and look automatically in that direction
30
V1
• Primary visual cortex = striate cortex = Brodmann Area 17 = Visual Receiving Area = V1
• The first cortical area for visual processing
• The best-understood part of the cortex, thanks to work by such luminaries as Hubel & Weisel, who won a Nobel Prize for research on V1
31
V1
• Organizational aspects:
• 6 layered structure
• Retinotopic map
• Ocular dominance columns
• Orientation selectivity columns
• Cytochrome oxidase blobs
• Location hypercolumns
32
Layer 1: No neurones, just fibres from neurones below
Layers 2-3: Communicate horizontally with other visual cortical areas
Layer 4: Receives inputs from LGN, subdivided into 4A, 4B, 4Cα (receives parvo inputs) & 4Cβ (receives magno)
Signals are then sent up/down from here to other layers
Layers 5-6: Send descending communications back to subcortical areas (LGN and SC)
Layers of V1
33
Layers of V1
34
Retinotopic Layout of V1
• RFs of adjacent V1 neurones are adjacent on the retina
• But, this retinotopic mapping is distorted relative to the surface area of the retina
• Foveal Magnification: Far more V1 neurones have RFs in the fovea than in the periphery
• i.e., foveal RGCs innervate a far larger area of V1than one would predict based on the area of the fovea
35
Foveal MagnificationOn The Retina In V1
36
Foveal MagnificationOn The Retina In V1
36
Foveal Magnification
•Why does V1 exhibit foveal magnification?
• One, there are simply more RGCs per unit area of fovea than in the peripheral retina
• Two, each foveal RGC innervates more cortical neurones
• Presumably this allows for more complex and precise processing of visual information from fovea
37
Questions
•What is the role of the SC in vision?
•Which layer of V1 receives signals from LGN?
•What is foveal magnification? Why does it occur?
38
Binocularity &Ocular Dominance
• At the LGN, all neurones are monocular
• However, at V1, the majority are binocular, taking inputs from both eyes
• This is the beginning of stereoscopic depth perception
39
Ocular Dominance
• Most V1 neurones are binocular
• But most show some preference for one eye or the other
• The preference varies systematically across the surface of the cortex
40
Ocular Dominance
41
How Ocular Dominance Comes About
42
Orientation Selectivity
• Most V1 Neurons have elongated receptive fields
• These are ON/OFF or OFF/ON, like RGCs
• Each neurone responds best to a line of light (ON/OFF) or dark (OFF/ON) of a given orientation
43
Orientation Selectivity
44
Orientation Selectivity• How does orientation selectivity in V1 neurones arise?
• Hubel & Weisel proposed the model below, where several LGN cells having RFs lined up on the retina--feed into one V1 cell
45
How V1 cells are wired to RGCs to produce oriented receptive fields
46
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
+-
Retinal Ganglion Cells
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
+-
Retinal Ganglion Cells
LGN Cells
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
+-
Retinal Ganglion Cells
LGN Cells
+-
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
+-
Retinal Ganglion Cells
LGN Cells
+-
+-
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
+-
Retinal Ganglion Cells
LGN Cells
+-
+-
V1 Simple Cell
+ + +
- - -
- - -
How V1 cells are wired to RGCs to produce oriented receptive fields
46
+ +
+
- -
-
- -
-
+-
Retinal Ganglion Cells
LGN Cells
+- +
-
V1 Simple Cell
How V1 cells are wired to RGCs to produce oriented receptive fields
47
+ +
+
- -
-
- -
-
+-
Retinal Ganglion Cells
LGN Cells
+- +
-
V1 Simple Cell
How V1 cells are wired to RGCs to produce oriented receptive fields
47
–+
Retinal Ganglion Cells
LGN Cells
–+
–+
V1 Simple Cell
– – –
+ + +
+ + +
How V1 cells are wired to RGCs to produce oriented receptive fields
48
–+
Retinal Ganglion Cells
LGN Cells
–+
–+
V1 Simple Cell
– – –
+ + +
+ + +
How V1 cells are wired to RGCs to produce oriented receptive fields
48
Directional Motion Selectivity
• Hubel & Weisel also found V1 cells sensitive to the direction of motion of the stimulus
• These are the first stage in our ability to process moving stimuli
49
Directional Motion Selectivity
50
Striate Cortex Motion-Sensitive
Cell(Reichardt Detector)
Striate Cortex Simple Cell
+
+
+-
-
-
-
-
-
+ + +
- - -
- - -
DelayingInterneuron
Schematic of a Reichardt Motion Detector
51
Striate Cortex Motion-Sensitive
Cell(Reichardt Detector)
Striate Cortex Simple Cell
+
+
+-
-
-
-
-
-
+ + +
- - -
- - -
DelayingInterneuron
Schematic of a Reichardt Motion Detector
51
Questions
•What are three stimulus characteristics that V1 neurones are tuned to?
• True or false: Orientation-selective cells are all ON/OFF (i.e, excitatory centre/inhibitory surround)?
52
Organization of V1
• How are the various cells in V1 organized?
• Hubel & Weisel proposed the “ice cube” model, whereby orientation and ocular dominance columns varied independently
• Also proposed the location hypercolumn, which is a set of all orientation columns and two ocular dominance columns
53
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
Retina
54
Ice Cube Model of V1
• While it’s accurate as far as it goes, the ice cube model is not complete
• Motion direction selectivity is not incorporated, nor are colour processing, spatial frequency, or M vs. P channels.
55
Cytochrome Oxidase Blobs• Interlaced with location hypercolumns are another set of
columns that show high neural activity
• Once were thought to be involved in colour processing and called “colour blobs”
• But instead they seem to integrate info from M&P cells
56
M vs. P Channels• As noted earlier, two
channels start in retina:
• Magno = movement
• Parvo = particulars
• These continue on through V1 to higher visual areas
57
M vs. P: Anatomical Separation
M P
Retina Parasol RGCs Midget RGCs
LGN Layers 1-2 Layers 3-6
V1, layer 4 4Cα 4Cβ
V1, blobs blob & interblob interblob only
Extrastriate MT V4
Pathways Dorsal (“where”) Ventral (“what”)
Com
pleteSeparation
PartialSeparation
58
M vs. P: Functional Differences
59
Ultimately...
• M channel projects more to MT (motion processing area) and then to the dorsal stream (= “where pathway”)
• P channel projects more to V4 (form processing area) and then to the ventral stream (= “what pathway”)
• We will now take a closer look at the latter
60
Questions
•What is a location hypercolumn in V1?
• The M & P pathways are completely segregated up to what point in the visual system?
61
The Ventral Stream• Consists of a network of areas, mostly in the
inferior temporal (IT) area, that engage in high-level vision
• IT cortex can be divided into 3 zones:
• Posterior IT (PIT): Complex form processing
• Central IT (CIT): View invariant processing
• Anterior IT (AIT): Individuation / configuration / shape-invariant processing
62
• Differentiating illumination edges from reflectance edges.
• Inverse Projection: Determining 3D shape from 2D information
• Segmentation: Differentiating objects from background and each other.
• Viewpoint invariance: Objects look different from different viewpoints
• Shape invariance: Some objects, especially living things, change shape but nonetheless are recognized as the same object.
• Completion: Objects are often partially occluded, how do we complete the view of a partially-viewed object?
Complex Form Processing Tasks (PIT/CIT)
63
64
Illumination Edge
64
Illumination Edge
Reflectance Edge
64
It is difficult for a computer program (but easy for us) to determine which changes in lightness in this scene are due to properties of different parts of the scene, and which are due to changes in illumination.
Illumination Edge
Reflectance Edge
64
Light Comes From Above
One assumption the visual system makes in differentiatingshadow from reflectance change is that light comes from above. True over our evolutionary history.
65
Light Comes From Above
One assumption the visual system makes in differentiatingshadow from reflectance change is that light comes from above. True over our evolutionary history.
65
The Light-from-above Assumption
66
The Light-from-above Assumption
67
The Light-from-above Assumption
67
Shadows Have Fuzzy Edges
Another assumption the visual system makes is that shadows have fuzzy edges (penumbras).
68
Shadows Have Fuzzy Edges
Another assumption the visual system makes is that shadows have fuzzy edges (penumbras).
68
An infinite number of objects can create the same image on the retina. How do we know which one is out there?
Inverse Projection Problem
69
Inverse Projection Rules
• The brain uses heuristics--rules of thumb that aren’t always true--to solve the otherwise impossible inverse projection problem. E.g.,
• “A straight line in the 2D image on the retina is a straight line in 3D reality”
• “If the tips of two lines meet in 2D, assume they meet at their tips in 3D reality”
70
Heuristics
Both of these assumptions hold true in most cases, but not all. Both are part of a more general rule that the visual system interprets the 3D world in a “stable” way, meaning that it will not change with slight changes in POV.
71
Heuristics
Both of these assumptions hold true in most cases, but not all. Both are part of a more general rule that the visual system interprets the 3D world in a “stable” way, meaning that it will not change with slight changes in POV.
71
Heuristics
Both of these assumptions hold true in most cases, but not all. Both are part of a more general rule that the visual system interprets the 3D world in a “stable” way, meaning that it will not change with slight changes in POV.
71
Heuristics
Both of these assumptions hold true in most cases, but not all. Both are part of a more general rule that the visual system interprets the 3D world in a “stable” way, meaning that it will not change with slight changes in POV.
71
Inverse Projection Fail
• http://tinyurl.com/7duz9zb
• Your brain makes assumptions about how 2D projections arise from 3D objects
• In the case of the Devil’s Triangle, they not only fail to give you the correct interpretation, they actively prevent you from getting it!
72
Inverse Projection Epic Fail!
73
• Segmentation: Which parts of a scene belong to which objects? What is object vs. background?
• Part of the solution involves gestalt rules such as smoothness heuristics, but part is simple experience.
• Here Magritte messes with our segmentation heuristic by violating both gestalt rules and our experiences.
74
Segmentation
• Gestalt heuristics play a role in segmentation:
• Smoothness: Take the interpretation with the least sharp turns
• Pragnanz (simplicity): Take the interpretation with the fewest objects and types of objects
75
Smoothness Heuristic Fail
76
Smoothness Heuristic Fail
76
Perc
eptu
al S
egre
gatio
n:
Figu
re a
nd G
roun
d
77
78
• Heuristics used to determine which area is figure:
• Figures are located in the lower part of scene
• Figures are symmetrical
• Figures are small, backgrounds are large
• Figures are vertical
• Elements that are “meaningful” (i.e., have been seen as figures before) are figures
Figure-Ground Segmentation
79
80 81
• Recordings from V1 in the monkey cortex show:
• Response to area that is figure
• No response to area that is ground
• This result is important because:
• V1 neurones are early in the nervous system
• It reveals both a “feedforward” and “feedback” in the system
Figure-Ground Segmentation in V1
82
How a neurone in V1 responds to stimuli presented to its receptive field (green rectangle).
(a) The neurone responded when the stimulus on the receptive field is figure.
(b) No response when the same pattern on the receptive field is not figure!
83
How a neurone in V1 responds to stimuli presented to its receptive field (green rectangle).
(a) The neurone responded when the stimulus on the receptive field is figure.
(b) No response when the same pattern on the receptive field is not figure!
83
Questions
•What are some heuristics the brain uses...
• ...to distinguish luminance changes from lightness changes?
• ...to solve the inverse projection problem?
• ...to solve the segmentation problem?
84
Viewpoint Invariance“Ceci ne sont pas des pipes”
85
Viewpoint Invariance
•We recognize objects from different viewpoints, even though pattern of light on retina changes.
• i.e., there is a ∞-to-one relation between light patterns and objects (and vice versa).
• Beiderman’s RBC and Tarr et al’s view-based recognition models both tried to account for this
•We will see that this is another example of thesis-antithesis-synthesis
86
87
Human Viewpoint Invariance is Imperfect.Quick: Which pairs show two views of the same object?
A B C
87
Human Viewpoint Invariance is Imperfect.Quick: Which pairs show two views of the same object?
A B C
87
Human Viewpoint Invariance is Imperfect.Quick: Which pairs show two views of the same object?
A B C
87
Human Viewpoint Invariance is Imperfect.Quick: Which pairs show two views of the same object?
A B C
• Structural-description models: 3D object representations are based on combinations of 3D volumetric primitives
• Image-description models: Ability to identify 3D objects comes from sets of stored 2D images from different perspectives
Two Viewpoints onViewpoint Invariance
88
• Marr’s model proposed a sequence of events using simple geometrical features:
• Edges (detected via V1 neurones)
• View-invariant features such as parallel lines, curve polarity, angle type. (PIT, maybe?)
• Geometrical shapes (again, PIT?)
• Relations between geometric shapes (CIT?).
Structural Description Models
89
• Recognition-by-components theory by Biederman (developed from Marr’s ideas)
• Volumetric primitives are called geons
• Theory proposes there are 36 geons that combine to make all 3-D objects
• Geons include cylinders, rectangular solids, pyramids, etc.
Structural-Description Models
90
Geons & Objects
91
• Properties of geons
• View-invariant: They can be recognized from almost any viewpoint (except rare “accidental” viewpoints)
• Discriminability: They can be easily distinguished from one another.
• Principle of componential recovery - the ability to recognize an object if we can identify its geons
Structural-Description Models
92
It is difficult to identify the object behind the mask because the corners and curves that allow extraction of geons have been obscured.
93
Now that it is possible to identify geons, the object can be identified
94
• In contrast to structural description models, image-description models claim that:
• Ability to identify 3-D objects comes from stored 2-D viewpoints from different perspectives.
• Evidence for this comes from novel object studies:
• For a familiar object, view invariance occurs
• For a novel object, view invariance does not occur
• Shows that an observer must have the different viewpoints encoded before recognition can occur from all viewpoints
Image-Description Models
95
Psychophysical curve showing that a monkey is better at identifying the view of the object that was presented during training (arrow). No view invariance.
96
Synthesis• Tjan & Legge (1998):
• View-invariant performance found for simple objects (e.g., geometrical shapes)
• But not for complex objects (e.g., ameoboids, bent paper-clip objects, etc.)
• Complexity defined quantitatively via an ideal observer algorithm
• Recent models incorporate both image-description and structural description aspects.
97
Completion
• Completion of partially-viewed objects is based on gestalt heuristics such as smoothness and pragnanz
• But, as with all heuristics, these sometimes fail.
• Sometimes we complete things that aren’t there...
• Experience obviously plays an important role
98
You Complete Me
99
You Complete Me
99 100
Shape Invariance
We have little trouble recognizing this bird as such, despite its many changes in shape
101
Shape Invariance
A similar problem arises with facial expression
102
Questions
•Which theory best explains our ability to recognize objects from many views, structural description models or image-description models?
•What does completion refer to in vision?
103
Face Perception
• Face processing is a highly complex visual task
• Faces are quite uniform, but we individuate them with ease
• Facial expressions are subtle variations in face shape, but we decode them with ease
• Thought to be subserved by a network of brain areas, some of which are in AIT
104
“Your face is the same as every-body has – the two eyes... nose in the middle, mouth under. It's always the same. Now if you had the two eyes on the same side of the nose, for instance – or the mouth at the top – that would be some help.”
- Humpty Dumpty, Through the Looking Glass
105 106
My Clones?
106
My Clones? Prosopaganosia
• An inability to recognize faces
• Often arises after damage to the AIT, specifically the fusiform face area (FFA)
• Specific to faces, recognition of other objects is unimpaired
107
Face-Selective Neurones
In areas of monkey cortex homologous to FFA, we find cells that respond specifically to faces
108
Perceptual Differences Between Faces & Objects• A number of phenomena suggest that faces are
processed in a qualitatively different way than other objects
• Inversion has a disproportionate effect on face recognition
• Inversion seems to disrupt configural processing in faces but not objects
• Composite effects exist for faces but not objects
109
A little harder
A lot harder
110
Face Inversion Inversion Disrupts Configural Processing
111
Schwaninger, A., Carbon, C.C., & Leder, H. (2003). Expert face processing: Specialization and constraints. In G. Schwarzer & H. Leder (Eds.), Development of Face Processing, pp. 81-97. Göttingen: Hogrefe.
Inversion Disrupts Configural Processing
111
Schwaninger, A., Carbon, C.C., & Leder, H. (2003). Expert face processing: Specialization and constraints. In G. Schwarzer & H. Leder (Eds.), Development of Face Processing, pp. 81-97. Göttingen: Hogrefe.
Inversion Disrupts Configural Processing
111
Schwaninger, A., Carbon, C.C., & Leder, H. (2003). Expert face processing: Specialization and constraints. In G. Schwarzer & H. Leder (Eds.), Development of Face Processing, pp. 81-97. Göttingen: Hogrefe.
Composite Face Effect
112
Composite Face Effect
112
Questions
•What is prosopagnosia?
•When does it occur?
•What is the composite face effect?
113
top related