sensory and motor systems (g80.2202) psychophysics of early & mid-level vision instructor: nava...

Post on 20-Dec-2015

217 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sensory and Motor Systems (G80.2202)

Psychophysics of Early & Mid-level Vision

Instructor: Nava Rubin

Early Psychophysics (history-wise and level-wise)

Weber’s Law (1834)let i denote stimulation intensity, and let i denote the minimal increase in intensity that an observer can detect; the following holds:

Fechner’s insight (1860) : this corresponds to ‘constant increments of sensation’, s. Therefore :

iiconsta

nt

0

4

8

12

16

20

24

28

32

0 5000 10000

ii

s = k

integrate, s = k log(i) + C

i{

s {

s {

i

From here we can deduce

the form of S(i) :

Gustav Fechner (1801–1887)Ernst Weber (1795–1878)

Weber–Fechner law: S = k ln(i) [sensation log(intensity of stimulation)]

Example of a behavioral derivation of a neurally-based law (measured physiological only later*)

8

Luminance Gain Control(aka “light adaptation”)

Since luminance can potentially vary over an extremely wide range, the visual system (specifically, RGCs) adjust their sensitivity to match the locally prevalent luminance.This is done by roughly dividing the (within-RF) luminance by the local mean

luminance of the immediate surrounding (a few degrees outside the RF).

“surround”

“center”

lum in center RF

Background lum:

100

1000

10,000

1 10 100 1000

Sensitivity&

RGC response(normalized)

1

0.5

0

(schematic)

http://www.uni-mannheim.de/fakul/psycho/irtel/cvd/C4700.html

Contrast Gain Control

Contrast gain control begins in the retina and is strengthened at subsequent stages of the visual system. It roughly divides the responses by a measure that grows with the locally prevalent root-mean-square (r.m.s.) contrast, or the standard deviation of the stimulus luminance divided by the mean luminance.

Young & Helmholtz: Experiments in Additive Colors

(schematic)

“The Trichromatic Theory of Color”

Color metamers:

Two different spectral distributions that produce the same perceived color (in a given observer)||Two different spectral distributions that produce the same stimulation of the L,M and S cones (of a given observer)

Example: Yellow (~570nm or mix red and green)

‘S’ ‘M’ ‘L’

Ewald Hering (1834-1918):-Why does red produce a greenish after-effect ? (and vice versa)

-Why does yellow produce a bluish after-effect ? (and vice versa)

-Why do we perceive the superposition of ‘basic’ colors as “white”?-What does ‘white’ mean?? (Is it a property of the ‘outside’ world, or a property of our perceptual machinery?)

Hering’s Theory:The visual system generates color signals in opponent pairs (yellow-blue, red-green, white-black). At the time, it was seen by many to compete with the trichromatic theory, but Hering held that both theories could be valid. We now know he was correct: the two theories simply describe visual processes that occur at different levels. But it was not until much later in the twentieth century that neural experiments proved him correct.

Color adaptation (‘after-effects’): a demo

The Atomistic Approach to Psychophysics:the search for “atoms” of perception

Wilhelm Wundt (1832 -1920)

Limitations of the “atomistic” approach: Color

Color Constancy

Color Constancy: the tendency of surfaces to preserve their perceived color even when their emission spectrum changes dramatically (because of a change in the spectrum of the light they are reflecting)

E.H. Adelson, MIT

Limitations of the “atomistic” approach: Brightness

Fergus Campbell(1924-1993)

The Atomistic Approach, take 2:explaining visual perceptual phenomena

with independent filters / channels

Detection Thresholds and Linear Systems Analysis

Graham N & Nachmias J (1971), Detection of grating patterns containing two spatial frequencies: a comparison of single-channel and multiple-channels models. Vision Research 11(3) 251-9.

(script 1)

Detection Thresholds and Linear Systems Analysis

components‘single channel’

prediction‘multi-channel’

prediction

Results:

•The Appeal of the ‘multiple channel’ approach is tightly linked to the expectation that the response of the system to a ‘compound’ stimulus could be predicted from its response to the constituent components (e.g., in the case of a visual pattern, from the response to its Fourier components).

•Such a system is called a linear system: R(A + B + C …) = R(A) + R(B) + R(C) + …

•Another way to put it is that the channels are expected to be non-interacting, i.e. that the response of one channel (to its own component) does not depend on the input to the other channels.

•How valid is this expectation [assumption] for sensation and perception?

(script 2)

Limitations of Linear Systems Approach: the role of relative phase

Piotrowski LN & Campbell FW, A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase.Perception. 1982;11(3):337-46.

Why is relative phase so crucial to appearance?Hint: what is the Fourier spectra of an edge? (and of 1/f noise?... )

And why is threshold-detection nonetheless linear???

Different levels of visual processing[ (i) the boundaries are not 100% sharp; (ii) not a universal agreement on definitions]

Low-level: processes that are achieved by an array of filters that have relatively small receptive fields, tile the visual field (w/ overlap), and are non- or minimally-interacting.Examples: center-surround contrast detection in LGN; orientation selectivity in V1.

Mid-level: processes that (i) group visual information about surface fragments that are disjoint in space and/or time; (ii) segment visual information into separate spatial and temporal entities.

High-level: visual recognition processes; rely on prior knowledge of specific objects or classes of objects (their visual properties, semantic and/or lexical knowledge).

Example . . . . .

(Adapted from Lorenceau and Shiffrar 1992)More lines

One Object or Two Sets of Lines?

Will dots help?

Show All?

Mid-level visual processing: revisit definitionLow-level: processes that are achieved by an array of filters that have relatively small receptive fields, tile the visual field (w/ overlap), and are non- or minimally-interacting.Examples: center-surround contrast detection in LGN; orientation selectivity in V1.

Mid-level: processes that (i) group visual information about surface fragments that are disjoint in space and/or time*; (ii) segment visual information into separate spatial and temporal entities. Requires compilation of visual information from spatially and/or temporally disparate sources.a.k.a: “Perceptual Organization”; “Gestalt processing”; …

* Note: earliest in the visual pathway (ie retina), even physically contiguous surface portions may not be represented as a unitary entity (‘thing’), and therefore an overall change in neural representation may need to occur in cortex.

High-level: visual recognition processes; rely on prior knowledge of specific objects or classes of objects (their visual properties, semantic and/or lexical knowledge).

The Gestalt Psychology Movement (Wertheimer, Kohler, Koffka, …):

Perceptions are Gestalts** -- “a whole that is more than the sum of its parts”

put differently:PERCPTION IS FUNDAMENTALLY NON-LINEAR( the atomistic approach is doomed)

Emphasis on “perceptual organization”

Sensory and Motor Systems (G80.2202)

Psychophysics of Early & Mid-level VisionPart ii

Motion Integration and Segmentation: Plaids(Wallach 1935, 1976; Adelson & Movshon 1982; Hupe & Rubin 2003)

1

Reminder: show diff Alphas

(Rubin and Albert , VSS 2001)

Edges in Motion:Segmentation & integration in real-world images

Local velocity measurements are ambiguous …

Global Motion Processing:

“The aperture problem”Marr & Ullman (1981)

… is present not only for straight lines:

It is really just a subset of …

…and do not convey veridical information about the object’s global motion.

?

“The correspondence problem” Ullman (1979):“The identity problem”Wallach (1935, 1976):

(From Pack et al. 2003)

1D and 2D motion cues

Using short-bar stimuli and a reverse-correlation technique, Pack et al (2003) showed that the responses of end-stopped cells in V1 reliably signal the 2D motion direction of a bar’s endpoints, regardless of its orientation (i.e., these cells do not suffer from “the aperture problem”).

end-stoppedcell:

nonend-stoppedcell:

Motion Integration Motion Segmentation

Back to Plaid Perception :

“Coherency” “Transparency”

Back to Plaid Perception :

2D motion signals 1D motion signals

(Demoadapted from Lorenceau & Shiffrar VR 1992)

More lines

Back to ‘One Object, or Two Pairs of Lines?’

Scene Segmentation affects the assignment of local motion cues as ‘intrinsic’ vs. ‘extrinsic’

Will dots help?

Show All?

(After Shimojo, Silverman & Nakayama, VR 1989)

top related