a factor model for learning higher-order features in natural images yan karklin ( nyu + hhmi)
DESCRIPTION
a factor model for learning higher-order features in natural images yan karklin ( nyu + hhmi) joint work with mike lewicki ( case western reserve u ) icml workshop on learning feature hierarchies - june 18, 2009. retina. LGN. V2. V1. non-linear processing in the early visual system. - PowerPoint PPT PresentationTRANSCRIPT
a factor model for learning higher-order features in natural imagesyan karklin (nyu + hhmi)
joint work with mike lewicki (case western reserve u)
icml workshop on learning feature hierarchies - june 18, 2009
LGN
V1 V2
retina
2
non-linear processing in the early visual system
cat LGN (Carandini 2004)
estimated linear receptive field contrast/size response function
3
cat LGN (DeAngelis, Ohzawa, Freeman 1995)
cell (V1) model
÷
4
+ ...
normalized response
(Schwartz and Simoncelli, 2001)
( )2
non-linear statistical dependencies in natural images
5
0
6
7
8
9
10
a generative model for covariance matrices
multivariate Gaussian model
latent variables specify the covariance
11
a distributed code
if only we could find a basis for covariance matrices...
= y1 + . . .+ y2 + y3 + y4
12
a distributed code
= y1 + . . .+ y2 + y3 + y4
use the matrix-logarithm transform
if only we could find a basis for covariance matrices...
= y1 + . . .+ y2 + y3 + y4
13
a compact parameterization for cov-components
find common directions of change in variation/correlation
bark foliage
14
a compact parameterization for cov-components
bark foliage
find common directions of change in variation/correlation
vectors bk allow a more compact description of the basis functions
bk’s are unit norm (wjk can absorb scaling)
15
the full model
. . .
. . .
16
the full model
. . .
. . .
17
the full model
. . .
. . .
18
the full model
. . .
. . .
normalization by the inferred covariance matrix
19
÷ normalized response
training the model on natural images
20
. . .
. . .
learning details: • gradient ascent on data likelihood• 20x20 image patches from ~100 natural images• 1000 bk’s• 150 yj’s
learned vectors
21
. . .
. . .
learned vectors
22
black – 25 example featuresgrey – all 1000 features in model
. . .
. . .
one unit encodes global image contrast
. . .
. . .
23
w105,:
0
+
-
original patches
normalization by inferred contrast
0
. . .
. . .
26
w105,: w85,:
w57,: w11,:
27
. . .
. . .
w12,: w125,:
w90,: w144,:
. . .
. . .
28
w27,: w37,:
w78,: w66,:
original patches
normalization by inferred contrast
normalization by inverse covariance
v
h
Restricted Boltzmann Machines
32
x y
h
image sequence (t )
- Gaussian (pixels)- Gaussian (pixels)- binary
trained on image sequences
gated Restricted Boltzmann Machines (Memisevic and Hinton, CVPR 07)
33
x y
h
image sequence (t )
- Gaussian (pixels)- Gaussian (pixels)- binary
gated Restricted Boltzmann Machines (Memisevic and Hinton, CVPR 07)
34
log-Covariance factor model
trained on image sequences
35
÷
summary
• modeling non-linear dependencies motivated by statistical patterns• higher-order models can capture context• normalization by local “whitening” removes most structure
questions
• joint model for normalized response and normalization (context) signal?• how to extend to entire image?• where are these computations found in the brain?