image manifolds 16-721: learning-based methods in vision alexei efros, cmu, spring 2007 © a.a....

47
Image Manifolds 16-721: Learning-based Methods in Visi Alexei Efros, CMU, Spring 20 © A.A. Efros th slides by Dave Thompson

Post on 15-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Image Manifolds

16-721: Learning-based Methods in VisionAlexei Efros, CMU, Spring 2007

© A.A. Efros

With slides by Dave Thompson

Page 2: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Images as Vectors

=

m

n

n*m

Page 3: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Importance of Alignment

=

m

n

n*m

=

n*m

=?

Page 4: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Text Synthesis

[Shannon,’48] proposed a way to generate English-looking text using N-grams:• Assume a generalized Markov model

• Use a large text to compute prob. distributions of each letter given N-1 previous letters

• Starting from a seed repeatedly sample this Markov chain to generate new letters

• Also works for whole words

WE NEED TO EAT CAKE

Page 5: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Mark V. Shaney (Bell Labs)

Results (using alt.singles corpus):• “As I've commented before, really relating to

someone involves standing next to impossible.”

• “One morning I shot an elephant in my arms and kissed him.”

• “I spent an interesting evening recently with a grain of salt”

Page 6: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Video TexturesVideo Textures

Arno Schödl

Richard Szeliski

David Salesin

Irfan Essa

Microsoft Research, Georgia Tech

Page 7: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Video texturesVideo textures

Page 8: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Our approachOur approach

• How do we find good transitions?

Page 9: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Finding good transitions Finding good transitions

• Compute L2 distance Di, j between all frames

Similar frames make good transitions

frame ivs.

frame j

Page 10: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Markov chain representationMarkov chain representation

2 3 41

Similar frames make good transitions

Page 11: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Transition costs Transition costs

• Transition from i to j if successor of i is similar to j

• Cost function: Cij = Di+1, j

• i

j

i+ 1

j-1

i j D i+ 1, j

Page 12: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Transition probabilitiesTransition probabilities

•Probability for transition Pij inversely related

to cost:•Pij ~ exp ( – Cij / 2 )

high low

Page 13: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Preserving dynamicsPreserving dynamics

Page 14: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Preserving dynamics Preserving dynamics

Page 15: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Preserving dynamics Preserving dynamics

• Cost for transition ij

• Cij = wk Di+k+1, j+kk = -N

N -1

i

j j+ 1

i+ 1 i+ 2

j-1j-2

i jD i, j - 1 D Di+ 1, j i+ 2, j+ 1

i-1

D i-1, j-2

Page 16: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Preserving dynamics – effect Preserving dynamics – effect

• Cost for transition ij

• Cij = wk Di+k+1, j+kk = -N

N -1

Page 17: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Video sprite extractionVideo sprite extraction

blue screen m attingand velocity estim ation

Page 18: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

C i j = + angle C i j

vector tom ouse pointer

S im ilarity term Contro l term

velocity vector

Animation{ {

Video sprite controlVideo sprite control

• Augmented transition cost:

Page 19: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Interactive fishInteractive fish

Page 20: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Advanced Perception David R. Thompson

manifold learning with applications to object recognition

Page 21: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

plenoptic function

manifolds in vision

Page 22: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

appearance variation

manifolds in vision

images from hormel corp.

Page 23: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

deformation

manifolds in vision

images from www.golfswingphotos.com

Page 24: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Find a low-D basis for describing high-D data. X ~= X' S.T. dim(X') << dim(X)

uncovers the intrinsic dimensionality

manifold learning

Page 25: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

If we knew all pairwise distances…

Chicago Raleigh Boston Seattle S.F. Austin Orlando

Chicago 0

Raleigh 641 0

Boston 851 608 0

Seattle 1733 2363 2488 0

S.F. 1855 2406 2696 684 0

Austin 972 1167 1691 1764 1495 0

Orlando 994 520 1105 2565 2458 1015 0

Distances calculated with geobytes.com/CityDistanceTool

Page 26: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Multidimensional Scaling (MDS) For n data points, and a distance matrix D,

Dij =

...we can construct a m-dimensional space to preserve inter-point distances by using the top eigenvectors of D scaled by their eigenvalues

j

i

Page 27: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

MDS result in 2D

Page 28: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Actual plot of cities

Page 29: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Don’t know distances

Page 30: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Don’t know distnaces

Page 31: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

1. data compression

2. “curse of dimensionality”

3. de-noising

4. visualization

5. reasonable distance metrics

why do manifold learning?

Page 32: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

reasonable distance metrics

?

Page 33: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

reasonable distance metrics

?

linear interpolation

Page 34: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

reasonable distance metrics

?

manifold interpolation

Page 35: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Isomap for images

Build a data graph G. Vertices: images (u,v) is an edge iff SSD(u,v) is small For any two images, we approximate the

distance between them with the “shortest path” on G

Page 36: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Isomap

1. Build a sparse graph with K-nearest neighbors

Dg =

(distance matrix issparse)

Page 37: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Isomap

2. Infer other interpoint distances by finding shortest paths on the graph (Dijkstra's algorithm).

Dg =

Page 38: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Isomap shortest-distance on a graph is easy to compute

Page 39: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Isomap results: hands

Page 40: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

- preserves global structure

- few free parameters

- sensitive to noise, noise edges

- computationally expensive (dense matrix eigen-reduction)

Isomap: pro and con

Page 41: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Leakage problem

Page 42: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Find a mapping to preserve local linear relationships between neighbors

Locally Linear Embedding

Page 43: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

Locally Linear Embedding

Page 44: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

1. Find weight matrix W of linear coefficients:

Enforce sum-to-one constraint.

LLE: Two key steps

Page 45: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

2. Find projected vectors Y to minimize reconstruction error

must solve for whole dataset simultaneously

LLE: Two key steps

Page 46: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

LLE: Result preserves local topology

PCA

LLE

Page 47: Image Manifolds 16-721: Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson

- no local minima, one free parameter - incremental & fast

- simple linear algebra operations

- can distort global structure

LLE: pro and con