large-scale face manifold learning - brown university · pdf filelarge-scale face manifold...

43
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri

Upload: lytram

Post on 30-Mar-2018

228 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

1

Large-Scale Face Manifold Learning

Sanjiv Kumar

Google Research New York, NY

* Joint work with A. Talwalkar, H. Rowley and M. Mohri

Page 2: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

2

50 x 50 pixel faces

50 x 50 pixel random images

Space of face images significantly smaller than 2562500

Face Manifold Learning

Want to recover the underlying (possibly nonlinear) space !

ℜ2500

(Dimensionality Reduction)

Page 3: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

3

Dimensionality Reduction

•  Linear Techniques –  PCA, Classical MDS –  Assume data lies in a subspace –  Directions of maximum variance

•  Nonlinear Techniques –  Manifold learning methods

•  LLE •  ISOMAP •  Laplacian Eigenmaps

–  Assume local linearity of data –  Need densely sampled data as input

[Roweis & Saul ’00]

[Tenanbaum et al. ’00]

[Belkin & Niyogi ’01]

Bottleneck: Computational Complexity ≈ O(n3) !

Page 4: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

4

Outline

•  Manifold Learning –  ISOMAP

•  Approximate Spectral Decomposition –  Nystrom and Column-Sampling approximations

•  Large-scale Manifold learning –  18M face images from the web –  Largest study so far ~270 K points

•  People Hopper – A Social Application on Orkut

Page 5: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

5

•  Find the low-dimensional representation that best preserves geodesic distances between points

ISOMAP [Tanenbaum et al., ’00]

Page 6: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

6

•  Find the low-dimensional representation that best preserves geodesic distances between points

ISOMAP [Tanenbaum et al., ’00]

Recovers true manifold asymptotically !

Output co-ordinates Geodesic distance

Page 7: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

7

i j

Given n input images:

•  Find t nearest neighbors for each image : O(n2)

•  Find shortest path distance for every (i, j), Δij : O(n2 log n)

•  Construct n × n matrix G with entries as centered Δij

2 –  G ~ 18M x 18M dense matrix

•  Optimal k reduced dims: Uk Σk1/2

O(n3) ! Eigenvectors Eigenvalues

[Tanenbaum et al., ’00] ISOMAP

Page 8: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

8

Spectral Decomposition •  Need to do eigen-decomposition of symmetric positive

semi-definite matrix

•  For , G ≈ 1300 TB –  ~100,000 x 12GB RAM machines

•  Iterative methods –  Jacobi, Arnoldi, Hebbian –  Need matrix-vector products and several passes over data –  Not suitable for large dense matrices

•  Sampling-based methods –  Column-Sampling Approximation –  Nystrom Approximation

G[ ] n×n

[Golub & Loan, ’83][Gorell, ’06]

Relationship and comparative performance? [Frieze et al., ’98]

[Williams & Seeger, ’00]

O(n3)

Page 9: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

9

Approximate Spectral Decomposition

•  Sample l columns randomly without replacement

l

C

•  Column-Sampling Approximation – SVD of C

•  Nystrom Approximation – SVD of W [Frieze et al., ’98]

[Williams & Seeger, ’00][Drineas & Mahony, ’05]

l

Page 10: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

10

Column-Sampling Approximation

Page 11: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

11

Column-Sampling Approximation

Page 12: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

12

Column-Sampling Approximation

O(nl 2) !

O(l 3) !

[n × l ]

[l × l ]

Page 13: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

13

Nystrom Approximation

C

l

l

Page 14: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

14

Nystrom Approximation l

l

O(l 3) !

C

Page 15: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

15

Nystrom Approximation l

l

C

Not Orthonormal !

O(l 3) !

Page 16: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

16

Nystrom Vs Column-Sampling

•  Experimental Comparison –  A random set of 7K face images –  Eigenvalues, eigenvectors, and low-rank approximations

[Kumar, Mohri & Talwalkar, ICML ’09]

Page 17: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

17

Eigenvalues Comparison

% deviation from exact

Page 18: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

18

Eigenvectors Comparison

Principal angle with exact

Page 19: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

19

Low-Rank Approximations

Nystrom gives better reconstruction than Col-Sampling !

Page 20: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

20

Low-Rank Approximations

Page 21: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

21

Low-Rank Approximations

Page 22: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

22

Orthogonalized Nystrom

Nystrom-orthogonal gives worse reconstruction than Nystrom !

Page 23: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

23

Low-Rank Approximations Matrix Projection

Page 24: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

24

Low-Rank Approximations Matrix Projection

Page 25: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

25

Low-Rank Approximations Matrix Projection

˜ G nys = C ln

W −2⎛

⎝ ⎜

⎠ ⎟ CTG

˜ G col = C CTC( )−1CTG

Page 26: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

26

Col-Sampling gives better Reconstruction than Nystrom !

Low-Rank Approximations Matrix Projection

–  Theoretical guarantees in special cases [Kumar et al., ICML ’09]

Page 27: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

27

How many columns are needed? Columns needed to get 75% relative accuracy

•  Sampling Methods –  Theoretical analysis of uniform sampling method –  Adaptive sampling methods –  Ensemble sampling methods

[Deshpande et al. FOCS ’06] [Kumar et al., ICML ’09]

[Kumar et al., AISTATS ’09]

[Kumar et al., NIPS ’09]

Page 28: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

28

So Far …

•  Manifold Learning –  ISOMAP

•  Approximate Spectral Decomposition –  Nystrom and Column-Sampling approximations

•  Large-scale Face Manifold learning –  18 M face images from the web

•  People Hopper – A Social Application on Orkut

Page 29: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

29

Large-Scale Face Manifold Learning

•  Construct Web dataset –  Extracted 18M faces from 2.5B internet images –  ~15 hours on 500 machines –  Faces normalized to zero mean and unit variance

•  Graph construction –  Exact search ~3 months (on 500 machines) –  Approx Nearest Neighbor – Spill Trees (5 NN, ~2 days) –  New methods for hashing based kNN search –  Less than 5 hours!

[Liu et al., ’04]

[Talwalkar, Kumar & Rowley, CVPR ’08]

[CVPR ’10] [ICML ’10] [ICML ’11]

Page 30: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

30

Neighborhood Graph Construction

•  Connect each node (face) with its neighbors

•  Is the graph connected? –  Depth-First-Search to find largest connected component –  10 minutes on a single machine –  Largest component depends on number of NN ( t )

Page 31: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

31

Samples from connected components

From Largest Component

From Smaller Components

Page 32: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

32

Graph Manipulation

•  Approximating Geodesics –  Shortest paths between pairs of face images –  Computing for all pairs infeasible

•  Key Idea: Need only a few columns of G for sampling-based decomposition –  require shortest paths between a few ( l ) nodes and all

other nodes –  1 hour on 500 machines (l = 10K)

•  Computing Embeddings (k = 100) –  Nystrom: 1.5 hours, 500 machine –  Col-Sampling: 6 hours, 500 machines –  Projections: 15 mins, 500 machines

O(n2 log n) !

Page 33: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

33

18M-Manifold in 2D

Nystrom Isomap

Page 34: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

34

Shortest Paths on Manifold

18M samples not enough!

Page 35: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

35

Summary

•  Large-scale nonlinear dimensionality reduction using manifold learning on 18M face images

•  Fast approximate SVD based on sampling methods

•  Open Questions –  Does a manifold really exist or data may form clusters in

low dimensional subspaces? –  How much data is really enough?

Page 36: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

36

People Hopper

•  A fun social application on Orkut

•  Face manifold constructed with Orkut database –  Extracted 13M faces from about 146M profile images –  ~3 days on 50 machines –  Color face image (40x48 pixels) 5760-dim vector –  Faces normalized to zero mean and unit variance in

intensity space

•  Shortest path search using bidirectional Dijkstra

•  Users can opt-out – Daily incremental graph update

Page 37: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

37

People Hopper Interface

Page 38: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

38

From the Blogs

Page 39: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

39

CMU-PIE Dataset

•  68 people, 13 poses, 43 illuminations, 4 expressions

•  35,247 faces detected by a face detector

•  Classification and clustering on poses

Page 40: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

40

Clustering •  K-means clustering after transformation (k = 100)

–  K fixed to be the same as number of classes

•  Two metrics Purity - points within a cluster come from the same class Accuracy - points from a class form a single cluster

Matrix G is not guaranteed to be positive semi-definite in Isomap ! - Nystrom: EVD of W (can ignore negative eigenvalues) - Col-sampling: SVD of C (signs are lost) !

Page 41: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

41

Optimal 2D embeddings

Page 42: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

42

Laplacian Eigenmaps

Minimize weighted distances between neighbors

•  Find t nearest neighbors for each image : O(n2)

•  Compute weight matrix W:

•  Compute normalized laplacian

•  Optimal k reduced dims: Uk

O(n3) Bottom eigenvectors of G

[Belkin & Niyogi, ’01]

where

Page 43: Large-Scale Face Manifold Learning - Brown University · PDF fileLarge-Scale Face Manifold Learning Sanjiv Kumar ... A Social Application on Orkut . 5 ... largeScaleManifoldLearning_Brown.ppt

43

Different Sampling Procedures