advanced machine learning & perception

17

University Advanced Machine Learning & Perception Instructor: Tony Jebara

Upload: amanda

Post on 01-Feb-2016

34 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

DESCRIPTION

Advanced Machine Learning & Perception. Instructor: Tony Jebara. Topic 12. Manifold Learning (Unsupervised) Beyond Principal Components Analysis (PCA) Multidimensional Scaling (MDS) Generative Topographic Map (GTM) Locally Linear Embedding (LLE) Convex Invariance Learning (CoIL) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Advanced Machine Learning & Perception

Instructor: Tony Jebara

Page 2: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Topic 12•Manifold Learning (Unsupervised)

•Beyond Principal Components Analysis (PCA)

•Multidimensional Scaling (MDS)

•Generative Topographic Map (GTM)

•Locally Linear Embedding (LLE)

•Convex Invariance Learning (CoIL)

•Kernel PCA (KPCA)

Page 3: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Manifolds•Data is often embedded in a lower dimensional space•Consider image of face being translated from left-to-right

•How to capture the true coordinates of the data on the manifold or embedding space and represent it compactly?•Open problem: many possible approaches…•PCA: linear manifold•MDS: get inter-point distances, find 2D data with same•LLE: mimic neighborhoods using low dimensional vectors•GTM: fit a grid of Gaussians to data via nonlinear warp•Linear after Nonlinear normalization/invariance of data•Linear in Hilbert space (Kernels)

0t

tx T x=r r

Page 4: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•If we have eigenvectors, mean and coefficients:

•Getting eigenvectors (I.e. approximating the covariance):

•Eigenvectors are orthonormal:•In coordinates of v, Gaussian is diagonal, cov = •All eigenvalues are non-negative•Higher eigenvalues are higher variance, use those first

•To compute the coefficients:

Principal Components Analysis

1

C

i ij jjx c v

=» m+å

r rr

11 12 13 1

12 22 23 1 2 3 2 1 2 3

13 23 33 2

0 0

0 0

0 0

T

T

V V

v v v v v v

S = L

é ù é ùS S S lê ú ê úê ú ê úé ù é ùé ù é ù é ù é ù é ù é ùS S S = lê ú ê úê ú ê úë û ë û ë û ë û ë û ë ûë û ë ûê ú ê úê ú ê úS S S lê ú ê úë û ë û

r r r r r r

( )Tij i jc x v= - m

r rr

Ti j ijv v = dr r

0i

l ³

1 2 3 4l ³ l ³ l ³ l ³ K

Page 5: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Multidimensional Scaling (MDS)•Idea: capture only distances between points X in original space•Construct another set of low dim or 2D Y points having same distances•A Dissimilarity d(x,y) is a function of two objects x and y such that

•A Metric also has to satisfy triangle inequality:

•Standard example: Euclidean l2 metric•Assume for N objects, we compute a dissimilarity matrix which tells us how far they are

( )

( )

( ) ( )

, 0

, 0

, ,

d x y

d x x

d x y d y x

³

=

=

( ) ( ) ( ), , ,d x z d x y d y z£ +( ) 21

2,d x y x y= -

( ),ij i j

d X XD =

Page 6: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Multidimensional Scaling•Given dissimilarity between original X points under original d() metric, find Y points with dissimilarity D under another d’() metric such that D is similar to

•Want to find Y’s that minimize some difference from D to •Eg. Least Squares Stress =

•Eg. Invariant Stress =

•Eg. Sammon Mapping =

•Eg. Strain =

( ) ( ), ' ,ij i j ij i j

d X X D d Y YD = =

( ) ( )2

1, ,

ij ijN ijStress Y Y D= - DåK

( )2iji j

Stress YInvStress

D<

=å

( )21

ij ijijij

D - DDå

( ) ( )( )2 2 2 2 1 11TNtrace J D J D whereJ ID - D - = -r r

Some are globalSome are localGradient descent

Page 7: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•Have distances from cities to cities, these are on the surface of a sphere (Earth) in 3D space•Reconstructed 2D points on plane capture essential properties (poles?)

MDS Example 3D to 2D

Page 8: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•More elaborate example•Have correlation matrix between crimes. These are arbitrary dimensionality.•Hack: convert correlation to dissimilarity and show reconstructed Y

MDS Example Multi-D to 2D

Page 9: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•Instead of distance, look at neighborhood of each point. Preserve reconstruction of point with neighbors in low dim •Find K nearest neighbors for each point•Describe neighborhood as best weights on neighbors to reconstruct the point

•Find best vectors that still have same weights

Locally Linear Embedding

( )2

1

i i ij jj

ijj

W X W X

subjectto W i

e = -å

= "

åå

r r

( ) { } { }2

0i i ij jjY Y W Y subjectto E Y Cov Y IF = - = =å å

r r

Why?

Page 10: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Locally Linear Embedding•Finding W’s (convex combination of weights on neighbors):( ) ( ) ( )

2

i i i i i i ij jjW W where W X W X

· ·e = e e = -å å

r r

( ) ( )

( )( ) ( )( )( ) ( )

22

1

i i i ij j ij i jj j

T

ij i j ij i jj j

T

ij i j iik kjk

ij ijik jkjk j

W X W X W X X

W X X W X X

WW X X X X

WW C andrecall W

·e = - = -

= - -

= - -

= =

å å

å å

åå å

r r r r

r r r r

r r r r

( )* 12argmin 1T T

wiW w Cw w

·= - l

r

( )1 0

1

Cw

wC

- l =

æ ö÷ç =÷ç ÷çè øl

r

r

1) Take Deriv& Set to 0

2) SolveLinear system

3) Find

4) Find w

1 1

1 1

T

T

w

w

=

æ ö÷çl =÷ç ÷çè øl

r

r

Page 11: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

Locally Linear Embedding•Finding Y’s (new low-D points that agree with the W’s)

•Solve for Y as the bottom d+1 eigenvectors of M•Plot the Y values

( )

( ) ( )( )( )

2

i ij ji j

T

i ij j i ik ki j k

T T T Ti i i ij j i ij jik k ik ki k j jk

Tij jjk jk kj ik kjk i

Tjjk kjk

Y Y W Y

Y W Y Y W Y

Y Y W Y Y W Y Y WW Y Y

W W WW Y Y

M Y Y subjecttoY beingwhite

F = -

= - -

= - - +

= d - + +

=

å å

å å å

å å å å

å åå

r r

r r r r

r r r r r r r r

r r

r r

Page 12: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•Original X data are raw images

•Dots are reconstructed two-dimensional Y points

LLE Examples

1

3

0

2

éùêúêúêú® êúêúêúêúêúëû

Page 13: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•Top=PCA•Bottom=LLE

LLEs

Page 14: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•A principled altenative to the Kohonen map•Forms a generative model of the manifold. Can sample it, etc.•Find a nonlinear mapping y() from a 2D grid of Gaussians.•Pick params W of mapping such that mapped Gaussians in data space maximize the likelihood of the observed data.•Have two spaces, the data space t (old notation were X’s) and the hidden latent space x (old notation were Y’s).•The mapping goes from latent space to observed space

Generative Topographic Map

( ),i it y x W»

Page 15: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•We choose our priors and conditionals for all variables of interest•Assume Gaussian noise on the y() mapping

•Assume our prior latent variables are a grid model equally spaced in latent space•Can now write out the full likelihood

GTM as a Grid of Gaussians

( ) ( )/ 2

2| , , exp ,

2 2

D

p t xW y xW tæ ö æ öb b÷ ÷ç çb = - -÷ ÷ç ç÷ ÷ç çè ø è øp

( ) ( )11

K

K kkp x x x

== d -å

( ) ( ) ( ) ( )1 1

, log | , log | , ,N N

n nn nL W p t W p t xW p xdx

= =b = b = bå å ò

Page 16: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•Integrating over delta functions makes a summation

•Note the log-sum, need to apply EM to maximize•Also, use the following parametric (linear in the basis) form of the mapping•Examples of manifolds for randomly chosen W mappings•Typically, we are given the data and want to find the maximum likelihood mapping W for it…

GTM Distribution Model

( ) ( ) ( )

( )1

11 1

, log | , ,

log | , ,

N

nn

N K

K n kn k

L W p t xW p xdx

p t x W

=

= =

b = b

= b

å òå å

( ) ( ),y xW W x= f

Page 17: Advanced Machine Learning & Perception

Tony Jebara, Columbia University

•Recover non-linear manifold by warping grid with W params•Synthetic Example: Left = Initialized Right = Converged

•Real Example: Oil Data 3-Classes Left = GTM Right = PCA

GTM Examples

Cooperative Human and Machine Perception in Teleoperated

Advanced Perception, Navigation and Planning for ...people.csail.mit.edu/kaess/pub/Hover12ijrr.pdf · Advanced Perception, Navigation and Planning for Autonomous In-Water Ship Hull

Advanced Machine Learning

Personalized machine learning for robot perception of ... · Personalized machine learning for robot perception of affect and engagement in autism therapy Ognjen Rudovic1*, Jaeryoung

Machine Perception Through Natural Intelligence

CENTER FOR MACHINE PERCEPTION Learning Eﬃcient …cmp.felk.cvut.cz/~matas/papers/matas-predictor_tracking-tr06.pdfCENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY 1213-2365

Advanced Machine Company © Advanced Machine Company 1 © 2001-2014 Advanced Machine Company Company Presentation

Object Recognition as Machine Translation Matching Words and Pictures Heather Dunlop 16-721: Advanced Perception April 17, 2006

Advanced RISC Machine (ARM)

Machine Perception and Learning of Complex Social Systemsrealitycommons.media.mit.edu/pdfs/thesis.pdf · Machine Perception and Learning of Complex Social Systems by Nathan Norfleet

Testing for extrasensory perception with a machine

Machine learning advanced applications

CENTER FOR MACHINE PERCEPTION Feature selection based on

Active Perception over Machine and Citizen Sensing

Personalization Techniques And Recommender Systems (Series in Machine Perception and Artificial Intelligence ???) (Series in Machine Perception and Artificial ... Perception and Artifical

PPT slides - MACHINE PERCEPTION LABORATORY

Advanced Feature Set Display Processor - 3D perception

ROBOTICS AND MACHINE PERCEPTION MARCH 2001 · PDF fileROBOTICS AND MACHINE PERCEPTION MARCH 2001 ... offered the option of receiving the Robotics and Machine Perception ... and sensor-based

ROBOTICS AND MACHINE PERCEPTION MARCH 2001 Special …

Advanced Machine Learning & Perception › ~jebara › 6772 › notes › notes1.pdfWeek 7: Feature Selection and Kernel Selection, Support Vector Machine Extensions Week 8: Meta-Learning

Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara

ROBOTICS AND MACHINE PERCEPTION 13.2 OCTOBER 2004 …

CENTER FOR MACHINE PERCEPTION Detection …cmp.felk.cvut.cz/~dolejm1/noduledetection/nodule...CENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY 2365 THESIS Detection of Pulmonary

Advanced Machine Learning & Perception

Active Perception over Machine and Citizen Sensing

Advanced Machine Learning & Perceptionjebara/6772/notes/topic3.pdf · Learning & Perception Instructor: Tony Jebara . Tony Jebara, Columbia University Topic 3 •Maximum Margin •Empirical

A Semantics-based Approach to Machine Perception

Advanced Vehicle Perception System

Perception & Vision - METR 4202 Advanced Control & Robotics L7-Perception-Vision

Flexible Human-Machine Information Fusion and Perception

Advanced Machine Design

MACHINE PERCEPTION IN BIOMEDICAL APPLICATIONS: AN INTRODUCTION AND …biologicalengineering.in/gallery/14-1-2-5.pdf · MACHINE PERCEPTION IN BIOMEDICAL APPLICATIONS: AN INTRODUCTION

Advanced Topics in Machine Learning - Universität Hildesheim · Advanced Topics in Machine Learning Advanced Topics in Machine Learning 1. Learning SVMs / Primal Methods ... 10g

Direct and inverse kinematics - Center for Machine Perception

CENTER FOR MACHINE PERCEPTION Statistical Pattern Recognition