a geometric perspective on machine learning 何晓飞浙江大学计算机学院 1

A Geometric Perspective on Machine Learning

何晓飞浙江大学计算机学院

Machine Learning: the problem

f 何晓飞

Information(training data)

f: X→YX and Y are usually

considered as a Euclidean spaces.

Manifold Learning: geometric perspective

The data space may not be a Euclidean space, but a nonlinear manifold.

☒ Euclidean distance.☒ f is defined on

Euclidean space.☒ambient dimension

☑ geodesic distance.☑ f is defined on

nonlinear manifold.☑ manifold

dimension.

instead… 3

Manifold Learning: the challenges

The manifold is unknown! We have only samples!

How do we know M is a sphere or a torus, or else?

How to compute the distance on M?

versus

This is unknown:

This is what we have:

? ? or else…? Topology

Geometry

Functional analysis 4

Manifold Learning: current solution

Find a Euclidean embedding, and then perform traditional learning algorithms in the Euclidean space.

Simplicity

Simplicity is relative

Manifold-based Dimensionality Reduction

Given high dimensional data sampled from a low dimensional manifold, how to compute a faithful embedding?

How to find the mapping function ?

How to efficiently find the projective function ?

A Good Mapping Function

If xi and xj are close to each other, we hope f(xi) and f(xj) preserve the local structure (distance, similarity …)

k-nearest neighbor graph:

Objective function: Different algorithms have different concerns

Locality Preserving Projections

Principle: if xi and xj are close, then their maps yi and yj are also close.

Mathematical formulation: minimize the integral of the gradient of f.

Stokes’ Theorem:

LPP finds a linear approximation to nonlinear manifold, while preserving the local geometric structure.

Manifold of Face Images

Expression (Sad >>> Happy)

Manifold of Handwritten Digits

Thickness

Learning target:

Training Examples:

Linear Regression Model

Active and Semi-Supervised Learning: A Geometric Perspective

Generalization Error

Goal of Regression

Obtain a learned function that minimizes the generalization error (expected error for unseen test input points).

Maximum Likelihood Estimate

Gauss-Markov Theorem

For a given x, the expected prediction error is:

-4 -3 -2 -1 0 1 2 3 40

Gauss-Markov Theorem

For a given x, the expected prediction error is:

Good! Bad!20

Experimental Design Methods

Three most common scalar measures of the size of the parameter (w) covariance matrix:

A-optimal Design: determinant of Cov(w). D-optimal Design: trace of Cov(w). E-optimal Design: maximum eigenvalue of

Cov(w).

Disadvantage: these methods fail to take into account unmeasured (unlabeled) data points.

Manifold Regularization: Semi-Supervised Setting

Measured (labeled) points: discriminant structure Unmeasured (unlabeled) points: geometrical structure

random labeling

Measured (labeled) points: discriminant structure Unmeasured (unlabeled) points: geometrical structure

random labeling active learningactive learning + semi-supervsed learning

Unlabeled Data to Estimate Geometry Measured (labeled) points: discriminant structure

Unmeasured (unlabeled) points: geometrical structure

Compute nearest neighbor graph G

Laplacian Regularized Least Square (Belkin and Niyogi, 2006)

Linear objective function

Solution

Active Learning

How to find the most representative points on the manifold? 33

Objective: Guide the selection of the subset of data points that gives the most amount of information.

Experimental design: select samples to label

Manifold Regularized Experimental DesignManifold Regularized Experimental Design

Share the same objective function as Laplacian Regularized Least Squares, simultaneously minimize the least square error on the measured samples and preserve the local geometrical structure of the data space.

Active Learning

In order to make the estimator as stable as possible, the size of the covariance matrix should be as small as possible.

D-optimality: minimize the determinant of the covariance matrix

2( )Cov Iy 1 2TXLX I

1 2T TH ZZ XLX I

wˆ( )Cov w

11 2ˆ ( )T TZZ XLX I Z w y

Analysis of Bias and Variance

Select the first data point such that is maximized,

Suppose k points have been selected, choose the (k+1)th point such that .

Update

Manifold Regularized Experimental Design

Where are selected from

1 1 1 1 11/H H H H H

1( ,..., )maxkZ H z z

1,..., kz z 1{ ,..., }mx x

1z 1 1 1 2T TXLX I z z

11 arg max

zz z z

1 11 1 1 1 11 1 1 1

TT k k k k

k k k k k Tk k k

H HH H H

z zz z

1 1 1 1 2T TH XLX I z z

The algorithm

Consider feature space F induced by some nonlinear mapping φ, and < f(xi), f(xj) >=K(xi, xi).

K(·, ·): positive semi-definite kernel function Regression model in RKHS: Objective function in RKHS:

22 212

( ) ( ( ) ) ( ( ) ( ))2

k mT T T

LapRLS i i i j iji i j

ν ν z ν x ν x νF

( ) ,m

i i X ii

ν x α

( ) ,Ty ν x ν F

Nonlinear Generalization in RKHS

Select the first data point such that is maximized,

Suppose k points have been selected, choose the (k+1)th point such that .

Update

Kernel Graph Regularized Experimental Design

where are selected from

2 11 2( ) ( )XZ ZX XX XX XXCov K K K LK K α

1( ,..., ) 1 2maxkZ XZ ZX XX XX XXK K K LK K z z

1,..., kz z 1{ ,..., }mx x

1v 1 1 1 2T

XX XX XXK LK K v v

11 arg max

vv v v

1 11 1 1 11 1

Tk k k k

k k Tk k k

M MM M

1 1 1 1 2T

XX XX XXM K LK K v v

Nonlinear Generalization in RKHS

A Synthetic Example

A-optimal Design Laplacian Regularized Optimal Design

A Synthetic Example

A-optimal Design Laplacian Regularized Optimal Design

Application to image/video compression

Video compression

Topology

Can we always map a manifold to a Euclidean space without changing its topology?

Topology

Simplicial Complex

Homology Group

Betti Numbers Euler Characteristic

Good CoverSample Points

Homotopy

Number of components, dimension,…44

Topology

The Euler Characteristic is a topological invariant, a number that describes one aspect of a topological space’s shape or structure.

The Euler Characteristic of Euclidean space is 1!

Challenges

Insufficient sample points Choose suitable radius How to identify noisy holes (user interaction?)

Noisy holehomotopy

homeomorphsim

a geometric perspective on machine learning 何晓飞浙江大学计算机学院 1

Documents

山东协和职业技术学院生理学教研室 ...

宁江区实验小学潘晓华

直升机飞行力学 helicopter dynamics chapter 1

pro engineer_08110309_ 王晓晓

文学教育与文学类文本的阅读 ...

直升机飞行力学 helicopter dynamics chapter 4

长沙民政学院康复系张晓霞

on to object design 徐迎晓 xuyingxiao@126.com ...

fun with english 8a unit three reading (i) ...

社会保障学授课教师 : 郭晓溶

manifold learning on probabilistic graphical models...

体育史 36 学时主讲 : 梁晓刚

matlab 软件与应用太原理工大学数学系...

创想队 : 陈晓苗陈如飞陈晓洁孔德鹏 (...

直升机飞行力学 helicopter dynamics chapter 5

江西省宁都中学赖晓凯

广东梅州蕉岭中学袁飞林

学术报告报告人： prof. dr. ingo rechenberg ...

2019 年中国地球科学联合学术年会...

华北电力大学 -...

a geometric perspective on machine learning 何晓飞 浙江大学计算机学院 1

a geometric perspective on machine learning 何晓飞浙江大学计算机学院 1