presentation - svm & km - may 2009

Post on 15-Nov-2014

129 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Support Vector Machinesand

Kernel methods

byLucian Huluta

06/15/2009

6/2

0/2

009

Support Vector Machine (SVM)

What is Support Vector Machine?

A statistical tool, essentially used for NONLINEAR

classification/regression.

A SUPERVISED LEARNING mechanism like

neural networks.

An quick and adaptive method for PATTERN

ANALYSIS.

A fast and flexible approach for learning COMPLEX

SYSTEMS.

2

Support Vector Machine (SVM)

Strengths

Few parameters required for tuning the learning

machine

Learning involves optimisation of a convex

function

It scales relatively well to high dimensional

data

Weaknesses

Training large data still difficult

Need to choose a “good” kernel function

3

- weights ( - dimensional vector),

- bias

Binary classification problem

- input space

SVM: linear classification

More than one solution for the decision function!

4

Generalization region:

SVM: Generalization capacity

Generalization ability

5

SVM: Hard margin

Training data must satisfy:

Quadratic optimization problem

subject to:

minimize

with constraint:

6

SVM: Primal form

Convert the constrained problem => unconstrained problem:

We obtain:

Solving the for and

where is nonnegative Lagrange multipliers.

7

The dual form of the cost function consists of inner products.

SVM: Dual form

Solve QP with following problem:

The SVM is called

hard-margin support vector machine

8

The modified QP minimizes following cost function:

subject to the constraints:

SVM: L1-soft margin problem

: trade-off between the maximization of the margin and minimization of the classification error.

9

SVM: L2-soft margin problem

The modified QP minimizes following cost function:

subject to the constraints:

10

Decision function:

We assume that all the training

data are within the tube with radius ε

named insensitive loss function

Slack variables:

SVM: Regression

11

Cost function with slack variables:

SVM: Regression

If p=1: L1 soft-margin

If p=2: L2 soft-margin

subject to the constraints:

12

SVM: Linear inseparability

1. data are NOT linear separable.

2. feature space is HIGH DIMENSIONAL, hence QP

takes long time to solve

3. nonlinear function approximation problems can

NOT be solved13

If the feature space is Hilbert space, i.e., where inner product applies…

SVM: Linear inseparability

…,we can simplify the optimization problem by a TRICK!!!

14

Kernel Trick = is a method for using a linear classifier algorithm to solve a non-linear problem by

The Kernel “trick”

Kernel trick avoids computing inner product of two vectors in feature space.

choosing appropriate KERNEL FUNCTIONS

15

Consider a two-dimensional input space together with the feature map:

Numerical Example

Kernel function

16

Choose kernel function:

Maximize:

Compute bias term:

Classify data using decision function:

SVM with Kernel: Steps

17

Linear:

Polynomial:

Radial Basis Function:

Others: design kernels suitable for target applications

Kernels

18

Demo

19 To see video demo please visit this link.

Breast cancer diagnosis and prognosis

Handwritten digit recognition

On-line Handwriting Recognition

Text Categorization

3-D Object Recognition Problems

Function Approximation and Regression

Detection of Remote Protein Homologies

Gene Expression

Vast number of applications…

• andFault diagnosis in chemical processes

Applications of SVM

20

Application aspects of SVM – Belousov et.al., 2002, Journal of Chemometrics

Current developments: SVM

About Kernel latent variables approaches and SVM– Czekaj et.al., 2005, Journal of Chemometrics

Kernel based orthogonal projections to latent structures– Rantalainen et.al., 2007, Journal of Chemometrics

Performance assessment of a novel fault diagnosis system based on SVM – Yelamos et.al., 2009, Computers and Chemical Engineering

SVM and its application in chemistry – Li et.al., 2009, Chemometrics and intelligent Laboratory

21

Identification of MIMO Hammerstein systems with LS-SVM, Goethals et.al., 2005, Automatica

Current developments: SVM

An online support vector machine for abnormal event detection, Davy et.al., 2006, Signal Processing

Support vector machine for quality, monitoring in a plastic injection molding process, Ribeiro, 2005, IEEE System Man and Cybernetics

Fault prediction for nonlinear system based on Hammerstein model and LS-SVM, Jiang et.al., 2009,IFAC Safeprocess

22

My future work

Finish my diploma project

Study the role of various “Tuning” parameters on classification results

Apply SVM to Tennessee Eastman benchmark that involves 20 pre-defined faults.

Apply of SVM based classification algorithm to small academic example

Study support vector machine based classification for “one-against-one” and “one-against-all” problems

23

Thank you..

24

&Answers

top related