컴퓨터 과학부 김명재. introduction data preprocessing model selection experiments

A Practical Guide to Sup-port Vector Classification

컴퓨터 과학부김명재

Introduction

Data Preprocessing

Model Selection

Experiments

Table of Contents

Support Vector Machine

Introduction

SVM (Support vector machine)◦ Training set of instance-label pairs◦ where

◦ Objective function

subject to

Introduction

( , )i ix y 1, ,i l n

ix R

1

1

2

lT

ii

w w C

, ,minw b

( ( ) ) 1Ti i iy w x b

0i

{1, 1}ly

Dual space form◦ Objective function

maximize

subject to

Introduction

1

0l

i ii

t

0 i C

1 1 1

1( )

2

l l lT

i i j i j i ji i j

L t t x x

Nonlinear SVM◦ Kernel method

Training vectors Mapped into a higher dimensional space Maybe infinite Mapping function

Objective function

Introduction

ix

: L H

1 1 1

1( ) ( , )

2

l l l

i i j i j i ji i j

L t t K x x

◦ Kernel function

Linear

Polynomial

Radial basis function

Sigmoid

are kernel parameter

Introduction

( , ) ( ) ( )Ti j i jK x x x x

( , )i jK x x

( , )i jK x x

( , )i jK x x

( , )i jK x x

Ti jx x

( ) , 0T di jx x r

2exp( ), 0i jx x

tanh( )Ti jx x r

, ,r d

Example◦ Data url

http://www.csie.ntu.edu.tw/~cjlin/papers/guide/data/

Introduction

Application #training data

#testingdata

#features #classes

Astroparticle 3, 089 4,000 4 2

Bioinfomat-ics

391 0 20 3

Vehicle 1,243 41 21 2

http://www.csie.ntu.edu.tw/~cjlin/papers/guide/data/

Proposed Procedure◦ Transform data to format of an SVM package◦ Conduct simple scaling on the data◦ Consider the RBF kernel ◦ Use cross-validation to find the best parameter

and◦ Use the best parameter and to train the whole

training set◦ Test

Introduction

( , )i jK x x2

exp( )i jx x

C C

Categorical Feature◦ Example

Three-category such as {red, green, blue} can be represented as (0, 0, 1), (0, 1, 0), and (1, 0, 0)

Scaling◦ Scaling before applying SVM is very important.◦ Linearly scaling each attribute to the range

[-1, +1] or [0, 1].

Data Preprocessing

RBF kernel◦ RBF kernel is a reasonable first choice◦ Nonlinearly maps samples into a higher dimen-

sional space

◦ The number of hyperparameters which influences the complexity of model selection.

◦ Fewer numerical difficulties

Model Selection

Cross-validation

Model Selection

Cross-validation◦ Find the good

◦ Avoid the overfitting problem

◦ v-fold cross-validation Divide the training set into v subsets of equal size Sequentially, on subset is tested using the classifier

trained on the remaining v-1 subsets.

Model Selection

( , )C

Grid-search◦ Various pairs of◦ Find a good parameter

for example

Model Selection

( , )C

5 3 152 ,2 , , 2C 15 13 32 ,2 , , 2

Grid-search

Model Selection

Astroparticle Physics◦ original accuracy

66.925 %◦ after scaling

96.15 %◦ after grid-search

96.875 % (3875/4000)

Experiments

Bioinformatics◦ original cross validation accuracy

56.5217 %

◦ after scaling cross validation accuracy 78.5166 %

◦ after grid-search 85.1662 %

Experiments

Vehicle◦ original accuracy

2.433902 %

◦ after scaling 12.1951 %

◦ after grid-searching 87.8049 % (36/41)

Experiments

libSVM◦ http://www.csie.ntu.edu.tw/~cjlin/libsvm/

A Training Algorithm for optimal Margin classifiers◦ Bernhard E. Boser, Isabelle M. Guyon, Vladimir N.

Vapnik

수업교재

REFERENCES

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

A Practical Guide to Sup-port Vector Classification

end of pages

컴퓨터 과학부 김명재. introduction data preprocessing model selection experiments

Documents