컴퓨터 과학부 김명재. introduction data preprocessing model selection experiments
TRANSCRIPT
A Practical Guide to Sup-port Vector Classification
컴퓨터 과학부김명재
Introduction
Data Preprocessing
Model Selection
Experiments
Table of Contents
Support Vector Machine
Introduction
SVM (Support vector machine)◦ Training set of instance-label pairs◦ where
◦ Objective function
subject to
Introduction
( , )i ix y 1, ,i l n
ix R
1
1
2
lT
ii
w w C
, ,minw b
( ( ) ) 1Ti i iy w x b
0i
{1, 1}ly
Dual space form◦ Objective function
maximize
subject to
Introduction
1
0l
i ii
t
0 i C
1 1 1
1( )
2
l l lT
i i j i j i ji i j
L t t x x
Nonlinear SVM◦ Kernel method
Training vectors Mapped into a higher dimensional space Maybe infinite Mapping function
Objective function
Introduction
ix
: L H
1 1 1
1( ) ( , )
2
l l l
i i j i j i ji i j
L t t K x x
◦ Kernel function
Linear
Polynomial
Radial basis function
Sigmoid
are kernel parameter
Introduction
( , ) ( ) ( )Ti j i jK x x x x
( , )i jK x x
( , )i jK x x
( , )i jK x x
( , )i jK x x
Ti jx x
( ) , 0T di jx x r
2exp( ), 0i jx x
tanh( )Ti jx x r
, ,r d
Example◦ Data url
http://www.csie.ntu.edu.tw/~cjlin/papers/guide/data/
Introduction
Application #training data
#testingdata
#features #classes
Astroparticle 3, 089 4,000 4 2
Bioinfomat-ics
391 0 20 3
Vehicle 1,243 41 21 2
Proposed Procedure◦ Transform data to format of an SVM package◦ Conduct simple scaling on the data◦ Consider the RBF kernel ◦ Use cross-validation to find the best parameter
and◦ Use the best parameter and to train the whole
training set◦ Test
Introduction
( , )i jK x x2
exp( )i jx x
C C
Categorical Feature◦ Example
Three-category such as {red, green, blue} can be represented as (0, 0, 1), (0, 1, 0), and (1, 0, 0)
Scaling◦ Scaling before applying SVM is very important.◦ Linearly scaling each attribute to the range
[-1, +1] or [0, 1].
Data Preprocessing
RBF kernel◦ RBF kernel is a reasonable first choice◦ Nonlinearly maps samples into a higher dimen-
sional space
◦ The number of hyperparameters which influences the complexity of model selection.
◦ Fewer numerical difficulties
Model Selection
Cross-validation
Model Selection
Cross-validation◦ Find the good
◦ Avoid the overfitting problem
◦ v-fold cross-validation Divide the training set into v subsets of equal size Sequentially, on subset is tested using the classifier
trained on the remaining v-1 subsets.
Model Selection
( , )C
Grid-search◦ Various pairs of◦ Find a good parameter
for example
Model Selection
( , )C
5 3 152 ,2 , , 2C 15 13 32 ,2 , , 2
Grid-search
Model Selection
Grid-search
Model Selection
Astroparticle Physics◦ original accuracy
66.925 %◦ after scaling
96.15 %◦ after grid-search
96.875 % (3875/4000)
Experiments
Bioinformatics◦ original cross validation accuracy
56.5217 %
◦ after scaling cross validation accuracy 78.5166 %
◦ after grid-search 85.1662 %
Experiments
Vehicle◦ original accuracy
2.433902 %
◦ after scaling 12.1951 %
◦ after grid-searching 87.8049 % (36/41)
Experiments
libSVM◦ http://www.csie.ntu.edu.tw/~cjlin/libsvm/
A Training Algorithm for optimal Margin classifiers◦ Bernhard E. Boser, Isabelle M. Guyon, Vladimir N.
Vapnik
수업교재
REFERENCES
A Practical Guide to Sup-port Vector Classification
end of pages