liang tian- learning from data through support vector machines
TRANSCRIPT
![Page 1: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/1.jpg)
Learning From Data ThroughLearning From Data ThroughSupport Vector MachinesSupport Vector Machines
Liang Tian
Lane Department of Computer Science& Electrical Engineering
West Virginia University
November 16, 2004November 16, 2004
CPE 520 CPE 520 –– Neural Networks Neural Networks
![Page 2: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/2.jpg)
Neural Networks Neural Networks –– BP Learning BP Learning
P. Klinkhachorn. CpE520 Lecture Notes, CSEE Dept, West Virginia University.
![Page 3: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/3.jpg)
BP Learning ProcedureBP Learning Procedure
P. Klinkhachorn. CpE520 Lecture Notes, CSEE Dept, West Virginia University.
![Page 4: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/4.jpg)
ClassifierClassifier
A. Moore. Lecture Notes, School of Computer Science, CMU, http://www.cs.cmu.edu/~awm/tutorials.
![Page 5: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/5.jpg)
ClassifierClassifier
SVM
Margin
MLP
![Page 6: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/6.jpg)
ClassifierClassifier
ßß MLP MLP stop training when all points arestop training when all points arecorrectly classifiedcorrectly classified
ßß Decision surface may Decision surface may notnot be optimized be optimized
ßß Generalization error may Generalization error may notnot be minimized be minimized
![Page 7: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/7.jpg)
Local MinimaLocal Minima
S. Bengio. An Introduction to Statistical Machine Learning – Neural Networks.IDIAP. Available at http://www.idiap.ch/~bengio May. 2003.
MLP MLP –– gradient descent learning gradient descent learning –– non linear optimization non linear optimization –– local minima local minima
![Page 8: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/8.jpg)
SVM ClassificationSVM Classification
R. Collobert. An Introduction to Statistical Machine Learning – Support Vector Machines.IDIAP. Available at http://www.idiap.ch/~collober. Jan. 2003.
![Page 9: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/9.jpg)
SVM ClassificationSVM Classification
R. Collobert. An Introduction to Statistical Machine Learning – Support Vector Machines.IDIAP. Available at http://www.idiap.ch/~collober. Jan. 2003.
![Page 10: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/10.jpg)
SVM ClassificationSVM Classification
R. Collobert. An Introduction to Statistical Machine Learning – Support Vector Machines.IDIAP. Available at http://www.idiap.ch/~collober. Jan. 2003.
![Page 11: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/11.jpg)
SVM ClassificationSVM Classification
R. Collobert. An Introduction to Statistical Machine Learning – Support Vector Machines.IDIAP. Available at http://www.idiap.ch/~collober. Jan. 2003.
Margin Maximization
Correct Classification
![Page 12: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/12.jpg)
ßß Classic n Classic non-linear on-linear optimization optimization problemproblemwith inequality constraintswith inequality constraints
ßß Solved by maximizing the dual variables Solved by maximizing the dual variables
Lagrange functionLagrange function
SVM ClassificationSVM Classification
Subject to constraints
![Page 13: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/13.jpg)
ßß Solutions of Lagrange Multipliers Solutions of Lagrange Multipliers ααii willwilldetermine parameters determine parameters ww and and bb
ßß The final decision The final decision hyperplane hyperplane is anis an
indicator functionindicator function
SVM ClassificationSVM Classification
Similar to weighted sum in MLP
![Page 14: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/14.jpg)
SVM ClassificationSVM Classification
R. Collobert. An Introduction to Statistical Machine Learning – Support Vector Machines.IDIAP. Available at http://www.idiap.ch/~collober. Jan. 2003.
If it is non linear, itIf it is non linear, it’’s easier to separate two classes bys easier to separate two classes by
projecting data into projecting data into higher dimensionalhigher dimensional space. space.
![Page 15: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/15.jpg)
SVM ClassificationSVM Classification
y = Outputy = S wi f(xi) +by = w fT(x)xi
InputSpace
x1x2
fi (x)
wi
b= w0 = bias
Z-Space
![Page 16: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/16.jpg)
SVM ClassificationSVM Classification
Introducing Introducing KernelKernel Functions to simplify the computation Functions to simplify the computation
Problem?Computationally discouraging if dimensionality of Computationally discouraging if dimensionality of Z-SpaceZ-Space is very large is very large
Common Kernels areCommon Kernels are polynomial polynomial andand GaussianGaussian..
KernelKernel Function is in Function is in input spaceinput space
Bypass the high dimensionality of feature spaceBypass the high dimensionality of feature space
![Page 17: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/17.jpg)
SVM Learning ExampleSVM Learning Example
Classic XOR Problem
![Page 18: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/18.jpg)
SVM Learning ExampleSVM Learning Example
![Page 19: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/19.jpg)
SVM Learning ExampleSVM Learning Example
![Page 20: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/20.jpg)
SVM Learning ExampleSVM Learning Example
![Page 21: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/21.jpg)
SVM Learning ExampleSVM Learning Example
![Page 22: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/22.jpg)
SVM Learning ExampleSVM Learning Example
polynomial polynomial Kernel function KKernel function K
![Page 23: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/23.jpg)
SVM Learning ProcedureSVM Learning Procedure
Step 1: Step 1: Select the kernel functionSelect the kernel function
Step 2: Step 2: Present inputs and desired outputsPresent inputs and desired outputs
Step 3: Step 3: SolveSolve Lagrange MultipliersLagrange Multipliers ααi i throughthrough an optimization probleman optimization problem
Step 4: Step 4: Obtain decision indicator functionObtain decision indicator function
![Page 24: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/24.jpg)
BP Learning ProcedureBP Learning Procedure
P. Klinkhachorn. CpE520 Lecture Notes, CSEE Dept, West Virginia University.
![Page 25: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/25.jpg)
SVM vs. NNSVM vs. NN
V. Kecman. Learning and Soft Computing. MIT Press, Cambridge, MA, 2001. ISBN: 0-262-11255-8.
This is a NNThis is a NN
![Page 26: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/26.jpg)
SVM vs. NNSVM vs. NN
V. Kecman. Learning and Soft Computing. MIT Press, Cambridge, MA, 2001. ISBN: 0-262-11255-8.
This is a SVMThis is a SVM
![Page 27: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/27.jpg)
SVM vs. NNSVM vs. NN
There is There is NO NO difference between structuredifference between structure
HOWEVERHOWEVER
ImportantImportant difference in difference in LEARNING !LEARNING !
![Page 28: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/28.jpg)
[1] V. Vapnik. The Nature of Statistical Learning Theory. Springer, N.Y., 1995. ISBN: 0-387-94559-8.
ßß SVM is a novel type of machine learning SVM is a novel type of machine learningalgorithm developed by V. algorithm developed by V. VapnikVapnik..
ßß SVM minimizes an upper bound on the SVM minimizes an upper bound on thegeneralizationgeneralization error. error.
ßß Conventional neural networks only minimize Conventional neural networks only minimize
the error on the training data.the error on the training data.
ßß A unique and global solution and avoid being A unique and global solution and avoid beingtrapped at local minimatrapped at local minima
SVM vs. NNSVM vs. NN
![Page 29: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/29.jpg)
SVM ApplicationsSVM Applications
Muller et al. “An introduction to kernel-based learning algorithms”, IEEE Trans. NN, 12(2), 2001, pp.181-201.
OCROCR
0.6%0.6%
![Page 30: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/30.jpg)
SVM ApplicationsSVM Applications
Muller et al. “An introduction to kernel-based learning algorithms”, IEEE Trans. NN, 12(2), 2001, pp.181-201.
DNA Data AnalysisDNA Data Analysis
![Page 31: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/31.jpg)
SVM ApplicationsSVM Applications
Tax and Duin, “Outliers and data descriptions”, Proceedings of the 7th Annual Conference of the Advanced School for Computing and Imaging, 2001. Pp. 234-241.
Single-Class ClassificationSingle-Class Classification
![Page 32: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/32.jpg)
Two Types of ProblemsTwo Types of Problems
RegressionRegression ClassificationClassification
S. Bengio. An Introduction to Statistical Machine Learning – Neural Networks.IDIAP. Available at http://www.idiap.ch/~bengio May. 2003.
![Page 33: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/33.jpg)
SVM RegressionSVM Regression
V. Kecman. Learning and Soft Computing. MIT Press, Cambridge, MA, 2001. ISBN: 0-262-11255-8.
![Page 34: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/34.jpg)
SVM RegressionSVM Regression
Approximating the set of data of Approximating the set of data of ll pair of training patternpair of training pattern
The SVM model used for function approximation is:The SVM model used for function approximation is:
where where ΦΦ(x)(x) is the high-dimensional feature space is the high-dimensional feature space
that is nonlinearly mapped from the input space x.that is nonlinearly mapped from the input space x.
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
![Page 35: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/35.jpg)
SVM RegressionSVM Regression
ww and and bb can be estimated by minimizing the can be estimated by minimizing thefollowing regularized risk functionfollowing regularized risk function
Vapnik'sVapnik's linear loss function linear loss functionwith with εε-insensitivity zone-insensitivity zone
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
![Page 36: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/36.jpg)
SVM RegressionSVM Regression
is the weights vector norm, which is used tois the weights vector norm, which is used toconstrain the model structure capacity in orderconstrain the model structure capacity in orderto obtain better generalization performance.to obtain better generalization performance.
CC is the regularization constant, representing is the regularization constant, representingthe trade-off between the approximation errorthe trade-off between the approximation errorand the model structure.and the model structure.
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
![Page 37: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/37.jpg)
SVM RegressionSVM Regression
Minimizing risk objective function R Minimizing risk objective function R
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
![Page 38: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/38.jpg)
SVM RegressionSVM Regression
Then, the solution is given in the form: Then, the solution is given in the form:
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
Training examples with (Training examples with (ααi i –– ααii*) *) ≠≠ 0 are 0 are support vectorssupport vectors
![Page 39: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/39.jpg)
SVM RegressionSVM Regression
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
Lagrange Multipliers can be obtained Lagrange Multipliers can be obtained by maximizing the form:by maximizing the form:
![Page 40: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/40.jpg)
Regression Application - 1Regression Application - 1Short-Term Load ForecastingShort-Term Load Forecasting
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
![Page 41: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/41.jpg)
L. Tian and A. Noore, “A novel approach for short-term load forecasting using support vector machines,” International Journal of Neural Systems, vol. 14, no. 5, Oct. 2004.
Regression Application - 2Regression Application - 2Software Reliability PredictionSoftware Reliability Prediction
![Page 42: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/42.jpg)
Regression Application - 2Regression Application - 2Software Reliability PredictionSoftware Reliability Prediction
Average Error = 1.20%Average Error = 1.20%
L. Tian and A. Noore, “On-line software reliability prediction: An approach based on support vector machines,” International Journal of Reliability, Quality and Safety Engineering, submitted and under revision.
![Page 43: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/43.jpg)
L. Tian and A. Noore, “On-line software reliability prediction: An approach based on support vector machines,” International Journal of Reliability, Quality and Safety Engineering, submitted and under revision.
Regression Application - 2Regression Application - 2Software Reliability PredictionSoftware Reliability Prediction
![Page 44: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/44.jpg)
Parameter SelectionParameter Selection
Cao and Tay, “Support vector machine with adaptive parameters in financial time series forecasting”, IEEE Trans. NN, 14(6), Nov. 2003, pp. 1506-1518.
![Page 45: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/45.jpg)
ßß Both NN and SVM learn from experimental data Both NN and SVM learn from experimental data
ßß Both NN and SVM are universal Both NN and SVM are universal approximatorapproximator
ßß After learning, both NN and SVM have sameAfter learning, both NN and SVM have same
mathematical model, graphical representationmathematical model, graphical representation
ßß The only The only differencedifference is the is the learning methodlearning method..
ßß NN NN –– gradient descent gradient descent
ßß SVM SVM –– solving quadratic programming solving quadratic programming
SummarySummary
![Page 46: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/46.jpg)
SVM Research IssuesSVM Research Issues
ßß Speeding up learning time when data is large. Speeding up learning time when data is large.
ßß Chunking, using subset of data Chunking, using subset of data
ßß Optimization techniques improvement Optimization techniques improvement
ßß Parameter selection and optimization Parameter selection and optimization
ßß Modified and adaptive SVM and some variations Modified and adaptive SVM and some variations
![Page 47: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/47.jpg)
References and Further ReadingReferences and Further Reading
[1] V. Vapnik.The Nature of Statistical Learning Theory.Springer, N.Y., 1995. ISBN: 0-387-94559-8.
[2] S. Bengio.An Introduction to Statistical Machine Learning –Neural Networks.IDIAP. Available at http://www.idiap.ch/~bengioMay. 2003.
[3] V. Kecman.Learning and Soft Computing.MIT Press, Cambridge, MA, 2001. ISBN: 0-262-11255-8.
![Page 48: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/48.jpg)
References and Further ReadingReferences and Further Reading
[4] R. Collobert.An Introduction to Statistical Machine Learning –Support Vector MachinesIDIAP. Available at http://www.idiap.ch/~colloberJan. 2003.
[5] L. Tian, A. Noore.A Novel Approach for Short-Term Load ForecastingUsing Support Vector Machines.International Journal of Neural Systems.International Journal of Neural Systems.Vol. 14, No. 5, Oct. 2004.
[6] L. Tian, A. Noore.On-line Software Reliability Prediction: An ApproachBased on Support Vector Machines.International Journal of Reliability, Quality andInternational Journal of Reliability, Quality andSafety Engineering.Safety Engineering.Submitted and under revision.
![Page 49: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/49.jpg)
References and Further ReadingReferences and Further Reading
[7] V. Vapnik.Statistical Learning Theory.John Wiley & Sons, 1998, ISBN: 0-471-03003-1.
[8] http://www.kernel-machines
[9] http://www.support-vector.ws
![Page 50: Liang Tian- Learning From Data Through Support Vector Machines](https://reader034.vdocuments.site/reader034/viewer/2022042521/542ac16d219acd89798b471b/html5/thumbnails/50.jpg)
Thank You !!Thank You !!
Questions and Comments ?Questions and Comments ?