particle swarm optimization in machine learningteisseyrep/teaching/sml/presentations/psoinml.pdf ·...
TRANSCRIPT
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Particle Swarm Optimization in Machine Learning
Micha l Okulewicz, Julian Zubek
Institute of Computer SciencePolish Academy of Sciences
Statistical Machine Learning09 January 2014
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Presentation Plan
1 Introduction
2 Application to training of MLP
3 Application to training of SNN
4 Application to clustering
5 Application to full model selection
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
General machine learning task
Machine learning algorithm
ML algorithm = family of models + model selection
Family of considered models is called hypothesis space.
Choosing the best model is an optimization problem.
Note: Some methods does not describe hypothesis functionexplicitly (e.g. kNN).
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Optimization in machine learning
Decision TreeHypothesis space: all possible partitionsby trees.Model selection: multistep greedy searchoptimizing Gini coefficient at each split.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Optimization in machine learning
Linear regressionHypothesis space:
y = 0 + 1x
Model selection: ordinary least squares estimator(closed-form).
Logistic regressionHypothesis space:
(x) =1
1 + exp(0 + 1x)
Model selection: Newtons method (iterative root finding).
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Optimization in machine learning
Multi-layered perceptronHypothesis space:
y(x) =
(00 +
01
[(10 +
11
[(20 +
21x), . . .
]T), . . .
]T)Model selection: Backpropagation (gradient descentoptimization).
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
MultiLayer Perceptron
Interpretation:
Artificial Neural Network modeling a neural system. A stack of logistic regression models.
Applications:
Classification, Regression.
Problems:
Standard Backpropagationalgorithm might get lost inlocal minima (restarts needed).
Needs tuning of a learning rate. Backpropagation is unsuitable
for more then 2 hidden layers.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Spiking Neural Network
Interpretation:
Artificial Neural Network taking into account timing of inputs. A set of differential equations for computing membrane
potential of a neuron.
Applications:
Sequence (time-series) analysis. Pattern recognition.
Problems:
Tuning of the parametersis not easy (STDP and ReSuMealgorithms train only the weights, but not the recovery time,increase time and initial potential of the neurons).
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Selected types of clustering
Similarity based clustering with defined K . Capacitated clustering (possibly with maximum K ). Cost based clustering (possibly with maximum K and limited
cluster capacity).
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
When global stochastic search is feasible?
Search space is very large. Objective function has multiple minima. Function gradient is unknown. Non-standard evaluation criteria. It is easy to overfit.
* Tom Dietterich, 1995
[In machine learning] it appears to be better not to optimize!
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Particle Swarm Optimization
Continuous iterative global optimization metaheuristicalgorithm.
Utilizes the idea of Swarm Intelligence. Optimization is performed by a set of simple beings called
particles.
Each particle has current location, velocity and memory of thebest visited location.
Particles communicate their best visited location to the set oftheir neighbours.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Particle Swarm Optimization
Initialize swarm
Evaluate particles
Update velocity Update velocity Update velocity...
Update position Update position Update position
[STOP conditions not met]
[STOP conditions met]
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
SPSO 2007
tth iteration for the ith particle:
v(t+1)i = c1u
(1)U[0;1](x
(best)n[i ] x
(t)i ) +
c2u(2)U[0;1](x
(best)i x
(t)i ) +
v(t)i
(1)
x(t+1)i = x
(t)i + v
(t)i (2)
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
SPSO 2011
tth iteration for the ith particle:
g(t)i =
1
3
[3x
(t)i +[
c1(x
(best)n[i ] x
(t)i
)]+[
c2(x
(best)i x
(t)i
)]] (3)
x(t)i unifBi (g (t)i ,x(t)i g (t)i )
v(t+1)i = v
(t)i + x
(t)i x
(t)i
(4)
x(t+1)i = x
(t)i + v
(t)i (5)
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Simulation on Rastrigins function for SPSO 2007
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Machine Learning taskMultilayer PerceptronSpiking Neural NetworksClusteringPSO Algorithm
Simulation on Rastrigins function for SPSO 2011
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
MLP: PSO + SGD BP
Task: minimize MSE.
0 100 200 300 400 500
-10
-8-6
-4-2
irisF2_
Iteration
log(
MS
E)
without initial PSOwith initial PSO
0 100 200 300 400 500
-1.5
-1.0
-0.5
0.0
glassF2_
Iteration
log(
MS
E)
without initial PSOwith initial PSO
0 100 200 300 400 500
-8-6
-4-2
0
thyroidF2_
Iteration
log(
MS
E)
without initial PSOwith initial PSO
0 1000 2000 3000 4000 5000
-1.2
-1.0
-0.8
-0.6
gsmF2_40_agg_mean
Iteration
log(
MS
E)
without initial PSOwith initial PSO
0 1000 2000 3000 4000 5000 6000
-4.0
-3.5
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
wifiF2_40_agg_mean
Iteration
log(
MS
E)
without initial PSOwith initial PSO
0 500 1000 1500 2000 2500
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
wifiF2_40_minus
Iteration
log(
MS
E)
without initial PSOwith initial PSO
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Task: minimize AUC of absolute value of membrane potential ofSimilarity Measure Neuron.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
K-means clustering
Task: minimize distance from cluster centers to points belongingto clusters.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
DVRP: example of cost based clustering
Task: minimize total routes length.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Full model selection
Standard machine learning
Model selection:
Tuning parameters of a function of a given class.
Meta-learning
Full model selection:
Choosing preprocessing algorithm and its parameters. Choosing feature selection strategy and its parameters. Choosing machine learning algorithm and its parameters.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
PSO in full model selection
Particle Swarm Model Selection (H. J. Escalante, M. Montes, E.Sucar, 2009):
Implemented on top of Challenge Learning Object Package(CLOP) for MATLAB.
Used in Agnostic Learning vs Prior Knowledge 2007 challenge(75 competitors):
8th place overall, 5th place among agnostic methods, 2th place among methods utilizing only standard CLOP
algorithms.
2-6 hours needed for each dataset from the competition.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Conclusions
PSO (or possibly other metaheuristics) could be applied toML models fitting and selection.
In standard approach it does not usualy beat well knownspecific training algorithms.
It could be used for a mathematically non-trivial models(SNN) or with non-standard fitness functions (we couldoptimize easier for a business criteria).
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Bibliography I
Maurice Clerc.
Standard PSO 2006, 2011, 2012.
Ioan Cristian and Trelea.
The particle swarm optimization algorithm: convergence analysis and parameter selection.Information Processing Letters, 85(6):317 325, 2003.
X. Cui, T.E. Potok, and P. Palathingal.
Document clustering using particle swarm optimization.In Swarm Intelligence Symposium, 2005. SIS 2005. Proceedings 2005 IEEE, pages 185 191, june 2005.
Hugo Jair Escalante, Manuel Montes, Luis Enrique Sucar, Isabelle Guyon, and Amir Saffari.
Particle swarm model selection.In JMLR, Special Topic on Model Selection, pages 405440, 2009.
Yuan-wei Jing, Tao Ren, and Yu-cheng Zhou.
Neural network training using pso algorithm in atm traffic control.In Intelligent Control and Automation, pages 341350. Springer, 2006.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Bibliography II
Jan Karwowski, Micha l Okulewicz, and Jaros law Legierski.
Application of particle swarm optimization algorithm to neural network training process in the localizationof the mobile terminal.In Lazaros Iliadis, Harris Papadopoulos, and Chrisina Jayne, editors, Engineering Applications of NeuralNetworks, volume 383 of Communications in Computer and Information Science, pages 122131. SpringerBerlin Heidelberg, 2013.
J. Kennedy and R. Eberhart.
Particle swarm optimization.Proceedings of IEEE International Conference on Neural Networks. IV, pages 19421948, 1995.
Hongbo Liu, Bo Li, Xiukun Wang, Ye Ji, and Yiyuan Tang.
Survival density particle swarm optimization for neural network training.In Advances in Neural NetworksISNN 2004, pages 332337. Springer, 2004.
Michael Meissner, Michael Schmuker, and Gisbert Schneider.
Optimized Particle Swarm Optimization (OPSO) and its application to artificial neural network training.BMC bioinformatics, 7(1):125, 2006.
Ammar Mohemmed, Satoshi Matsuda, Stefan Schliebs, Kshitij Dhoble, and Nikola Kasabov.
Optimization of spiking neural networks with dynamic synapses for spike sequence generation using pso.In IJCNN, pages 29692974. IEEE, 2011.
Micha l Okulewicz, Julian Zubek PSO in ML
-
IntroductionApplication to training of MLPApplication to training of SNN
Application to clusteringApplication to full model selection
Conclusions
Bibliography III
Ben Niu and Li Li.
A hybrid particle swarm optimization for feed-forward neural network training.In Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, pages494501. Springer, 2008.
Micha l Okulewicz and Jacek Mandziuk.
Application of particle swarm optimization algorithm to dynamic vehicle routing problem.In Leszek Rutkowski, Marcin Korytkowski, Rafa l Scherer, Ryszard Tadeusiewicz, LotfiA. Zadeh, andJacekM. Zurada, editors, Artificial Intelligence and Soft Computing, volume 7895 of Lecture Notes inComputer Science, pages 547558. Springer Berlin Heidelberg, 2013.
Xiaorong Pu, Zhongjie Fang, and Yongguo Liu.
Multilayer perceptron networks training using particle swarm optimization with minimum velocityconstraints.In Advances in Neural NetworksISNN 2007, pages 237245. Springer, 2007.
Y. Shi and R.C. Eberhart.
A modified particle swarm optimizer.Proceedings of IEEE International Conference on Evolutionary Computation, page 69 73, 1998.
Y. Shi and R.C. Eberhart.
Parameter selection in particle swarm optimization.Proceedings of Evolutionary Programming VII (EP98), page 591600, 1998.
Micha l Okulewicz, Julian Zubek PSO in ML
IntroductionApplication to training of MLPApplication to training of SNNApplication to clusteringApplication to full model selection