regression and classification: an artificial neural network approach
TRANSCRIPT
Welcome to my presentation on
Regression and Classification: An Artificial Neural Network Approach
Presented byMd. Menhazul Abedin
Research studentDept. of Statistics
University of RajshahiRajshahi-6205
Dedication
• This presentation is dedicated to my honorable supervisor
05/02/2023 2
Three pioneer of ANN
Warren McCulloch Walter Pitts
Frank Rosenblatt05/02/2023 3
OutlinesMotivation/Why this study?ObjectivesMethodologyFindingsConclusionLimitationArea of further research
05/02/2023 4
Motivation/Why this study?
• Vector, matrix, sound, image, wave, string, text etc.• How to analyze them? Pitfall of human civilization from several decades.
05/02/2023 5
Objectives?
• To study neural network as a technique for regression and classification.
• To compare neural network with classical regression and classification techniques.
• To study the limitations of neural network.
05/02/2023 6
• Structure of neuron
05/02/2023 7
What is ANN?Biological neural network
Artificial neural network
05/02/2023 8
• How many hidden layers considered? More hidden layer more approximate nonlinearity • More hidden layer need much time to converge. • Weight adjusted by iterative method (backpropagation)
• Analogy between biological and artificial neural networks
05/02/2023 9
Historical Background of Artificial Neural Network
• In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper on how neurons might work.
• In 1949, Donald Hebb wrote The Organization of Behavior (the ways in which humans learn)
• M. Minsky (1951) built a reinforcement-based network learning system.• F. Rosenblatt (1958) the first practical Artificial Neural Network (ANN) - the
perceptron, • B. Widrow & M.E. Hoff (1960) introduced adaptive percepton-like network using
Least Mean Square (LMS) error algorithm. • 1969 – Marvin Minsky and Seymour showed that perceptron model is not capable
of representing many important problems• 1973 – Christoph Von Der Malsburg used a neuron model that was nonlinear and
biologically more motivated• 1974 – Paul Werbos Developed a learning precedure called backpropagation of
error.
05/02/2023 10
Historical Background of Artificial Neural Network
• 1986, The application area of the MLP networks remained rather limited until the breakthrough when a general back propagation algorithm for a multi-layered perceptron was introduced by Rummelhart and Mclelland.
• 1988, Radial Basis Function (RBF) networks were first introduced by Broomhead & Lowe. Although the basic idea of RBF was developed 30 years ago under the name method of potential function, the work by Broomhead & Lowe opened a new frontier in the neural network community.
05/02/2023 11
ANN regression
• Linear activation function Gives continuous values.
05/02/2023 12
ANN classification
• For two class Sigmoid function ( threshold > 0.5 one class & threshold < 0.5 another class)• More class Softmax function (Gives probability for each class)• tanh function may used as activation function 05/02/2023 13
Activation functions• Linear function ,
• Sigmoid function , Where η=xθ.
• Softmax function,
05/02/2023 14
Perceptron learning model specifies the probability of a binary output yi ε {0,1} given the input xi as follows:
( | , ) ( | ( , ))i i i ip y x w Ber y sigm x w
1
( | , ) ( | ( , ))n
i ii
p y X w Ber y sigm x w
1
1
1 1( | , ) 11 1
i i
i i
y yn
x w x wi
p y X we e
1; ( 1| , )1 ii i i x wp y x we
Cost function:
1
( ) log ( | , )
= log (1 ) log(1 )n
i i i ii
c w p y X w
y y
Cross entropy
Construction of cost function: sigmoid formulation
sigm(xi,w)=1
1 ix we
Xiw=0
05/02/2023 15
Softmax formulation
sigm(xi,w)=1
1 ix we+1
xi1
xi2
+1
b1=w10
w11
w21
w12
w22
b2=w20
Ʃ
Ʃ
u11
u12 Softm
ax la
yer
1
1 2 1
i
i i
x w
ix w x w
ee e
2
1 2 2
i
i i
x w
ix w x w
ee e
1 2 1i i 05/02/2023 16
Indicator: 1 if ( )
0 otherwisei
c i
y cI y
0 1( ) ( )1 2( | , ) i iI y I y
i i i ip y x w
0 1( ) ( )1 2
1
( | , ) i i
nI y I yi i
i
p y X w
1
1 2
2
1 2
1
2
y 0( | , )
y 1
i
i i
i
i i
x w
i ix w x w
i i x w
i ix w x w
e ife ep y x we if
e e
0 1 1 21
( ) log ( | , ) ( ( ) log ( ) log )n
i i i ii
c w p y X w I y I y
Construction of cost function: Softmax formulation
XLinear Layer
Log softmax
layerNLL C(w)
05/02/2023 17
Weight update (Backpropagation)
• Derivative cost w.r.t inputs (layer wise).• Information go from to = c forward message.• Error propagate backward message & update its
weights.
05/02/2023 18
Optimization
Our goal is to optimize the cost function.Different optimization techniquesGradient descent algorithmNewton's algorithmStochastic gradient descent(SGD)Online learning, batch & mini batch
optimization
05/02/2023 19
Regression (Findings)• Used data set = 7• (Regression = 4, classification = 3)• Pharmaceuticals data:
Size 26
No. of variables 4 (one dependent and three independent)
Outlier Present (6th , 10th ,and 26th )Autocorrelation AbsenceMulticollinearity AbsenceNormality PresentData type RealCross validation LOOCVApplied methods Linear model, Polynomial & ANN
05/02/2023 20
Regression (cont…)
ANN is the best regression model05/02/2023 21
Regression(cont..)
• Yacht Hydrodynamics Data:Size 308
No. of variables 7 (one dependent and six independent)
Outlier Absence
Autocorrelation Absence
Multicollinearity Absence
Normality Absence (Clustered)
Data type Real
Cross validation Training set and test set
Applied methods Linear model, Polynomial & ANN
05/02/2023 22
• Results of Yacht hydrodynamics..
05/02/2023 23
• 100 times repeat for different training and test set• Box plot of test error grow sense about error variation
• ANN is the best regression model05/02/2023 24
Regression(cont..)• Simulated data-1
Size 1000No. of variables 10 (one dependent and nine independent)Outlier AbsenceAutocorrelation AbsenceMulticollinearity AbsenceNormality presentData type Real Cross validation Training set and test setApplied methods Linear model & ANN
05/02/2023 25
• Results of Simulated data-1
05/02/2023 26
• 100 times repeat for different training and test set• Box plot of test error grow sense about error variation
• ANN is the best regression model05/02/2023 27
Regression (cont…)
• Simulated data-2Size 20000No. of variables 20 (one dependent and nine independent)Outlier AbsenceAutocorrelation AbsenceMulticollinearity Strong MulticollinearityNormality presentData type Real Cross validation Training set and test setApplied methods Linear model & ANN
05/02/2023 28
• Results of Simulated data-2
05/02/2023 29
• 100 times repeat for different training and test set• Box plot of test error grow sense about error variation
• ANN is the best regression model05/02/2023 30
Classification• IRIS data
Size 150
No. of variables 5 (one dependent and four independent)
No. of class Three (Setosa, Versicolor, Virginica
Type Balanced
Data type Real
Cross validation LOOCV
Applied methods Logistic, LDA, QDA, KNN, NB & ANN
05/02/2023 31
Classification (cont…)• Results
• ANN is the best classifier
Methods Classification rate Misclassification rate
Logistic 0.98 0.02
LDA 0.98 0.02
QDA 0.98 0.02
KNN 0.95 0.05
NB 0.95 0.05
ANN 0.99 0.01
05/02/2023 32
Classification (cont…)
• Fertility data
Size 100
No. of variables 5 (one dependent and four independent)
No. of class Two (Normal & Altered)
Type Imbalanced
Data type Real
Cross validation LOOCV
Applied methods Logistic, LDA, KNN, NB & ANN
05/02/2023 33
Classification (cont…)
• Results
• ANN is the best classifier
Methods Accuracy Sensitivity Specificity PPV NPV
Logistic 0.84 0.87 0.00 0.96 0.00
LDA 0.83 0.95 0.00 0.87 0.00
KNN 0.81 0.90 0.16 0.88 0.20
NB 0.82 0.94 0.00 0.87 0.00
ANN 0.88 0.95 0.34 0.91 0.50
05/02/2023 34
Classification (cont…)• Leukemia data
Size 72
No. of variables 7130 (one dependent and 7129 independent)
No. of class Two (ALL & AML)
Type Balanced
Data type Real
Cross validation LOOCV
Applied methods Logistic, LDA, QDA, KNN, NB & ANN
05/02/2023 35
Classification (cont…)• Results
• ANN is the best classifier
Methods Accuracy Sensitivity Specificity
Logistic 0.47 0.62 0.31
LDA 0.62 0.68 0.52
QDA 0.65 1.00 0.00
KNN 0.54 0.65 0.32
NB 0.65 1.00 0.00
ANN 0.64 0.68 0.56
05/02/2023 36
Conclusion
• In all cases ANN is the best .
Data Problems ANN Status
Pharmaceuticals Outlier Best regression model
Yacht hydro: Clustered Best regression model
Simulated data-1 Fresh Best regression model
simulated data-2 Strong multicollinearity Best regression model
IRIS Balanced Best classifier
Fertility Imbalanced Best classifier
Leukemia Large (7129 varisbles) Best classifier
05/02/2023 37
Limitations
• Backpropagation no guarantee of absolute minimum • VC dimension unclear• Weights initialization random result is not unique.• Some weights are zero network doesn’t converge.• Computation of confidence interval is so hard.• Doesn’t perform t-test, F-test.
05/02/2023 38
Areas of further research• Robust, generalized ridge, principle component, latent
root, lasso and step wise regression.• Multivariate regression, time series analysis • Application of artificial neural network on unsupervised
learning• Study of semi supervised learning• Comparative study with others machine learning
techniques and data mining techniques• Improvement of backpropagation algorithm
05/02/2023 39
THANK YOU ALL
05/02/2023 40