multilayer feed-forward artificial neural networks for class-modeling f. marini, a. magrì, r. bucci...
TRANSCRIPT
Multilayer feed-Multilayer feed-forwardforward
artificial neural artificial neural networks networks
for for Class-modelingClass-modelingF. Marini, A. Magrì, R.
BucciDept. of Chemistry - University of
Rome “La Sapienza”
The starting question….The starting question….
Despite literature on NNs has increased significantly, no paper considers the possibility of performing class-modeling
1 4 28 148367
1615
2509
3577
46434780 4916
0
1000
2000
3000
4000
5000
1982 85-86 89-90 93-94 97-98 2001-2002
ANN papers published: 1982-2002
class modeling: what….
• Class modeling considers one class at a time• Any object can then belong or not to that
specific class model• As a consequence, any object can be assigned
to only one class, to more than one class or to no class at all
classification
class modeling
……..and why..and why• Flexibility• Additional information:
– sensitivity: fraction of samples from category X accepted by the model of category X
– specificity: fraction of samples from category Y (or Z, W….) refused by the model of category X
• No need to rebuild the existing models each time a new category is added.
less equivocal answer to the question:
“are the analytical data compatible with the product being X as declared?”
A first step forwardA first step forward
• A particular kind of NN, after suitable modifications could be used for performing class-modeling (Anal. Chim. Acta, 544 (2005), 306)– Kohonen SOM– Addition of dummy random vectors to the training
set– Computation of a suitable (non-parametric)
probability distribution after mapping on the 2D Kohonen layer.
– Definition of the category space based on this distribution
In this communication…In this communication…
…The possibility of using a different type of neural network (multilayer feed-forward) to operate class-modeling is studied– How to?– Examples
NN: a mathematical NN: a mathematical approachapproach
• From a computational point of view, ANNs represent a way to operate a non-linear functional mapping between an input and an output space.
)(xy f
• This functional relation is expressed in an implicit way (via a combination of suitably weighted non-linear functions, in the case of MLF-NN)
• ANNs are usually represented as groups of elementary computational units (neurons) performing simultaneously the same operations.
• Types of NN differ in how neurons are grouped and how they operate
Multilayer feed-forward NNMultilayer feed-forward NN
• Individual processing units are organized in three types of layer: input, hidden and output
• All neurons within the same layer operate simultaneously
output
hidden
input
x1 x2 x3 x4
x5
y1 y2 y3 y4
The artificial neuronThe artificial neuron
f()
w1k
w2kw3kx
3
x2
x1
zk
)( 0ki iikk wxwfz
hidden
input
x1 x2 x3 x4
x5
The artificial neuronThe artificial neuron
f()
w1j
w2jw3jz
3
z2
z1
yj
hidden
input
x1 x2 x3 x4
x5
output
y1 y2 y3 y4
)))(((
)(
00
0
jk ki iikkj
jk kkjj
wwxwfwf
wzwfy
TrainingTraining
• Iterative variation of connection weights, to minimize an error criterion.
• Usually, backpropagation algorithm is used:
)1()(
tww
Etw ijP
ij
P
ijP
MLF class-modeling: what to MLF class-modeling: what to do?do?
• Model for each category has to be built using only training samples from that category
• Suitable definition of category space
Somewhere to start fromSomewhere to start from
When targets are equal to input values, hidden nodes could be thought of as a sort of non-linear principal components
Input Hidden Input
x1 x2
x3
Xj
xm Output value of hidden node 1 Ou
tpu
t valu
e o
f h
idd
en
nod
e 2
… … and a first ending pointand a first ending point• For each category a neural network model is
computed providing the input vector also as desired target vector
Ninp-Nhid-Ninp
• Number of hidden layer is estimated by loo-cv (minimum reconstruction error in prediction)
• The optimized model is then used to predict unknown samples:– Sample is presented to the network– Vector of predicted responses (which is an
estimate of the original input vector) is computed– Prediction error is calculated and compared to
the average prediction error for samples belonging to the category (as in SIMCA).
NN-CM in practiceNN-CM in practice• Separate category autoscaling
• 2,0;; C
CChid
Ctrain sNX W
•
VitestC
itestT
itestC
itestCi
CiC
itestC
itest
Ns
sf
/)ˆ()ˆ(
);(ˆ
,,,,2,
2,,,
xxxx
Wxx
• 2,0
2,
,C
CiCi s
sF
• if is lower than a predifined threshold, the sample is refused by the category model.
)( ,CiFFp
The classical X-OR The classical X-OR • 200 training samples:200 training samples:
– 100 class 1100 class 1– 100 class 2100 class 2
• 200 test samples:200 test samples:– 100 class 1 100 class 1 – 100 class 2100 class 2
3 hidden neurons for each 3 hidden neurons for each categorycategory
ResultsResults• Sensitivity:
– 100% class 1, 100% class2
• Specificity:– 75% class1 vs class 2– 67% class2 vs class 1
• Prediction ability:– 87% class1– 83% class2– 85% overall
• These results are significantly better than with SIMCA and UNEQ (specificities lower than 30% and classification slightly higher than 60%)
CM of honey samplesCM of honey samples
• 76 samples of honey from 6 different botanical origins (honeydew, wildflower, sulla, heather, eucalyptus and chestnut)
• 11-13 samples per class• 2 input variables: specific rotation; total
acidity• Despite the small number of samples, a
good NN model was obtained (2 hidden neurons for each class)
• Possibility of drawing a Coomans’ plot
Further work and Further work and ConclusionsConclusions
• A novel approach to class-modeling based on multilayer feed-forward NN was presented
• Preliminary results seem to indicate its usefulness in cases where traditional class modeling fails
• Effect of training set dimension should be further invetigated (our “small” data set was too good to be used for obtaining a definitive answer)
• We are analyzing other “exotic” data sets for classification where traditional methods fail.