multilayer feed-forward artificial neural networks for class-modeling f. marini, a. magrì, r. bucci...

25
Multilayer feed- Multilayer feed- forward forward artificial neural artificial neural networks networks for for Class-modeling Class-modeling F. Marini , A. Magrì, R. Bucci Dept. of Chemistry - University of Rome “La Sapienza”

Upload: justina-fitzgerald

Post on 22-Dec-2015

217 views

Category:

Documents


4 download

TRANSCRIPT

Multilayer feed-Multilayer feed-forwardforward

artificial neural artificial neural networks networks

for for Class-modelingClass-modelingF. Marini, A. Magrì, R.

BucciDept. of Chemistry - University of

Rome “La Sapienza”

The starting question….The starting question….

Despite literature on NNs has increased significantly, no paper considers the possibility of performing class-modeling

1 4 28 148367

1615

2509

3577

46434780 4916

0

1000

2000

3000

4000

5000

1982 85-86 89-90 93-94 97-98 2001-2002

ANN papers published: 1982-2002

class modeling: what….

• Class modeling considers one class at a time• Any object can then belong or not to that

specific class model• As a consequence, any object can be assigned

to only one class, to more than one class or to no class at all

classification

class modeling

……..and why..and why• Flexibility• Additional information:

– sensitivity: fraction of samples from category X accepted by the model of category X

– specificity: fraction of samples from category Y (or Z, W….) refused by the model of category X

• No need to rebuild the existing models each time a new category is added.

less equivocal answer to the question:

“are the analytical data compatible with the product being X as declared?”

A first step forwardA first step forward

• A particular kind of NN, after suitable modifications could be used for performing class-modeling (Anal. Chim. Acta, 544 (2005), 306)– Kohonen SOM– Addition of dummy random vectors to the training

set– Computation of a suitable (non-parametric)

probability distribution after mapping on the 2D Kohonen layer.

– Definition of the category space based on this distribution

In this communication…In this communication…

…The possibility of using a different type of neural network (multilayer feed-forward) to operate class-modeling is studied– How to?– Examples

Just a few words about NNJust a few words about NN

Sophocles

NN: a mathematical NN: a mathematical approachapproach

• From a computational point of view, ANNs represent a way to operate a non-linear functional mapping between an input and an output space.

)(xy f

• This functional relation is expressed in an implicit way (via a combination of suitably weighted non-linear functions, in the case of MLF-NN)

• ANNs are usually represented as groups of elementary computational units (neurons) performing simultaneously the same operations.

• Types of NN differ in how neurons are grouped and how they operate

Multilayer feed-forward NNMultilayer feed-forward NN

• Individual processing units are organized in three types of layer: input, hidden and output

• All neurons within the same layer operate simultaneously

output

hidden

input

x1 x2 x3 x4

x5

y1 y2 y3 y4

The artificial neuronThe artificial neuron

f()

w1k

w2kw3kx

3

x2

x1

zk

)( 0ki iikk wxwfz

hidden

input

x1 x2 x3 x4

x5

The artificial neuronThe artificial neuron

f()

w1j

w2jw3jz

3

z2

z1

yj

hidden

input

x1 x2 x3 x4

x5

output

y1 y2 y3 y4

)))(((

)(

00

0

jk ki iikkj

jk kkjj

wwxwfwf

wzwfy

TrainingTraining

• Iterative variation of connection weights, to minimize an error criterion.

• Usually, backpropagation algorithm is used:

)1()(

tww

Etw ijP

ij

P

ijP

MLF class-modeling: what to MLF class-modeling: what to do?do?

• Model for each category has to be built using only training samples from that category

• Suitable definition of category space

Somewhere to start fromSomewhere to start from

When targets are equal to input values, hidden nodes could be thought of as a sort of non-linear principal components

Input Hidden Input

x1 x2

x3

Xj

xm Output value of hidden node 1 Ou

tpu

t valu

e o

f h

idd

en

nod

e 2

… … and a first ending pointand a first ending point• For each category a neural network model is

computed providing the input vector also as desired target vector

Ninp-Nhid-Ninp

• Number of hidden layer is estimated by loo-cv (minimum reconstruction error in prediction)

• The optimized model is then used to predict unknown samples:– Sample is presented to the network– Vector of predicted responses (which is an

estimate of the original input vector) is computed– Prediction error is calculated and compared to

the average prediction error for samples belonging to the category (as in SIMCA).

NN-CM in practiceNN-CM in practice• Separate category autoscaling

• 2,0;; C

CChid

Ctrain sNX W

VitestC

itestT

itestC

itestCi

CiC

itestC

itest

Ns

sf

/)ˆ()ˆ(

);(ˆ

,,,,2,

2,,,

xxxx

Wxx

• 2,0

2,

,C

CiCi s

sF

• if is lower than a predifined threshold, the sample is refused by the category model.

)( ,CiFFp

A couple of examples

The classical X-OR The classical X-OR • 200 training samples:200 training samples:

– 100 class 1100 class 1– 100 class 2100 class 2

• 200 test samples:200 test samples:– 100 class 1 100 class 1 – 100 class 2100 class 2

3 hidden neurons for each 3 hidden neurons for each categorycategory

ResultsResults• Sensitivity:

– 100% class 1, 100% class2

• Specificity:– 75% class1 vs class 2– 67% class2 vs class 1

• Prediction ability:– 87% class1– 83% class2– 85% overall

• These results are significantly better than with SIMCA and UNEQ (specificities lower than 30% and classification slightly higher than 60%)

A very small data-set: A very small data-set: honeyhoney

CM of honey samplesCM of honey samples

• 76 samples of honey from 6 different botanical origins (honeydew, wildflower, sulla, heather, eucalyptus and chestnut)

• 11-13 samples per class• 2 input variables: specific rotation; total

acidity• Despite the small number of samples, a

good NN model was obtained (2 hidden neurons for each class)

• Possibility of drawing a Coomans’ plot

Further work and Further work and ConclusionsConclusions

• A novel approach to class-modeling based on multilayer feed-forward NN was presented

• Preliminary results seem to indicate its usefulness in cases where traditional class modeling fails

• Effect of training set dimension should be further invetigated (our “small” data set was too good to be used for obtaining a definitive answer)

• We are analyzing other “exotic” data sets for classification where traditional methods fail.

AcknowledgementsAcknowledgements

• Prof. Jure Zupan, SloveniaProf. Jure Zupan, Slovenia