structure of artificial neural networksplima/shortcourse/... · 2020-01-22 · structure of...

STRUCTURE OF ARTIFICIAL NEURAL NETWORKS

Neurons are organized in layers. Different layers may perform differentkinds of transformations on their inputs. Signals travel from the first(input), to the last (output) layer.Neural networks have been used on a variety of tasks, including computervision, speech recognition, machine translation, social network filtering.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 1 / 15

STRUCTURE OF ARTIFICIAL NEURAL NETWORKS

Biological neuron (1)- dendrites (2), cell body (3); Artificial neuron (4)-acumulator (5), connection (6); Artificial neural network (7).


NEURAL NETWORKS AS MATHEMATICAL TOOLS

Neural network = Learning machineThe knowledge acquired from the samples can be generalized to arbitrarydata.How to build a neural network?Use statistical learning (includes optimization, approximation theory,probability theory,...).X - input space; Y - output space;h: - X → YKnowing the output of h for a finite set of samples S ⊂ X , we want tocompute its value for every x ∈ X .The criterium for choosing h is the minimization of risk functional R(h).(probability of getting a wrong answer).Usually we cannot evaluate the exact risk functional, but only a certainempirical risk functional Remp(h).Instead R(h) we minimize Remp(h).


NEURAL NETWORKS IN IMAGE PROCESSING

Suppose you want to use an artificial neural network to distinguish imagesof cats from dogs.

First you must have a certain set of images (samples) which arerecognized as cats of dogs.

The neural network is trained by processing these images andcomparing the obtained answers with the correct ones.

Based on this processing the network creates a rule to distinguish catsfrom dogs, which is stored in the artificial neurons.The more samplesare used, the more reliable is the rule.

Each time an arbitrary image enters the network an answer isobtained applying the rule.


NEURAL NETWORKS IN IMAGE PROCESSING

Training process (9-12): Rules are created based on previously classifiedimages. Application (13): the network is tested using new images.


NEURAL NETWORKS IN ECONOMICS

Example

R. Heibrich, M. Keilbach, T. Graepe, P. Bollmann-Sdorra, K. Obermayer,Neural Networks in Economics.http://www.researchgate.net/publication/249900701Task : evaluate the reliability of bank customers concerning paying back agiven loan.X - set of bank clients; x = (x1, x2, ..., xn), ∀x ∈ X (age, sex, income,...)Y = {−1, 1} (the answer is 1, if the client is reliable, and −1, if he is notreliable.)Set of samples (training set): S = {(xt, yt) xt ∈ X , yt ∈ Y , t = 1, ..., l}Objective: define h : X → Y of a certain type, such that h defines thereliability of any customer x ∈ X . h is called a classifier.



Risk functional: R(h) =∫XY L(y , h(x))PXY dxdy ,

where L- loss function ,

L(y , y ′) =

{0, ify = y ′;1, ify 6= y ′.

PXY - probability that a random client is reliable (y = 1) or unreliable(y = −1).The goal is to minimize R(h).Empirical risk functional:

Remp(h) =1

l ∑(xt,yt )∈S

L(yt , h(xt)) =1

l ∑(xt,yt )∈S

(h(xt)− yt)2/4.

Since we cannot in general evaluate R(h), we replace the minimization ofR(h) by the minimization of Remp(h).



h∗ - minimizer of R; hl - minimizer of Remp.According to the principle of empirical risk minimization as l (cardinal ofS) tends to infinity, |Remp(h)− R(h)| → 0, and therefore if l is largeenough hl is a good approximation h∗.This problem can be solved by different types of neural networks,depending on the form of the classifier h.Let’s consider here the simplest type of neural network, called perceptron.In this case, we have only one hidden layer. The scheme of the perceptronis as follows:

input layer: X = (x1, ..., xn); dim(X )=n.

hidden layer: α = (α1, ..., αn); r =dim(α) = n.

output layer: y ∈ {−1, 1};dim(Y )=1 (the output consists only ofone number).


Scheme of a perceptron



Each classifier h is defined by a n-dimensional vector α = (α1, ..., αn),αi ∈ IR:

h(x, α) = sign

(n

∑i=1

αixi

)= sign(〈α, x〉).

Given a training example (xt, yt) we say thatxt is classified, if h(xt, α) = yt ;xt is misclassified, if h(xt, α) 6= yt .Our goal is that all the samples are classified.



In terms of α, the empirical risk functional Remp(h) is defined as

Remp(h) =1

l ∑(xt,yt )∈S

(h(xt)− yt)2/4 =

1

l ∑(xt,yt )∈S

(sign(〈α, xt〉)− yt)2 /4.

In order to minimize Remp(h) we use the following iterative process:

Set initial value α0 ∈ IRn (randomly chosen);

While misclassified training examples exist

compute h(xt, αt);

If xt is misclassified

then αt+1 = αt − 12 (sign(〈α, xt〉)− yt) xt ;EndIf

End While



Besides perceptron, different kinds of neural networks could be applied.

Multilayer Perceptron

In this case, the classificator has the form:

h(x, β, γ) = f2(f1(x, β), γ),

where

f1(x, β) = (g1(〈x, β1〉), ..., (g1(〈x, βr 〉)), f2(z, γ) = g2(〈z, γ〉);

here g1 and g2 are sigmoidal functions; β1, ...βr , γ are vectors that shouldbe optimized.



Radial Basis Functions

h(x, β, γ, σ) = f2(f1(x, β, σ), γ),

where

f1(x, β, σ) = (g1(x, β1, σ1), ..., (g1(x, βr , σr )), f2(z, γ) = g2(〈z, γ〉);

here g1(x , β, σ) = exp(− ‖x−β‖2

2σ2

); g2 is a sigmoidal function.

β1, ...βr , σ1, ..., σr , γ are vectors that should be optimized.

In all the different types of neural networks we have to minimize theemiprical risk function using an iterative procedure.


BIBLIOGRAPHY

1 Bear, Mark F., Berry W. Connors, and Michael A. Paradiso (Eds.),Neuroscience: exploring the brain, Philadelphia, PA, USA, 2016.

2 Bishop, Christopher, Pattern Recognition and Machine Learning, NewYork, Springer Verlag, 2006.

3 Cerebro, ed. by Fundacao Calouste Gulbenkian, Lisbon, 2019.

4 De Felipe, Javier, and Larry Swanson, The Beautiful Brain: theDrawings of Ramon y Cajal, New York, Abbrams and Chronicle Book,2017.

5 Greenstein, B., and Greenstein, A., Color Atlas of Neuroscience,Neuroanathomy and Neurophysiology, Stutgart, George ThiemeVerlag, 2000. 2006/08/29/the-discovery-of-the-neuron.

6 R. Heibrich, M. Keilbach, T. Graepe, P. Bollmann-Sdorra, K.Obermayer, Neural Networks in Economics,http://www.researchgate.net/publication/249900701.


BIBLIOGRAPHY

7 Mo Constandi, The Discovery of Neuron,https://neurophilosophy.wordpress.com/,2006.

8 Netter, F., Craig, J., Perkins, J., Hansen, T., and Koeppen, B., AtlasNeuroanathomy and Neurophysiology, Ebook: Icon CostumCommunication, 2002.

9 Nieuwenhuys,R., Donkelaar, and Nicholson, C., The Central NervousSystem of Vertebrates, Berlin, Spriger,1998.


structure of artificial neural networksplima/shortcourse/... · 2020-01-22 · structure of...

Documents