structure of artificial neural networksplima/shortcourse/... · 2020-01-22 · structure of...

15
STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Neurons are organized in layers. Different layers may perform different kinds of transformations on their inputs. Signals travel from the first (input), to the last (output) layer. Neural networks have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering. Pedro Miguel Lima ( CENTRO DE MATEM ATICA COMPUTACIONAL E ESTOC Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery January 22, 2020 1 / 15

Upload: others

Post on 31-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

STRUCTURE OF ARTIFICIAL NEURAL NETWORKS

Neurons are organized in layers. Different layers may perform differentkinds of transformations on their inputs. Signals travel from the first(input), to the last (output) layer.Neural networks have been used on a variety of tasks, including computervision, speech recognition, machine translation, social network filtering.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 1 / 15

Page 2: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

STRUCTURE OF ARTIFICIAL NEURAL NETWORKS

Biological neuron (1)- dendrites (2), cell body (3); Artificial neuron (4)-acumulator (5), connection (6); Artificial neural network (7).

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 2 / 15

Page 3: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS AS MATHEMATICAL TOOLS

Neural network = Learning machineThe knowledge acquired from the samples can be generalized to arbitrarydata.How to build a neural network?Use statistical learning (includes optimization, approximation theory,probability theory,...).X - input space; Y - output space;h: - X → YKnowing the output of h for a finite set of samples S ⊂ X , we want tocompute its value for every x ∈ X .The criterium for choosing h is the minimization of risk functional R(h).(probability of getting a wrong answer).Usually we cannot evaluate the exact risk functional, but only a certainempirical risk functional Remp(h).Instead R(h) we minimize Remp(h).

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 3 / 15

Page 4: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN IMAGE PROCESSING

Suppose you want to use an artificial neural network to distinguish imagesof cats from dogs.

First you must have a certain set of images (samples) which arerecognized as cats of dogs.

The neural network is trained by processing these images andcomparing the obtained answers with the correct ones.

Based on this processing the network creates a rule to distinguish catsfrom dogs, which is stored in the artificial neurons.The more samplesare used, the more reliable is the rule.

Each time an arbitrary image enters the network an answer isobtained applying the rule.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 4 / 15

Page 5: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN IMAGE PROCESSING

Training process (9-12): Rules are created based on previously classifiedimages. Application (13): the network is tested using new images.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 5 / 15

Page 6: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

Example

R. Heibrich, M. Keilbach, T. Graepe, P. Bollmann-Sdorra, K. Obermayer,Neural Networks in Economics.http://www.researchgate.net/publication/249900701Task : evaluate the reliability of bank customers concerning paying back agiven loan.X - set of bank clients; x = (x1, x2, ..., xn), ∀x ∈ X (age, sex, income,...)Y = {−1, 1} (the answer is 1, if the client is reliable, and −1, if he is notreliable.)Set of samples (training set): S = {(xt, yt) xt ∈ X , yt ∈ Y , t = 1, ..., l}Objective: define h : X → Y of a certain type, such that h defines thereliability of any customer x ∈ X . h is called a classifier.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 6 / 15

Page 7: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

Risk functional: R(h) =∫XY L(y , h(x))PXY dxdy ,

where L- loss function ,

L(y , y ′) =

{0, ify = y ′;1, ify 6= y ′.

PXY - probability that a random client is reliable (y = 1) or unreliable(y = −1).The goal is to minimize R(h).Empirical risk functional:

Remp(h) =1

l ∑(xt,yt )∈S

L(yt , h(xt)) =1

l ∑(xt,yt )∈S

(h(xt)− yt)2/4.

Since we cannot in general evaluate R(h), we replace the minimization ofR(h) by the minimization of Remp(h).

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 7 / 15

Page 8: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

h∗ - minimizer of R; hl - minimizer of Remp.According to the principle of empirical risk minimization as l (cardinal ofS) tends to infinity, |Remp(h)− R(h)| → 0, and therefore if l is largeenough hl is a good approximation h∗.This problem can be solved by different types of neural networks,depending on the form of the classifier h.Let’s consider here the simplest type of neural network, called perceptron.In this case, we have only one hidden layer. The scheme of the perceptronis as follows:

input layer: X = (x1, ..., xn); dim(X )=n.

hidden layer: α = (α1, ..., αn); r =dim(α) = n.

output layer: y ∈ {−1, 1};dim(Y )=1 (the output consists only ofone number).

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 8 / 15

Page 9: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

Scheme of a perceptron

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 9 / 15

Page 10: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

Each classifier h is defined by a n-dimensional vector α = (α1, ..., αn),αi ∈ IR:

h(x, α) = sign

(n

∑i=1

αixi

)= sign(〈α, x〉).

Given a training example (xt, yt) we say thatxt is classified, if h(xt, α) = yt ;xt is misclassified, if h(xt, α) 6= yt .Our goal is that all the samples are classified.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 10 / 15

Page 11: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

In terms of α, the empirical risk functional Remp(h) is defined as

Remp(h) =1

l ∑(xt,yt )∈S

(h(xt)− yt)2/4 =

1

l ∑(xt,yt )∈S

(sign(〈α, xt〉)− yt)2 /4.

In order to minimize Remp(h) we use the following iterative process:

Set initial value α0 ∈ IRn (randomly chosen);

While misclassified training examples exist

compute h(xt, αt);

If xt is misclassified

then αt+1 = αt − 12 (sign(〈α, xt〉)− yt) xt ;EndIf

End While

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 11 / 15

Page 12: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

Besides perceptron, different kinds of neural networks could be applied.

Multilayer Perceptron

In this case, the classificator has the form:

h(x, β, γ) = f2(f1(x, β), γ),

where

f1(x, β) = (g1(〈x, β1〉), ..., (g1(〈x, βr 〉)), f2(z, γ) = g2(〈z, γ〉);

here g1 and g2 are sigmoidal functions; β1, ...βr , γ are vectors that shouldbe optimized.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 12 / 15

Page 13: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

NEURAL NETWORKS IN ECONOMICS

Radial Basis Functions

h(x, β, γ, σ) = f2(f1(x, β, σ), γ),

where

f1(x, β, σ) = (g1(x, β1, σ1), ..., (g1(x, βr , σr )), f2(z, γ) = g2(〈z, γ〉);

here g1(x , β, σ) = exp(− ‖x−β‖2

2σ2

); g2 is a sigmoidal function.

β1, ...βr , σ1, ..., σr , γ are vectors that should be optimized.

In all the different types of neural networks we have to minimize theemiprical risk function using an iterative procedure.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 13 / 15

Page 14: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

BIBLIOGRAPHY

1 Bear, Mark F., Berry W. Connors, and Michael A. Paradiso (Eds.),Neuroscience: exploring the brain, Philadelphia, PA, USA, 2016.

2 Bishop, Christopher, Pattern Recognition and Machine Learning, NewYork, Springer Verlag, 2006.

3 Cerebro, ed. by Fundacao Calouste Gulbenkian, Lisbon, 2019.

4 De Felipe, Javier, and Larry Swanson, The Beautiful Brain: theDrawings of Ramon y Cajal, New York, Abbrams and Chronicle Book,2017.

5 Greenstein, B., and Greenstein, A., Color Atlas of Neuroscience,Neuroanathomy and Neurophysiology, Stutgart, George ThiemeVerlag, 2000. 2006/08/29/the-discovery-of-the-neuron.

6 R. Heibrich, M. Keilbach, T. Graepe, P. Bollmann-Sdorra, K.Obermayer, Neural Networks in Economics,http://www.researchgate.net/publication/249900701.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 14 / 15

Page 15: STRUCTURE OF ARTIFICIAL NEURAL NETWORKSplima/ShortCourse/... · 2020-01-22 · STRUCTURE OF ARTIFICIAL NEURAL NETWORKS Biological neuron (1)- dendrites (2), cell body (3);Arti cial

BIBLIOGRAPHY

7 Mo Constandi, The Discovery of Neuron,https://neurophilosophy.wordpress.com/,2006.

8 Netter, F., Craig, J., Perkins, J., Hansen, T., and Koeppen, B., AtlasNeuroanathomy and Neurophysiology, Ebook: Icon CostumCommunication, 2002.

9 Nieuwenhuys,R., Donkelaar, and Nicholson, C., The Central NervousSystem of Vertebrates, Berlin, Spriger,1998.

Pedro Miguel Lima ( CENTRO DE MATEMATICA COMPUTACIONAL E ESTOCASTICA INSTITUTO SUPERIOR TECNICO UNIVERSIDADE DE LISBOA PORTUGAL)Mathematical Models in Neuroscience and their Applications Lecture 1 - from the discovery of neuron to neural networksJanuary 22, 2020 15 / 15