artificial neural networksvasighi/courses/ann97win/ann02.pdfintroduction neural networks...
TRANSCRIPT
Part 2
Artificial Neural
Networks
Introduction Artificial Neuron Model
Following simplified model of real neurons is also known as a Threshold
Logic Unit
Biological Terminology Artificial Neural Network Terminology
Neuron Node / Unit / Cell / Neurode
Synapse Connection / Edge / Link
Synaptic Efficiency Connection Strength / Weight
Firing Frequency Node Output
w1
w2
wn
...
f
x1
x2
.
.
.
xn
out
Body of neuron
McCullouch-Pitts neuron
(1943)
Introduction
w1
w2
wn
...
f
x1
x2
.
.
.
xn
out
The neuron output may be written as: nnxwxwf 11
n
i
ii xwf1
wT
x
netxwn
i
ii 1
xwTf
McCullouch-Pitts neuron
Introduction
Typical activation functions used are:
Step functions
θ,1
θ,1)sgn()(
net
netnetnetf
θ,0
θ,1)sgn()(
net
netnetnetf
Bipolar binary Unipolar binary
hard-limiting activation functions
The step function is very easy to implement
Output does not increase/decrease to values whose magnitude is
excessively high
The outputs of the step function may be interpreted as class identifier
McCullouch-Pitts neuron Activation functions
net
f(net)
net
f(net)
Introduction
Ramp functions
otherwise
1if
0if
1
0
)( net
net
net
netf
otherwise
1if
1if
1
1
)( net
net
net
netf
Bipolar binary Unipolar binary
Typical activation functions used are:
McCullouch-Pitts neuron Activation functions
net
f(net)
net
f(net)
Introduction
Sigmoid functions
1)2exp(1
2)(
netnetf
)exp(1
1)(
netnetf
Typical activation functions used are:
McCullouch-Pitts neuron Activation functions
Hyperbolic tangent sigmoid
• real-valued
• differentiable
Introduction McCullouch-Pitts neuron Activation functions
Typical activation functions used are:
Gaussian functions
𝑓 𝑛𝑒𝑡 =1
2𝜋𝜎exp(−
1
2(𝑛𝑒𝑡 − 𝜇
𝜎)2)
f(net)
net
A) Show the following equation is true:
B) Write a simple Matlab m-file and plot following activation function:
o A sigmoid function.
o A Gaussian function.
Suppose the range is -5 to 5. Try to show both plot in a single figure by
subplot function.
1 − tanh 𝑥
1 + tanh 𝑥= 𝑒2𝑥
Introduction McCullouch-Pitts neuron Activation functions
Introduction Perceptron
x1
x2
.
.
.
xn
...
...
An arrangement of one input layer of McCulloch-Pitts (M-P) neurons
feeding forward to one output layer of McCulloch-Pitts neurons is known as
a Perceptron.
w1
w2
wn
...
o1
w1
w2
wn
...
o2
w1
w2
wn
...
om
θ,0
θ,1)sgn(
net
netnet
n
1i
ijij )θwxsgn(out
Connection weights between layer i and j
w
We can use McCulloch-Pitts neurons to implement the basic logic gates:
sgn1w
input
NOT
Output
00
1-1
sgn0
0
input ANDOutput
00.5
0.5
0
1
01
0
01
1
= 0
= 1
1
0θ0w0w 21 0θ
0θ1w0w 21
0θ0w1w 21
0θ1w1w 21
θw 2
θw1
θww 21
Introduction Perceptron
Introduction Perceptron
w
sgn
We can use McCulloch-Pitts neurons to implement the basic logic gates:
0
0
input OR
Output
02
2
0
1
11
0
11
1 = 1
1
0θ1w1w
0θ0w1w
0θ1w0w
0θ0w0w
21
21
21
21
θww
θw
θw
0θ
21
1
2
1
0
input XOR
Output
1
w
sgn0
11
0
00
1
1
0
0θ1w1w
0θ0w1w
0θ1w0w
0θ0w0w
21
21
21
21
θww
θw
θw
0θ
21
1
2
AND
0,0
0,1
1,0
1,1
w
sgnx
y
inputOutput
out
w1
w2
0θywxw 21
The decision boundary is at:
22
1
w
θx
w
wy
3θ
5.1w
5.1w
2
1
(w1,w2)
3θ
2.1w
7.1w
2
1
(w1,w2)
Introduction Perceptron
AND
0,0
0,1
1,0
1,1
OR
0,0
0,1
1,0
1,1
XOR
0,0
0,1
1,0
1,1
?
It was the proof by Minsky & Papert in 1969 that
Perceptrons could only learn linearly separable
functions that led to decline in neural network
research until the mid 1980's when it was proved
that other network architectures could learn
these type of functions.
Introduction Perceptron
It would simplify the mathematics if we could treat the neuron threshold
as if it were just another connection weight.
θwxwxwxwxθwx njn3j32j21j1
n
1i
iji
It is easy to see that if we define w0j= –θj and x0=1 then this
becomes:
We just have to include an extra
input unit with activation x0 = 1
and then we only need to
compute “weights”, and no
explicit thresholds.
w1
w2
wq
...
out
x1
x2
.
.
.
xnbias
1
Introduction Perceptron
n
0i
iji0j0njn3j32j21j1
n
1i
iji wxwxwxwxwxwxθwx
Now we will discuss artificial neural network architectures, some of
which derive inspiration from biological neural networks.
Fully connected networks
every node is connected to
every node, and these
connections may be either
excitatory (positive weights),
inhibitory (negative weights), or
irrelevant (almost zero weights).
large number of parameters
Difficult learning
Biologically implausible
Introduction Neural Networks Architectures
Layered networks
These are networks in which nodes are partitioned into subsets called
layers, with no connections that lead from layer j to layer k if j>k.
Connections, with arbitrary
weights, may exist from any
node in layer i to any node in
layer j for j i.
intra-layer connections may
exist.
Introduction Neural Networks Architectures
Layered networks Acyclic networks
there are no intra-layer connections. In other words, a connection may
exist between any node in layer i and any node in layer j for j > i, but a
connection is not allowed for i = j.
Introduction Neural Networks Architectures
Introduction Neural Networks Architectures
Layered networks Acyclic networks Feed forward networks
This is a subclass of acyclic networks in which a connection is allowed
from a node in layer i only to nodes in layer i + 1
These networks can be
described by a sequence
of numbers indicating the
number of nodes in each
layer.
A feed forward 3-2-3-2 network
Recurrent neural networks
Introduction Perceptron Example
A typical neural network application is classification.
Mass Speed Class1.0 0.1 Bomber
2.0 0.2 Bomber
0.1 0.3 Fighter
2.0 0.3 Bomber
0.2 0.4 Fighter
3.0 0.4 Bomber
0.1 0.5 Fighter
1.5 0.5 Bomber
0.5 0.6 Fighter
1.6 0.7 Fighter
How do we construct a neural network that can classify airplanes?
General Procedure for Building Neural Networks
1. Understand and specify your problem in terms of
inputs and required outputs
2. Take the simplest form of network you think
might be able to solve your problem
3. Try to find appropriate connection weights
(including neuron thresholds)
4. Make sure that the network works on its
training and test data.
5. If the network doesn’t perform well enough, go
back to stage 3 and try harder
6. If the network still doesn’t perform well enough,
go back to stage 2
Introduction Perceptron Example
A typical neural network application is classification.
Mass Speed Class1.0 0.1 Bomber
2.0 0.2 Bomber
0.1 0.3 Fighter
2.0 0.3 Bomber
0.2 0.4 Fighter
3.0 0.4 Bomber
0.1 0.5 Fighter
1.5 0.5 Bomber
0.5 0.6 Fighter
1.6 0.7 Fighter
How do we construct a neural network that can classify airplanes?
• the inputs can be direct encodings of the masses and
speeds.
• We have two classes, so just one output unit is
enough, with activation 1 for ‘fighter’ and 0 for
‘bomber’ (or vice versa)
• The simplest network to try first is a simple
Perceptron.
|sgn
Mass
Speed
Class
Bias
w2
w1
w0
class sgn(w0 w1.Mass w2.Speed)
You can experiment with a two-element neuron by running: nnd2n2
Neural Network Toolbox
Neural Network Toolbox
You might want to trydemo nnd4db .
It allows you to pick
new input vectors and
change the decision
boundary to monitor
weights and bias
changes