artificial neural networksvasighi/courses/ann97win/ann02.pdfintroduction neural networks...

23
Part 2 Artificial Neural Networks

Upload: others

Post on 01-Jan-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Part 2

Artificial Neural

Networks

Page 2: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction Artificial Neuron Model

Following simplified model of real neurons is also known as a Threshold

Logic Unit

Biological Terminology Artificial Neural Network Terminology

Neuron Node / Unit / Cell / Neurode

Synapse Connection / Edge / Link

Synaptic Efficiency Connection Strength / Weight

Firing Frequency Node Output

w1

w2

wn

...

f

x1

x2

.

.

.

xn

out

Body of neuron

McCullouch-Pitts neuron

(1943)

Page 3: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction

w1

w2

wn

...

f

x1

x2

.

.

.

xn

out

The neuron output may be written as: nnxwxwf 11

n

i

ii xwf1

wT

x

netxwn

i

ii 1

xwTf

McCullouch-Pitts neuron

Page 4: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction

Typical activation functions used are:

Step functions

θ,1

θ,1)sgn()(

net

netnetnetf

θ,0

θ,1)sgn()(

net

netnetnetf

Bipolar binary Unipolar binary

hard-limiting activation functions

The step function is very easy to implement

Output does not increase/decrease to values whose magnitude is

excessively high

The outputs of the step function may be interpreted as class identifier

McCullouch-Pitts neuron Activation functions

net

f(net)

net

f(net)

Page 5: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction

Ramp functions

otherwise

1if

0if

1

0

)( net

net

net

netf

otherwise

1if

1if

1

1

)( net

net

net

netf

Bipolar binary Unipolar binary

Typical activation functions used are:

McCullouch-Pitts neuron Activation functions

net

f(net)

net

f(net)

Page 6: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction

Sigmoid functions

1)2exp(1

2)(

netnetf

)exp(1

1)(

netnetf

Typical activation functions used are:

McCullouch-Pitts neuron Activation functions

Hyperbolic tangent sigmoid

• real-valued

• differentiable

Page 7: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction McCullouch-Pitts neuron Activation functions

Typical activation functions used are:

Gaussian functions

𝑓 𝑛𝑒𝑡 =1

2𝜋𝜎exp(−

1

2(𝑛𝑒𝑡 − 𝜇

𝜎)2)

f(net)

net

Page 8: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

A) Show the following equation is true:

B) Write a simple Matlab m-file and plot following activation function:

o A sigmoid function.

o A Gaussian function.

Suppose the range is -5 to 5. Try to show both plot in a single figure by

subplot function.

1 − tanh 𝑥

1 + tanh 𝑥= 𝑒2𝑥

Introduction McCullouch-Pitts neuron Activation functions

Page 9: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction Perceptron

x1

x2

.

.

.

xn

...

...

An arrangement of one input layer of McCulloch-Pitts (M-P) neurons

feeding forward to one output layer of McCulloch-Pitts neurons is known as

a Perceptron.

w1

w2

wn

...

o1

w1

w2

wn

...

o2

w1

w2

wn

...

om

θ,0

θ,1)sgn(

net

netnet

n

1i

ijij )θwxsgn(out

Connection weights between layer i and j

Page 10: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

w

We can use McCulloch-Pitts neurons to implement the basic logic gates:

sgn1w

input

NOT

Output

00

1-1

sgn0

0

input ANDOutput

00.5

0.5

0

1

01

0

01

1

= 0

= 1

1

0θ0w0w 21 0θ

0θ1w0w 21

0θ0w1w 21

0θ1w1w 21

θw 2

θw1

θww 21

Introduction Perceptron

Page 11: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction Perceptron

w

sgn

We can use McCulloch-Pitts neurons to implement the basic logic gates:

0

0

input OR

Output

02

2

0

1

11

0

11

1 = 1

1

0θ1w1w

0θ0w1w

0θ1w0w

0θ0w0w

21

21

21

21

θww

θw

θw

21

1

2

1

0

input XOR

Output

1

w

sgn0

11

0

00

1

1

0

0θ1w1w

0θ0w1w

0θ1w0w

0θ0w0w

21

21

21

21

θww

θw

θw

21

1

2

Page 12: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

AND

0,0

0,1

1,0

1,1

w

sgnx

y

inputOutput

out

w1

w2

0θywxw 21

The decision boundary is at:

22

1

w

θx

w

wy

5.1w

5.1w

2

1

(w1,w2)

2.1w

7.1w

2

1

(w1,w2)

Introduction Perceptron

Page 13: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

AND

0,0

0,1

1,0

1,1

OR

0,0

0,1

1,0

1,1

XOR

0,0

0,1

1,0

1,1

?

It was the proof by Minsky & Papert in 1969 that

Perceptrons could only learn linearly separable

functions that led to decline in neural network

research until the mid 1980's when it was proved

that other network architectures could learn

these type of functions.

Introduction Perceptron

Page 14: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

It would simplify the mathematics if we could treat the neuron threshold

as if it were just another connection weight.

θwxwxwxwxθwx njn3j32j21j1

n

1i

iji

It is easy to see that if we define w0j= –θj and x0=1 then this

becomes:

We just have to include an extra

input unit with activation x0 = 1

and then we only need to

compute “weights”, and no

explicit thresholds.

w1

w2

wq

...

out

x1

x2

.

.

.

xnbias

1

Introduction Perceptron

n

0i

iji0j0njn3j32j21j1

n

1i

iji wxwxwxwxwxwxθwx

Page 15: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Now we will discuss artificial neural network architectures, some of

which derive inspiration from biological neural networks.

Fully connected networks

every node is connected to

every node, and these

connections may be either

excitatory (positive weights),

inhibitory (negative weights), or

irrelevant (almost zero weights).

large number of parameters

Difficult learning

Biologically implausible

Introduction Neural Networks Architectures

Page 16: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Layered networks

These are networks in which nodes are partitioned into subsets called

layers, with no connections that lead from layer j to layer k if j>k.

Connections, with arbitrary

weights, may exist from any

node in layer i to any node in

layer j for j i.

intra-layer connections may

exist.

Introduction Neural Networks Architectures

Page 17: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Layered networks Acyclic networks

there are no intra-layer connections. In other words, a connection may

exist between any node in layer i and any node in layer j for j > i, but a

connection is not allowed for i = j.

Introduction Neural Networks Architectures

Page 18: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction Neural Networks Architectures

Layered networks Acyclic networks Feed forward networks

This is a subclass of acyclic networks in which a connection is allowed

from a node in layer i only to nodes in layer i + 1

These networks can be

described by a sequence

of numbers indicating the

number of nodes in each

layer.

A feed forward 3-2-3-2 network

Recurrent neural networks

Page 19: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction Perceptron Example

A typical neural network application is classification.

Mass Speed Class1.0 0.1 Bomber

2.0 0.2 Bomber

0.1 0.3 Fighter

2.0 0.3 Bomber

0.2 0.4 Fighter

3.0 0.4 Bomber

0.1 0.5 Fighter

1.5 0.5 Bomber

0.5 0.6 Fighter

1.6 0.7 Fighter

How do we construct a neural network that can classify airplanes?

General Procedure for Building Neural Networks

1. Understand and specify your problem in terms of

inputs and required outputs

2. Take the simplest form of network you think

might be able to solve your problem

3. Try to find appropriate connection weights

(including neuron thresholds)

4. Make sure that the network works on its

training and test data.

5. If the network doesn’t perform well enough, go

back to stage 3 and try harder

6. If the network still doesn’t perform well enough,

go back to stage 2

Page 20: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Introduction Perceptron Example

A typical neural network application is classification.

Mass Speed Class1.0 0.1 Bomber

2.0 0.2 Bomber

0.1 0.3 Fighter

2.0 0.3 Bomber

0.2 0.4 Fighter

3.0 0.4 Bomber

0.1 0.5 Fighter

1.5 0.5 Bomber

0.5 0.6 Fighter

1.6 0.7 Fighter

How do we construct a neural network that can classify airplanes?

• the inputs can be direct encodings of the masses and

speeds.

• We have two classes, so just one output unit is

enough, with activation 1 for ‘fighter’ and 0 for

‘bomber’ (or vice versa)

• The simplest network to try first is a simple

Perceptron.

|sgn

Mass

Speed

Class

Bias

w2

w1

w0

class sgn(w0 w1.Mass w2.Speed)

Page 21: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

You can experiment with a two-element neuron by running: nnd2n2

Neural Network Toolbox

Page 22: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of

Neural Network Toolbox

You might want to trydemo nnd4db .

It allows you to pick

new input vectors and

change the decision

boundary to monitor

weights and bias

changes

Page 23: Artificial Neural Networksvasighi/courses/ann97win/ann02.pdfIntroduction Neural Networks Architectures Layered networks Acyclic networks Feed forward networks This is a subclass of