neural networks

45
Neural Networks

Upload: anila

Post on 05-Jan-2016

59 views

Category:

Documents


4 download

DESCRIPTION

Neural Networks. What are they. Models of the human brain used for computational purposes Brain is made up of many interconnected neurons. What is a neuron. Components of biological neuron. Dendrites – serve as inputs Soma Cell of neuron which contains Nucleus - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Neural  Networks

Neural Networks

Page 2: Neural  Networks

What are they

• Models of the human brain used for computational purposes

• Brain is made up of many interconnected neurons

Page 3: Neural  Networks

What is a neuron

Page 4: Neural  Networks

Components of biological neuron

• Dendrites – serve as inputs• Soma Cell of neuron which contains

Nucleus

• Nucleus- processing component of neuron• Axon along which output goes• Synapse - ends across whose gap connection is

made between other neurons

Page 5: Neural  Networks

How does it work

• Signals move from neuron to neuron via electrochemical reactions. The synapses release a chemical transmitter which enters the dendrite. This raises or lowers the electrical potential of the cell body.

• The soma sums the inputs it receives and once a threshold level is reached an electrical impulse is sent down the axon (often known as firing).

• These impulses eventually reach synapses and the cycle continues.

Page 6: Neural  Networks

Synapses

• Synapses which raise the potential within a cell body are called excitatory. Synapses which lower the potential are called inhibitory.

• It has been found that synapses exhibit plasticity. This means that long-term changes in the strengths of the connections can be formed depending on the firing patterns of other neurons. This is thought to be the basis for learning in our brains.

Page 7: Neural  Networks

Artificial model of neuron

Page 8: Neural  Networks

Diagram

• aj : Activation value of unit j• wj,i : Weight on the link from unit j to

unit i• ini : Weighted sum of inputs to unit i• ai : Activation value of unit i (also

known as the output value)• g : Activation function

Page 9: Neural  Networks

How does this work

• A neuron is connected to other neurons via its input and output links. Each incoming neuron has an activation value and each connection has a weight associated with it.

• The neuron sums the incoming weighted values and this value is input to an activation function. The output of the activation function is the output from the neuron.

Page 10: Neural  Networks

Common ActivationFunctions

Page 11: Neural  Networks

Some common activation functions in more detail

• These functions can be defined as follows.• Stept(x) = 1 if x >= t, else 0• Sign(x) = +1 if x >= 0, else –1• Sigmoid(x) = 1/(1+e-x)• On occasions an identify function is also used

(i.e. where the input to the neuron becomes the output). This function is normally used in the input layer where the inputs to the neural network are passed into the network unchanged.

Page 12: Neural  Networks

A brief history of Neural Networks

• In 1943 two scientists, Warren McCulloch and Walter Pitts, proposed the first artificial model of a biological neuron [McC]. This synthetic neuron is still the basis for most of today’s neural networks.

• Rosenblatt came up with his two layered perceptron which was subsequently shown to be defective by Papert and Minsky which lead to a huge decline in funding and interest in Neural Networks.

Page 13: Neural  Networks

The bleak years• During this period, even though there was a lack of funding and

interest in neural networks, a small number of researchers continued to investigate the potential of neural models.

• A number of papers were published, but none had any great impact. Many of these reports concentrated on the potential of neural networks for aiding in the explanation of biological behaviour (e.g. [Mal], [Bro], [Mar], [Bie], [Coo]).

• Others focused on real world implementations. In 1972 Teuvo Kohonen and James A. Anderson independently proposed the same model for associative memory [Koh], [An1]

• In 1976 Marr and Poggio applied a neural network to a realistic problem in computational vision, stereopsis [Mar]. Other projects included [Lit], [Gr1], [Gr2], [Ama], [An2], [McC].

Page 14: Neural  Networks

The Discovery of Backpropagation

• The backpropagation learning algorithm was developed independently by Rumelhart [Ru1], [Ru2], Le Cun [Cun] and Parker [Par] in 1986.

• It was subsequently discovered that the algorithm had also been described by Paul Werbos in his Harvard Ph.D thesis in 1974 [Wer].

• Error backpropagation networks are the most widely used neural network model as they can be applied to almost any problem that requires pattern mapping.

• It was the discovery of this paradigm that brought neural networks out of the research area and into real world implementation.

Page 15: Neural  Networks

Interests in neural network differ according to profession.

• Neurobiologists and psychologists -understanding our brain

• Engineers and physicists -a tool to recognise patterns in noisy data (see Ts at right)

• Business analysts and engineers -a tool for modelling data

• Computer scientists and mathematicians - networks offer an alternative model of computing: machines that may be taught rather than programmed

• Artificial Intelligensia, cognitive scientists and philosophers -Subsymbolic processing (reasoning with patterns, not symbols)

Page 16: Neural  Networks

Backpropagation Network Architecture

• A backpropagation network typically consists of three or more layers of nodes.

• The first layer is the known as the input layer and the last layer is known as the output layer.

• Any layers of nodes in between the input and output layers are known as hidden layers.

• Each unit in a layer is connected to every unit in the next layer. There are no interlayer connections.

Page 17: Neural  Networks

Backpropagation

Output Layer

Hidden Layer

Input Layer

I

N

P

U

T

E

R

R

O

R

Page 18: Neural  Networks

Operation of the network

• The operation of the network consists of a forward pass of the input through the network (forward propagation) and then a backward pass of an error value which is used in the weight modification (Backward Propagation)

Page 19: Neural  Networks

Forward Propagation

• A forward propagation step is initiated when an input pattern is presented to the network.

• No processing is performed at the input layer. The pattern is propagated forward to the next layer, and each node in this layer performs a weighted sum on all its inputs

• After this sum has been calculated, a function is used to compute the unit’s output.

Page 20: Neural  Networks
Page 21: Neural  Networks

Example XOR

Page 22: Neural  Networks
Page 23: Neural  Networks

Layers of the Network

• The Input Layer• The input layer of a backpropagation

network acts solely as a buffer to hold the patterns being presented to the network. Each node in the input layer corresponds to one entry in the pattern. No processing is done at the input layer. The pattern is fed forward from the input layer to the next layer.

Page 24: Neural  Networks

The Hidden Layers

• is the hidden layers which give the backpropagation network its exceptional computation abilities.

• The units in the hidden layers act as “feature detectors”. They extract information from the input patterns which can be used to distinguish between particular classes. The network creates its own internal representation of the data.

Page 25: Neural  Networks

The Output Layer

• The output layer of a network uses the response of the feature detectors in the hidden layer. Each unit in the output layer emphasises each feature according to the values of the connecting weights. The pattern of activation at this layer is taken as the network’s response.

Page 26: Neural  Networks

The sigmoid function

• The function used to perform this operation is the sigmoid function,

• The main reason why this particular function is chosen is that its derivative, which is used in the learning law, is easily computed.

• The result obtained after applying this function to the net input is taken to be the node’s output value.

• This process is continued until the pattern has been propagated through the entire network and reaches the output layer.

• The activation pattern at the output layer is taken as the network’s result.

Page 27: Neural  Networks

Linear Separability and the XOR Problem

• Consider two-input patterns being classified into two classes

• Each point with either symbol of or represents a pattern with a set of values . Each pattern is classified into one of two classes.

• Notice that these classes can be separated with a single line . They are known as linearly separable patterns.

• Linear separability refers to the fact that classes of patterns with -dimensional vector can be separated with a single decision surface. In the case above, the line represents the decision surface.

Page 28: Neural  Networks

Diagram

Page 29: Neural  Networks

Xor

• The most classic example of linearly inseparable pattern is a logical exclusive-OR (XOR) function. Shown in the next figure is the illustration of XOR function that two classes, 0 for black dot and 1 for white dot, cannot be separated with a single line.

Page 30: Neural  Networks

XOR linearly inseparable

Page 31: Neural  Networks

The significance of This

• XOR is separable in 3 dimensions but obviously not in 2.

• So many classifiers will need more than 2 layers to classify

• Minsky and Papert pointed out that perceptrons in 2 layers couldn’t learn in 3 dimensions or more as far as they could see

• Because so many problems are like Xor then according to these stars of AI neural networks had limited applicability

Page 32: Neural  Networks

But they were wrong

• Backpropagation showed that neural networks could learn in 3 and more dimensions

• However such was the stature of this pair that this impacted negatively on research in Neural networks for 2 decades

• However the work Werbos and Parker and Rumelhart proved them wrong and in 1987 working multilayer networks were working away and learning and have become a he industry

Page 33: Neural  Networks

Backward Propagation

• The first step in the backpropagation stage is the calculation of the error between the network’s result and the desired response. This occurs when the forward propagation phase is completed.

• Each processing unit in the output layer is compared to its corresponding entry in the desired pattern and an error is calculated for each node in the output layer.

• The weights are then modified for all of the connections going into the output layer.

• Next, the error is backpropagated to the hidden layers and by using the generalised delta rule, the weights are adjusted for all connections going into the hidden layer.

• The procedure is continued until the last layer of weights have been modified. The forward and backward propagation phases are repeated until the network’s output is equal to the desired result.

Page 34: Neural  Networks

The Backpropagation Learning Law

• The Learning Law used is known as the Generalised Delta Rule.

• It allows for the adjustment of the weights in the hidden layer, a feat deemed impossible by Minsky and Papert.

• It uses the derivative of the activation function of nodes (which in most cases is the sigmoid function) to determine the extent of the adjustment to the weights connecting to the hidden layers.

• In other words , the network learns from its errors and uses the difference between expected and actual results(the error) to make adjustments.

Page 35: Neural  Networks

Example

• Calculate the weight adjustments in the following network for expected outputs of {1,1} and the learning rate is 1:

• The Target Values are 1, 1 and the learning rate is 1

Page 36: Neural  Networks

Sample Neural Network

Page 37: Neural  Networks

Hidden Layer Computation

• Xi =iW1 = • 1 * 1 + 0 * -1 = 1, • 1 * -1 + 0 * 1 = -1 = • { 1 - 1} = {Xi1,Xi2} = Xi

xF

1

1

Page 38: Neural  Networks

• h = F(X)• h1 = F(Xi1) = F(1)• h2 = F(Xi2) = F(-1)

27.01

1

1

1)2(

73.01

1

1

1)1(

)1(2

)1(1

xi

xi

XiF

XiF

Page 39: Neural  Networks

Output Layer Computation

• X = hW2 = • 0.73 * -1 + 0.27 * 0 = -0.73, • 0.73 * 0 + 0.27 * -1 = -0.27 =• { -0.73 - 0.27} = {X1,X2} = X

xF

1

1

Page 40: Neural  Networks

• O = F(X)• O1 = F(X1)• O2 = F(X2)

58.01

1

1

1)2(

68.01

1

1

1)1(

)27.0(2

)73.0(1

x

x

XF

XF

Page 41: Neural  Networks

Error

• d1 = 0.7( 1 -0.7)(0.7 - 1) = 0.7 (0.3)(-0.3) = -0.063

• d2 = 0.6(1 - 0.6)(0.6 - 1) = 0.6(0.4)(-0.4) = -0.096

Page 42: Neural  Networks

Error Calculatione = h(1 - h)W2d

21

11

h

h

2

1

d

d

2221

1211

WW

WW 21 hh

2

1

e

e

Page 43: Neural  Networks

Another Way to write the error

• e1 = (h1(1-h1)+ h2(1-h2)) W11 D1 +W12D2• e2 =( h1(1-h1)+ h2(1-h2)) W21 D1 +W22D2

• e1 = (0.73(1-0.73)+ 0.27(1-0.27))( -1* -0.063 +0*-0.096)• e2 =( 0.73(1-0.73)+ 0.27(1-0.27)) (0 *-0.063 +-1*-0.096)

• e1 = (0.73(0.27)+ 0.27(0.73))( 0.063)• e2 =( 0.73(0.27)+ 0.27(0.73)) (0.096)• e1 = 0.3942 * 0.063 = 0.247• e2 = 0.3942 * 0.096 = 0.038

outputsk

kkhiih dWhhe )1(

Page 44: Neural  Networks

Weight Adjustment

• △W2t = α hd + Θ △W2t-1

• where α = 1

2212

211121

2

1

dhdh

dhdhdd

h

hhd

)096.0*27.0()063.0*27.0(

)096.0*73.0()063.0*73.0(096.0063.0

27.0

73.0hd

Page 45: Neural  Networks

Weight Change

)026.0()017.0(

)107.0()046.0(