lecture1 introduction to ann

8/10/2019 Lecture1 Introduction to ANN

1/70

Introduction to ArtificialNeural Networks

:


2/70

Content Fundamental Concepts of ANNs.

Basic Models and Learning Rules Neuron Models

ANN structures

Learning

Distributed Representations

Conclusions


3/70


FundamentalConcepts of ANNs


4/70

What is ANN? Why ANN?

ANN Artificial Neural Networks

To simulate human brain behavior A new generation of information

processing system.


5/70

Applications Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering . . .


6/70

Applications Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering . . .

Traditional Computers are inefficientat these tasks although their

computation speed is faster.


7/70

The Configuration of ANNs

An ANN consists of a large number of

interconnected processing elements calledneurons. A human brain consists of ~1011neurons of

many different types.

How ANN works? Collectivebehavior.


8/70

The Biologic Neuron


9/70

The Biologic Neuron


10/70

The Biologic Neuron

Excitatory or Inhibitory


11/70

The Artificial Neuron

x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i


12/70


x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

ij

m

j

iji xwf 1

)(

)()1( fatyi

otherwise

ffa

0

01)(


13/70


x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

wij

positive excitatorynegative inhibitoryzero no connection


14/70


x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

Proposed by McCullochand Pitts [1943]

M-P neurons


15/70

What can be done by M-P neurons?

A hard limiter. A binary threshold unit. Hyperspace separation.

otherwise

fy

xwxwf

i

i

0

0)(1

)( 2211

x1

x2

x1 x2

y

w1 w2 10


16/70


17/70

Three Basic Entities of ANN Models

Models of Neurons or PEs.

Models of synaptic interconnections andstructures.

Training or learning rules.


18/70


Basic Models and Learning Rules

Neuron Models

ANN structures

Learning


19/70

Processing Elements

f (.) a (.)

i

What integrationfunctions we mayhave?

What activationfunctions we mayhave?

Extensions of M-P neurons


20/70

Integration Functions

f (.) a (.)

iij

m

j

iji xwf

2

1

QuadraticFunction

i

m

j

ijji wxf 1

2)(SphericalFunction

1 1

j k

m m

i ijk j k j k i

j k

f w x x x x

PolynomialFunction

ij

m

j

ijii xwnetf 1

M-P neuron


21/70

Activation Functions

f (.) a (.)

i

M-P neuron: (Step function)

otherwise

ffa

0

01)(

1

a

f


22/70


f (.) a (.)

i

Hard Limiter (Threshold function)

01

01)sgn()(

f

fffa

1

a

1f


23/70


f (.) a (.)

i

Ramp function:

00

10

11

)(

f

ff

f

fa

1

a

1f


24/70


25/70


f (.) a (.)

i

Bipolar sigmoid function:

11

2)(

fefa

-1.5

-1

-0.5

0

0.5

1

1.5

-4 -3 -2 -1 0 1 2 3 4


26/70

x

y

Example: Activation Surfaces

L1

L2

L3

x y

L1 L2

L3


27/70

x

y L1

L2

L3


x1=0

y1=0

xy+4=0

x y

L1

L2

L3

10

1=1

0 1

2=1

1

1

3= 4


28/70


29/70

x

y L1

L2

L3


z=1

z=0L4

z

x y

L1

L2

L3


30/70

x

y L1

L2

L3


z=1

z=0L4

z

x y

L1

L2

L3

1

4=2.5

1 1


31/70


L4

z

x y

L1

L2

L3

M-P neuron: (Step function)

otherwise

ffa

0

01)(

U i l i id 1


32/70


L4

z

x y

L1

L2

L3

=2 =3

=5 =10

Unipolar sigmoidfunction: fe

fa

1

1)(


33/70



Neuron Models

ANN structures

Learning


34/70

ANN Structure (Connections)


35/70

Single-Layer Feedforward Networks

y1 y2 yn

x1

x2

xm

w11 w12

w1mw21 w22

w2m wn1

wnmw

n2

. . .


36/70

Multilayer Feedforward Networks

. . .

. . .

. . .

. . .

x1 x2 xm

y1 y2 yn

Hidden Layer

Input Layer

Output Layer


37/70

Multilayer Feedforward Networks

Pattern Recognition

Input

Analysis

Classification

OutputLearning

Where theknowledge from?


38/70

Single Node with Feedback to Itself

FeedbackLoop


39/70

Single-Layer Recurrent Networks

. . .

x1

x2

xm

y1 y2 yn


40/70

Multilayer Recurrent Networks

x1 x2 x3

y1 y2 y3

. . .

. . .


41/70



Neuron Models

ANN structures

Learning


42/70

Learning Consider an ANN with nneurons and each

with madaptive weights.

Weight matrix:

nmnn

m

m

T

n

T

T

www

www

www

21

22221

11211

2

1

w

w

w

W


43/70

Learning Consider an ANN with nneurons and each

with madaptive weights.

Weight matrix:

nmnn

m

m

T

n

T

T

www

www

www

21

22221

11211

2

1

w

w

w

W

To Learn the weight matrix.

How?


44/70

Learning RulesSupervisedlearning

Reinforcementlearning

Unsupervisedlearning


45/70

Supervised Learning Learning with a teacher

Learning by examples

Training set

(1) (2)(1) (2 )) ( )(( , ), ( , ), , ( , ),kk d d dx xT x

(1) (2)(1) (2 )) ( )( kk


46/70

Supervised Learning

x

Error

signal

Generator

d

yANN

W

(1) (2)(1) (2 )) ( )(( , ), ( , ), , ( , ),kk d d dx xT x


47/70

Reinforcement Learning

Learning with a critic

Learning by comments


48/70

Reinforcement Learning

x

Critic

signal

Generator

yANN

W ReinforcementSignal


49/70

Unsupervised Learning

Self-organizing

Clustering Form proper clusters by

discovering the similarities anddissimilarities among objects.


50/70

Unsupervised Learning

x yANN

W


51/70

The General Weight Learning Rule

1

1

m

i ijijj

net xw

Input:

Output: ( )i iy a net

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

yi

i


52/70


1

1

m

i ijijj

net xw

Input:

Output: ( )i iy a net

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

yi

i

We want to learn the weights& bias.


53/70

We want to learn the weights& bias.


1

1

m

i ijijj

net xw

Input:

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

i1

ij

m

i j

j

net xw

Letxm = 1and wim = i.


54/70


1

1

m

i ijijj

net xw

Input:

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-11

ij

m

i j

j

net xw

Letxm = 1and wim = i.

xm=

1

wim=i

( )TWe want


55/70


Input:

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

1 ij

m

i jj

net xw

xm=

1

wim=i

yi

wi=(wi1, wi2 ,,wim)T

wi(t) = ?

We wantto learn


56/70


57/70


wix yi

r diLearningSignalGenerator

( , , )r i if dw x


58/70


wix yi


)()( trti xw

( , , )r i if dw x

)()(


59/70


wix yi


( ) ( )i t tr w x

)()( trti xw

Learning Rate

( , , )r i if dw x

( )TWe want


60/70


wi=(wi1, wi2 ,,wim)T

W wto learn

( , , )r i ir f d w x( ) ( )i t tr w x

( 1) ( ) ( ) ( ) ( ) ( )( , , )t t t t t t i i r i if d w w w x x

Discrete-Time Weight Modification Rule:

Continuous-Time Weight Modification Rule:

( )( )i

d tr t

dt

w

x


61/70

Hebbs Learning Law Hebb [1994] hypothesis that when an axonal input

fromAtoBcauses neuron B to immediately emit

a pulse (fire) andthis situation happensrepeatedly or persistently.

Then, the efficacy of that axonal input, in termsof ability to help neuron B to fire in future, issomehow increased.

Hebbs learning rule is a unsupervisedlearningrule.


62/70


63/70


DistributedRepresentations


64/70

Distributed Representations Distributed Representation:

An entity is represented by a pattern ofactivity distributed over many PEs.

Each Processing element is involved inrepresenting many different entities.

Local Representation: Each entity is represented by one PE.


65/70

ExampleP0P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15

+ _ + + _ _ _ _ + + + + + _ _ _

+_

+ +_ _ _ _

+_

+_

+ +_

+

+ +_

+_

+ +_

+_ _

+ + + +_

Dog

Cat

Bread

Act as a content addressable memory.


66/70

P0P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15

+ _ + + _ _ _ _ + + + + + _ _ _

+_

+ +_ _ _ _

+_

+_

+ +_

+

+ +_

+_

+ +_

+_ _

+ + + +_

Dog

Cat

Bread

Advantages


+ + + +

What is this?


M k i d ti


67/70

Advantages


+ _ + + _ _ _ _ + + + + + _ _ _

+_

+ +_ _ _ _

+_

+_

+ +_

+

+ +_

+_

+ +_

+_ _

+ + + +_

Dog

Cat

Bread

Make induction easy.


+_ _

+_ _ _ _

+ + + + + +_ _

Fido

Dog has 4 legs? How many for Fido?


M k i d ti


68/70

Advantages


+ _ + + _ _ _ _ + + + + + _ _ _

+_

+ +_ _ _ _

+_

+_

+ +_

+

+ +_

+_

+ +_

+_ _

+ + + +_

Dog

Cat

Bread


Make the creation of new entities orconcept easy (without allocation ofnew hardware).

+ + _ _ _ + + _ + _ _ _ + + + _Doughnut

Add doughnut by changing weights.


M k i d ti


69/70

Advantages


+ _ + + _ _ _ _ + + + + + _ _ _

+_

+ +_ _ _ _

+_

+_

+ +_

+

+ +_

+_

+ +_

+_ _

+ + + +_

Dog

Cat

Bread


Make the creation of new entities orconcept easy (without allocation ofnew hardware).

Fault Tolerance.

Some PEs break down dont cause problem.


70/70

Disadvantages How to understand?

How to modify?

Learning procedures are required.

lecture1 introduction to ann

Documents