introduction to artificial intelligence · - multilayer perceptron (mlp) - feed forward neural...

Introduction to Artificial Intelligence

COMP307 Neural Network 1: perceptron

Yi [email protected]

1

Outline• Why ANN?• Origin• Perceptron• Perceptron learning• What can (not) perceptron learn• Extending perceptron

2

Why ANN?

3

• Killer applications in a lot of areas– Computer Vision/Image processing– Game playing (AlphaGo, Watson, …)– Big data

Origin• Human brain shows amazing capability in

– Learning– Perception– Adaptability– …

• Simulate human brain to achieve the above functionalities• ANN models for this purpose

4

Origin

5

Artificial Neuron

7

𝑧"

zj =mX

i=1

wjixi + bj

yj = '(zj)

Activation Functions

8

Threshold Sigmoid

yj =

(1, if zj > 0,

0, otherwiseyj =

1

1 + e��zj

Perceptron• A special type of artificial neuron

– Real-valued inputs– Binary output– Threshold activation function

9

yj =

(1, if

Pmi=1 wjixi + bj > 0,

0, otherwise

y =

(1, if x1 + x2 > 0,

0, otherwise.

Perceptron• To perform linear classification

– 2 inputs: a line– 3 inputs: a plane

• Can do online learning– Update 𝑤"$ and 𝑏" along with new examples

10

yj =

(1, if

Pmi=0 wjixi > 0,

0, otherwise

Learning Perceptron• How to get the optimal weights and bias?• Only consider accuracy

– Optimal if 100% accuracy on training set– Can have many optimal solutions

• To facilitate notation, transform bias to a weight 𝑤!" = 𝑏!, and 𝑥" = 1 always holds

11

yj =

(1, if

Pmi=1 wjixi + bj > 0,

0, otherwise

bj = wj0 · 1 = wj0x0

Learning Perceptron• Idea

– Initialise weights and threshold randomly (or all zeros)– Given a new example 𝑥', 𝑥), … , 𝑥+, 𝑑

• Input feature vector: 𝑥', 𝑥), … , 𝑥+• Output (class label): 𝑑• Predicted output 𝑦

– If 𝑦 = 0 and 𝑑 = 1, • increase 𝑏 = 𝑤1, increase 𝑤$ for each positive 𝑥$, decrease 𝑤$ for each

negative 𝑥$, – If 𝑦 = 1 and 𝑑 = 0,

• decrease 𝑏 = 𝑤1, decrease 𝑤$ for each positive 𝑥$, increase 𝑤$ for each negative 𝑥$,

– Repeat the process for each new example until the desired behaviour is achieved

12

y =

(1, if

Pmi=0 wixi > 0,

0, otherwise

Learning Perceptron• Implementation

– Initialise weights and threshold randomly (or all zeros)– Given a new example 𝑥', 𝑥), … , 𝑥+, 𝑑

• Input feature vector: 𝑥', 𝑥), … , 𝑥+• Output (class label): 𝑑• Predicted output 𝑦

– Where 𝜂 ∈ [0,1] is called the learning rate– Repeat the process for each new example until the desired

behaviour is achieved

13

wi wi + ⌘(d� y)xi, i = 0, 1, 2, . . . ,m

y =

(1, if

Pmi=0 wixi > 0,

0, otherwise

Problem with Perceptron• What can the perceptron learn?

• Perceptron convergence theorem: The perceptron learning algorithm will converge if and only if the training set is linearly separable.

• Cannot learn for XOR (Minsky and Papert, 1969)15

Multi-Layer Perceptron• Add one hidden node between the inputs and output

16

Threshold

x1 x2 y (class)

0 0 0

1 0 1

0 1 1

1 1 0 y =

(1, if

Pi wixi � T > 0,

0, otherwise

Neural Network• Add more hidden layers• Add more nodes for hidden layers

17

COMP307 ML5 (NNs):

Multilayer Perceptron (Neural Networks) • Change one or two layers of nodes to three or more layers

- Multilayer perceptron (MLP)

- Feed forward neural networks

- Standard feed forward networks: nodes in neighbouring layers are fully connected

- Input layer: input patterns/features

- Output layer: output patterns/class labels

- Hidden layer(s): high level features

3COMP307 ML5 (Neural Networks): 3

Multilayer Perceptron (Neural Networks)

• Change one or two layers of nodes to three or more layers

– Multilayer perceptron (MLP)

– Feed forward neural networks

– Standard feed forward networks: nodes in neighbouring layers are fully

connected

layerHidden Hidden OutputInput

layer layer layer

– Input layer: input patterns/features

– Output layer: output patterns/class labels

– Hidden layer(s): high level features

Summary• ANN has many killer applications

– Image analysis, game playing, …

• Perceptron – the simplest neural network• How to learn a perceptron• Limitation of perceptron, and general neural

network

• Reading: Text book section 20.5 (2nd edition) or section 18.7 (3rd edition) or Web materials

18

introduction to artificial intelligence · - multilayer perceptron (mlp) - feed forward neural...

Documents