egr 301 artificial neural networks - clark science centerjcardell/courses/egr301/... · 1 egr 301...

34
EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to use a backpropagation, feed- forward ANN. 2. Acquire some insight into how they work, their limitations, etc.

Upload: others

Post on 04-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

1

EGR 301

Artificial Neural

NetworksProf. Glenn Ellis

Spring 2005

Objectives

1. Ability to use a backpropagation, feed-forward ANN.

2. Acquire some insight into how they work, their limitations, etc.

Page 2: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

2

How do we teach a child to differentiate cats from dogs?

Expert Systems

Teach rules

Cats say meow. Dogs say woof.

Examples

MedicineWater treatment

Page 3: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

3

2. Test Show new pictures

Iterate

ANNs

1. Train Show exampleCompare child’s and actual answer Reward/Correct

Iterate

3. ApplyInteract with cats and dogs

2. Test Show new pictures Iterate

ANNs

1. Train Show exampleCompare child’s and actual answer Reward/Correct

Iterate

3. ApplyInteract with cats and dogs

1a. ValidatePre-test to see if we should stop training.

Page 4: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

4

Note

Need training, validation, test sets.

Ann as good as data set.

Ann learns relationships.

What can go wrong?

Bad ANN

Error in dataset

Not enough data

Not enough independent data

Not random sample

Apply outside domain

Page 5: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

5

What can go wrong?

Bad ANN

Error in dataset

Not enough data

Not enough independent data

Not random sample

Apply outside domain

What can go wrong?

Bad ANN

Error in dataset

Not enough data

Not enough independent data

Not random sample

Apply outside domain

Page 6: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

6

What can go wrong?

Bad ANN

Error in dataset

Not enough data

Not enough independent data

Not random sample

Apply outside domain

ANNs solve some classical AI problems

Pattern recognition

100 step constraint

Graceful degradation

Multiple soft constraints

Knowledge relevance

Page 7: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

7

Credit Card Application

How do we create an expert system?

Credit Card ApplicationExpert System – Interview experts and

decide on rules. Apply rules.

Page 8: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

8

Credit Card Application

How do we create an ANN?

Credit Card Application

1. Get data.

2. ANNs – Train, test and apply.

Page 9: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

9

Credit Card Application

Any ethical concerns?

Neuron: Gathers signals from synapses, processes, sends output

w1

w2

w3

b

Gather weighted inputs

Transfer function, usually sigmoid

I = Σwixi + b f(I) = (1+e-I)-1

f(I)x2

x1

x3

Page 10: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

10

What does sigmoid function look like?

f(I) = (1+e-I)-1

I

F(I)

What does sigmoid function look like?

f(I) = (1+e-I)-1

I

F(I)

10.50

Page 11: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

11

Create an ANN to check credit.

I

Inputs: ???

Outputs: ???

Notes on hidden layer

I

May have many layers.

Allows deeper (non-linear) learning.

Sees weighted inputs.

Get # of layers and neurons by trial and error, genetic algorithms, etc..

ROT for starting: # hidden neurons = (# inputs + # outputs) / 2.

Page 12: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

12

-8.0

+8.0 -5.4+6.1

+8 -8

-3.7

+3.7 -3

Not XOR

0,0 1 1,1 1 0,1 0 1,0 0

Do these weights work?

Where is the knowledge?

How do we get it?

I

Page 13: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

13

Supervised Training

I

1. Show ANN inputs

2. Compute output(s)

3. Compute error, Σ(output – target)2

4. Is error small enough? If yes, stop.

5. No, adjust weights (using backpropagation) and go back to (1).

How do we know it has learned something?

Page 14: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

14

y

x

Fit a line to this data.

Human attempt

y

x

Page 15: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

15

ANN after lots of training

y

x

What does this mean?

y

x

Generalized (some error)

Memorized (little error, overtrained)

Page 16: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

16

How do we know when to stop if this graph is in 130 dimensions?

y

x

Generalized (some error)

Memorized (little error, overtrained)

Test it on data it hasn’t seen.

y

x

Generalized (some error)

Memorized (little error, overtrained)

Page 17: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

17

error

# iterations

testing

training

Early Stopping

But this is sort of cheating, how?

error

# iterations

testing

training

Page 18: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

18

Is there over-training in this example?

MATLAB example with overtraining

f(t)ANN f(t) + noise

Page 19: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

19

With early stopping

Overtrained

What if it doesn’t do well in testing?

1. Overtrained

2. No underlying relationship Potsdam Water Treatment, Stamford Wastewater Treatment Plant

3. ANN can’t learn it.

4. Insufficient data.

Page 20: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

20

What if it doesn’t do well in testing?

1. Overtrained

2. No underlying relationship Potsdam Water Treatment, Stamford Wastewater Treatment Plant

3. ANN can’t learn it.

4. Insufficient data.Relate to dog/cat.

Cure.

With limited data, how much should be used for training and testing?

Answer: depends

What does putting more data into the training set get us?

What does putting more data into the testing set get us?

Page 21: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

21

With limited data, how much should be used for training and testing?

Answer: depends

What does putting more data into the training set get us? Higher chance that it learns.

What does putting more data into the testing set get us? Higher confidence that it has learned.

ROT

10 – 20 independent data points for each i/o neuron.

Page 22: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

22

ROT

10 – 20 independent data points for each i/o neuron.

ROT

10 – 20 independent data points for each i/o neuron.

90% of data in training set

10% of data in testing set

Page 23: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

23

-8.0

+8.0 -5.4+6.1

+8 -8

-3.7

+3.7 -3

Not XOR

0,0 1 1,1 1 0,1 0 1,0 0

Explain the knowledge it contains.

What can we do?

Vary one variable at a time and see how the output changes.

Page 24: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

24

Main Applications

1. Pattern recognition

Train by looking at many patterns

Examples: writing, speech, objects, seismograms

2. Function estimation

y

x

Y = f(x)

Page 25: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

25

2. Function estimation

y

x

Y = f(x)

2. Function estimation

y

x1

Y = f(x1,x2)

x2

Page 26: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

26

2. Function estimation

Y1-100 = f(x1-100)

Example: Fiber-reinforced concrete beams

Caesar's Palace

Page 27: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

27

Example: Fiber-reinforced concrete beams

13 variables

(dimensions, loading, material variables)

Strength

Most accurate method in world 10 years ago.

I have been doing this all of my life, and that damned thing knows more than I do.

Page 28: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

28

Most accurate method in world 10 years ago.

But, they’ll never use it.

GeographyGradeSexMinoritySSAT scoresInterview scoresLegacy

Boarding School Admissions

Admit

Waitlist

Reject

Page 29: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

29

Boarding School Admissions

Results

Highly accurate

Most important factor?GeographyGradeSexMinoritySSAT scoresInterview scoresLegacy

Boarding School Admissions

Results

Highly accurate

Most important factor?GeographyGradeSexMinoritySSAT scoresInterview scoresLegacy

Page 30: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

30

Ozone Water Disinfection

Dosing

Environmental conditions

Virus conc.

Results

More efficient than EPA techniques

Published in: Environmental Engineering ScienceFlorida AI International Conference

Size (square ft., #bathrooms, #bedrooms, #garages)Style (3 styles)Land (acres, pool, courts, lakefront, oceanfront)Location (9 neighborhoods)

Price ($)

Real Estate

Page 31: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

31

0

4

8

12

0 4 8 12

actual price (million $)

pred

icte

d pr

ice

(mill

ion

$)Test Set

Applications

Detect price trends

Isolate variables (value of saltwater frontage?)

Relate to secondary markets

Predict home improvement value

Appraisals

Page 32: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

32

Back Propagation Training

∆wij = - k Ewij

Go in direction to minimize error.

Learning rate

Change in error with respect to weight.

wij

E

If we start here, which way will the weight change? Where do we want to go? What problems may occur?

Start

Page 33: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

33

wij

E

If we start here, which way will the weight change?Where do we want to go? What problems may occur?

Negative slope, positive

weight change.

wij

E

If we start here, which way will the weight change? Where do we want to go? What problems may occur?

Local minimum

Wrong learning rate.

∆wij

Page 34: EGR 301 Artificial Neural Networks - Clark Science Centerjcardell/Courses/EGR301/... · 1 EGR 301 Artificial Neural Networks Prof. Glenn Ellis Spring 2005 Objectives 1. Ability to

34

If we start here, which way will the weight change? Where do we want to go? What problems may occur?

100X magnification

Add Momentum

∆wij(n) = - k Ewij

+ α ∆wij(n-1)

where 0 < α < 1

Advice

1. Start with a low learning rate.

2. More complicated architectures need lower learning rates.

3. Need momentum to get out of oscillations.

4. Over specified networks will get confused.