neural networks: representation...

75
Neural Networks: Representation Non-linear hypotheses Machine Learning

Upload: doanh

Post on 10-Apr-2018

238 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Neural Networks: Representation

Non-linear hypotheses

Machine Learning

Page 2: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Non-linear Classification

x1

x2

Page 3: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Computer Vision: Car detection

Testing: What is this?

Not a car Cars

Page 4: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Learning Algorithm

pixel 1

pixel 2

pixel 1

pixel 2

Raw image

Cars “Non”-Cars

Page 5: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

pixel 1

pixel 2

Raw image

Cars “Non”-Cars

Learning Algorithm

pixel 1

pixel 2

Page 6: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

pixel 1

pixel 2

Raw image

Cars “Non”-Cars

50 x 50 pixel images→ 2500 pixels (7500 if RGB)

pixel 1 intensity

pixel 2 intensity

pixel 2500 intensity

Quadratic features ( ): ≈3 million features

Learning Algorithm

pixel 1

pixel 2

Page 7: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I
Page 8: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I
Page 9: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Neural Networks: Representation

Machine Learning

Model representation I

Page 10: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Neural Networks

Origins: Algorithms that try to mimic the brain. Was very widely used in 80s and early 90s; popularity diminished in late 90s. Recent resurgence: State-of-the-art technique for many applications

Page 11: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neuron in the brain

Page 12: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neurons in the brain

[Credit: US National Institutes of Health, National Institute on Aging]

Page 13: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neuron model: Logistic unit

Sigmoid (logistic) activation function.

Page 14: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Network “activation” of unit in layer

matrix of weights controlling function mapping from layer to layer

If network has units in layer , units in layer , then will be of dimension .

Page 15: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I
Page 16: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Neural Networks: Representation

Model representation II

Machine Learning

Page 17: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Add .

Forward propagation: Vectorized implementation

Page 18: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Layer 3 Layer 1 Layer 2

Neural Network learning its own features

Page 19: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Layer 3 Layer 1 Layer 2

Other network architectures

Layer 4

Page 20: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I
Page 21: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Neural Networks: Representation

Examples and intuitions I

Machine Learning

Page 22: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Non-linear classification example: XOR/XNOR

, are binary (0 or 1).

x1

x2

x1

x2

Page 23: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Simple example: AND

0 0 0 1 1 0 1 1

1.0

Page 24: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Example: OR function

0 0 0 1 1 0 1 1

-10

20

20

Page 25: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I
Page 26: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Neural Networks: Representation

Examples and intuitions II

Machine Learning

Page 27: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Negation:

0 1

Page 28: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Putting it together:

0 0 0 1 1 0 1 1

-30

20

20

10

-20

-20

-10

20

20

Page 29: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Network intuition

Layer 3 Layer 1 Layer 2 Layer 4

Page 30: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Handwritten digit classification

[Courtesy of Yann LeCun]

Page 31: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Handwritten digit classification

[Courtesy of Yann LeCun]

Page 32: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 33: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Representation

Multi-class classification

Machine Learning

Page 34: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Multiple output units: One-vs-all.

Pedestrian Car Motorcycle Truck

Want , , , etc.

when pedestrian when car when motorcycle

Page 35: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Multiple output units: One-vs-all.

Want , , , etc.

when pedestrian when car when motorcycle

Training set:

one of , , ,

pedestrian car motorcycle truck

Page 36: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 37: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Learning

Cost function

Machine Learning

Page 38: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Network (Classification)

Binary classification 1 output unit

Layer 1 Layer 2 Layer 3 Layer 4

Multi-class classification (K classes)

K output units

total no. of layers in network

no. of units (not counting bias unit) in layer

pedestrian car motorcycle truck

E.g. , , ,

Page 39: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Cost function

Logistic regression: Neural network:

Page 40: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 41: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Gradient descent

Machine Learning

Neural Networks: Learning

Page 42: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Have some function

Want

Outline:

• Start with some

• Keep changing to reduce

until we hopefully end up at a minimum

Page 43: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

1

0

J(0,1)

Page 44: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

0

1

J(0,1)

Page 45: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Gradient descent algorithm

Page 46: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 47: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Learning

Backpropagation algorithm

Machine Learning

Page 48: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Gradient computation

Need code to compute: - -

Page 49: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Backpropagation algorithm Training set

Set (for all ).

For

Set

Perform forward propagation to compute for

Using , compute

Compute

Page 50: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 51: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Learning

Random initialization

Machine Learning

Page 52: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Initial value of

For gradient descent and advanced optimization method, need initial value for .

Consider gradient descent Set ?

optTheta = fminunc(@costFunction,

initialTheta, options)

initialTheta = zeros(n,1)

Page 53: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Zero initialization

After each update, parameters corresponding to inputs going into each of two hidden units are identical.

Page 54: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Random initialization: Symmetry breaking

Initialize each to a random value in (i.e. ) E.g.

Theta1 = rand(10,11)*(2*INIT_EPSILON)

- INIT_EPSILON;

Theta2 = rand(1,11)*(2*INIT_EPSILON)

- INIT_EPSILON;

Page 55: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 56: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Learning

Putting it together

Machine Learning

Page 57: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Training a neural network

Pick a network architecture (connectivity pattern between neurons)

No. of input units: Dimension of features No. output units: Number of classes Reasonable default: 1 hidden layer, or if >1 hidden layer, have same no. of hidden units in every layer (usually the more the better)

Page 58: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Training a neural network 1. Randomly initialize weights 2. Implement forward propagation to get for any 3. Implement code to compute cost function 4. Implement backprop to compute partial derivatives

for i = 1:m

Perform forward propagation and backpropagation using example (Get activations and delta terms for ).

Page 59: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Training a neural network 5. Use gradient checking to compare computed using

backpropagation vs. using numerical estimate of gradient of . Then disable gradient checking code.

6. Use gradient descent or advanced optimization method with backpropagation to try to minimize as a function of parameters

Page 60: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 61: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 62: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Learning

Backpropagation example: Autonomous driving

Machine Learning

Page 63: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng [Courtesy of Dean Pomerleau]

Page 64: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng [Courtesy of Dean Pomerleau]

Page 65: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 66: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Feature Learning

Autoencoder Machine Learning

Page 67: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Autoencoders (and sparsity)

Page 68: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Sparse autoencoders

Page 69: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Unsupervised Feature Learning/Self-taught learning

Page 70: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Unsupervised pre-training + Fine-tuning

Page 71: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 72: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Neural Networks: Feature Learning

Deep Learning

Machine Learning

Page 73: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Unsupervised pre-training + Fine-tuning

Page 74: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng

Page 75: Neural Networks: Representation Non-linearhelper.ipam.ucla.edu/publications/gss2012/gss2012_10740.pdf · Neural Networks: Representation Machine Learning Model representation I

Andrew Ng