artificial neural network · 1 2 input layer output layer 11 12 1 1 𝑏1 single layer perceptron...
TRANSCRIPT
3
Introduction
History
1943
M-P neuron
1957
Perceptron
1974
Back
Propagation
2006
Deep Neural
Network
5
Neural Network
Single Layer Perceptron
𝑥1
𝑥𝑖
𝑥𝑛
Input layer Output layer
𝑤1
𝑤𝑖
𝑤𝑛
𝑦1
𝑠 =
𝑖=1
𝑛
𝑥𝑖𝑤𝑖 + 𝑏𝑗
𝑦 = 𝑓(𝑠)
𝑠 𝑓
1
𝑏1
𝑓 𝑡 ∶
1943MP
1957P
1974BP
2006DNN
6
Neural Network
𝑥1
𝑥2
Input layer Output layer
𝑤11
𝑤12
𝑦1𝑠 𝑓
1
𝑏1
𝑦1 = 𝑓 𝑤11𝑥1 + 𝑤12𝑥2 + 𝑏1 = 𝑢1
𝑥2= −𝑤11
𝑤12𝑥1 −
𝑏1𝑤12
⇒ −𝑥1 + 1.5
Single Layer Perceptron
• Example - AND
1943MP
1957P
1974BP
2006DNN
7
Neural Network
𝑥1
𝑥2
Input layer Output layer
𝑤11
𝑤12
𝑦1𝑠 𝑓
1
𝑏1
Single Layer Perceptron
• Example – XOR Problem (Minsky, M.; S. Papert (1969). 《An Introduction to Computational Geometry》. MIT Press.)
1943MP
1957P
1974BP
2006DNN
8
Neural Network
Multi Layer Perceptron (MLP)
𝑥1
𝑥𝑖
𝑥𝑁𝐼
Input layer - 𝑖Hidden layer - 𝑗
𝑤111
1
𝑏𝑗
Output layer - 𝑘
𝑤𝑖𝑗
1
𝑦12
𝑦𝑘
𝑦𝑁𝑂2
𝑦11
𝑤112
𝑦𝑗 𝑤𝑗𝑘
𝑏𝑘
1943MP
1957P
1974BP
2006DNN
9
Neural Network
Multi Layer Perceptron (MLP)
Hidden layer - 𝑗
𝑥1
𝑥𝑖
𝑥𝑁𝐼
Input layer - 𝑖𝑤111
1
𝑏𝑗
Output layer - 𝑘
𝑤𝑖𝑗
1
𝑦12
𝑦𝑘
𝑦𝑁𝑂2
𝑦11
𝑤112
𝑦𝑗 𝑤𝑗𝑘
𝑏𝑘
1943MP
1957P
1974BP
2006DNN
10
Back Propagation
Back Propagation (Werbos : 1974, Parker : 1982)
𝑥1
𝑥𝑖
𝑥𝑁𝐼
Input layer - 𝑖Hidden layer - 𝑗
𝑤111
1
𝑏𝑗
Output layer - 𝑘
𝑤𝑖𝑗
1
𝑦12
𝑦𝑘
𝑦𝑁𝑂2
𝑦11
𝑤112
𝑦𝑗 𝑤𝑗𝑘
𝑏𝑘
𝑦𝑑𝑘
𝑒𝑘
1943MP
1957P
1974BP
2006DNN
11
Back Propagation
Update Process
𝑤 𝑡 + 1 = 𝑤 𝑡 + ∆𝑤 𝑡Measurement Process
𝑒 𝑡 = 𝑦𝑑 𝑡 − 𝑦(𝑡)
Back Propagation (Werbos : 1974, Parker : 1982)
• Concept of gradient descent algorithm
1943MP
1957P
1974BP
2006DNN
12
Back Propagation
Weight update
𝑤(𝑡 + 1) = 𝑤(𝑡) − ∆𝑤(𝑡)
Back Propagation (Werbos : 1974, Parker : 1982)
• Gradient descent algorithm
𝐸𝑟𝑟𝑜𝑟 𝐸
𝐸𝑚𝑖𝑛
𝑤𝑤0𝑤1𝑤2𝑤∗
Objective function
E =1
2
𝑘=1
𝑁𝑂
𝑒𝑗2
The gradient
∆𝑤(𝑡) = −𝜂𝜕𝐸
𝜕𝑤
1943MP
1957P
1974BP
2006DNN
13
Back Propagation
Back Propagation (Werbos : 1974, Parker : 1982)
• Example
1943MP
1957P
1974BP
2006DNN
14
Back Propagation
Back Propagation (Werbos : 1974, Parker : 1982)
Update Process
𝑤 𝑡 + 1 = 𝑤 𝑡 + ∆𝑤 𝑡 + 𝛼∆𝑤(𝑡 − 1)
Measurement Process
∆𝑤𝑗𝑘 𝑡 = 𝜂𝑒𝑘𝑓′ 𝑠𝑘 𝑦𝑗
Δ𝑏𝑘 𝑡 = 𝜂𝑒𝑘𝑓′ 𝑠𝑘
∆𝑤𝑖𝑗 𝑡 = 𝜂𝑓′ 𝑠𝑗 𝑥𝑖
𝑘=1
𝑁𝑂
𝑒𝑘 𝑓′ 𝑠𝑘 𝑤𝑗𝑘
∆𝑏𝑗 𝑡 = 𝜂𝑓′ 𝑠𝑗
𝑘=1
𝑁𝑂
𝑒𝑘 𝑓′(𝑠𝑘)𝑤𝑗𝑘
1943MP
1957P
1974BP
2006DNN
16
Implementation
XOR Classification 𝜂 = 0.9
- Parameter - Output - Classification
1943MP
1957P
1974BP
2006DNN
20
Deep Learning
Local minima problem
- Local minima - Unsupervised Learning => Pre-training
1943MP
1957P
1974BP
2006DNN
21
Depp Learning
Deep Neural Network
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge. ” A Neural Algorithm of Artistic Style”.
1943MP
1957P
1974BP
2006DNN