recurrent neural network 遞迴式神經網路
TRANSCRIPT
![Page 1: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/1.jpg)
RNN
蔡炎⿓�政治⼤學應⽤數學系
⼈⼯智慧學校
![Page 2: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/2.jpg)
1
三⼤ 神經網路
2
![Page 3: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/3.jpg)
3
Deep�Learning� 三⼤天王
標準�NN CNN RNN
![Page 4: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/4.jpg)
4
真⼼不騙
![Page 5: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/5.jpg)
5
標準的神經網路
![Page 6: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/6.jpg)
6
filter 1
filter 2
input
每個 filter 看⼀一個特徵, 掃過每⼀一個點, 紀錄該點附近的「特徵強度」。
Convolutional�Neural�Network�(CNN)
![Page 7: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/7.jpg)
7
Recurrent�Neural�Network�(RNN)
ht1 =σ (w1xt1 + w2x
t2 + w3h3
t−1 + w4h4t−1 + b1)
會加入上⼀一次的 outputs 當這次的 inputs。
![Page 8: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/8.jpg)
8
反正就是要學個函數
f
![Page 9: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/9.jpg)
9
我們先問⼀個問題在野外看到⼀隻動物,�我想知道是什麼?
1
![Page 10: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/10.jpg)
10
化成函數的形式
f 台灣⿊熊
2
![Page 11: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/11.jpg)
準備訓練資料
(�����������������,�"台灣⿊熊") (�����������������,�"蠎蛇"), ,�...
xk+1, yk+1xk , ykx1, y1 xn , yn
訓練資料 測試資料
3
![Page 12: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/12.jpg)
12強化學習 ⽣成對抗模式�(GAN)
VAE
膠囊
標準�NN CNN RNN
架構我們的神經網路4
![Page 13: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/13.jpg)
13
我們架好了神經網路,�就剩⼀些參數要調。
θ = {wi, bj}決定好這些參數的值,�就會出現⼀個函數。
![Page 14: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/14.jpg)
14
學習5
學習都是⽤我們訓練資料送進我們的神經網路,調整我們的參數,�然後⽤個�loss�function�看我們和觀察值差多少。
![Page 15: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/15.jpg)
15
基本上就是⽤�
gradient�descent
因神經網路的特性,�叫
backpropagation
![Page 16: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/16.jpg)
2
RNN�簡介
16
![Page 17: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/17.jpg)
17
Recurrent�Neural�Network
![Page 18: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/18.jpg)
18
RNN�是有記憶的神經網路
![Page 19: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/19.jpg)
19
⼀般的神經網路⼀筆輸⼊和下⼀筆是沒有關係的...
![Page 20: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/20.jpg)
20
RNN�會偷偷把上⼀次的輸出也當這⼀次的輸⼊。
![Page 21: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/21.jpg)
21
很多⼈畫成這樣。
![Page 22: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/22.jpg)
22
實際上某⼀個�RNN�Cell�的作⽤是這樣
ht1 =σ (w1xt1 + w2x
t2 + w3h3
t−1 + w4h4t−1 + b1)
注意每個�cell�輸出還是⼀個純量。
![Page 23: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/23.jpg)
2
RNN�的應⽤
23
![Page 24: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/24.jpg)
24
對話機器人
⽬前的字 下⼀個字f ( ) =
![Page 25: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/25.jpg)
25
字1 字2
回1
EOS 回1
回2
回k
EOS
對話機器⼈應⽤
注意這樣的模式,�每次輸⼊和輸出都不是固定的⻑度!
![Page 26: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/26.jpg)
26
其實輸⼊不⼀定要⽂字,�是影⽚�(⼀張⼀張的圖)�也是可以的!�輸出還是可以為⽂字,�最常⾒的⼤概是讓電腦說影⽚中發⽣什麼事。
![Page 27: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/27.jpg)
翻譯。�Video�Captioning�⽣成影⽚敘述。�⽣成⼀段⽂字。�畫⼀半的圖完成它。
同樣型式的應⽤
27
![Page 28: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/28.jpg)
28
Andrej�Karpathy�⽣出代數幾何介紹�"Stacks"�的⽂字http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 29: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/29.jpg)
29
潘達洛斯:�唉,我想他應該過來接近⼀天�當⼩的⼩⿆變成從不吃的時候,�誰是他的死亡鏈條和⾂⺠,�我不應該睡覺。�
第⼆位參議員:�他們遠離了我⼼中產⽣的這些苦難,�當我滅亡的時候,我應該埋葬和堅強�許多國家的地球和思想。�
電腦仿的莎⼠⽐亞作品。
![Page 30: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/30.jpg)
30
孫瑄正、邢恒毅、劉亮緯、曾煜祐、李錦藤MLB�球員全壘打數預測
![Page 31: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/31.jpg)
31
第�t-1�年[Age,�G,�PA,�AB,�R,�H,�2B,�3B,�HR,��RBI,�SB,�BB,�SO,�OPS+,�TB]
15�個�features
第�t�年全壘打數
f
![Page 32: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/32.jpg)
32
運⽤�LSTM,�輸⼊�10�年的資料猜下⼀年�只有⼀層�LSTM�層!
![Page 33: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/33.jpg)
33
不要猜精確數⽬,�猜區間即可!
分五段:�0-9,�10-19,�20-29,�30-39,�40+
![Page 34: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/34.jpg)
34
1
3
2
4
5
10-19
One-Hot�encoding
01000
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
0-9
10-19
20-29
30-39
40+
![Page 35: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/35.jpg)
35
Mike�Trout�(LAA)�預測�30-39�實際�33�
Mookie�Betts�(BOS)�預測�20-29�實際�24�
Jose�Altuve�(HOU)�預測�20-29�實際�24�
Kris�Bryant�(CHC)�預測�30-39�(第⼆⾼�20-29)�實際�29�
Daniel�Murphy�(WSH)�預測�20-29�實際�23�
Corey�Seager�(LAD)�預測�20-29�實際�22
2017�預測結果(2016�年�6�⽉預測)
![Page 36: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/36.jpg)
3
基本�RNN⻑這樣
36
![Page 37: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/37.jpg)
37
注意為了讓⼤家更容易瞭解,�我們會⽤較簡單的圖⽰。請注意輸⼊都是個向量、會有權重;�輸出都是純量。
xt =
⎡
⎢⎢⎢⎣
xt,1
xt,2...
xt,n
⎤
⎥⎥⎥⎦
可能⻑這樣
![Page 38: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/38.jpg)
38
注意為了讓⼤家更容易瞭解,�我們會⽤較簡單的圖⽰。請注意輸⼊都是個向量、會有權重;�輸出都是純量。
實際輸⼊⻑這樣
![Page 39: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/39.jpg)
注意為了讓⼤家更容易瞭解,�我們會⽤較簡單的圖⽰。請注意輸⼊都是個向量、會有權重;�輸出都是純量。
同理�h�也是這樣
39
![Page 40: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/40.jpg)
40
對同⼀層第�k�個�cell,�就⼀個輸出。
輸⼊可能是向量。
第�k�號�RNN�Cell
![Page 41: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/41.jpg)
41
真正連結的樣⼦,�注意�RNN�Cell�會把記憶傳給同⼀層其他�RNN�Cells。
![Page 42: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/42.jpg)
42
標準�RNN�⼀個�Cell�的輸出
![Page 43: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/43.jpg)
4
⽐玩具還玩具的例⼦
43
![Page 44: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/44.jpg)
44
我們來做標準�RNN�應⽤,�預測⼀句話下⼀個字。
RNN�預測下⼀個字問題
⽤到�RNN�的�Neural�
Networks字 字
(下⼀個)
![Page 45: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/45.jpg)
45
我們考慮的字集只有三個字。字集
好 ⼀ 點“ ” “ ” “ ”, ,
可以產⽣如�“好⼀點”,�“好好”,�“⼀點⼀點點”,�等等的「句⼦」。
![Page 46: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/46.jpg)
46
⽤⾼級的�one�hot�編碼。字集
好 ⼀ 點“ ” “ ” “ ”, ,
![Page 47: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/47.jpg)
47
架構
![Page 48: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/48.jpg)
48
好⼀點“ ”輸⼊
![Page 49: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/49.jpg)
49
1
1
1
-1
-1
-1
1
1
0.2
0.2
0.3
0.7
架構
![Page 50: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/50.jpg)
50
![Page 51: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/51.jpg)
51
0.5
0.5
0.5
0.3
0.2
1
-1
0.3
1
![Page 52: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/52.jpg)
52
0.82086199
0.8602775
1.16130798
0.5
0.5
0.5
0.3
0.2
1
-1
0.3
1
![Page 53: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/53.jpg)
53
softmax
0.82086199
0.8602775
1.16130798
希望加起來是�1
概念
![Page 54: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/54.jpg)
54
概念 softmax
![Page 55: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/55.jpg)
55
RNN
![Page 56: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/56.jpg)
56
再來輸⼊
![Page 57: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/57.jpg)
57
1
1
1
-1
-1
-1
1
1
0.2
0.2
0.3
0.7
架構
![Page 58: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/58.jpg)
58
![Page 59: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/59.jpg)
59
0.5
0.5
0.5
0.3
0.2
1
-1
0.3
1
![Page 60: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/60.jpg)
60
0.8970793
0.75862845
1.40826454
0.5
0.5
0.5
0.3
0.2
1
-1
0.3
1
![Page 61: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/61.jpg)
61
概念 softmax
![Page 62: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/62.jpg)
62
RNN
⎡
⎣0.282647790.246103130.47124908
⎤
⎦
![Page 63: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/63.jpg)
5
RNN�的問題
63
![Page 64: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/64.jpg)
64
Vanishing�Gradient
![Page 65: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/65.jpg)
6
LSTM
65
![Page 66: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/66.jpg)
66
LSTM
Long�Short�Term�MemoryRNN 系的王牌救援
![Page 67: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/67.jpg)
67
編號�k�的LSTM
多⼀個「cell�狀態」
![Page 68: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/68.jpg)
68
Gate控制閥
重要概念
![Page 69: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/69.jpg)
重點 Gate
69
輸出�0�到�1�間的⼀個數
sigmoid
只是決定「閥」要開多⼤
![Page 70: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/70.jpg)
70
LSTM�有三個�Gates
![Page 71: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/71.jpg)
71
忘記⾨ 輸⼊⾨ 輸出⾨
![Page 72: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/72.jpg)
72
插播
tanh
sigmoid
![Page 73: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/73.jpg)
73
-1
1
![Page 74: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/74.jpg)
74
0
1
σ(x) =1
1 + e−x
![Page 75: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/75.jpg)
75
編號�k�的LSTM
LSTM�再⼀次 多出來的�cell�state�是屬於這個�cell�的�
⼀個�cell�的�output�還是會和同⼀層分享
![Page 76: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/76.jpg)
76
「新的」狀態資訊
![Page 77: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/77.jpg)
77
![Page 78: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/78.jpg)
78
真要弄得那麼複雜?要忘多少和要記多少難道不能⼀一起...
![Page 79: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/79.jpg)
7
GRU
79
![Page 80: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/80.jpg)
80
LSTM�的簡化版
![Page 81: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/81.jpg)
81
只留兩個�Gates雖然名稱有�gated
![Page 82: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/82.jpg)
82記憶⾨ 重設⾨
![Page 83: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/83.jpg)
83
input 可不考慮從前
![Page 84: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/84.jpg)
84
![Page 85: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/85.jpg)
重點 RNN�的名字
現在說到�RNN,�其實包括原始�RNN,�LSTM,�GRU�等各種變形。�
特別要叫原始的�RNN,�我們習慣叫它��Vanilla�RNN,�在�Keras�中是�SimpleRNN。
85
![Page 86: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/86.jpg)
86
實戰篇。
情意分析
![Page 87: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/87.jpg)
8
IMDB�評論
87
![Page 88: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/88.jpg)
88
正評
負評or
![Page 89: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/89.jpg)
技巧 指定�Backend
在�Jupyter�Notebook�中,�我們可以指定�Keras�使⽤的�Backend。
%env KERAS_BACKEND=tensorflow
89
![Page 90: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/90.jpg)
技巧 讀⼊�IMDB�資料庫
在�Jupyter�Notebook�中,�我們可以指定�Keras�使⽤的�Backend。
from keras.datasets import imdb
90
![Page 91: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/91.jpg)
重點 讀⼊資料
Keras�轉備好的資料都⽤�load_data�讀⼊。
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=10000)
91
這裡限制只選最常⽤�10,000�字。訓練資料和測試資料都是很豪氣的�25,000�筆。
![Page 92: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/92.jpg)
說明 資料內容
我們來看⼀筆輸⼊資料⻑像:
[1, 1230, 3765, 566, 97, 189, 102, 86, ...]
92
每個數字代表⼀個字,�這就是⼀則評論。
![Page 93: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/93.jpg)
注意 資料⻑度
每⼀則評論⻑度顯然是不⼀樣的!
for i in range(10): print(len(x_train[i]), end=', ')
93
輸出:�218,�189,�141,�550,�147,�43,�123,�562,�233,�130
![Page 94: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/94.jpg)
9
輸⼊處理
94
![Page 95: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/95.jpg)
重點 輸⼊處理
from keras.preprocessing import sequence
95
![Page 96: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/96.jpg)
重點 輸⼊處理
x_train = sequence.pad_sequences(x_train, maxlen=100)
x_test = sequence.pad_sequences(x_test, maxlen=100)
96
![Page 97: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/97.jpg)
重點 ⻑度變成⼀樣!
97
x_train.shape
輸出:�(25000,�100)
![Page 98: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/98.jpg)
10
打造�RNN
98
![Page 99: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/99.jpg)
99
決定架構
先將�10000�維的⽂字壓到�128�維�然後⽤�128�個�LSTM�最後⼀個�output,�直接⽤�sigmoid�送出
![Page 100: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/100.jpg)
重點 讀⼊函數
100
from keras.models import Sequentialfrom keras.layers import Dense, Embeddingfrom keras.layers import LSTM
![Page 101: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/101.jpg)
重點 打開新的神經網路
101
model = Sequential()
![Page 102: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/102.jpg)
重點 Embedding�層
102
model.add(Embedding(10000, 128))
![Page 103: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/103.jpg)
重點 LSTM�層
103
model.add(LSTM(150))
![Page 104: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/104.jpg)
重點 Dense�層�(output)
104
model.add(Dense(1, activation='sigmoid'))
![Page 105: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/105.jpg)
11
組裝
105
![Page 106: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/106.jpg)
重點 組裝
106
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
![Page 107: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/107.jpg)
12
訓練
107
![Page 108: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/108.jpg)
重點 訓練
108
model.fit(x_train, y_train, batch_size=32, epochs=15)
![Page 109: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/109.jpg)
13
結果
109
![Page 110: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/110.jpg)
重點 訓練
110
score = model.evaluate(x_test, y_test)
![Page 111: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/111.jpg)
重點 訓練
111
model_json = model.to_json()open('imdb_model_architecture.json', 'w').write(model_json)model.save_weights('imdb_model_weights.h5')
![Page 112: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/112.jpg)
14
結果
112
![Page 113: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/113.jpg)
重點 訓練成績
113
print('測試資料的 loss:', score[0])
print('測試資料正確率:', score[1])
![Page 114: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/114.jpg)
重點 儲存結果
114
model_json = model.to_json()open('imdb_model_architecture.json', 'w').write(model_json)model.save_weights('imdb_model_weights.h5')
![Page 115: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/115.jpg)
115
調校篇。
![Page 116: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/116.jpg)
15
基本概念複習
116
![Page 117: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/117.jpg)
117
學習都是⽤我們訓練資料送進我們的神經網路,調整我們的參數。
θ = {wi, bj}然後⽤個�loss�function�看我們和觀察值差多少。
![Page 118: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/118.jpg)
為了簡化,�我們先把�L�想成只有⼀個變數�w。
118
![Page 119: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/119.jpg)
119w�⽬前的值
如何⾛到最⼩值?
![Page 120: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/120.jpg)
120w�⽬前的值
我知道,�a�點要向右移動!
![Page 121: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/121.jpg)
121
電腦怎麼「看」出來?
![Page 122: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/122.jpg)
122
切線是關鍵!
![Page 123: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/123.jpg)
123
切線斜率�<�0
![Page 124: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/124.jpg)
124
切線斜率�>�0
![Page 125: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/125.jpg)
切線斜率指向�(局部)�最⼤值的⽅向!!
125
![Page 126: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/126.jpg)
126
符號
切線斜率
切線斜率
![Page 127: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/127.jpg)
127
切線斜率是變化率
切線斜率
切線斜率
![Page 128: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/128.jpg)
128
對任意的�w,�我們會寫成:
簡潔就是潮!切線斜率函數
微分符號符號
![Page 129: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/129.jpg)
129
正負指向�(局部)�極⼤
切線斜率
切線斜率
![Page 130: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/130.jpg)
130
往�(局部)�極⼩值移動重點
我們想調整�w�的值,�往極⼩值移動,�應該讓新的�w�變成:
![Page 131: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/131.jpg)
131
往�(局部)�極⼩值移動重點
⽐如現在在�w=a,�要調整為
![Page 132: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/132.jpg)
132
調整⽰意圖
切線斜率
![Page 133: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/133.jpg)
133
有時會跑過頭!
切線斜率
![Page 134: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/134.jpg)
134
往�(局部)�極⼩值移動重點
為了不要⼀次調太⼤,�我們會乘上⼀個⼩⼩的數,�叫�Leraning�Rate:
![Page 135: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/135.jpg)
135
偏微分定義
我們有�L(w1,�w2,�b1)�這三個變數的函數,�當我們
只把�w1�當變數,�其他�w2,�b1�當常數的微分。
![Page 136: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/136.jpg)
136
同理,
偏微分定義
![Page 137: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/137.jpg)
137
梯度�(gradient)符號
函數�L�的�gradient�就變成:
![Page 138: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/138.jpg)
138
Gradient�當然要有很酷的符號:
梯度�(gradient)符號
![Page 139: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/139.jpg)
139
我們調整�w1,�w2,�b1�的⽅法就變成:
梯度�(gradient)符號
![Page 140: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/140.jpg)
⼤家都知道這叫
Gradient Descent
140
梯度下降
![Page 141: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/141.jpg)
141
基本上,�把�learning�rate�調⼩,�你是可以預期你的學習是會收斂的�--�在你時間多的狀況下...
![Page 142: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/142.jpg)
16
基本統計
142
![Page 143: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/143.jpg)
假設
143
相信⼤家都知道
平均值
![Page 144: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/144.jpg)
例⼦ 平均值的計算
144
µ = 4 · 15+ 4 · 1
5+ 8 · 1
5+ 5 · 1
5+ 9 · 1
5
![Page 145: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/145.jpg)
例⼦ 平均值的計算
145
µ = 4 · 25+ 8 · 1
5+ 5 · 1
5+ 9 · 1
5
數字出現的機率
![Page 146: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/146.jpg)
例⼦ 平均值的計算
146
我們可以把�X�想成⼀個隨機變數,�所有可能的值為
![Page 147: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/147.jpg)
例⼦ 平均值的計算
147
其實我們就是在算這個隨機便數的期望值。
![Page 148: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/148.jpg)
重點 變異數的計算
相信⼤家也還記得變異數是和平均值差多遠�(平⽅差)�的平均值。
148
!ni=1(xi − µ)2
n
![Page 149: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/149.jpg)
重點 變異數的計算
再度想成隨機變數,�就變成這看來很⾼級的樣⼦。
149
![Page 150: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/150.jpg)
重點 變異數的重要性質
如果�X�和�Y�是兩個獨⽴的隨機變數。
150
![Page 151: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/151.jpg)
重點 變異數的重要性質
如果�X,�Y�不但獨⽴,�再加上平均值為�0,�我們有更炫的:
151
![Page 152: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/152.jpg)
152
標準差
所以�X�的變異數我們常常寫為
![Page 153: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/153.jpg)
153
標準常態分布平均值是�0,�變異數為�1�的常態分布。如果隨機變數�X�是標準常態分布,�我們記為:
![Page 154: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/154.jpg)
重點 ⽤�NumPy
在�NumPy�有個指令會從標準常態分布隨機抽個數字出來。
randn()
154
這在�np.random�之下。
![Page 155: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/155.jpg)
重點 ⽤�NumPy
取�n�個�samples。
randn(n)
155
![Page 156: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/156.jpg)
重點 ⽤�NumPy
平均值改為�µ,�標準差改為�𝜎
randn(n)*𝜎 + µ
156
![Page 157: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/157.jpg)
重點 ⽤�NumPy
相反的,�X�是⼀個平均值為�µ,�標準差為�𝜎 的常態分布,�我們要改成標準常態分布。
(randn(n) - µ)/𝜎
157
![Page 158: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/158.jpg)
17
調校之前
158
⽐調校還要重要的事�
![Page 159: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/159.jpg)
159
問個好問題�
要解決的問題,� 當有不同的問法。通常第⼀次的問法都要調整,� 還有⼀定要考慮是不能有⾜夠的資料。
![Page 160: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/160.jpg)
160
建構你的�神經網路�
![Page 161: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/161.jpg)
18
初始化
161
![Page 162: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/162.jpg)
162
其實基本上,��weights�亂亂選就可以。
雖然我們可以說得好像有點學問。
![Page 163: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/163.jpg)
重點 權重等參數選取原則
163
不要全部設成�0�(原因⽤腳想就知道)�取⼩⼀點的值
這叫有學問!?
![Page 164: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/164.jpg)
重點 權重等參數選取原則
164
activation initializer
sigmoid, tanh Xavier
ReLU He
SELU LeCun
![Page 165: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/165.jpg)
165
這看來好有學問的,�其實只是說我們每層權重的變異數不要⼀直加⼤�(不致於產⽣某個權重⼀開始就太⼤)。
![Page 166: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/166.jpg)
166
假設我們神經網路的某⼀層有�n=nin�個輸⼊,�權重我
們記為:
輸⼊我們記為
![Page 167: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/167.jpg)
167
這層接收到的輸⼊就是
![Page 168: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/168.jpg)
168
如果該獨⽴的都獨⽴,�平均值⼜都是�0,�我們有:
如果每個 Wi,�Xi�基本上⼀樣的話,�例如
![Page 169: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/169.jpg)
169
如果該獨⽴的都獨⽴,�平均值⼜都是�0,�我們有:
如果每個 Wi,�Xi�基本上⼀樣的話,�例如
![Page 170: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/170.jpg)
170
也就是�n�越⼤,�變異數越⼤...
Var(Y ) = nVar(W )Var(X)
![Page 171: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/171.jpg)
171
所以我們可以由平均值�0,�變異數為�������的分布中取值
Xavier�Initializationhttp://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
![Page 172: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/172.jpg)
172
所以我們可以由平均值�0,�變異數為�������的分布中取值
Xavier�Initialization
原始論⽂
![Page 173: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/173.jpg)
173
ReLUReLU�太接近�0�反⽽不好
![Page 174: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/174.jpg)
174
結果⼤⼀點就好...
He�Initializationhttps://arxiv.org/abs/1502.01852
![Page 175: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/175.jpg)
175
SELU
LeCun�Initialization
https://arxiv.org/abs/1706.02515
加設上下限在⼀定範圍。
![Page 176: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/176.jpg)
19
Regularization
176
![Page 177: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/177.jpg)
177
Dropout執⾏�dropout
![Page 178: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/178.jpg)
178
L2�Regularization
loss�function�+�
![Page 179: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/179.jpg)
重點 設�regularizer
在�Keras�可在每層中設�regularizer,�例如我們先��import�了�keras.regularizer�中的�l2,�那在某層可加⼊以下參數。
179
W_regularizer=l2(0.01)
![Page 180: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/180.jpg)
180
通常�dropout,�L2�regularization�選⼀個就好。
![Page 181: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/181.jpg)
20
Optimizer
181
![Page 182: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/182.jpg)
182
為了簡化我們的符號,�我們設定:
gt = ∇θL(θt)
⽬前參數值所成向量
![Page 183: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/183.jpg)
183
標準�Gradient�Descent
![Page 184: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/184.jpg)
184
兩個改善⽅向momentum:�⽅向的穩定和加速器�可變速的�learning�rate
![Page 185: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/185.jpg)
概念 Momentum
想像你在�loss�function�⾼⼭上某⼀點,�負的梯度指的是那⼀點你看到最陡的下坡⽅向。
185
![Page 186: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/186.jpg)
概念 Momentum
問題是你順著那個⽅向到了另⼀點,�你可能會發現那個⽅向⼜變了!�於是你⼜向另⼀個⽅向...�
但其實你好好⾛原來⽅向,�也許⼀路也就下⼭了!
186
![Page 187: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/187.jpg)
概念 Momentum
187
vt = �vt�1 + �gt
�t+1 = �t � vt
如果�𝛾=0,�我們⽤的就是原本的��gradient�descent。
![Page 188: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/188.jpg)
概念 Learning�Rate�變速器
188
⼀個很⾃然的想法:�learning�rate�在接近⽬標時可能要⼩⼀點。於是有了各種「減速法」。
![Page 189: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/189.jpg)
概念 Momentum
189
vt = �vt�1 + �gt
�t+1 = �t � vt
![Page 190: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/190.jpg)
重點 Adam
不想尋找哪個⽅法最好,�通常可以交給�Adam。這是兼具�momentum�和減速的⽅法。
190
mt = �1mt�1 + (1 � �1)gt
vt = �2vt�1 + (1 � �2)g2t
![Page 191: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/191.jpg)
重點 Adam
191
m̂t =mt
1 � �t1
v̂t =vt
1 � �t2
![Page 192: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/192.jpg)
重點 Adam
192
θt+1 = θt −η√
v̂t + εm̂t
原始論⽂設定: �1 = 0.9
�2 = 0.999
� = 10�8
![Page 193: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/193.jpg)
21
Batch� Normalization
193
![Page 194: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/194.jpg)
194
對輸⼊做某種�normalization�是重要的。
![Page 195: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/195.jpg)
原因1
195
不會有某個�feature�太過主導。
![Page 196: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/196.jpg)
原因2
196
⾄少在理論上分析是相對容易的。
最好是�1
![Page 197: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/197.jpg)
197
那可以在中間的某層輸⼊也做某種�normalizaiton�嗎?
![Page 198: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/198.jpg)
198
答案是肯定的,就是:
Batch�Normalization
![Page 199: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/199.jpg)
22
ResNet 深度結構
199
![Page 200: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/200.jpg)
200
我們真的可以做深度學習。
![Page 201: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/201.jpg)
201
x
![Page 202: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/202.jpg)
23
SELU 什麼都來
202
![Page 203: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/203.jpg)
203
selu(x) = λ ·!x if x > 0
αex − α if x ≤ 0
最炫的是參數其實是常數:
𝛼 = 1.6732632423543772848170429916717𝜆 = 1.0507009873554804934193349852946
![Page 204: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/204.jpg)
重點 使⽤注意事項
204
起始⽤�LeCun�Normal�(權重平均�0,�變異數�1)�輸⼊資料也要平均�0,�變異數�1
![Page 205: Recurrent Neural Network 遞迴式神經網路](https://reader034.vdocuments.site/reader034/viewer/2022050613/5aabe5107f8b9aaf528b56b9/html5/thumbnails/205.jpg)
205
可以做很深的深度學習!