computational linguistics week 10

44
Computa(onal Linguis(cs Week 10 Neural Sequence Modeling Mark Chang

Upload: mark-chang

Post on 10-Feb-2017

507 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Computational Linguistics week 10

Computa(onal  Linguis(cs  Week  10  

Neural  Sequence  Modeling  

Mark  Chang  

Page 2: Computational Linguistics week 10

Outlines  

•  Recurrent  Neural  Networks  •  Long  short-­‐term  Memory  •  Neural  Turing  Machine  •  Applica(ons  

Page 3: Computational Linguistics week 10

Recurrent  Neural  Networks  

Page 4: Computational Linguistics week 10

短期記憶

白日依山盡,黃河入海流

白日

白日依

…..

白日依山

Page 5: Computational Linguistics week 10

短期記憶

白 n(白)

日 n(日)

n W1

W2

x1

x2

bWb

y

n W1

W2

x1

x2

bWb

y

Page 6: Computational Linguistics week 10

Recurrent  Neural  Network  

日 n(n(白),日)

n(白)

依 n(n(n(白),日),依)

Page 7: Computational Linguistics week 10

類神經網路到深度學習

Feedforward  Neural  Network   Recurrent  Neural  Network  

Long  Short  Term  Memory  Neural  Turing  Machine  

Page 8: Computational Linguistics week 10

Recurrent  Neural  Network  

n

in,t

= w

c

x

t

+ w

p

n

out,t�1 + w

b

n

out,t

=1

1 + e

�nin,t

把上一個時間點的nout,接回這個時間點的nin  

Page 9: Computational Linguistics week 10

Recurrent  Neural  Network

….

x0

y0 y1

x1 x2

y2 yt

xt

Page 10: Computational Linguistics week 10

Recurrent  Neural  Network

x0 x1 xt-­‐1 xt

y0 y1 yt-­‐1 yt

Page 11: Computational Linguistics week 10

Backward  Propaga(on  Through  Time

t  =  0

�in,0 =

@J

@nout,0

@nout,0

@nin,0

= �out,0

@nout,0

@nin,0

t  =  1 �in,0=

@J

@nout,1

@nout,1

@nin,1

@nin,1

@nout,0

@nout,0

@nin,0

=�out,1

@nout,1

@nin,1

@nin,1

@nout,0

@nout,0

@nin,0

=�in,1

@nin,1

@nout,0

@nout,0

@nin,0

=�out,0

@nout,0

@nin,0

Page 12: Computational Linguistics week 10

Backward  Propaga(on  Through  Time

�in,s

=

8>><

>>:

@J

@nout,s

@nout,s

@nin,s

if s = t

�in,s+1

@nin,s+1

@nout,s

@nout,s

@nin,s

otherwise

http://cpmarkchang.logdown.com/posts/278457-neural-network-recurrent-neural-network  

�in,s+1�in,s

= �in,s+1

@nin,s+1

@nout,s

@nout,s

@nin,s

�in,t

=@J

@nout,t

@nout,t

@nin,t

Page 13: Computational Linguistics week 10

Deep  RNN  

y0  

x0  

y1  

x1  

yt-­‐1  

xy-­‐1  

yt  

xt  

Page 14: Computational Linguistics week 10

Bi-­‐Direc(onal  RNN  

x0  

x0  

x1  

x1  

xt-­‐1  

xy-­‐1  

xt  

xt  

y0   y1   yt-­‐1   yt  

Page 15: Computational Linguistics week 10

Long  Short-­‐Term  Memory  

Page 16: Computational Linguistics week 10

Vanishing  Gradient  Problem

�in,0

�in,0 = �

out,t

@nout,t

@nin,t

@nin,t

@nout,t�1

...@n

in,1

@nout,0

@nout,0

@nin,0

�out,t

Page 17: Computational Linguistics week 10

Long  Short-­‐Term  Memory  

xt m yt

Cin

c cc

k n

b

nout

Memory  Cell kout

Cread Cforget Cwrite

mout,t

mout,t-­‐1

Cout min,t

Page 18: Computational Linguistics week 10

Long  Short-­‐Term  Memory  輸入值 Cin

讀取開關 Cread 遺忘開關 Cforget 寫入開關 Cwrite

輸出值 Cout

Page 19: Computational Linguistics week 10

Long  Short-­‐Term  Memory  

•  寫入開關Cwrite:控制是否可寫入記憶體    

C

write

= sigmoid(wcw,x

x

t

+ w

cw,y

y

t�1 + w

cw,b

)

k

out

= sigmoid(wk,x

x

t

+ w

k,b

)

min,t = kout

Cwrite

Page 20: Computational Linguistics week 10

Long  Short-­‐Term  Memory

•  遺忘開關Cforget:控制是否保留之前的值  

C

forget

= sigmoid(wcf,x

x

t

+ w

cf,y

y

t

+ w

cf,b

)

mout,t = min,t +C

forget

mout,t�1

Page 21: Computational Linguistics week 10

Long  Short-­‐Term  Memory

•  讀取開關Cread  :控制是否可讀取記憶體

n

out

= sigmoid(mout,t

)

C

read

= sigmoid(wcr,x

x

t

+ w

cr,y

y

t�1 + w

cr,b

)

Cout

nout

= Cread

Page 22: Computational Linguistics week 10

Training:  Backward  Propaga(on  

hRp://www.felixgers.de/papers/phd.pdf  

mout,t = min,t +C

forget

mout,t�1 min,t = k

out

Cwrite

@mout,t

@wk,x

=@m

in,t

@wk,x

+ Cforget

@mout,t�1

@wk,x

= Cwrite

@kout

@wk,x

+ Cforget

@mout,t�1

@wk,x

Page 23: Computational Linguistics week 10

Long-­‐Short  Term  Memory  

https://class.coursera.org/neuralnets-2012-001/lecture/95  

Page 24: Computational Linguistics week 10

Neural  Turing  Machine  

Page 25: Computational Linguistics week 10

Neural  Turing  Machine  

Input Output

Read/Write  Head    

controller  

Memory  

Page 26: Computational Linguistics week 10

Memory

Memory  Address  

Memory  Block  

Block  Length  

0 1 … i … n

0

j

m

Page 27: Computational Linguistics week 10

Read  Opera(on

11 2

21 3

42 1

Read  Opera(on:

0 00 00.9 0.1

0 1 … i … n

2

64

r0

r1

r2

3

75 =

2

64

1 ⇤ 0.9 + 2 ⇤ 0.11 ⇤ 0.9 + 1 ⇤ 0.12 ⇤ 0.9 + 4 ⇤ 0.1

3

75 =

2

64

1.1

1.0

2.2

3

75

X

i

w(i) = 1, 0 w(i) 1, 8i

r X

i

w(i)M(i)

Read  Vector: rHead  Loca(on:   w

Memory  :    M1.1

1.0

2.2

Page 28: Computational Linguistics week 10

Erase  Opera(on

Erase  Opera(on:  

0

1

111 2

21 3

42 1

0 00 00.9 0.1

0 1 … i … n

0

j

m

11 2

3

1

0.1 1.8

0.2 3.6 0 e(j) 1, 8j

M =

2

64

1(1� 0.9) 2(1� 0.1) 3 ...

1 1 2 ...

2(1� 0.9) 4(1� 0.1) 1 ...

3

75 =

2

64

0.1 1.8 3 ...

1 1 2 ...

0.2 3.6 1 ...

3

75

M(i) (1� w(i)e)M(i)

Head  Loca(on:   w

Erase  Vector:   e

Memory  :    M

Page 29: Computational Linguistics week 10

Add  Opera(on

Add  Opera(on:

1

1

0

0 00 00.9 0.1

0 1 … i … n

11 2

3

1

0.1 1.8

0.2 3.6

2

3

10.2 3.6

1.9

1.9

1.1

1.0

M =

2

64

0.1 + 0.9 1.8 + 0.1 3 ...

1.0 + 0.9 1.0 + 0.1 2 ...

0.2 3.6 1 ...

3

75 =

2

64

1.0 1.9 3 ...

1.9 1.1 2 ...

0.2 3.6 1 ...

3

75

M(i) M(i) + w(i)a

Add  Vector:   a

Memory  :    M

Head  Loca(on:   w

0

j

m

Page 30: Computational Linguistics week 10

Controller controller  

Input

Read  Vector: r

Head  Loca(on:   w

Output

Add  Vector:   aErase  Vector:               e

Addressing  Mechanisms

Content  Addressing  Parameter: Interpola(on  Parameter:   Convolu(onal  Shi^  Parameter:   Sharpening  Parameter:

Memory  Key:   k

s

g

Page 31: Computational Linguistics week 10

0 0000 1

.45 .05 .50 0 0 0

.45 .05 .50 0 0 0

0 0 0 1 0 0

Head  Loca(on:   w

11 2 04 0

21 3 01 1

42 1 15 0

 0 00 00.9 0.1

wt�1Head  Loca(on:  

MMemory:   Previous  State

2

3

1

Memory    Key:   k

� = 50

g = 0.5

00 1s =

� = 50

Controller  Outputs

Content    Addressing    

Interpola(on    

Convolu(onal  Shi^  

Sharpening  

Page 32: Computational Linguistics week 10

Content  Addressing

11 2 04 0

21 3 01 1

42 1 15 02

3

1

.16 .16 .16 .16 .16 .16 0 0000 1 .15 .10 .47 .08 .13 .17

Memory  Key:   kMemory  :    M

Head  Loca(on:   w

K[u,v] =u · v

|u| · |v|w(i) e�K[k,M(i)]

Pj e

�K[k,M(j)]

� = 50 � = 5 � = 0

找出記憶體 中與 內容相近的位置。  參數 :調整集中度

M k

Page 33: Computational Linguistics week 10

Interpola(on

0 00 00.9 0.1

0 0000 1

0 0000 1 0 00 00.9 0.1 .45 .05 .50 0 0 0

wt�1

wt

g = 1 g = 0.5 g = 0

wt gwt + (1� g)wt�1

將讀寫頭位置  與上一個時段位置  結合。  參數 :調整目前的與上個時段的比率

wt wt�1

g

Page 34: Computational Linguistics week 10

Convolu(onal  Shi^

.45 .05 .50 0 0 0 .45 .05 .50 0 0 0

.45 .05 .50 0 0 0 .45 .05 .50 0 0 0

.45 .05 .50 0 0 0

.025 .475 .025 .25 0 .225

01 0 00 1 .5 0 .5

-­‐1 0 1 -­‐1 0 1 -­‐1 0 1

s = s = s =

wi�1 wi wi+1

s1s0s�1

wi

w(i) X

j

w(j)s(i� j)

w(i) w(i� 1)s(1) + w(i)s(0) + w(i+ 1)s(�1)

s

將 內的數值做平移。  參數 :調整平移方向 s

w w

w

Page 35: Computational Linguistics week 10

Sharpening

0 0 0 1 0 0 0 .37 0 .62 0 0

0 .45 .05 .50 0 0

.16 .16 .16 .16 .16 .16

w(i) w(i)�Pj w(j)

� = 50 � = 5 � = 0

使 中的值更集中(或分散)。  參數 :調整集中度 �

w

w

Page 36: Computational Linguistics week 10

Experiment:  Repeat  Copy  

hRps://github.com/fumin/ntm  

Page 37: Computational Linguistics week 10

Evolu(on  of  Recurrent  Neural  Network  

Recurrent  Neural  Network  

Long  Short  Term  Memory  

Neural  Turing  Machine  

短期記憶

可控制記憶體的讀寫

可更靈活地控制記憶體讀寫頭的位置

Page 38: Computational Linguistics week 10

Applica(ons  

Page 39: Computational Linguistics week 10

Machine  Transla(on  

hRp://arxiv.org/pdf/1409.3215.pdf  

A  B  C    -­‐>    W  X  Y  Z  

Page 40: Computational Linguistics week 10

Chinese  Word  Segmenta(on  

hRp://arxiv.org/pdf/1602.04874v1.pdf    

Page 41: Computational Linguistics week 10

Chinese  Poetry  Genera(on  

hRp://emnlp2014.org/papers/pdf/EMNLP2014074.pdf    

Page 42: Computational Linguistics week 10

Image  Cap(on  Genera(on  

hRp://arxiv.org/pdf/1411.4555v2.pdf    

Page 43: Computational Linguistics week 10

Visual  Ques(on  Answering  

hRp://arxiv.org/pdf/1505.00468v6.pdf    

Page 44: Computational Linguistics week 10

Further  Reading  

•  The  Unreasonable  Effec(veness  of  RecurrentNeural  Networks  –  hRp://karpathy.github.io/2015/05/21/rnneffec(veness/  

•  Understanding  LSTM  Networks  –  hRp://colah.github.io/posts/2015-­‐08-­‐Understanding-­‐LSTMs/  

•  Recurrent  Neural  Networks  –  hRp://cpmarkchang.logdown.com/posts/278457-­‐neural-­‐network-­‐recurrent-­‐neural-­‐network  

•  Neural  Turing  Machine  –  hRp://cpmarkchang.logdown.com/posts/279710-­‐neural-­‐network-­‐neural-­‐turing-­‐machine