![Page 1: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/1.jpg)
Variational Attention
《Sequence-to-Sequence Models》
Source : COLING 2018
Speaker : Ya-Fang, Hsiao
Advisor : Jia-Ling, Koh
Date : 2020/01/03
for
![Page 2: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/2.jpg)
PART
![Page 3: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/3.jpg)
Introduction
Auto-Encoder
Encoder-DecoderDeterministic
Variational
Auto-Encoder
Encoder-Decoder
DAE
DED
VAE
VED
![Page 4: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/4.jpg)
PART
![Page 5: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/5.jpg)
Variational Autoencoder[Bowman et al. 2016] Generating Sentences from a Continuous Space
Data likelihood under the posterior (cross entropy)
KL divergence of the posterior from the prior
ℒ θ, 𝜙 = 𝔼𝑞𝜙 𝑧 𝑥 [𝑙𝑜𝑔𝑝θ(𝑥|𝑧)] − KL 𝑞𝜙 𝑧 𝑥 ||𝑝(𝑧)
![Page 6: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/6.jpg)
Variational Seq2Seq model
A B
Bypassing phenomenon
C D
![Page 7: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/7.jpg)
PART
![Page 8: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/8.jpg)
VED+VAttn
ℒ θ, 𝜙 = 𝔼𝑞𝜙 𝑧, 𝑎 𝑥 [𝑙𝑜𝑔𝑝θ(𝑦|𝑧, 𝑎)] − KL 𝑞𝜙 𝑧, 𝑎 𝑥 ||𝑝(𝑧, 𝑎)
Variational Attention for
ℒ θ, 𝜙 = 𝔼𝑞𝜙(𝑧) 𝑧 𝑥 ,𝑞𝜙
(𝑎) 𝑎 𝑥 𝑙𝑜𝑔𝑝θ 𝑦 𝑧, 𝑎
−KL 𝑞𝜙(𝑧)
𝑧 𝑥 ||𝑝(𝑧) − KL 𝑞𝜙(𝑎)
𝑎 𝑥 ||𝑝(𝑎)
Variational Encoder Decoder
1. 𝑁 0, 𝐼
2. 𝑁(തℎ 𝑠𝑟𝑐 , 𝐼)
![Page 9: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/9.jpg)
ℒ θ, 𝜙 = 𝔼𝑞𝜙(𝑧) 𝑧 𝑥 ,𝑞𝜙
(𝑎) 𝑎 𝑥 𝑙𝑜𝑔𝑝θ 𝑦 𝑧, 𝑎
−KL 𝑞𝜙𝑧
𝑧 𝑥 ||𝑝(𝑧) − KL 𝑞𝜙(𝑎)
𝑎 𝑥 ||𝑝(𝑎)𝜆𝐾𝐿[ ]
VED+VAttn
Variational Attention for Variational Encoder Decoder
𝛾𝑎
+
![Page 10: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/10.jpg)
PART
![Page 11: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/11.jpg)
Question GenerationStandford Question Answering Dataset (Rajpurkar et al., 2016, SQuAD)
![Page 12: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/12.jpg)
Question GenerationStandford Question Answering Dataset (Rajpurkar et al., 2016, SQuAD)
![Page 13: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/13.jpg)
Question GenerationStandford Question Answering Dataset (Rajpurkar et al., 2016, SQuAD)
![Page 14: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/14.jpg)
Case study
![Page 15: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/15.jpg)
PART
![Page 16: Variational Attention - NTNU184pc128.csie.ntnu.edu.tw/presentation/20-01-03/Variational Attenti… · Variational Autoencoder [Bowman et al. 2016] Generating Sentences from a Continuous](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f57ceff87a43a0e97634dfa/html5/thumbnails/16.jpg)
Using variational attention
to solve bypassing phenomenon
Showing more diversified
while retaining high quality