neural machine transla#on
TRANSCRIPT
![Page 1: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/1.jpg)
NeuralMachineTransla/on
ThangLuongLecture@CS224D
Spring2016
(SpecialthankstoChrisManningforfeedback!)
![Page 2: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/2.jpg)
7billionpeople,7000languages
5/19/16 ThangLuong-NeuralMachineTranslaHon 2
![Page 3: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/3.jpg)
Auniversaltranslator
(TheBabelFishfrom“theHitchhiker'sGuidetotheGalaxy”)
5/19/16 ThangLuong-NeuralMachineTranslaHon 3
![Page 4: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/4.jpg)
Machinevs.HumanTransla/on
“Nevertheless,withinthedisciplineofmachinelearning,aspecializa8oncalleddeeplearninghasbeendeveloped.Itsdis8nguishingfeatureisthatitisinspiredbyneurobiology.Deeplearningdealswithcomputa8onalelementswhichallowanetworkofar8ficialneuronstolearnamodelofthehumanbrain.”
Faithfultransla/on
5/19/16 ThangLuong-NeuralMachineTranslaHon 4
• GrammaHcallyincorrect.
![Page 5: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/5.jpg)
Machinevs.HumanTransla/on
“Nevertheless,withinthedisciplineofmachinelearning,aspecializa8oncalleddeeplearninghasbeendeveloped.Itsdis8nguishingfeatureisthatitisinspiredbyneurobiology.Deeplearningdealswithcomputa8onalelementswhichallowanetworkofar8ficialneuronstolearnamodelofthehumanbrain.”
Faithfultransla/on
5/19/16 ThangLuong-NeuralMachineTranslaHon 5
• Badwordchoices.
![Page 6: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/6.jpg)
Machinevs.HumanTransla/on
“Nevertheless,withinthedisciplineofmachinelearning,aspecializa8oncalleddeeplearninghasbeendeveloped.Itsdis8nguishingfeatureisthatitisinspiredbyneurobiology.Deeplearningdealswithcomputa8onalelementswhichallowanetworkofar8ficialneuronstolearnamodelofthehumanbrain.”
Faithfultransla/on
5/19/16 ThangLuong-NeuralMachineTranslaHon 6
• Badsentencestructures.
![Page 7: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/7.jpg)
Machinevs.HumanTransla/on
“However,inmachinelearningaspecializa8oncalleddeeplearninghasemerged.Itcanberecognizedbyitsdis8nc8veneurobiologicalinfluence.Deeplearningiscenteredaroundnetworksofar8ficialneuronswhichcanlearnmodelsofthehumanbrain.”
Fluenttransla/on
Abiggap!5/19/16 ThangLuong-NeuralMachineTranslaHon 7
![Page 8: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/8.jpg)
HowhasMTevolved?
5/19/16 ThangLuong-NeuralMachineTranslaHon 8
![Page 9: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/9.jpg)
Phrase-basedMT
Sheloves
Elleaime
cute
leschats
cats
mignons(Brownetal.,1993;Koehnetal.,2003;Och&Ney,2004)
• Breaksentencesintochunks.• Transla8onmodel:lookupphrasetranslaHons.• Languagemodel:Hephrasestogether.
Translatelocally LMusesonlytargetwords5/19/16 ThangLuong-NeuralMachineTranslaHon 9
![Page 10: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/10.jpg)
JointNeuralLanguageModel
• CondiHonedonsourcewords(Devlinetal.,2014)• SHlltranslatelocally.
…alléspourunepromenadelelongdela
WewentforastrollalongtheSouthBank
rive
walk along bank(river)
MTsystemsbecomemorecomplex!5/19/16 ThangLuong-NeuralMachineTranslaHon 10
![Page 11: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/11.jpg)
NeuralMachineTransla/ontotherescue!
Let’sfindout!
• Sequence-to-sequence:translateglobally.• End-to-end:simple&generalizable.
(Sutskeveretal.,2014;Choetal.,2014)
NMT
Iamastudent Jesuisétudiant
5/19/16 ThangLuong-NeuralMachineTranslaHon 11
![Page 12: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/12.jpg)
Outline
• BasicNMT– RNNRecap.– Encoder-Decoder.– Training.– TesHng.
• AdvancedNMT
5/19/16 ThangLuong-NeuralMachineTranslaHon 12
![Page 13: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/13.jpg)
RecurrentNeuralNetworks(RNNs)
I am a studentinput:
5/19/16 ThangLuong-NeuralMachineTranslaHon 13
(PictureadaptedfromAndrejKarparthy)
![Page 14: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/14.jpg)
RecurrentNeuralNetworks(RNNs)
I am a studentinput:
RNNstorepresentsequences!
ht-1 ht
xt
(PictureadaptedfromAndrejKarparthy)5/19/16 ThangLuong-NeuralMachineTranslaHon 14
![Page 15: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/15.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
5/19/16 ThangLuong-NeuralMachineTranslaHon 15
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 16: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/16.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
5/19/16 ThangLuong-NeuralMachineTranslaHon 16
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 17: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/17.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
5/19/16 ThangLuong-NeuralMachineTranslaHon 17
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 18: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/18.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
5/19/16 ThangLuong-NeuralMachineTranslaHon 18
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 19: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/19.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
5/19/16 ThangLuong-NeuralMachineTranslaHon 19
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 20: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/20.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
Encoder
5/19/16 ThangLuong-NeuralMachineTranslaHon 20
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 21: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/21.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
Encoder
5/19/16 ThangLuong-NeuralMachineTranslaHon 21
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
Boundarymarker
![Page 22: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/22.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
Encoder
5/19/16 ThangLuong-NeuralMachineTranslaHon 22
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 23: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/23.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
Encoder
5/19/16 ThangLuong-NeuralMachineTranslaHon 23
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
![Page 24: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/24.jpg)
NeuralMachineTransla/on(NMT)
am a student _ Je suis étudiant
Je suis étudiant _
I
Encoder Decoder
• RecurrentNeuralNetworks:– ModelP(target|source)directly.– Canbetrainedend-to-end.
Boundarymarker
5/19/16 ThangLuong-NeuralMachineTranslaHon 24
![Page 25: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/25.jpg)
WordEmbeddings
am a student _ Je suis étudiant
Je suis étudiant _
I
• Oneforeachlanguage:canlearnfromscratch.
Sourceembeddings
Targetembeddings
5/19/16 ThangLuong-NeuralMachineTranslaHon 25
![Page 26: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/26.jpg)
am a student _ Je suis étudiant
Je suis étudiant _
I
RecurrentConnec/onsIniHalstates
• Olensetto0.5/19/16 ThangLuong-NeuralMachineTranslaHon 26
![Page 27: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/27.jpg)
am a student _ Je suis étudiant
Je suis étudiant _
I
RecurrentConnec/ons
Encoder1stlayer
5/19/16 ThangLuong-NeuralMachineTranslaHon 27
• Different:{1stlayer,2ndlayer}x{encoder,decoder}.
![Page 28: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/28.jpg)
am a student _ Je suis étudiant
Je suis étudiant _
I
RecurrentConnec/ons
Encoder2ndlayer
5/19/16 ThangLuong-NeuralMachineTranslaHon 28
• Different:{1stlayer,2ndlayer}x{encoder,decoder}.
![Page 29: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/29.jpg)
am a student _ Je suis étudiant
Je suis étudiant _
I
RecurrentConnec/ons
• Different:{1stlayer,2ndlayer}x{encoder,decoder}.
Decoder1stlayer
5/19/16 ThangLuong-NeuralMachineTranslaHon 29
![Page 30: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/30.jpg)
am a student _ Je suis étudiant
Je suis étudiant _
I
RecurrentConnec/ons
Decoder2ndlayer
5/19/16 ThangLuong-NeuralMachineTranslaHon 30
• Different:{1stlayer,2ndlayer}x{encoder,decoder}.
![Page 31: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/31.jpg)
RecurrentUnits
• Vanilla:
• LSTM:
5/19/16 ThangLuong-NeuralMachineTranslaHon 31
RNN
LSTM
C’mon,it’sbeenaroundfor20years!
Vanishinggradientproblem!
![Page 32: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/32.jpg)
SoTmax:vectors↦categories
am a student _ Je suis étudiant
Je suis étudiant _
I
am a student _ Je suis étudiant
Je suis étudiant _
I
Solmaxparameters
5/19/16 ThangLuong-NeuralMachineTranslaHon 32
am a student _ Je suis étudiant
Je suis étudiant _
I
Targethiddenstate
|V|
![Page 33: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/33.jpg)
SoTmax:vectors↦categories
am a student _ Je suis étudiant
Je suis étudiant _
I
am a student _ Je suis étudiant
Je suis étudiant _
I
5/19/16 ThangLuong-NeuralMachineTranslaHon 33
am a student _ Je suis étudiant
Je suis étudiant _
I
|V|
• Hiddenstates↦scores↦probabiliHes.
Scores
=
Probs
P(Je|…)exp&normalize
![Page 34: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/34.jpg)
TrainingLoss
• MaximizeP(target|source)
am a student _ Je suis étudiantI
Je suis étudiant _
5/19/16 ThangLuong-NeuralMachineTranslaHon 34
![Page 35: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/35.jpg)
TrainingLoss
am a student _ Je suis étudiant
-log P(Je)
I
-log P(suis)
• Sumofallindividuallosses5/19/16 ThangLuong-NeuralMachineTranslaHon 35
![Page 36: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/36.jpg)
TrainingLoss
am a student _ Je suis étudiant
-log P(Je)
I
-log P(suis)
• Sumofallindividuallosses5/19/16 ThangLuong-NeuralMachineTranslaHon 36
![Page 37: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/37.jpg)
TrainingLoss
• Sumofallindividuallosses
am a student _ Je suis étudiantI
-log P(étudiant)
5/19/16 ThangLuong-NeuralMachineTranslaHon 37
![Page 38: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/38.jpg)
TrainingLoss
• Sumofallindividuallosses
am a student _ Je suis étudiantI
-log P(_)
5/19/16 ThangLuong-NeuralMachineTranslaHon 38
![Page 39: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/39.jpg)
Backpropaga/onThroughTime
am a student _ Je suis étudiantI
-log P(_)
Initto0
5/19/16 ThangLuong-NeuralMachineTranslaHon 39
![Page 40: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/40.jpg)
am a student _ Je suis étudiantI
-log P(étudiant)
Backpropaga/onThroughTime
5/19/16 ThangLuong-NeuralMachineTranslaHon 40
![Page 41: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/41.jpg)
am a student _ Je suis étudiantI
-log P(étudiant)
Backpropaga/onThroughTime
5/19/16 ThangLuong-NeuralMachineTranslaHon 41
![Page 42: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/42.jpg)
am a student _ Je suis étudiantI
-log P(suis)
Backpropaga/onThroughTime
5/19/16 ThangLuong-NeuralMachineTranslaHon 42
![Page 43: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/43.jpg)
am a student _ Je suis étudiantI
-log P(suis)
Backpropaga/onThroughTime
5/19/16 ThangLuong-NeuralMachineTranslaHon 43
![Page 44: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/44.jpg)
am a student _ Je suis étudiantI
Backpropaga/onThroughTime
RNNgradientsareaccumulated.5/19/16 ThangLuong-NeuralMachineTranslaHon 44
![Page 45: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/45.jpg)
Trainingvs.Tes/ng
• Training– CorrecttranslaHonsareavailable.
• Tes8ng– Onlysourcesentencesaregiven.
am a student _ Je suis étudiant
Je suis étudiant _
I
am a student _ Je suis étudiant
Je suis étudiant _
I5/19/16 ThangLuong-NeuralMachineTranslaHon 45
![Page 46: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/46.jpg)
Tes/ng
• Feedthemostlikelyword5/19/16 ThangLuong-NeuralMachineTranslaHon 46
![Page 47: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/47.jpg)
Tes/ng
5/19/16 ThangLuong-NeuralMachineTranslaHon 47
• Feedthemostlikelyword
![Page 48: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/48.jpg)
Tes/ng
5/19/16 ThangLuong-NeuralMachineTranslaHon 48
• Feedthemostlikelyword
![Page 49: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/49.jpg)
Tes/ng
5/19/16 ThangLuong-NeuralMachineTranslaHon 49
• Feedthemostlikelyword
![Page 50: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/50.jpg)
Tes/ng
Simplebeam-searchdecoders!5/19/16 ThangLuong-NeuralMachineTranslaHon 50
![Page 51: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/51.jpg)
37
33.3
34.8
36.5
31
32
33
34
35
36
37
38
BLEU
SOTASMT(Durrani+,2014)
AvgSMT(Schwenk,2014)
NMT(Sutskever+,2014)
SMT+NMTrescore(Sutskever+,2014)
English-FrenchWMT’14results
5/19/16 ThangLuong-NeuralMachineTranslaHon 51
![Page 52: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/52.jpg)
English-FrenchWMT’14results
37
33.3
34.8
36.5
31
32
33
34
35
36
37
38
BLEU
SOTASMT(Durrani+,2014)
AvgSMT(Schwenk,2014)
NMT(Sutskever+,2014)
SMT+NMTrescore(Sutskever+,2014)
5/19/16 ThangLuong-NeuralMachineTranslaHon 52
2decadesofresearch
1-2yearsofresearch
![Page 53: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/53.jpg)
Encoder-decoderVariants
Encoder Decoder
(Sutskeveretal.,2014)MyNMTmodels DeepLSTM DeepLSTM
(Choetal.,2014)(Bahdanauetal.,2015)
(Jeanetal.,2015)
(BidirecHonal)GRU GRU
(Kalchbrenner&Blunsom,2013) CNN (InverseCNN)
RNN
Next,advancedNMT!5/19/16 ThangLuong-NeuralMachineTranslaHon 53
![Page 54: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/54.jpg)
Deepfriedbaby Meatmusclestupidbeansprouts
Break/me:whenMTfails…
Saleofchickenmurder Gobacktowardyourbehind
5/19/16 ThangLuong-NeuralMachineTranslaHon 54
![Page 55: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/55.jpg)
Limita/ons
• #1:thevocabularysizeproblem– Goal:extendthevocabularycoverage.
• #2:thesentencelengthproblem– Goal:translatelongsentencesbever.
• #3:thelanguagecomplexityproblem– Goal:handlemorelanguagevariaHons.
5/19/16 ThangLuong-NeuralMachineTranslaHon 55
![Page 56: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/56.jpg)
AdvancingNMT
• #1:thevocabularysizeproblem– Sol:“copy”mechanism.
• #2:thesentencelengthproblem– Sol:avenHonmechanism.
• #3:thelanguagecomplexityproblem– Sol:character-leveltranslaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 56
The<unk>porHcoin<unk>
Le<unk><unk>de<unk>
ecotax Pont-de-Buis
por8que écotaxe Pont-de-Buis
am a student _ Je suis étudiant
Je suis étudiant _
I
![Page 57: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/57.jpg)
CVvs.NLP
ComputerVision
cat
1Kcategories 1Mcategories
NLP
mat
Thecatsatona
5/19/16 ThangLuong-NeuralMachineTranslaHon 57
![Page 58: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/58.jpg)
#1TheVocabularySizeProblem
• WordgeneraHonproblem– Vocabsaremodest:50K.– Simplesolmax:GPUfriendliness.
am a student _ Je suis étudiant
Je suis étudiant _
I
The<unk>porHcoin<unk>Le<unk><unk>de<unk>
TheecotaxporHcoinPont-de-BuisLeporHqueécotaxedePont-de-Buis
5/19/16 ThangLuong-NeuralMachineTranslaHon 58
![Page 59: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/59.jpg)
• Propose“copy”mechanismsfor<unk>.
• Simple&effecHve– TreatanyNMTasablackbox.– Annotatetrainingdata.– Post-processtranslaHons.
ThangLuong*,IlyaSutskever*,QuocLe*,OriolVinyals,andWojciechZaremba.AddressingtheRareWordProbleminNeuralMachineTransla>on.ACL2015.
SOTAforEnglish-FrenchtranslaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 59
![Page 60: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/60.jpg)
Ourapproach–trainingannota>on
• AddrelaHveposiHons.
“AvenHon”forrarewords
TheecotaxporHcoinPont-de-Buis
LeporHqueécotaxedePont-de-Buis
• Learnalignments.
The<unk>porHcoin<unk>
Leunk1unk-1deunk0
5/19/16 ThangLuong-NeuralMachineTranslaHon 60
![Page 61: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/61.jpg)
Ourapproach–post-process
Testsentence
Transla8on
The<unk>porHcoin<unk>
LeporHqueunk-1deunk0
ecotax Pont-de-Buis
5/19/16 ThangLuong-NeuralMachineTranslaHon 61
![Page 62: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/62.jpg)
Ourapproach–post-process
Testsentence
Transla8on
The<unk>porHcoin<unk>
LeporHqueunk-1deunk0
ecotax Pont-de-Buis
Post-editTransla8on
Dic/onarytranslaHon
LeporHqueécotaxedePont-de-Buis
5/19/16 ThangLuong-NeuralMachineTranslaHon 62
![Page 63: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/63.jpg)
Ourapproach–post-process
Testsentence
Transla8on
The<unk>porHcoin<unk>
LeporHqueunk-1deunk0
ecotax Pont-de-Buis
LeporHqueécotaxedePont-de-Buis
Iden/tycopy
Post-editTransla8on
Orthogonaltolarge-vocabtechniques5/19/16 ThangLuong-NeuralMachineTranslaHon 63
![Page 64: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/64.jpg)
EffectsofTransla/ngRareWords
25
30
35
40
BLEU
Sentencesorderedbyaveragefrequencyrank
Durranietal.(37.0)
Sutskeveretal.(34.8)
Thiswork(37.5)
5/19/16 ThangLuong-NeuralMachineTranslaHon 64
FirstSOTANMTsystem!
![Page 65: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/65.jpg)
Sampletransla/ons
• Predictwelllong-distancealignments.– Correct:cataractvs.cataracte.
source AnaddiHonal2600operaHonsincludingorthopedicandcataractsurgerywillhelpclearabacklog.
human 2600opéraHonssupplémentaires,notammentdansledomainedelachirurgieorthopédiqueetdelacataracte,aiderontàravraperleretard.
trans Enoutre,unk1opéraHonssupplémentaires,dontlachirurgieunk5etlaunk6,permevrontderésorberl'arriéré.
trans+unk
Enoutre,2600opéraHonssupplémentaires,dontlachirurgieorthopédiquesetlacataracte,permevrontderésorberl'arriéré.
5/19/16 ThangLuong-NeuralMachineTranslaHon 65
![Page 66: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/66.jpg)
Sampletransla/ons
• Translatewelllongsentences.– Correct:JPMorganvs.JPMorgan.
sourceThistrader,RichardUsher,lelRBSin2010andisunderstandtohavebegivenleavefromhiscurrentposiHonasEuropeanheadofforexspottradingatJPMorgan.
humanCetrader,RichardUsher,aquivéRBSen2010etauraitétémissuspendudesonpostederesponsableeuropéendutradingaucomptantpourlesdeviseschezJPMorgan.
transCeunk0,Richardunk0,aquivéunk1en2010etacomprisqu'ilestautoriséàquiversonposteactuelentantqueleadereuropéendumarchédespointsdeventeauunk5.
trans+unk
Cenégociateur,RichardUsher,aquivéRBSen2010etacomprisqu'ilestautoriséàquiversonposteactuelentantqueleadereuropéendumarchédespointsdeventeauJPMorgan.
5/19/16 ThangLuong-NeuralMachineTranslaHon 66
![Page 67: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/67.jpg)
Sampletransla/ons
• IncorrectalignmentpredicHon:was–étaitvs.abandonnait.
source ButconcernshavegrownalerMrMazangawasquotedassayingRenamowasabandoningthe1992peaceaccord.
human Maisl'inquiétudeagrandiaprèsqueM.MazangaadéclaréquelaRenamoabandonnaitl'accorddepaixde1992.
trans MaislesinquiétudessesontaccruesaprèsqueM.unkpos3adéclaréquelaunk3unk3l'accorddepaixde1992.
trans+unk
MaislesinquiétudessesontaccruesaprèsqueM.MazangaadéclaréquelaRenamoétaitl'accorddepaixde1992.
5/19/16 ThangLuong-NeuralMachineTranslaHon 67
![Page 68: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/68.jpg)
AdvancingNMT
• #1:thevocabularysizeproblem– Sol:“copy”mechanism.
• #2:thesentencelengthproblem– Sol:avenHonmechanism.
• #3:thelanguagecomplexityproblem– Sol:character-leveltranslaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 68
The<unk>porHcoin<unk>
Le<unk><unk>de<unk>
ecotax Pont-de-Buis
por8que écotaxe Pont-de-Buis
am a student _ Je suis étudiant
Je suis étudiant _
I
![Page 69: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/69.jpg)
#2TheSentenceLengthProblem
Problem:sentencemeaningisrepresentedbyafixed-dimensionalvector.
am a student _ Je suis étudiant
Je suis étudiant _
I
• TranslaHonqualitydegradeswithlongsentences.
5/19/16 ThangLuong-NeuralMachineTranslaHon 69
![Page 70: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/70.jpg)
Youcan'tcramthemeaningofawhole%&!$#sentenceintoasingle$&!#*vector!
5/19/16 ThangLuong-NeuralMachineTranslaHon 70
(AdaptedfromKyungHuynCho’talk)
![Page 71: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/71.jpg)
Amen/onMechanism
• SoluHon:randomaccessmemory– Retrieveasneeded.
am a student _ Je suis étudiant
Je suis étudiant _
I
Poolofsourcestates
5/19/16 ThangLuong-NeuralMachineTranslaHon 71
![Page 72: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/72.jpg)
5/19/16 ThangLuong-NeuralMachineTranslaHon 72
DzmitryBahdanau,KyungHuynCho,andYoshuaBengio.NeuralMachineTransla>onbyJointlyLearningtoTranslateandAlign.ICLR2015.
WithavenHonWithoutavenHon
![Page 73: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/73.jpg)
Amen/onMechanism
am a student _ Je
suis
I
Attention Layer
Context vector
?
Asimplifiedversionof(Bahdanauetal.,2015)5/19/16 ThangLuong-NeuralMachineTranslaHon 73
![Page 74: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/74.jpg)
• Comparetargetandsourcehiddenstates.
Amen/onMechanism–Scoring
am a student _ Je
suis
I
Attention Layer
Context vector
?
3
5/19/16 ThangLuong-NeuralMachineTranslaHon 74
![Page 75: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/75.jpg)
• Comparetargetandsourcehiddenstates.
Amen/onMechanism–Scoring
am a student _ Je
suis
I
Attention Layer
Context vector
?
53
5/19/16 ThangLuong-NeuralMachineTranslaHon 75
![Page 76: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/76.jpg)
• Comparetargetandsourcehiddenstates.
Amen/onMechanism–Scoring
am a student _ Je
suis
I
Attention Layer
Context vector
?
13 5
5/19/16 ThangLuong-NeuralMachineTranslaHon 76
![Page 77: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/77.jpg)
• Comparetargetandsourcehiddenstates.
Amen/onMechanism–Scoring
am a student _ Je
suis
I
Attention Layer
Context vector
?
13 5 1
5/19/16 ThangLuong-NeuralMachineTranslaHon 77
![Page 78: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/78.jpg)
• Convertintoalignmentweights.
Amen/onMechanism–Normaliza>on
am a student _ Je
suis
I
Attention Layer
Context vector
?
0.10.3 0.5 0.1
5/19/16 ThangLuong-NeuralMachineTranslaHon 78
![Page 79: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/79.jpg)
am a student _ Je
suis
I
Context vector
• Buildcontextvector:weightedaverage.
Amen/onMechanism–Contextvector
?
5/19/16 ThangLuong-NeuralMachineTranslaHon 79
![Page 80: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/80.jpg)
am a student _ Je
suis
I
Context vector
• Computethenexthiddenstate.
Amen/onMechanism–Hiddenstate
5/19/16 ThangLuong-NeuralMachineTranslaHon 80
![Page 81: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/81.jpg)
am a student _ Je
suis
I
Context vector
• Predictthenextword.
Amen/onMechanism–Predict
5/19/16 ThangLuong-NeuralMachineTranslaHon 81
![Page 82: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/82.jpg)
• ExaminevariousavenHonmechanisms:
ThangLuong,HieuPham,andChrisManning.Effec>veApproachestoAGen>on-basedNeuralMachineTransla>on.EMNLP2015.
Global:allsourcestates. Local:subsetofsourcestates.
SOTAforEnglish-GermantranslaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 82
![Page 83: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/83.jpg)
TranslateLongSentences
10 20 30 40 50 60 7010
15
20
25
Sent Lengths
BLEU
����
�
ours, no attn (BLEU 13.9)ours, local−p attn (BLEU 20.9)ours, best system (BLEU 23.0)WMT’14 best (BLEU 20.7)Jeans et al., 2015 (BLEU 21.6)
NoAvenHon
AvenHon
5/19/16 ThangLuong-NeuralMachineTranslaHon 83
![Page 84: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/84.jpg)
SampleEnglish-Germantransla/ons
• Translatenamescorrectly.
source OrlandoBloomandMirandaKerrsHllloveeachother
human OrlandoBloomundMirandaKerrliebensichnochimmer
bestOrlandoBloomundMirandaKerrliebeneinandernochimmer.
baseOrlandoBloomundLucasMirandaliebeneinandernochimmer.
5/19/16 ThangLuong-NeuralMachineTranslaHon 84
![Page 85: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/85.jpg)
SampleEnglish-Germantransla/ons
• Translateadoubly-negatedphrasecorrectly• Failtotranslate“passengerexperience”.
sourceWeʹrepleasedtheFAArecognizesthatanenjoyablepassengerexperienceisnotincompa>blewithsafetyandsecurity,saidRogerDow,CEOoftheU.S.TravelAssociaHon.
humanWirfreuenuns,dassdieFAAerkennt,dasseinangenehmesPassagiererlebnisnichtimWider-spruchzurSicherheitsteht,sagteRogerDow,CEOderU.S.TravelAssociaHon.
best Wirfreuenuns,dassdieFAAanerkennt,dasseinangenehmesistnichtmitSicherheitundSicherheitunvereinbarist,sagteRogerDow,CEOderUS-die.
baseWirfreuenunsuberdie<unk>,dassein<unk><unk>mitSicherheitnichtvereinbaristmitSicherheitundSicherheit,sagteRogerCameron,CEOderUS-<unk>.
5/19/16 ThangLuong-NeuralMachineTranslaHon 85
![Page 86: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/86.jpg)
sourceWeʹrepleasedtheFAArecognizesthatanenjoyablepassengerexperienceisnotincompa>blewithsafetyandsecurity,saidRogerDow,CEOoftheU.S.TravelAssociaHon.
humanWirfreuenuns,dassdieFAAerkennt,dasseinangenehmesPassagiererlebnisnichtimWider-spruchzurSicherheitsteht,sagteRogerDow,CEOderU.S.TravelAssociaHon.
best Wirfreuenuns,dassdieFAAanerkennt,dasseinangenehmesistnichtmitSicherheitundSicherheitunvereinbarist,sagteRogerDow,CEOderUS-die.
baseWirfreuenunsuberdie<unk>,dassein<unk><unk>mitSicherheitnichtvereinbaristmitSicherheitundSicherheit,sagteRogerCameron,CEOderUS-<unk>.
SampleEnglish-Germantransla/ons
• Translateadoubly-negatedphrasecorrectly• Failtotranslate“passengerexperience”.5/19/16 ThangLuong-NeuralMachineTranslaHon 86
![Page 87: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/87.jpg)
TEDtalk,English-German
30.85
26.18 26.02 24.9622.51
20.08
0
5
10
15
20
25
30
35
BLEU
16.16
21.84 22.67 23.42
28.18
0
5
10
15
20
25
30
Stanford EdinburghKarlsruheHeidelberg PJAIT
HUMANTER
26%
ThangLuongandChrisManning.StanfordNeuralMachineTransla>onSystemsforSpokenLanguageDomain.IWSLT2015.
Winning
5/19/16 ThangLuong-NeuralMachineTranslaHon 87
![Page 88: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/88.jpg)
AdvancingNMT
• #1:thevocabularysizeproblem– Sol:“copy”mechanism.
• #2:thesentencelengthproblem– Sol:avenHonmechanism.
• #3:thelanguagecomplexityproblem– Sol:character-leveltranslaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 88
The<unk>porHcoin<unk>
Le<unk><unk>de<unk>
ecotax Pont-de-Buis
por8que écotaxe Pont-de-Buis
am a student _ Je suis étudiant
Je suis étudiant _
I
![Page 89: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/89.jpg)
#3Therarewordproblem
• “Copying”mechanismsarenotsufficient.– Differentalphabets:Christopher↦Kryštof– MulH-wordalignment:Solarsystem↦Sonnensystem
• Needtohandlelarge,openvocabulary– Richmorphology:nejneobhospodařovávatelnějšímu
(“totheworstfarmableone”)– Informalspelling:gooooooodmorning!!!!!
Beabletogenerateatthecharacterlevel.5/19/16 ThangLuong-NeuralMachineTranslaHon 89
![Page 90: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/90.jpg)
Recentcharacter-levelNMT
• UnsaHsfactoryperformance– (WangLing,IsabelTrancoso,ChrisDyer,AlanBlack,arXiv2015)
• IncompletesoluHon– Decoderonly(JunyoungChung,KyunghyunCho,YoshuaBengio.arXiv2016).
5/19/16 ThangLuong-NeuralMachineTranslaHon 90
![Page 91: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/91.jpg)
• Abest-of-both-worldsarchitecture:– Translatemostlyatthewordlevel– Onlygothecharacterlevelwhenneeded.
• AddiHonal+2.1↦+11.4BLEUimprovement.
ThangLuongandChrisManning.AchievingOpenVocabularyNeuralMachineTransla>onwithHybridWord-CharacterModels.Insubmission,ACL2016.
SOTAforEnglish-CzechtranslaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 91
![Page 92: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/92.jpg)
HybridNMT
Word-level(4layers)
End-to-endtraining8-stackingLSTMlayers.
5/19/16 ThangLuong-NeuralMachineTranslaHon 92
![Page 93: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/93.jpg)
SourceRepresenta/ons
• On-the-flyembeddings.– Zero-init,batchcomputaHon.
5/19/16 ThangLuong-NeuralMachineTranslaHon 93
![Page 94: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/94.jpg)
TargetGenera/on Initwithword
hiddenstates.
Purposely,use<unk>tomakedecodingeasier.
5/19/16 ThangLuong-NeuralMachineTranslaHon 94
![Page 95: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/95.jpg)
English-CzechWMT’15Results
Systems BLEUWinningentry(Bojar&Tamchyna,2015) 18.8
Exis8ngword-levelNMT(Jeanetal.,2015)
Singlemodel 15.7
Ensemble4models 18.3Largevocab+unkreplace
30xdata3systems
5/19/16 ThangLuong-NeuralMachineTranslaHon 95
![Page 96: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/96.jpg)
English-CzechWMT’15Results
Systems BLEUWinningentry(Bojar&Tamchyna,2015) 18.8
Exis8ngword-levelNMT(Jeanetal.,2015)
Singlemodel 15.7
Ensemble4models 18.3
Ourcharacter-basedNMT
Singlemodel(600-stepbackprop) 15.9
Largevocab+unkreplace
30xdata3systems
• Purelycharacter-based:slowbutpromising!
5/19/16 ThangLuong-NeuralMachineTranslaHon 96
![Page 97: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/97.jpg)
English-CzechWMT’15Results
Systems BLEUWinningentry(Bojar&Tamchyna,2015) 18.8
Exis8ngword-levelNMT(Jeanetal.,2015)
Singlemodel 15.7
Ensemble4models 18.3
Ourcharacter-basedNMT
Singlemodel(600-stepbackprop) 15.9
OurhybridNMT
Singlemodel 19.6
Largevocab+unkreplace
30xdata3systems
NewSOTA!
5/19/16 ThangLuong-NeuralMachineTranslaHon 97
![Page 98: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/98.jpg)
English-CzechWMT’15Results
Systems BLEUWinningentry(Bojar&Tamchyna,2015) 18.8
Exis8ngword-levelNMT(Jeanetal.,2015)
Singlemodel 15.7
Ensemble4models 18.3
Ourcharacter-basedNMT
Singlemodel(600-stepbackprop) 15.9
OurhybridNMT
Singlemodel 19.6
Ensemble4models 20.7
Largevocab+unkreplace
30xdata3systems
BemerSOTA!
5/19/16 ThangLuong-NeuralMachineTranslaHon 98
![Page 99: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/99.jpg)
EffectsofVocabularySizes
02468101214161820
1K 10K 20K 50K
BLEU
VocabularySize
Word Word+unkreplace Hybrid
5/19/16 ThangLuong-NeuralMachineTranslaHon 99
![Page 100: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/100.jpg)
EffectsofVocabularySizes
02468101214161820
1K 10K 20K 50K
BLEU
VocabularySize
Word Word+unkreplace Hybrid
AddiHonalgainsof+2.1↦+11.4BLEU
+11.4
+4.5+3.5
+2.1
5/19/16 ThangLuong-NeuralMachineTranslaHon 100
![Page 101: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/101.jpg)
EffectsofVocabularySizes
02468101214161820
1K 10K 20K 50K
BLEU
VocabularySize
Word Word+unkreplace Hybrid
Small-vocabhybrid=Large-vocabword5/19/16 ThangLuong-NeuralMachineTranslaHon 101
![Page 102: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/102.jpg)
RareWordEmbeddings
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
acceptable
acknowledgement
admissionadmitadmittanceadmitting
advance
antagonistchoose
chooses
connect
decide
developdevelopments
evidentlyexplicit
founder
governance
immobileimmoveable
impossible
insensitiveinsufficiency
linkmanagement
necessary
nominated
noticeable
obvious
perceptible
possible
practice
satisfactory
sponsor
unacceptable
unaffected
uncomfortableunsatisfactory
unsuitable
antagonize
cofounderscompanionships
disrespectful
heartlesslyheartlessness
illiberal
impossibilitiesinabilities
loveless
narrowïmindednarrowïmindedness nonconscious
regretful
spiritless
unattainableness
unconcern
uncontroversial
unfeatheredunfledged
ungraceful
unrealizable
unsighted
untrustworthy
wholeheartedness
• Word&character-basedembeddings.5/19/16 ThangLuong-NeuralMachineTranslaHon 102
![Page 103: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/103.jpg)
RareWordEmbeddings
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
acceptable
acknowledgement
admissionadmitadmittanceadmitting
advance
antagonistchoose
chooses
connect
decide
developdevelopments
evidentlyexplicit
founder
governance
immobileimmoveable
impossible
insensitiveinsufficiency
linkmanagement
necessary
nominated
noticeable
obvious
perceptible
possible
practice
satisfactory
sponsor
unacceptable
unaffected
uncomfortableunsatisfactory
unsuitable
antagonize
cofounderscompanionships
disrespectful
heartlesslyheartlessness
illiberal
impossibilitiesinabilities
loveless
narrowïmindednarrowïmindedness nonconscious
regretful
spiritless
unattainableness
unconcern
uncontroversial
unfeatheredunfledged
ungraceful
unrealizable
unsighted
untrustworthy
wholeheartedness
• Word&character-basedembeddings.5/19/16 ThangLuong-NeuralMachineTranslaHon 103
![Page 104: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/104.jpg)
RareWordEmbeddings
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
acceptable
acknowledgement
admissionadmitadmittanceadmitting
advance
antagonistchoose
chooses
connect
decide
developdevelopments
evidentlyexplicit
founder
governance
immobileimmoveable
impossible
insensitiveinsufficiency
linkmanagement
necessary
nominated
noticeable
obvious
perceptible
possible
practice
satisfactory
sponsor
unacceptable
unaffected
uncomfortableunsatisfactory
unsuitable
antagonize
cofounderscompanionships
disrespectful
heartlesslyheartlessness
illiberal
impossibilitiesinabilities
loveless
narrowïmindednarrowïmindedness nonconscious
regretful
spiritless
unattainableness
unconcern
uncontroversial
unfeatheredunfledged
ungraceful
unrealizable
unsighted
untrustworthy
wholeheartedness
• Word&character-basedembeddings.5/19/16 ThangLuong-NeuralMachineTranslaHon 104
![Page 105: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/105.jpg)
SampleEnglish-Czechtransla/ons
source TheauthorStephenJayGoulddied20yearsalerdiagnosis.human AutorStephenJayGouldzemřel20letpodiagnóze.
char AutorStepherStepherzemřel20letpodiagnóze.
wordAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpopo.
hybridAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpodiagnóze.
PerfecttranslaHon!
5/19/16 ThangLuong-NeuralMachineTranslaHon 105
![Page 106: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/106.jpg)
SampleEnglish-Czechtransla/ons
• Char-based:wrongnametranslaHon.• Char-based:wrongnametranslaHon.
source TheauthorStephenJayGoulddied20yearsalerdiagnosis.human AutorStephenJayGouldzemřel20letpodiagnóze.
char AutorStepherStepherzemřel20letpodiagnóze.
wordAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpopo.
hybridAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpodiagnóze.
5/19/16 ThangLuong-NeuralMachineTranslaHon 106
![Page 107: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/107.jpg)
SampleEnglish-Czechtransla/ons
• Word-based:incorrectalignment• Char-based:wrongnametranslaHon.
source TheauthorStephenJayGoulddied20yearsalerdiagnosis.human AutorStephenJayGouldzemřel20letpodiagnóze.
char AutorStepherStepherzemřel20letpodiagnóze.
wordAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpopo.
hybridAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpodiagnóze.
5/19/16 ThangLuong-NeuralMachineTranslaHon 107
![Page 108: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/108.jpg)
SampleEnglish-Czechtransla/ons
• Char-based&hybrid:correcttranslaHonofdiagnóze.• Char-based:wrongnametranslaHon.
source TheauthorStephenJayGoulddied20yearsalerdiagnosis.human AutorStephenJayGouldzemřel20letpodiagnóze.
char AutorStepherStepherzemřel20letpodiagnóze.
wordAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpopo.
hybridAutorStephenJay<unk>zemřel20letpo<unk>.
AutorStephenJayGouldzemřel20letpodiagnóze.
5/19/16 ThangLuong-NeuralMachineTranslaHon 108
![Page 109: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/109.jpg)
SampleEnglish-Czechtransla/ons
source AstheReverendMar>nLutherKingJr.saidfiTyyearsago:human Jakpředpadesá/letyřeklreverendMar/nLutherKingJr.:
char JakoreverendMar/nLutherkrálříkalpředpadesá>lety:
wordJakřeklreverendMarHn<unk>King<unk>předpadesáHlety:
JakřeklreverendMar/nLutherKingřeklpředpadesá>lety:
hybridJakřeklreverendMarHn<unk>King<unk>předpadesáHlety:
Jakpředpadesá/letyřeklreverendMar/nLutherKingJr.:
Correctreordering!
5/19/16 ThangLuong-NeuralMachineTranslaHon 109
![Page 110: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/110.jpg)
source AstheReverendMar>nLutherKingJr.saidfiTyyearsago:human Jakpředpadesá/letyřeklreverendMar/nLutherKingJr.:
char JakoreverendMar/nLutherkrálříkalpředpadesá>lety:
wordJakřeklreverendMarHn<unk>King<unk>předpadesáHlety:
JakřeklreverendMar/nLutherKingřeklpředpadesá>lety:
hybridJakřeklreverendMarHn<unk>King<unk>předpadesáHlety:
Jakpředpadesá/letyřeklreverendMar/nLutherKingJr.:
SampleEnglish-Czechtransla/ons
• Char-based:“král”means“king”.• Char-based:wrongnametranslaHon.5/19/16 ThangLuong-NeuralMachineTranslaHon 110
![Page 111: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/111.jpg)
SampleEnglish-Czechtransla/ons
source Her11-year-olddaughter,ShaniBart,saiditfeltalivlebitweird
human Jejíjedenác/letádceraShaniBartováprozradila,žejetotrochuzvláštní
char Jejíjedenác/letádcera,ShaniBartová,říkala,žecí�trochudivně
wordJejí<unk>dcera<unk><unk>řekla,žejetotrochudivné
Její11-year-olddceraShani,řekla,žejetotrochudivné
hybridJejí<unk>dcera,<unk><unk>,řekla,žejeto<unk><unk>
Jejíjedenác/letádcera,GrahamBart,řekla,žecí�trochudivný
Generatecomplexwords!
5/19/16 ThangLuong-NeuralMachineTranslaHon 111
![Page 112: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/112.jpg)
SampleEnglish-Czechtransla/ons
source Her11-year-olddaughter,ShaniBart,saiditfeltalivlebitweird
human Jejíjedenác/letádceraShaniBartováprozradila,žejetotrochuzvláštní
char Jejíjedenác/letádcera,ShaniBartová,říkala,žecí�trochudivně
wordJejí<unk>dcera<unk><unk>řekla,žejetotrochudivné
Její11-year-olddceraShani,řekla,žejetotrochudivné
hybridJejí<unk>dcera,<unk><unk>,řekla,žejeto<unk><unk>
Jejíjedenác/letádcera,GrahamBart,řekla,žecí�trochudivný
• Word-based:idenHtycopyfails.• Char-based:wrongnametranslaHon.5/19/16 ThangLuong-NeuralMachineTranslaHon 112
![Page 113: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/113.jpg)
SampleEnglish-Czechtransla/ons
source Her11-year-olddaughter,ShaniBart,saiditfeltalivlebitweird
human Jejíjedenác/letádceraShaniBartováprozradila,žejetotrochuzvláštní
char Jejíjedenác/letádcera,ShaniBartová,říkala,žecí�trochudivně
wordJejí<unk>dcera<unk><unk>řekla,žejetotrochudivné
Její11-year-olddceraShani,řekla,žejetotrochudivné
hybridJejí<unk>dcera,<unk><unk>,řekla,žejeto<unk><unk>
Jejíjedenác/letádcera,GrahamBart,řekla,žecí�trochudivný
• Hybrid:translatenamesincorrectly.• Char-based:wrongnametranslaHon.5/19/16 ThangLuong-NeuralMachineTranslaHon 113
![Page 114: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/114.jpg)
WehaveadvancedNMT
• #1:thevocabularysizeproblem– Sol:“copy”mechanism.– SOTAEnglish-French
• #2:thesentencelengthproblem– Sol:avenHonmechanism.– SOTAEnglish-German
• #3:thelanguagecomplexityproblem– Sol:character-leveltranslaHon.– SOTAEnglish-Czech
5/19/16 ThangLuong-NeuralMachineTranslaHon 114
The<unk>porHcoin<unk>
Le<unk><unk>de<unk>
ecotax Pont-de-Buis
por8que écotaxe Pont-de-Buis
am a student _ Je suis étudiant
Je suis étudiant _
I
![Page 115: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/115.jpg)
NMT&beyond
• UnsupervisedlearningforNMT– UHlizemonolingualdata.
• Long-contextNMT– TranslaHnganarHcle/abook.– SmarteravenHon,longersequences.
• MulH-modalLanguageUnderstandingSystem– MulH-lingualtranslaHon+speechrecogniHon+more– MulH-tasklearning
Thankyou!
5/19/16 ThangLuong-NeuralMachineTranslaHon 115
![Page 116: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/116.jpg)
I’mdone.Butifyouarecurious,readon!
5/19/16 ThangLuong-NeuralMachineTranslaHon 116
![Page 117: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/117.jpg)
• CanweuHlizeallsequence-to-sequencedata?
• CanwecompressNMTformobiledevices?
#4ForthefutureofNMT
MachinetranslaHon ConsHtuentparsing
5/19/16 ThangLuong-NeuralMachineTranslaHon 117
![Page 118: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/118.jpg)
Ourwork
• MulH-tasklearning:– MachinetranslaHon –ConsHtuentparsing– ImagecapHongeneraHon–Unsupervisedlearning
• TranslaHonimprovement:upto+1.5BLEU.
• State-of-the-artinconsHtuentparsing.
ThangLuong,QuocLe,IlyaSutskever,OriolVinyals,andLukaszKaiser.Mul>-tasksequencetosequencelearning.ICLR2016.
5/19/16 ThangLuong-NeuralMachineTranslaHon 118
![Page 119: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/119.jpg)
Many-to-one:shareddecoder
English (unsupervised)
Image (captioning) English
German (translation)
1616.517
17.518
18.519
German-EnglishTransla/on(BLEU)
Big(transla>on)+Medium(cap>on)
(Luongetal.,2015)
Single
+capHon(0.1x)
+capHon(0.05x)
+capHon(0.01x)
+0.7BLEU
5/19/16 ThangLuong-NeuralMachineTranslaHon 119
![Page 120: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/120.jpg)
1313.514
14.515
15.516
English-GermanTransla/on(BLEU)
Big(transla>on)+Small(PTBparsing)
(Luongetal.,2015)
Single
+parsing(1x)
+parsing(0.1x)
+parsing(0.01x)
+1.5BLEU
English (unsupervised)
German (translation)
Tags (parsing)English
One-to-many:sharedencoder
MixingraHo5/19/16 ThangLuong-NeuralMachineTranslaHon 120
![Page 121: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/121.jpg)
Ourwork
AbigailSee*,ThangLuong*,andChrisManning.CompressionofNeuralMachineTransla>onModelsviaPruning.Insubmission.
• CompressNMTviapruning&retraining:
0 10 20 30 40 50 60 70 80 90
0
5
10
15
20
25
percentage pruned
BLEU
score
pruned
pruned and retrained
Figure 1: Performance of pruned models, immediately after pruning and after
retraining.
1
Originalmodel
Prunesmallestweights
Prune+retrain
5/19/16 ThangLuong-NeuralMachineTranslaHon 121
![Page 122: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/122.jpg)
Ourwork• CompressNMTviapruning&retraining:
0 10 20 30 40 50 60 70 80 90
0
5
10
15
20
25
percentage pruned
BLEU
score
pruned
pruned and retrained
Figure 1: Performance of pruned models, immediately after pruning and after
retraining.
1
Originalmodel
Prunesmallestweights
Prune+retrain
Prune80%withoutlossofperformance.5/19/16 ThangLuong-NeuralMachineTranslaHon 122
![Page 123: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/123.jpg)
NMTRedundancy–Embeddings
• Frequentwordshavelargerweights– white:large.– black:small.
target embedding weights
source embedding weights
least common wordmost common word
source layer 1 weights
recurrentfeed-forward
input gate
forget gate
output gate
input
source layer 2 weights source layer 3 weights source layer 4 weights
target layer 1 weights target layer 2 weights target layer 3 weights target layer 4 weights
5/19/16 ThangLuong-NeuralMachineTranslaHon 123
![Page 124: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/124.jpg)
target embedding weights
source embedding weights
least common wordmost common word
source layer 1 weights
recurrentfeed-forward
input gate
forget gate
output gate
input
source layer 2 weights source layer 3 weights source layer 4 weights
target layer 1 weights target layer 2 weights target layer 3 weights target layer 4 weights
NMTRedundancy–LSTMLayer1 Layer2 Layer3 Layer4
Encoder
Decoder
Inputgate
Forgetgate
Outputgate
Inputsignal
Feed-forward Recurrent
5/19/16 ThangLuong-NeuralMachineTranslaHon 124
![Page 125: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/125.jpg)
target embedding weights
source embedding weights
least common wordmost common word
source layer 1 weights
recurrentfeed-forward
input gate
forget gate
output gate
input
source layer 2 weights source layer 3 weights source layer 4 weights
target layer 1 weights target layer 2 weights target layer 3 weights target layer 4 weights
NMTRedundancy–LSTMLayer1 Layer2 Layer3 Layer4
Encoder
Decoder
Inputgate
Forgetgate
Outputgate
Inputsignal
Feed-forward Recurrent
Inputsignalisimportant!
5/19/16 ThangLuong-NeuralMachineTranslaHon 125
![Page 126: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/126.jpg)
target embedding weights
source embedding weights
least common wordmost common word
source layer 1 weights
recurrentfeed-forward
input gate
forget gate
output gate
input
source layer 2 weights source layer 3 weights source layer 4 weights
target layer 1 weights target layer 2 weights target layer 3 weights target layer 4 weights
NMTRedundancy–LSTMLayer1 Layer2 Layer3 Layer4
Encoder
Decoder
Inputgate
Forgetgate
Outputgate
Inputsignal
Feed-forward Recurrent
Forgetgate:small–layer1large–layer4
5/19/16 ThangLuong-NeuralMachineTranslaHon 126
![Page 127: Neural Machine Transla#on](https://reader031.vdocuments.site/reader031/viewer/2022021815/586a10fd1a28abd97c8bb185/html5/thumbnails/127.jpg)
FutureChallenges
Shesawanelephantinherdress.Theelephantmusthaveagood
senseoffashion!Needstounderstand
commonsense&largercontext.
Shesawanelephantinherdress.
5/19/16 ThangLuong-NeuralMachineTranslaHon 127
Thankyou!