multilingual grammar induction with continuous language...

24
Multilingual Grammar Induction with Continuous Language Identification Wenjuan Han, Ge Wang , Yong Jiang, Kewei Tu ShanghaiTech University, Shanghai, China Alibaba Group {hanwj,wangge,tukw}@shanghaitech.edu.cn {yongjiang.jy}@alibaba-inc.com November 9, 2019 Han et al., 2019 M-NDMV November 9, 2019 1 / 24

Upload: others

Post on 24-Jan-2021

38 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Multilingual Grammar Induction with ContinuousLanguage Identification

Wenjuan Han, Ge Wang, Yong Jiang, Kewei Tu

ShanghaiTech University, Shanghai, ChinaAlibaba Group

{hanwj,wangge,tukw}@shanghaitech.edu.cn{yongjiang.jy}@alibaba-inc.com

November 9, 2019

Han et al., 2019 M-NDMV November 9, 2019 1 / 24

Page 2: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Outline

1 Motivation

2 Model

3 Learning

4 Experiments

5 Conclusion

Han et al., 2019 M-NDMV November 9, 2019 2 / 24

Page 3: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Motivation

Outline

1 Motivation

2 Model

3 Learning

4 Experiments

5 Conclusion

Han et al., 2019 M-NDMV November 9, 2019 3 / 24

Page 4: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Motivation

Grammar induction is the task to learn grammars form unannotatedcorpus.

Han et al., 2019 M-NDMV November 9, 2019 4 / 24

Page 5: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Motivation

Grammar induction is the task to learn grammars form unannotatedcorpus.

Multilingual grammar induction couples grammar parameters ofdifferent languages together and learns them simultaneously.

Han et al., 2019 M-NDMV November 9, 2019 5 / 24

Page 6: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Motivation

Grammar induction is the task to learn grammars form unannotatedcorpus.

Multilingual grammar induction couples grammar parameters ofdifferent languages together and learns them simultaneously.

→ The key is to exploit the similarities between languages.

Han et al., 2019 M-NDMV November 9, 2019 6 / 24

Page 7: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Motivation

Existing approaches to tackle this problem:

Treating languages equally (Iwata et al., 2010).

Utilizing hand-crafted phylogenetic tree to encode this kind ofinformation (Berg-Kirkpatrick and Klein, 2010).

Han et al., 2019 M-NDMV November 9, 2019 7 / 24

Page 8: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Motivation

Existing approaches to tackle this problem:

Treating languages equally (Iwata et al., 2010). → Languagesimilarity ignored.

Utilizing hand-crafted phylogenetic tree to encode this kind ofinformation (Berg-Kirkpatrick and Klein, 2010). → Need linguisticknowledge and sometimes could be misleading. Example: English isdominant SVO while German is not, although they are both Germaniclanguages.

Han et al., 2019 M-NDMV November 9, 2019 8 / 24

Page 9: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Model

Outline

1 Motivation

2 Model

3 Learning

4 Experiments

5 Conclusion

Han et al., 2019 M-NDMV November 9, 2019 9 / 24

Page 10: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Model

We represent language identities with continuous vectors i.e., languageembeddings and use them to encode language similarity.

Han et al., 2019 M-NDMV November 9, 2019 10 / 24

Page 11: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Model

Model Architecture

0

0.125

0.25

0.375

0.5

0

0.15

0.3

0.45

0.6

val

⌦Wdir⌦

Multilingual Grammar Model

(G)

Auxiliary Language Identification

Model (I)

language embedding

matrix

h

x1 x2 x3 xn

Pattach(c|·)

⌦ Wc

l

P (l|x)

Neural DMV grammar rule probability:PATTACH(child |head , direction, valence)

Han et al., 2019 M-NDMV November 9, 2019 11 / 24

Page 12: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Model

Model Architecture

0

0.125

0.25

0.375

0.5

0

0.15

0.3

0.45

0.6

val

⌦Wdir⌦

Multilingual Grammar Model

(G)

Auxiliary Language Identification

Model (I)

language embedding

matrix

h

x1 x2 x3 xn

Pattach(c|·)

⌦ Wc

l

P (l|x)

Neural DMV grammar rule probability:PATTACH(child |head , direction, valence)Now we have: PATTACH(child |head , direction, valence, language)

Han et al., 2019 M-NDMV November 9, 2019 12 / 24

Page 13: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Model

Model Architecture

0

0.125

0.25

0.375

0.5

0

0.15

0.3

0.45

0.6

val

⌦Wdir⌦

Multilingual Grammar Model

(G)

Auxiliary Language Identification

Model (I)

language embedding

matrix

h

x1 x2 x3 xn

Pattach(c|·)

⌦ Wc

l

P (l|x)

Predict language identification with language embeddings and sentencerepresentations.

Han et al., 2019 M-NDMV November 9, 2019 13 / 24

Page 14: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Model

Model Architecture

0

0.125

0.25

0.375

0.5

0

0.15

0.3

0.45

0.6

val

⌦Wdir⌦

Multilingual Grammar Model

(G)

Auxiliary Language Identification

Model (I)

language embedding

matrix

h

x1 x2 x3 xn

Pattach(c|·)

⌦ Wc

l

P (l|x)

For each training sentence x(i) from language l :

P(x(i)|Gl (i)), the probability of the training sentence x(i) beinggenerated from grammar Gl (i) .

P(l (i)|x(i)), the probability of correct language identification of x(i).

Han et al., 2019 M-NDMV November 9, 2019 14 / 24

Page 15: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Learning

Outline

1 Motivation

2 Model

3 Learning

4 Experiments

5 Conclusion

Han et al., 2019 M-NDMV November 9, 2019 15 / 24

Page 16: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Learning

Objective

For each training sentence x(i):

P(x(i)|Gl (i)), the probability of the training sentence x(i) beinggenerated from grammar Gl (i) .

P(l (i)|x(i)), the probability of correct language identification of x(i).

The training objective is:

L(Θ) =∑

(x,l)∈D

(logPΘ(x|Gl) + λ logPΘ(l |x)

)

Han et al., 2019 M-NDMV November 9, 2019 16 / 24

Page 17: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Learning

Learning

P(x(i)|Gl (i)) → this term is optimized with EM (Adam used in Mstep).

P(l (i)|x(i)) → Adam to optimize this term.

Han et al., 2019 M-NDMV November 9, 2019 17 / 24

Page 18: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Experiments

Outline

1 Motivation

2 Model

3 Learning

4 Experiments

5 Conclusion

Han et al., 2019 M-NDMV November 9, 2019 18 / 24

Page 19: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Experiments

Dataset

We selected 15 languages across 8 language families and subfamilies fromUD dataset to ensure diversity.

Language UD Treebank Language Family Corpus SizeET Estonian Finnic 11404

FI Finnish Finnic 9648

NL Dutch Germanic 8783

EN English Germanic 7674

DE German Germanic 7447

NO Norwegian Germanic 10017

GRC Ancient Greek Hellenic 9387

HI Hindi Indo-Iranian 4997

JA Japanese Japonic 7441

FR French Romance 4976

IT Italian Romance 6492

LA Latin-ITTB Romance 10136

BG Bulgarian Slavonic 6507

SL Slovenian Slavonic 3800

EU Basque Vasconic 4271

Han et al., 2019 M-NDMV November 9, 2019 19 / 24

Page 20: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Experiments

Comparison of monolingual and multilingual approaches.

G: our multilingual grammar model.

G+I: our multilingual grammar model and auxiliary language identificationtask.

Code Monolingual Multilingual

DMV NDMV DMV NDMV G G+IET 51.8 52.9 43.1 45.3 56.0 56.4FI 31.8 27.6 39.1 40.0 50.7 49.3NL 42.4 35.6 46.5 47.8 50.4 50.6EN 51.8 53.7 47.7 50.8 51.7 52.7DE 52.8 50.4 55.5 57.2 59.6 61.4NO 58.9 59.2 55.7 58.8 61.0 61.3

GRC 40.4 37.7 41.1 40.8 46.8 46.2HI 52.6 53.9 29.2 31.1 47.4 46.8JA 39.8 37.1 27.8 29.6 43.4 44.2FR 58.8 38.1 59.6 59.4 58.4 60.1IT 60.8 63.6 66.7 66.4 64.4 65.9LA 32.6 36.3 39.8 42.0 45.1 45.0BG 58.9 61.8 65.9 69.4 71.3 71.3SL 70.7 67.5 62.1 63.3 68.3 68.6EU 42.1 45.5 45.7 45.2 54.2 53.6

Avg 49.7 48.1 48.4 49.8 55.3 55.6

Each language is indicated by its ISO 639 code.

Han et al., 2019 M-NDMV November 9, 2019 20 / 24

Page 21: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Experiments

Visualization of the language embeddings

Language X Y L-FEstonian 281.8368 294.4548 FinnicLatin 99.28075 -133.8807 RomanceNorwegian -58.05113 -1.362061 GermanicFinnish 291.2706 254.6438 FinnicAncient_Greek 170.2816 -160.4296 HellenicDutch -274.3047 -71.46408 GermanicEnglish -90.90612 -27.69737 GermanicGerman -291.0553 -31.99229 GermanicJapanese 229.3594 -182.7461 JaponicBulgarian 4.288813 -12.05062 SlavonicItalian -306.261 126.6668 RomanceHindi 52.45371 -182.9534 Indo-IranianFrench -291.0037 165.9962 RomanceBasque 116.8331 4.031695 VasconicSlovenian 65.97718 -41.21712 Slavonic

281.8368 294.4548 -127.1413 -453.943799.28075 -133.8807 -186.8575 44.95691

-58.05113 -1.362061 122.555 39.78903291.2706 254.6438 -35.18907 -458.5191170.2816 -160.4296 -267.4718 181.6538

-274.3047 -71.46408 59.31725 274.2233-90.90612 -27.69737 30.11588 78.54933-291.0553 -31.99229 163.4853 224.1084229.3594 -182.7461 -364.3714 279.10694.288813 -12.05062 -10.18006 -51.6754-306.261 126.6668 316.187 197.147152.45371 -182.9534 -279.5474 -87.38526

-291.0037 165.9962 307.6128 92.68643116.8331 4.031695 184.086 -239.275965.97718 -41.21712 87.39928 -121.4219

Estonian

Finnish

Latin

Italian

French

Norwegian

Dutch

EnglishGerman Bulgarian

Slovenian

Ancient_Greek

Japanese

Hindi

Basque

Finnic Romance Germanic Slavonic

Hellenic Japonic Indo-Iranian Vasconic

Han et al., 2019 M-NDMV November 9, 2019 21 / 24

Page 22: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Conclusion

Outline

1 Motivation

2 Model

3 Learning

4 Experiments

5 Conclusion

Han et al., 2019 M-NDMV November 9, 2019 22 / 24

Page 23: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Conclusion

Conclusion

We represent language identities with language embeddings and usethem to encode language similarity.

The language embeddings are used for grammar parameter predictionand auxiliary language identification task.

The language embeddings learned in our model can capture languagesimilarity that can not be inferred from phylogenetic knowledge.

Han et al., 2019 M-NDMV November 9, 2019 23 / 24

Page 24: Multilingual Grammar Induction with Continuous Language ...faculty.sist.shanghaitech.edu.cn/faculty/tukw/emnlp19mult-slides.pdf · Grammar induction is the task to learn grammars

Conclusion

Multilingual Grammar Induction with ContinuousLanguage Identification

Wenjuan Han, Ge Wang, Yong Jiang, Kewei Tu

ShanghaiTech University, Shanghai, ChinaAlibaba Group

{hanwj,wangge,tukw}@shanghaitech.edu.cn{yongjiang.jy}@alibaba-inc.com

November 9, 2019

Han et al., 2019 M-NDMV November 9, 2019 24 / 24