unsupervised syntactic category induction using multi-level linguistic features christos...
TRANSCRIPT
![Page 1: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/1.jpg)
Unsupervised Syntactic Category Induction using Multi-level Linguistic Features
Christos ChristodoulopoulosPre-viva talk
![Page 2: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/2.jpg)
2
What did we ever do for the linguists?
• Computational linguistic (or NLP) models– Mostly supervised (until recently)– Mostly on English (English ≈ WSJ sections 02-21)– Mostly on the “fat head” of Zipf’s law
![Page 3: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/3.jpg)
3
Revolutionaries (without supervision)
• NLP is good at spotting patterns– Unsupervised learning– Machine Learning
• Not great at looking at the whole picture
![Page 4: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/4.jpg)
4
The “whole” picture
Parts of Speech Syntax AlignmentsMorphology
Traditional NLP Pipeline
![Page 5: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/5.jpg)
5
The “whole” picture
Parts of Speech
Syntax Alignments
Morphology
Geer
tzen
& van
Zaan
en (2
004)
Klein & Manning (2004)
Virpioja et al. (2007)
Snyder et al. (2009)
Naseem et al. (2009)Sirts & Alumäe (2012)
Clark (2003)
![Page 6: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/6.jpg)
6
My thesis
• Patterns that correspond to syntactic categories or parts of speech (PoS)– Motivated by linguistic theories
• Holistic view of NLP– Instead of the pipeline approach– Computationally efficient
• Cross-lingual analysis– Might provide linguistic insights
![Page 7: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/7.jpg)
7
My thesis
0 1 2 3 4 5 6 7 8 9 11 12 13 15 16 17 18 19 20 21 22 23 25 26 27 28 29 30 31 32 33 3410 14 35
0 1 2 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 25 26 28 30 313 8 27 29 32 33 34 35
τουτο, τι, εαν, τις, ιδου, ομως, οτε,
παντες, συ, αυτος
Κυριος, Θεος, βασιλευς, υιος, Ιησους, λαος,
ανθρωπος, Μωυσης, λογος, ιερευς
αυτο, εκει, πλεον, ουχι, εκαστος,
τουτον, ταυτην, ουδεν, ετη, πυρ
he, it, there, she, whosoever, soon,
others, Satan, whoso, Elias
man, day, time, city, place, thing, priest,
woman, wicked, spirit
do, make, give, take, know, bring,
eat, see, hear, keep
εισθαι, καμει, δωσει, φερει, λαβει, ελθει,
ειπει, γνωρισει, γεινει, προσφερει
ηναι, δυναται, γεινη, πρεπει, ελθη, καμη, δωση, καμω,
ιδη, λαβη
this, what, if, whoever, there, but, when,
everyone, you, himis, does, gives, brings,
comes, takes, says, knows,becomes,offers
Lord, God, king, son, Jesus, people, man, Moses, word, priest
this, there, since, not, whosoever, him, her,
none, fire
be, can, become, must, come, do, give,
make, see, give
![Page 8: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/8.jpg)
8
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 9: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/9.jpg)
9
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 10: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/10.jpg)
10
Theory of syntactic categories
• Not everyone agrees on what they are• Syntactic categories/PoS/Word classes?• Most agree that they capture more than one
levels of language structure
Semantic (noun, verb)
Morphological (conjunction)
Plato, Aristotle
8 parts of speech Semantic, syntactic & morphological
Dionysius Thrax
‘School account’ of 9 parts of speech
Semantic & syntactic
Lindley Murray
Feature-based (e.g. ±Subject) Purely syntactic
Ray Jackendoff
Formal semantics <e,t>: nouns, adjectives, intr. verbs
Susan Schmerling Notional (pragmatic) definition
Morphological, syntactic & distributionalPaul Schachter
Influence to my work:• Not easy to focus on any particular theory• Multiple sources of information is key
![Page 11: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/11.jpg)
11
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 12: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/12.jpg)
12
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 13: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/13.jpg)
13
Evaluation
• How will we know whether we have found good clusters?
• Intrinsic– Test on existing PoS tagged data (gold-standard)
• Extrinsic– Use clusters as input to another task
• Both have issues when used with unsupervised methods
![Page 14: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/14.jpg)
14
Intrinsic evaluation
• Clusters might not correspond to PoS– Cluster IDs instead of labels– Different sizes
• Gold-standard follows specific linguistic theories• Gold-standard might not help downstream
– Annotations are tuned to specific tasks
![Page 15: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/15.jpg)
15
The (intrinsic) elephant in the room
• Cyclical problem– Trying to discover clusters that don’t (necessarily)
correspond to gold-standard annotations– Evaluate them on gold-standard annotations
• (Compromising) Solution: Test on multiple languages, using multiple systems– There is going to be some overlap
![Page 16: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/16.jpg)
16
Extrinsic evaluation
• “Passing the buck” to the next task– If unsupervised and intrinsically evaluated
• Performances might not be correlated – Intrinsic gains on task #1 do not correlate with
gains on task #2 (Headden et al., 2008)• Depends on the degree of integration
– How much of task #2’s input is the output of #1
![Page 17: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/17.jpg)
17
Evaluation
• Intrinsic evaluation metrics– Mapping
• Many-to-one (m-1), one-to-one, cross-validation• Widely used• Sensitive to size of induced tagset
– Information-theoretic• Variation of information (vi), V-measure (vm)• Less sensitive• Less intuitive (especially vi)
![Page 18: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/18.jpg)
18
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 19: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/19.jpg)
19
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
Multiple systems examined
Average performance on 8 languages
(2010 review)
Average performance on 22 languages(2011 review)
tl;dr: Christodoulopoulos et al., 2010; 2011
Most successful properties:• Use of morphology features• One cluster per word type
![Page 20: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/20.jpg)
20
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 21: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/21.jpg)
21
Bayesian Multinomial Mixture Model (BMMM)
• Three key properties:– One tag per word
• Helpful for unsupervised systems
– Mixture Model (instead of HMM)• Easier to handle non-local features
– Easy to add multiple features• e.g. morphology, alignments
SPOILER ALERT!!!
I’ll be adding dependencies
![Page 22: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/22.jpg)
22
z
θ α
M
fnj
φ
Z
β
For each word type i
choose a class zi
(conditioned on θ)
For each word token j
choose a feature fij
(conditioned on φi)
BMMMBasic model structure
![Page 23: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/23.jpg)
23
BMMMExtended model
f(T)
z
θ
φ(T) β(T)
α
Mnj Z
m
φ(m)
Z
f(1)
nj
φ(1)
Z
β(1)
β(m)
. . . . . . . . .
Type-level features
Token-level features
![Page 24: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/24.jpg)
24
Development results(averaged over 8 languages)
base +morph avg. Al. Best Al. +morph50
52
54
56
58
60
62
64
66
68
53
53.5
54
54.5
55
55.5
56
m-1 vm
m1
scor
e
vm sc
ores
* *
![Page 25: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/25.jpg)
25
Final Results(using +morph system)
hcd clark bmmm0
10
20
30
40
50
60
70
average multext average misc average conll wsj
vm sc
ore
![Page 26: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/26.jpg)
26
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS
![Page 27: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/27.jpg)
27
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information
• Alignment method based on PoS– Interdependence of linguistic structure
![Page 28: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/28.jpg)
28
Putting the syntax in syntactic categories
• Induced dependencies as BMMM features• DMV (Klein & Manning, 2004)
– Basis for most dependency parsers– Uses parts of speech as terminal nodes
• Proxy for a joint model– Induce PoS that help DMV induce dependencies…– …that help induce better PoS…– …repeat…
For example: Cohen & Smith (2009); Headden et al. (2009); Gillenwater et al. (2010); Blunsom
& Cohn (2011);Spitkovsky et al. (2010a,b,
2011a,b,c)
![Page 29: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/29.jpg)
29
The Iterated Learning (IL) Model
ly ing ed ingly ity …
counting
is
simply
word
this
can
BMMM
MORPHOLOGY
is word context the a …
counting
is
simply
word
this
can
is word context the a …
counting
is
simply
word
this
can
L+R CONTEXT
DMV
This/32 is/12 a/32 tagged/1 sentence/28 ./0
This/32 is/12 another/32 tagged/1 sentence/28 ./0
…
ly ing ed ingly ity …
counting
is
simply
word
this
can
BMMM
MORPHOLOGY
is word context the a …
counting
is
simply
word
this
can
is word context the a …
counting
is
simply
word
this
can
L+R CONTEXT
is word context the a …
counting
is
simply
word
this
can
DEPENDENCIES
Experiments on WSJ and CoNLL
(≤10 words)
![Page 30: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/30.jpg)
30
IL – DependenciesWSJ10 Results
0 1 2 3 4 5 6 7 8 9 1050
52
54
56
58
60
62
64
66
68
70
BMMM M-1DMV Undir
Iteration
![Page 31: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/31.jpg)
31
IL – DependenciesAverage over 9 CoNLL languages (≤10 words)
0 1 2 3 4 540
45
50
55
60
65
70
BMMM M-1DMV UndirGold M-1
Iteration
***
![Page 32: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/32.jpg)
32
IL – DependenciesShortcomings
• DMV is not state of the art – Best systems surpass it by more than 15%
accuracy (for WSJ)
• Trained/tested only on ≤10-word sentences– Hard to compare the PoS inducer– Not realistic
Replace DMV component with a state-of-the-art system (TSG-DMV)
Use full-length-sentence corpora for training and testing
Results not shown here. tl;dr: Slightly better results on PoS, much worse on deps.Interesting for further discussion.
![Page 33: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/33.jpg)
33
Using full-length sentencesAverage over 9 CoNLL languages
0 1 2 3 4 530
35
40
45
50
55
60
65
70
75
BMMM M-1DMV Undir
Iteration
***
*
![Page 34: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/34.jpg)
34
Recap
• Both BMMM and DMV improve– Mostly in the first few iterations
• Using full-length sentences:– Increase in BMMM above system with gold deps– DMV close to performance with gold PoS
(but lower than ≤10-word case)
![Page 35: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/35.jpg)
35
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information– Interdependence of linguistic structure
• Alignment method based on PoS
![Page 36: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/36.jpg)
36
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information– Interdependence of linguistic structure
• Alignment method based on PoS
![Page 37: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/37.jpg)
37
Further IL experiments
• Giza++ (Och & Ney, 2000)– Extension of the IBM 1-4 models (Brown et al. 1993)– Uses ‘word classes’ to condition alignment prob.
• Can be replaced with BMMM
• Hansards English-French corpus– Manually annotated alignments for 500 sentences
• MULTEXT-East corpus– 1984 Novel in 8 languages (incl. English)
![Page 38: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/38.jpg)
38
IL – Alignments Hansards corpus
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2065
66
67
68
69
70
71
72 0.19
0.2
0.21
0.22
0.23
0.24
0.25
0.26
0.270.5k M-1 1k M-10.5k AER 1k AER
Iteration
M-1
AER
![Page 39: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/39.jpg)
39
IL – AlignmentsMULTEXT-East corpus
0 1 2 3 4 550
52
54
56
58
60
62
64
66
68
70
m-1vm
Iteration
![Page 40: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/40.jpg)
40
Recap
• Iterated learning between PoS and X– X = {Dependencies, Alignments, Morphology}– Effective proxy for joint inference
• PoS induction helped by all other levels– A test for theories of PoS– A joint model of NLP
![Page 41: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/41.jpg)
41
A joint model of NLP
Parts of Speech
Syntax Alignments
Morphology
![Page 42: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/42.jpg)
42
Induction chains
BMMM
Deps BMMM
Deps BMMM …
Morph BMMM …
Aligns BMMM …
Morph BMMM
Deps BMMM …
Morph BMMM …
Align BMMM …
Aligns BMMM
Deps BMMM …
Morph BMMM …
Align BMMM …
BMMM Morph BMMM Morph BMMM …
BMMM Deps BMMM Deps BMMM …
BMMM Aligns BMMM Aligns BMMM …
BMMM Deps
BMMM Morph
BMMM Aligns
![Page 43: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/43.jpg)
43
Induction chains
baseline aligns-deps-morph aligns-morph-deps deps-aligns-morph deps-morph-aligns morph-aligns-deps morph-deps-aligns62
64
66
68
70
72
74
76
78
80
82
50
52
54
56
58
60
62
64
66
68
70
en m-1 bg m-1 en vm bg vm
m-1
scor
e
vm sc
ore
****
***
***
*
*
**
![Page 44: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/44.jpg)
44
What do we need?
• Theory of syntactic categories– What are we looking for?
• Clustering method – Review of existing methods– Multiple sources of information– Interdependence of linguistic structure
• Alignment method based on PoS
• Massively parallel corpus• Bible translations
– Collected from online versions of the Bible– Cleaned-up and verse-aligned (CES level 1 XML)– 100 languages
And one more thing:
![Page 45: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/45.jpg)
45
Cross-lingual clusters
0 1 2 3 4 5 6 7 8 9 11 12 13 15 16 17 18 19 20 21 22 23 25 26 27 28 29 30 31 32 33 3410 14 35
3 8 27 29 320 1 2 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 25 26 28 30 31 33 34 35
τουτο, τι, εαν, τις, ιδου, ομως, οτε,
παντες, συ, αυτος
Κυριος, Θεος, βασιλευς, υιος, Ιησους, λαος,
ανθρωπος, Μωυσης, λογος, ιερευς
αυτο, εκει, πλεον, ουχι, εκαστος,
τουτον, ταυτην, ουδεν, ετη, πυρ
he, it, there, she, whosoever, soon,
others, Satan, whoso, Elias
man, day, time, city, place, thing, priest,
woman, wicked, spirit
do, make, give, take, know, bring,
eat, see, hear, keep
εισθαι, καμει, δωσει, φερει, λαβει, ελθει,
ειπει, γνωρισει, γεινει, προσφερει
ηναι, δυναται, γεινη, πρεπει, ελθη, καμη, δωση, καμω,
ιδη, λαβη
this, what, if, whoever, there, but, when,
everyone, you, himis, does, gives, brings,
comes, takes, says, knows,becomes,offers
Lord, God, king, son, Jesus, people, man, Moses, word, priest
this, there, since, not, whosoever, him, her,
none, fire
[SBJV]be, can, become,
must, come, do, give, make, see, give
Subjunctive moodο δε Κυριος αςκαμη το αρεστον the and Lord let (he)
do the pleasing‘and the Lord do [that which seemeth him] good’
3rd person present tenseθελει να
καμει(she) wants to
(she) make‘she wants to make’
• Unsupervised syntactic category induction– Theory of syntactic categories– Review of systems/evaluation metrics
• Iterated learning & Induction chains– Holistic view of NLP (no more pipelines!)
• Cross-lingual clusters– Tool for linguistic enquiry– Reveal similarities/differences across languages
tl;dr: My thesis
![Page 46: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/46.jpg)
46
Where can we go from here?
• Fully joint models– Preliminary attempts for PoS & Dependencies
• Evaluation methods– Non-gold-standard based (Smith, 2012)
• “Syntactically aware” categories – CCG type induction (Bisk & Hockenmaier, 2012)
• Linguistic analysis– Invite the Romantics back!
THE END
![Page 47: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/47.jpg)
49
A fully joint model
• Maximise jointly the distributions over PoS and dependency trees– Run a full training step of DMV every time BMMM
samples a new PoS sequence– Intractable
• Solution:– Train DMV on partial trees (up to a depth d)
• Comparable results with best IL models(also, still quite slow)
![Page 48: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/48.jpg)
50
TSG-DMVBlunsom & Cohn (2010)
• Tree Substitution Grammar– CFG subset of LTAG– Lexicalised
• Eisner’s (2000) split-head constructions– Allows for modelling longer-range dependencies
• Pitman-Yor process (Teh, 2006) over TSG trees
![Page 49: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/49.jpg)
51
IL – TSG-DMVAverage over 9 CoNLL languages (≤10 words)
0 1 2 3 4 540
45
50
55
60
65
70
75
BMMM M-1 TSG-DMV Undir
Gold M-1 Gold Undir
Iteration
![Page 50: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/50.jpg)
52
Using full-length sentences – TSG-DMVAverage over 9 CoNLL languages
0 1 2 3 4 530
35
40
45
50
55
60
65
70
75
BMMM M-1 TSG-DMV UndirGold M-1 Gold Undir
Iteration
![Page 51: Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk](https://reader030.vdocuments.site/reader030/viewer/2022032801/56649ddf5503460f94ad9322/html5/thumbnails/51.jpg)
53
What are the gold-standard PoS capturing?
What purpose does PoS annotation serve?
What ARE parts of speech?
Why do we need PoS?
Why am I doing this?
Why are we here?
What is the meaning of life the
universe and everything?
What is the meaning of 42?