![Page 1: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/1.jpg)
Parsing German with Latent Variable Grammars
Slav Petrov and Dan Klein
UC Berkeley
![Page 2: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/2.jpg)
The Game of Designing a Grammar
Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98] Head lexicalization [Collins ’99, Charniak ’00] Automatic clustering?
![Page 3: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/3.jpg)
Previous Work:Manual Annotation
Manually split categories NP: subject vs object DT: determiners vs demonstratives IN: sentential vs prepositional
Advantages: Fairly compact grammar Linguistic motivations
Disadvantages: Performance leveled out Manually annotated
[Klein & Manning ’03]
Model F1
Naïve Treebank Grammar 72.6
Klein & Manning ’03 86.3
![Page 4: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/4.jpg)
Previous Work:Automatic Annotation Induction
Advantages: Automatically learned:
Label all nodes with latent variables.
Same number k of subcategories for all categories.
Disadvantages: Grammar gets too large Most categories are
oversplit while others are undersplit.
[Matsuzaki et. al ’05, Prescher ’05]
Model F1
Klein & Manning ’03 86.3
Matsuzaki et al. ’05 86.7
![Page 5: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/5.jpg)
[Petrov, Barrett, Thibaux & Klein
in ACL’06]
[Petrov & Klein in NAACL’07]
Overview
Learning: Hierarchical Training Adaptive Splitting Parameter Smoothing
Inference: Coarse-To-Fine Decoding Variational Approximation
German Analysis
![Page 6: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/6.jpg)
Forward
Learning Latent Annotations
EM algorithm:
X1
X2X7X4
X5 X6X3
He was right
.
Brackets are known Base categories are known Only induce subcategories
Just like Forward-Backward for HMMs. Backward
![Page 7: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/7.jpg)
k=16k=8
k=4
k=2
k=160
65
70
75
80
85
90
50 250 450 650 850 1050 1250 1450 1650
Total Number of grammar symbols
Parsing accuracy (F1)
Starting PointLimit of computational resources
![Page 8: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/8.jpg)
Refinement of the DT tag
DT-1 DT-2 DT-3 DT-4
DT
![Page 9: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/9.jpg)
Refinement of the DT tagDT
![Page 10: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/10.jpg)
Hierarchical Refinement of the DT tag
DT
![Page 11: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/11.jpg)
Hierarchical Estimation Results
74
76
78
80
82
84
86
88
90
100 300 500 700 900 1100 1300 1500 1700
Total Number of grammar symbols
Parsing accuracy (F1)
Model F1
Baseline 87.3
Hierarchical Training 88.4
![Page 12: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/12.jpg)
Refinement of the , tag
Splitting all categories the same amount is wasteful:
![Page 13: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/13.jpg)
The DT tag revisited
Oversplit?
![Page 14: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/14.jpg)
Adaptive Splitting
Want to split complex categories more Idea: split everything, roll back splits which
were least useful
![Page 15: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/15.jpg)
Adaptive Splitting
Want to split complex categories more Idea: split everything, roll back splits which
were least useful
![Page 16: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/16.jpg)
Adaptive Splitting
Evaluate loss in likelihood from removing each split =
Data likelihood with split reversed
Data likelihood with split No loss in accuracy when 50% of the splits are
reversed.
![Page 17: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/17.jpg)
Adaptive Splitting Results
74
76
78
80
82
84
86
88
90
100 300 500 700 900 1100 1300 1500 1700
Total Number of grammar symbols
Parsing accuracy (F1)
50% Merging
Hierarchical Training
Flat TrainingModel F1
Previous 88.4
With 50% Merging 89.5
![Page 18: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/18.jpg)
0
5
10
15
20
25
30
35
VP NP PP AP S
CNP AVP PN CAP
CS CVP
VZ CCP NM CPP MTA CVZ
AA ISU
VROOT CAVP CAC
CH CO DL ROOT
Number of Phrasal Subcategories
![Page 19: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/19.jpg)
0
5
10
15
20
25
30
35
NE
VVFIN ADJA NN ADV
ADJD VVPP APPR VVINF CARD ART PIS PIAT
PPER KON $[
PROAV VAFIN PDS
APPRAR PPOSAT
$.
PDAT PRELS PTKVZ VVIZU VAINF KOUS VMFIN
FM VAPP
KOKOM PWAV
PWS KOUI TRUNC
XY
PTKZU PWAT VVIMP NNE
PRELAT PTKNEG
APZR
Number of Lexical Subcategories
![Page 20: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/20.jpg)
Smoothing
Heavy splitting can lead to overfitting Idea: Smoothing allows us to pool
statistics
![Page 21: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/21.jpg)
Linear Smoothing
![Page 22: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/22.jpg)
74
76
78
80
82
84
86
88
90
100 300 500 700 900 1100
Total Number of grammar symbols
Parsing accuracy (F1)50% Merging and Smoothing
50% Merging
Hierarchical Training
Flat Training
Model F1
Previous 89.5
With Smoothing 90.7
Result Overview
![Page 23: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/23.jpg)
Coarse-to-Fine Parsing[Goodman ‘97, Charniak&Johnson ‘05]
Coarse grammarNP … VP
NP-dog NP-catNP-apple VP-run NP-eat…
Refined grammar
…
TreebankParse
Pru
ne
NP-17 NP-12NP-1 VP-6VP-31…
Refined grammar
…
Parse
![Page 24: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/24.jpg)
Hierarchical Pruning
Consider the span 5 to 12:
… QP NP VP …coarse:
split in two: … QP1
QP2
NP1 NP2 VP1 VP2 …
… QP1
QP1
QP3
QP4
NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4 …split in four:
split in eight: … … … … … … … … … … … … … … … … …
![Page 25: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/25.jpg)
Intermediate Grammars
X-Bar=G0
G=
G1
G2
G3
G4
G5
G6
Lea
rning DT1 DT2 DT3 DT4 DT5 DT6 DT7 DT8
DT1 DT2 DT3 DT4
DT1
DT
DT2
![Page 26: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/26.jpg)
State Drift (DT tag)
somesomethisthisThatThat thesethese
That this some
the
these
this some
that
That this some
the
these
this some
that
……………… …… ……………… …… somesomethesethisThatThis thatthat EM
![Page 27: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/27.jpg)
G1
G2
G3
G4
G5
G6
Lea
rning
G1
G2
G3
G4
G5
G6
Lea
rning
Projected Grammars
X-Bar=G0
G=
Pro
jectio
n i
0(G)
1(G)
2(G)
3(G)
4(G)
5(G)G
![Page 28: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/28.jpg)
Bracket Posteriors (after G0)
![Page 29: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/29.jpg)
Bracket Posteriors (after G1)
![Page 30: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/30.jpg)
Bracket Posteriors (Movie)(Final Chart)
![Page 31: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/31.jpg)
Bracket Posteriors (Best Tree)
![Page 32: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/32.jpg)
Parse Selection
Computing most likely unsplit tree is NP-hard: Settle for best derivation. Rerank n-best list. Use alternative objective function / Variational Approximation.
Parses:
-1
-1
-2
-2
-1
-1
-1Derivations:
-1
-2
-1
-1
-2
-1
-2
![Page 33: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/33.jpg)
Efficiency Results
Berkeley Parser: 15 min Implemented in Java
Charniak & Johnson ‘05 Parser 19 min Implemented in C
![Page 34: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/34.jpg)
Accuracy Results
≤ 40 words
F1
all
F1
EN
G
Charniak&Johnson ‘05 (generative) 90.1 89.6
This Work 90.6 90.1
GE
R
Dubey ‘05 76.3 -
This Work 80.8 80.1
CH
N
Chiang et al. ‘02 80.0 76.6
This Work 86.3 83.4
![Page 35: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/35.jpg)
Parsing German Shared Task
Two Pass Parsing Determine constituency structure (F1: 85/94) Assign grammatical functions
One Pass Approach Treat categories+grammatical functions as
labels
![Page 36: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/36.jpg)
Parsing German Shared Task
Two Pass Parsing Determine constituency structure Assign grammatical functions
One Pass Approach Treat categories+grammatical functions as
labels
![Page 37: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/37.jpg)
Development Set Results
![Page 38: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/38.jpg)
Shared Task Results
![Page 39: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/39.jpg)
Part-of-speech splits
![Page 40: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/40.jpg)
Linguistic Candy
![Page 41: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/41.jpg)
Conclusions
Split & Merge Learning Hierarchical Training Adaptive Splitting Parameter Smoothing
Hierarchical Coarse-to-Fine Inference Projections Marginalization
Multi-lingual Unlexicalized Parsing
![Page 42: Parsing German with Latent Variable Grammars Slav Petrov and Dan Klein UC Berkeley](https://reader033.vdocuments.site/reader033/viewer/2022052820/55162adf550346a2308b5db3/html5/thumbnails/42.jpg)
Thank You!
Parser is avaliable athttp://nlp.cs.berkeley.edu