mixed syntax translation hieu hoang mt marathon 2011 trento
TRANSCRIPT
![Page 1: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/1.jpg)
MIXED SYNTAX TRANSLATION
Hieu HoangMT Marathon 2011 Trento
![Page 2: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/2.jpg)
Contents• What is a syntactic model?
• What’s wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 3: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/3.jpg)
What is a syntactic model?
• Hierarchical Phrase-Based Model– String-to-string– Non-terminals are unlabelled
• Tree-to-string Model– Source non-terminals are labelled• match input parse tree
X habe X1 gegessen # have eaten X1
S habe NP1 gegessen # have eaten NP1
![Page 4: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/4.jpg)
What is a syntactic model?
• Hierarchical Phrase-Based Model– String-to-string– Non-terminals are unlabelled
• Tree-to-string Model– Source non-terminals are labelled• match input parse tree
X habe X1 gegessen # have eaten X1
S habe NP1 gegessen # have eaten NP1
![Page 5: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/5.jpg)
What is a syntactic model?
• Hierarchical Phrase-Based Model– String-to-string– Non-terminals are unlabelled
• Tree-to-string Model– Source non-terminals are labelled• match input parse tree
X habe X1 gegessen # have eaten X1
S habe NP1 gegessen # have eaten NP1
![Page 6: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/6.jpg)
What is a syntactic model?
• Hierarchical Phrase-Based Model– String-to-string– Non-terminals are unlabelled
• Tree-to-string Model– Source non-terminals are labelled• match input parse tree
X habe X1 gegessen # have eaten X1
S habe NP1 gegessen # have eaten NP1
![Page 7: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/7.jpg)
What is a syntactic model?
• Hierarchical Phrase-Based Model– String-to-string– Non-terminals are unlabelled
• Tree-to-string Model– Source non-terminals are labelled• match input parse tree
X habe X1 gegessen # have eaten X1
S habe NP1 gegessen # have eaten NP1
![Page 8: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/8.jpg)
Contents• What is a syntactic model?
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 9: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/9.jpg)
What’s Wrong with Syntax?
(Ambati and Lavie, 2009)
Evaluation of French-English MT System
BLEU METEOR
Tree-to-string 27.02 57.68
Tree-to-tree 22.23 54.05
Moses (phrase-based) 30.18 58.13
![Page 10: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/10.jpg)
Hierarchical Model
according to János Veres , this would be in the first quarter of 2008 possible .
![Page 11: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/11.jpg)
Hierarchical Model
according to János Veres , this would be in the first quarter of 2008 possible .
![Page 12: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/12.jpg)
Hierarchical Model
according to János Veres , this would be in the first quarter of 2008 possible .
![Page 13: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/13.jpg)
Tree-to-String Model
1 2 3 4 5 6 7 8 9 10 11laut János Veres wäre dies im ersten Quartal 2008 möglich .
PN
PP PP
S
TOP
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
![Page 14: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/14.jpg)
Tree-to-String Model
PN
PP PP
S
TOP
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
according to János Veres would be this in the first quarter of2008 possible .
![Page 15: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/15.jpg)
Tree-to-String Model
PN
PP PP
S
TOP
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
according to János Veres would be this in the first quarter of2008 possible .
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
![Page 16: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/16.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 17: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/17.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
![Page 18: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/18.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
![Page 19: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/19.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
![Page 20: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/20.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
![Page 21: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/21.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
![Page 22: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/22.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 23: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/23.jpg)
Why Use Syntactic Models?
• Decrease decoding time– Derivation constrained by source parse tree
![Page 24: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/24.jpg)
Why Use Syntactic Models?
• Decrease decoding time– Derivation constrained by source parse tree
• Long-range reordering during decoding– rules covering more words than max-span limit
![Page 25: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/25.jpg)
Why Use Syntactic Models?
• Decrease decoding time– Derivation constrained by source parse tree
• Long-range reordering during decoding– rules covering more words than max-span limit
• Other rule-forms– 3+ non-terminals– consecutive non-terminals– non-lexicalized rules
![Page 26: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/26.jpg)
Why Use Syntactic Models?
• Decrease decoding time– Derivation constrained by source parse tree
• Long-range reordering during decoding– rules covering more words than max-span limit
• Other rule-forms– 3+ non-terminals– consecutive non-terminals– non-lexicalized rules
X S1O2V3 # S1V3O2
X PRO1 PRO2 aime bien # PRO1 like PRO2
![Page 27: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/27.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 28: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/28.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
![Page 29: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/29.jpg)
Other Syntactic Models• Syntax-Augmented MT (SAMT)
– Not constrained only to parse tree– (Zollmann and Venugopal, 2006)
• Binarization– Restructure and relabel parse parse tree– (Wang et al, 2010)
• Forest-based translation– Recover from parse errors– (Mi et al, 2008)
• Soft constraint– Reward/Penalize derivations which follows parse structure– (Chiang 2010)
• Ignore Syntax (occasionally)
![Page 30: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/30.jpg)
Mixed-Syntax Model
• Tree-to-string model– input is a parse tree
• Roles of non-terminals– Constrain derivation to parse constituents– State information• Consistent node label on target derivation• hypotheses with different head NT cannot be
recombined
![Page 31: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/31.jpg)
Mixed-Syntax Model
• Tree-to-string model– input is a parse tree
• Roles of non-terminals– Constrain derivation to parse constituents
• Can sometime have no constraints
– State information• Consistent node label on target derivation• hypotheses with different head NT cannot be
recombined• always X
![Page 32: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/32.jpg)
Mixed-Syntax Model
• Naïve syntax model
• Mixed-Syntax Model
Example Translation Rules
VP VVFIN1 zu VVINF2 # to VVFIN2 VVINF1
VP X1 zu VVINF2 # X to X2 X1
![Page 33: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/33.jpg)
Mixed-Syntax Model
• Naïve syntax model
• Mixed-Syntax Model
Example Translation Rules
VP VVFIN1 zu VVINF2 # to VVFIN2 VVINF1
VP X1 zu VVINF2 # X to X2 X1
![Page 34: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/34.jpg)
Mixed-Syntax Model
• Naïve syntax model
• Mixed-Syntax Model
Example Translation Rules
VP VVFIN1 zu VVINF2 # to VVFIN2 VVINF1
VP X1 zu VVINF2 # X to X2 X1
![Page 35: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/35.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 36: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/36.jpg)
Extraction
• Allow rules– Max 3 non-terminals– Adjacent non-terminals• At least 1 NT must be syntactic
– Non-lexicalized rules
![Page 37: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/37.jpg)
Example Rules ExtractedRule Factional Count p( t | s)
Syntactic Rules
VP NP1 VVINF2 # XX2 X1167.3 68%
Mixed Rules
VP X1 VZ2 # X X2 X163.3 64%
VP X1 zu VVINF2 # X to X2 X139.9 56%
TOP NP1 X2 # X X1 X243.1 92%
![Page 38: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/38.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 39: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/39.jpg)
Synchronous CFGInput:
![Page 40: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/40.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VB vu # see
VP ne PRO1 VB2 pas # did not VB2 PRO1
![Page 41: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/41.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VP ne PRO1 VB2 pas # did not VB2 PRO1
Derivation:
VB vu # see
![Page 42: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/42.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VP ne PRO1 VB2 pas # did not VB2 PRO1
Derivation:
VB vu # see
![Page 43: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/43.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VP ne PRO1 VB2 pas # did not VB2 PRO1
Derivation:
VB vu # see
![Page 44: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/44.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VP ne PRO1 VB2 pas # did not VB2 PRO1
Derivation:
VB vu # see
![Page 45: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/45.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VP ne PRO1 VB2 pas # did not VB2 PRO1
Derivation:
VB vu # see
![Page 46: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/46.jpg)
Synchronous CFGInput:
Rules:S NP1 VP2 # NP1 VP2
NP je # I
PRO lui # him
VP ne PRO1 VB2 pas # did not VB2 PRO1
Derivation:
VB vu # see
![Page 47: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/47.jpg)
Mixed-Syntax ModelInput:
![Page 48: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/48.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
![Page 49: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/49.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
Derivation:
![Page 50: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/50.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
Derivation:
![Page 51: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/51.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
Derivation:
![Page 52: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/52.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
Derivation:
![Page 53: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/53.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
Derivation:
![Page 54: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/54.jpg)
Mixed-Syntax ModelInput:
Rules:S NP1 VP2 # X NP1 VP2
PRO je # X I
PRO lui # X him
VP ne X1 pas # did not X1
VB vu # X see
X PRO1 VB2 # X X2 X1
Derivation:
![Page 55: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/55.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 56: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/56.jpg)
ExperimentGerman-English
Corpus
German English
Train Sentences 82,306
Words 2,034,373 1,965,325
Tune Sentences 2000
Test Sentences 1026
Trained: News Commentary 2009Tuned: held out setTested: nc test2007
![Page 57: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/57.jpg)
Results
Model # rules %BLEU
Hierarchical 61.2m 15.9
Tree-to-String 4.7m 14.9
Mixed Syntax 128.7m 16.7
Using constituent parseGerman-English
Model # rules %BLEU
Hierarchical 84.6m 10.2
Mixed Syntax 175.0m 10.6
English-German
![Page 58: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/58.jpg)
Example
according to János Veres , this would be in the first quarter of 2008 possible .
Hierarchical Model
![Page 59: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/59.jpg)
ExampleMixed-Syntax Model
according to János Veres this would be possible in the first quarter of 2008.
.
possible
2008quarter of
[APPRART,1] [X,2] [CARD,3]
thiswould beVeresJános
according to
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
first
[ADJA,1] [NN,2]
in the
[X,2] [PP,1] [PUNC,3]
[PDS,2] [VAFIN,1]
[NE,1] [NE,2]
1 2 3 4 5 6 7 8 9 10 11
laut János Veres wäre dies im ersten Quartal 2008 möglich .
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
PN PP
PP
S
TOP
![Page 60: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/60.jpg)
Example
according to János Veres this would be possible in the first quarter of 2008.
.
possible
2008quarter of
[APPRART,1] [X,2] [CARD,3]
thiswould beVeresJános
according to
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
first
[ADJA,1] [NN,2]
in the
[X,2] [PP,1] [PUNC,3]
[PDS,2] [VAFIN,1]
[NE,1] [NE,2]
1 2 3 4 5 6 7 8 9 10 11
laut János Veres wäre dies im ersten Quartal 2008 möglich .
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
PN PP
PP
S
TOP
Mixed Syntax
![Page 61: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/61.jpg)
Example
according to János Veres this would be possible in the first quarter of 2008.
.
possible
2008quarter of
[APPRART,1] [X,2] [CARD,3]
thiswould beVeresJános
according to
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
first
[ADJA,1] [NN,2]
in the
[X,2] [PP,1] [PUNC,3]
[PDS,2] [VAFIN,1]
[NE,1] [NE,2]
1 2 3 4 5 6 7 8 9 10 11
laut János Veres wäre dies im ersten Quartal 2008 möglich .
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
PN PP
PP
S
TOP
Mixed Syntax
![Page 62: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/62.jpg)
Example
according to János Veres this would be possible in the first quarter of 2008.
.
possible
2008quarter of
[APPRART,1] [X,2] [CARD,3]
thiswould beVeresJános
according to
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
first
[ADJA,1] [NN,2]
in the
[X,2] [PP,1] [PUNC,3]
[PDS,2] [VAFIN,1]
[NE,1] [NE,2]
1 2 3 4 5 6 7 8 9 10 11
laut János Veres wäre dies im ersten Quartal 2008 möglich .
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
PN PP
PP
S
TOP
Mixed Syntax
![Page 63: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/63.jpg)
Example
according to János Veres this would be possible in the first quarter of 2008.
.
possible
2008quarter of
[APPRART,1] [X,2] [CARD,3]
thiswould beVeresJános
according to
[X,1] [X,2]
[X,1] [X,2]
[X,1] [X,2]
first
[ADJA,1] [NN,2]
in the
[X,2] [PP,1] [PUNC,3]
[PDS,2] [VAFIN,1]
[NE,1] [NE,2]
1 2 3 4 5 6 7 8 9 10 11
laut János Veres wäre dies im ersten Quartal 2008 möglich .
APPR NE NE VAFIN PDS APPRART ADJA NN CARD ADJA PUNC
PN PP
PP
S
TOP
Mixed Syntax
![Page 64: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/64.jpg)
Chunk Tags
• Advantages of Shallow Tags– Don’t need Treebank– More reliable
• Disadvantages– Not a tree structure• We don’t rely on tree structure
![Page 65: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/65.jpg)
ResultsShallow Tags
German-EnglishModel # rules %BLEU
Hierarchical 64.3m 16.3
Mixed Syntax 254.5m 16.8
![Page 66: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/66.jpg)
Larger Training CorpusGerman-English
Corpus
German English Corpus
Train Sentences 1,446,224 Europarl v5
Words 37,420,876 39,464,626
Tune Sentences 1910 dev2006
Test (in-domain)(out-of-domain)
Sentences19201042
nc test2007 v2 devtest2006
![Page 67: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/67.jpg)
Larger Training CorpusGerman-English
Model # rules In-domain (BLEU)
Out-of-domain(BLEU)
Hierarchical 500m 22.1 16.5
Mixed Syntax (original) 2664m 21.6 16.3
Mixed Syntax (new extraction)
1435m 22.7 17.8
![Page 68: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/68.jpg)
Contents
• What’s Wrong with Syntax?
• Which syntax model to use?
• Why use syntactic models?
• Mixed-Syntax Model– Extraction– Decoding– Results
• Future Work
![Page 69: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/69.jpg)
Create your own labelDumb labels
ich bitte Sie , sich zu einer Schweigeminute zu erheben .
![Page 70: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/70.jpg)
Create your own labelDumb labels
ich bitte Sie , sich zu einer Schweigeminute zu erheben .
Sz
e
![Page 71: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/71.jpg)
Create your own labelsDumb labels
ich bitte Sie , sich zu einer Schweigeminute zu erheben .
Sz
e
Model In-domain (BLEU)
Out-of-domain(BLEU)
Hierarchical 22.1 16.5
Dumb Labels 22.0 16.3
![Page 72: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/72.jpg)
Create your own labelsLabels motivated by reordering
Labelling patterns:
1. VMFIN...VVINF EOS2. VVINF und ... VVINF3. VAFIN ... (VVPP or VVINF) EOS4. , PRELS ... VVINF EOS5. EOS ... zu VVINF
Example:ich bitte Sie , sich zu einer Schweigeminute zu erheben .
label 5… werde ich dem Vorschlag von Herrn Evans folgen .
label 3
![Page 73: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/73.jpg)
Create your own labels
Labels motivated by reordering
Model In-domain (BLEU)
Out-of-domain(BLEU)
Hierarchical 22.1 16.5
Dumb Labels 22.0 16.3
Reordering Labels 22.1 16.9
![Page 74: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/74.jpg)
Conclusion• Mixed-Syntax Model– SCFG-based decoding– Hierarchical phrase-based v. tree-to-string– Generality v. specificity
• Syntax Models– Many variations– Won’t automatically make MT better– Question
• which syntactic information?• how do we use it?• why use it?
![Page 75: MIXED SYNTAX TRANSLATION Hieu Hoang MT Marathon 2011 Trento](https://reader035.vdocuments.site/reader035/viewer/2022070306/55162883550346a2308b5ccb/html5/thumbnails/75.jpg)
END