advanced signal processing 05/06 reinisch bernhard statistical machine translation phrase based...
Post on 30-Dec-2015
219 Views
Preview:
TRANSCRIPT
Advanced Signal Processing 05/06Reinisch Bernhard
Statistical Machine Translation
Phrase Based Model
2/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Overview
● The quality of the MT systems have improved with the use of phrase translation– Phrases from word-based alignments– Syntactic phrases– Phrases from phrase alignments– IBM word-based statistical MT systems
enhanced with phrase translation● Best to extract phrase translations pairs?
– Evaluation Framework / Outcome
3/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Word based approaches
● Try to model word-to-word correspondences● Models are often restricted
– source word -> exactly one target word– Hidden Markov models in speech recognition
● Enhanced to “One-to-many” alignment model– Solve lexical problems like
● “Zahnarzttermin” -> “dentist’s appointment”
● Order of words will be changed
4/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Statistical machine translation (1)
● argmax … search/decoding problem (generation of the output sentence)
● Pr(e1) … language model
● Pr(f1|e1) … translation model
J1
I1
I1
ê
J1
I1
ê
I1
Jj1I1Jj1
J1
e|fPrePrmaxarg
f|ePrmaxargê
e...e...ee;f...f...ff
I1
I1
5/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Statistical machine translation (2)
Taken from [2]
6/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Learning translation lexica
● Following describes methods for learning single-word and phrase-based translation lexica– Statistical alignment models
● Used for learning word alignments● Symmetrization
– Bilingual phrases– Alignment templates
7/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Statistical alignment models (1)
● In the alignment model– A “hidden” parameter is introduced a– a describes the mapping from source position j to target
position aj
● “a” is represented as a matrix with binary values– 1 entry … words are aligned– 0 entry … words are not aligned– source word -> no target word (empty word eo)
8/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Statistical alignment models (2)
● In general the model depends on a set of unknown parameters
● Exist several different specific statistical alignment models– First compute word alignments i.e. model 4– Train this hidden parameters θ
● Alignment with highest probability– called Viterbi alignment
9/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Symmetrization (1)
● Baseline alignment model (i.e. model 4) does not allow multiple target words– “Zahnarzttermin” -> “dentist’s appointment”
● Outcome should be such
alignment matrix
Taken from [2]
10/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Symmetrization (2)
● To solve this problem– Training in both directions – For a sentence pair -> two Viterbi alignments
– Now both alignments tables A1 and A2 have to combined (symmetized)
● Simple union of both tables (some refined methods)– Result then is used to train single word based
translation lexica
11/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Symmetrization (2)
– By computing for relative frequencies using:
● N(e|f) … how many times e and f are aligned● N(f) … how many time the word f occurs
12/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Bilingual phrases
● Now we need an algorithm that relationships between whole phrases of source sentence m and target sentence n– “phrase extract” algorithm
and take as input
alignment matrix A
Taken from [2]
13/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Alignment templates (1)
● A more systematic approach– Considers whole phrases
● Whole group of adjacent words in the source● maps to a whole group of words in the target
– The context of words have greater influence – The changes of word order can be learned
● The Idea is to model two different alignment levels– Word level alignments– Phrase level alignments
14/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Alignment templates (2)
• Alignments templates z– “F”… source class sequence– “E”…target class sequence– “A”… describes the alignment between source
and target
• “F” and “E” are classes – The advantage is a better generalization
~~~
,,, AEFz
15/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Alignment templates (3)
Taken from [2]
16/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Alignment templates (4)
● For the training we need the probability of applying an alignment template
● The “phrase extraction” have to be modified● Can be estimated by relative frequencies● Finished the
“Learning translation lexica”-task
17/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Translation model (1)
• For notation we decompose the sentences
– f1J…source sentence
– e1I…target sentence
– sequence of phrases (k=1,…,K)
• Further considerations (only one segmentation)
kk
kk
jjk
KI
jjk
KJ
eeeee
fffff
,....,;
,....,;
1
~
1
~
1
1
~
1
~
1
1
1
18/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Translation model (2)
● The model have to allow reordering of the phrases
K
K
Kz
K
K
K
z
zfe
e
f
K
,
ibleshidden var following
ationfor transl templatealignment ...
phrase theofn permutatio...phrasestarget...
phrasessource...
K1
1
~
1
~
K11
~
1
~
19/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Translation model (3)
Taken from [2]
20/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Translation model (4)
Taken from [2]
21/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Alignment template approach results
● Evaluation of the approach by a translation task (“Verbmobil Task”)
● Additional preprocessing– word-joinings– word-splitting
Taken from [2]
Taken from [2]
22/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Alignment template approach conclusions
● Overall we see a better performance● So it is important to model word groups in
source and target language● By using two abstraction levels
– Phrase level alignments– Word level alignments– -> greater influence of the context and can be
learned explicitly
23/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Syntactic phrases (1)
● A collection of all phrase pairs will also include non-intuitive phrases– “Okay, the”, “house the”, etc… – Intuitively such phrases do not help– Restricting to syntactically motivated phrases
● The idea of syntactic trees and phrases as subtrees
24/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Syntactic phrases (2)
● The input sentence is preprocessed by a syntactic parser
● Different operations will be performed on each node– reordering child nodes– inserting extra words at each node– translating leaf words
25/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Syntactic phrases (3)
Taken from [4]
26/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Syntactic phrases (4)
Taken from [6]
27/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Syntactic phrases (5)
● Reordering– Every given child sequence has a probability of
reordering (N nodes -> N! pos. reorderings)– The probability of reordering is given by the model (table
etc)● Inserting
– Extra word can be inserted (left/right)– Another table for insert probability
● Translating– Operation is applied to every leaf– Assumption that this operation only depends on the word
itself
28/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Experiments
● Now we have three models● [1] build a system to compare them and
measure performance under different aspects– Weighting syntactic phrases– Maximum phrase length
● Setup– Free corpus Europarl– German to English– Performance measured using BLEU score
29/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Comparison of core methods
● AP… template alignment● M4 … IBM Model 4 for word
based translation● Syn … syntactic phrases
● Training corpus size [sentences]
Taken from [1] Taken from [1]
30/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Weighting syntactic phrases (1)
● The restriction on syntactic phrases is harmful, because too many phrases are eliminated
● Intuitively that can not be– Improvements in data collection, during
translation, penalizing● Results suggest
– Collection of only syntactically phrases – Performance not better– But smaller table sizes
31/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Weighting syntactic phrases (2)
● Example:– “es gibt” literally translates in “it gives” but really
means “there is”– Not syntactic relationship– Also “with regard to”, “note that” syntactically
complex but easy translation
32/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Maximum phrase length
● How long do phrases have to be to achieve high performance?
● All experiments with “Phrases from word-based alignments” approach
Taken from [1] Taken from [1]
33/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Simpler Underlying word-based models (1)
● The core of this framework is IBM model 4 for collecting phrase pairs
● Model 4 is computationally expensive, parameters problems (approximations)
● What about IBM models 1-3– Faster and easier to implement– Model 1 and 2 compute word alignments
efficiently
34/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Simpler Underlying word-based models (2)
● How much is performance affected, if the base word alignment on these simpler methods?
● M1 worst performance● But M2 & M3 provide similar
performance to the M4 model
Taken from [1]
35/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Conclusions
● Intuitively phrase bases approaches gives better performance than word-based approaches
● Also experiments show us that– “straight forward” forward syntax based models
have disadvantages● The “best” outcome with small word phrases● Phrase extraction and the alignment heuristic
have a great influence
36/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
References
● [1] Philipp Koehn, Franz Josef Och, Daniel Marcu; Statistical Phrase-Based Translation
● [2] Franz Josef Och, Hermann Ney; The Alignment Template Approach to Statistical Machine Translation
● [3] Franz Josef Och, Christoph Tillmann, Hermann Ney; Improved Alignment Models for Statistical Machine Translation
● [4] Kenji Yamada, Kevin Knight; A Syntax-based Translation Model
● [5] Daniel Marcu, William Wong; A Phrase-Based, Joint Probability Model for Statistical Machine Translation
● [6] Amitabha Mukerjee, Ankit Soni and Achla M. Raina; Detecting Complex Predicates in Hindi using POS Projection across Parallel Corpora
● [7] www.sbox.tugraz.at/home/b/brein/061120_TranslationModelPhraseBased.zip
37/37ASP 06/07Reinisch Bernhard
Translation Model – Phrase-based
Advanced Signal Processing 05/06Reinisch Bernhard
Statistical Machine Translation
Phrase Based Models
top related