machine translation and mt tools: giza++ and moses -nirdesh chauhan
TRANSCRIPT
![Page 1: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/1.jpg)
MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES
-Nirdesh Chauhan
![Page 2: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/2.jpg)
Outline
Problem statement in SMT
Translation models
Using Giza++ and Moses
![Page 3: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/3.jpg)
Introduction to SMT
Given a sentence in foreign language F, find most appropriate translation in English E
P(F|E) – Translation model P(E) – Language model
![Page 4: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/4.jpg)
The Generation Process4
Partition: Think of all possible partitions of the source language
Lexicalization: For a give partition, translate each phrase into the foreign language
Reordering: permute the set of all foreign words - words possibly moving across phrase boundaries
We need the notion of alignment to better explain mathematic behind the generation process
![Page 5: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/5.jpg)
Alignment
![Page 6: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/6.jpg)
Word-based alignment
For each word in source language, align words from target language that this word possibly produces
Based on IBM models 1-5 Model 1 – simplest As we go from models 1 to 5, models get
more complex but more realistic
This is all that Giza++ does
![Page 7: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/7.jpg)
Alignment
A function from target position to source position:
7
The alignment sequence is: 2,3,4,5,6,6,6Alignment function A: A(1) = 2, A(2) = 3 ..A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2)..
To allow spurious insertion, allow alignment with word 0 (NULL)No. of possible alignments: (I+1)J
![Page 8: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/8.jpg)
IBM Model 1: Generative Process
8
![Page 9: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/9.jpg)
IBM Model 1: Details
No assumptions. Above formula is exact. Choosing length: P(J|E) = P(J|E,I) = P(J|I) = Choosing Alignment: all alignments equiprobable
Translation Probability
A
J
jaJ jjeft
IEFP
1
)|(*)1(
)|(
),,|(*),|(*)|()|( AEJFPEJAPEJPEFPA
9
),,,|(*),,|(
),,|(*),|(
),,|(*),|(
11
11
11
11
1
11111
IJjj
J
j
Ijj
IJJIJ
eaJffPeJaaP
eaJfPeJaP
EJAFPEJAP
A
IJjj
J
j
Ijj eaJffPeJaaPEJPEFP ),,,|(*),,|(*)|()|( 1
11
11
11
11
1
1),,|( 1
11
IeJaaP Ij
j
)|(),,,|( 11
11
1 jajefteaJffP IJj
j
![Page 10: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/10.jpg)
Training Alignment Models10
Given a parallel corpora, for each (F,E) learn the best alignment A and the component probabilities: t(f|e) for Model 1 lexicon probability P(f|e) and alignment
probability P(ai|ai-1,I)
How to compute these probabilities if all you have is a parallel corpora
![Page 11: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/11.jpg)
Intuition : Interdependence of Probabilities
11
If you knew which words are probable translation of each other then you can guess which alignment is probable and which one is improbable
If you were given alignments with probabilities then you can compute translation probabilities
Looks like a chicken and egg problem
EM algorithm comes to the rescue
![Page 12: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/12.jpg)
Expectation Maximization (EM) Algorithm
12
Used when we want maximum likelihood estimate of the parameters of a model when the model depends on hidden variables-In present case, parameters are Translation Probabilities, and hidden Variables are alignment probabilities • Init: Start with an arbitrary estimate of parameters• E-step: compute the expected value of hidden variables• M-Step: Recompute the parameters that maximize the likelihood of data given the expected value of the hidden variables from E-step
![Page 13: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/13.jpg)
Example of EM Algorithm13
Green houseCasa verde
The houseLa case
Init: Assume that any word can generate any word with equal prob:
P(la|house) = 1/3
![Page 14: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/14.jpg)
E-Step14
J
jaj
J jeft
I
EJAFPEJAPEJFAP
1
)|(*)1(
),,|(*),|(),|,(
E-Step:
A
EFAP
EFAPEFAP
)|,(
)|,(),|(
![Page 15: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/15.jpg)
M-Step
15
f
EF A
eftcount
eftcounteft
EFAefCAPeftcount
)|(
)|()|(
),,|,(*)()|(,
![Page 16: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/16.jpg)
E-Step again
J
jaj
J jeft
I
EJAFPEJAPEJFAP
1
)|(*)1(
),,|(*),|(),|,(
A
EFAP
EFAPEFAP
)|,(
)|,(),|(
16
1/3 2/3 2/3 1/3
Repeat till convergence
![Page 17: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/17.jpg)
Limitation: Only 1->Many Alignments allowed
17
![Page 18: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/18.jpg)
Phrase-based alignment
More natural
Many-to-one mappings allowed
![Page 19: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/19.jpg)
Generating Bi-directional Alignments Existing models only generate uni-directional
alignments Combine two uni-directional alignments to get
many-to-many bi-directional alignments
19
![Page 20: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/20.jpg)
Hindi-Eng Alignment
छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�
Goa |
is
a |
premier |
beach
vacation | | |
destination | |
20
![Page 21: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/21.jpg)
Eng-Hindi Alignment
छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�
Goa
|
is
a
|premier
|
beach
|
vacation
|
destination
|21
![Page 22: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/22.jpg)
Combining Alignments
छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�
Goa +
is
a +premier |
|
beach
|
vacation | |
+
destination
|
| |
22P=2/3=.67, R=2/7=.3P=4/5=.8,R=4/7=.6
P=5/6=.83,R=5/7=.7P=6/9=.67,R=6/7=.85
![Page 23: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/23.jpg)
A Different Heuristic from Moses-Site
23
GROW-DIAG-FINAL(e2f,f2e): neighboring = ((-1,0),(0,-1),(1,0),(0,1),(-1,-1),(-1,1),(1,-1),(1,1)) alignment = intersect(e2f,f2e); GROW-DIAG(); FINAL(e2f); FINAL(f2e);
GROW-DIAG(): iterate until no new points added for english word e = 0 ... en for foreign word f = 0 ... fn if ( e aligned with f ) for each neighboring point ( e-new, f-new ): if (( e-new, f-new ) in union( e2f, f2e ) and
( e-new not aligned and f-new not aligned )) add alignment point ( e-new, f-new ) FINAL(a): for english word e-new = 0 ... en for foreign word f-new = 0 ... fn if ( ( ( e-new, f-new ) in alignment a) and
( e-new not aligned or f-new not aligned ) ) add alignment point ( e-new, f-new )
Proposed Changes:After growing diagonalAlign the shorter sentence firstAnd use alignments only fromcorresponding directional alignment
![Page 24: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/24.jpg)
Generating Phrase Alignments
छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�
Goa +
is
a +premier +
beach
+
vacation + +
+
destination + +
24a premier beach vacation destinationएके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�
premier beach vacationप्रमु�ख समु�द्र-तटी�यों
![Page 25: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/25.jpg)
Using Moses and Giza++
Refer to http://www.statmt.org/moses_steps.html
![Page 26: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/26.jpg)
Steps
Install all packages in Moses
Input - sentence aligned parallel corpus
Training Tuning Generate output on test corpus
(decoding)
![Page 27: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/27.jpg)
Example
train.enh e l l o
h e l l o
w o r l d
c o m p o u n d w o r d
h y p h e n a t e d
o n e
b o o m
k w e e z l e b o t t e r
train.prhh eh l ow
hh ah l ow
w er l d
k aa m p aw n d w er d
hh ay f ah n ey t ih d
ow eh n iy
b uw m
k w iy z l ah b aa t ah r
![Page 28: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/28.jpg)
Sample from Phrase-tableb o ||| b aa ||| (0) (1) ||| (0) (1) ||| 1 0.666667 1
0.181818 2.718
b ||| b ||| (0) ||| (0) ||| 1 1 1 1 2.718
c o m p o ||| aa m p ||| (2) (0,1) (1) (0) (1) ||| (1,3) (1,2,4) (0) ||| 1 0.0486111 1 0.154959 2.718
c ||| p ||| (0) ||| (0) ||| 1 1 1 1 2.718
d w ||| d w ||| (0) (1) ||| (0) (1) ||| 1 0.75 1 1 2.718
d ||| d ||| (0) ||| (0) ||| 1 1 1 1 2.718
e b ||| ah b ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718
e l l ||| ah l ||| (0) (1) (1) ||| (0) (1,2) ||| 1 1 0.5 0.5 2.718
e l l ||| eh l ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.111111 0.5 0.111111 2.718
e l ||| eh ||| (0) (0) ||| (0,1) ||| 1 0.111111 1 0.133333 2.718
e ||| ah ||| (0) ||| (0) ||| 1 1 0.666667 0.6 2.718
h e ||| hh ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718
h ||| hh ||| (0) ||| (0) ||| 1 1 1 1 2.718
l e b ||| l ah b ||| (0) (1) (2) ||| (0) (1) (2) ||| 1 1 1 0.5 2.718
l e ||| l ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.5 2.718
l l o ||| l ow ||| (0) (0) (1) ||| (0,1) (2) ||| 0.5 1 1 0.227273 2.718l l ||| l ||| (0) (0) ||| (0,1) ||| 0.25 1 1 0.833333 2.718l o ||| l ow ||| (0) (1) ||| (0) (1) ||| 0.5 1 1 0.227273 2.718l ||| l ||| (0) ||| (0) ||| 0.75 1 1 0.833333 2.718m ||| m ||| (0) ||| (0) ||| 1 0.5 1 1 2.718n d ||| n d ||| (0) (1) ||| (0) (1) ||| 1 1 1 1 2.718n e ||| eh n iy ||| (1) (2) ||| () (0) (1) ||| 1 1 0.5 0.3 2.718n e ||| n iy ||| (0) (1) ||| (0) (1) ||| 1 1 0.5 0.3 2.718n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718o o m ||| uw m ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.5 1 0.181818 2.718o o ||| uw ||| (0) (0) ||| (0,1) ||| 1 1 1 0.181818 2.718o ||| aa ||| (0) ||| (0) ||| 1 0.666667 0.2 0.181818 2.718o ||| ow eh ||| (0) ||| (0) () ||| 1 1 0.2 0.272727 2.718o ||| ow ||| (0) ||| (0) ||| 1 1 0.6 0.272727 2.718w o r ||| w er ||| (0) (1) (1) ||| (0) (1,2) ||| 1 0.1875 1 0.424242 2.718w ||| w ||| (0) ||| (0) ||| 1 0.75 1 1 2.718
![Page 29: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649da25503460f94a8f29f/html5/thumbnails/29.jpg)
Testing output
h o t hh aa t
p h o n e p|UNK hh ow eh n iy
b o o k b uw k