transfer-based mt. syntactic transfer-based machine translation direct and example-based approaches...

101
Transfer-based MT

Post on 20-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Transfer-based MT

Page 2: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Syntactic Transfer-based Machine Translation

• Direct and Example-based approaches – Two ends of a spectrum– Recombination of fragments for better coverage.

• What if the matching/transfer is done at syntactic parse level

• Three Steps – Parse: Syntactic parse of the source language sentence

• Hierarchical representation of a sentence– Transfer: Rules to transform source parse tree into target parse

tree• Subject-Verb-Object Subject-Object-Verb

– Generation: Regenerating target language sentence from parse tree• Morphology of the target language

• Tree-structure provides better matching and longer distance transformations than is possible in string-based EBMT.

Page 3: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

I

Examples of SynTran-MT

quiero

ajá usar

mi tarjeta

de

crédito

wanna

yeah use

my card

credit

•Mostly parallel parse structures

• Might have to insert word – pronouns, morphological particles

Page 4: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Example of SynTran MT -2

• Pros:– Allows for structure transfer– Re-orderings are typically restricted to the parent-child nodes.

• Cons:– Transfer rules are for each language pair (N2 sets of rules)– Hard to reuse rules when one of the languages is changed

need

I make

to call

a collect

必要があります (need)

私は (I)

かける (make)

コールを (call)

コレクト (collect)

Page 5: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Lexical-semantic Divergences

Linguistic Divergences

• Structural differences between languages– Categorical Divergence

• Translation of words in one language into words that have different parts of speech in another language– To be jealous– Tener celos (To have jealousy)

Page 6: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Issues

Linguistic Divergences– Conflational Divergence

• Translation of two or more words in one language into one word in another language– To kick – Dar una patada (Give a kick)

Page 7: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Issues

Linguistic Divergences– Structural Divergence

• Realization of verb arguments in different syntactic configurations in different languages– To enter the house – Entrar en la casa (Enter in the house)

Page 8: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Issues

Linguistic Divergences– Head-Swapping Divergence

• Inversion of a structural-dominance relation between two semantically equivalent words – To run in – Entrar corriendo (Enter running)

Page 9: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Issues

Linguistic Divergences– Thematic Divergence

• Realization of verb arguments that reflect different thematic to syntactic mapping orders– I like grapes – Me gustan uvas (To-me please grapes)

Page 10: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Divergence counts from Bonnie Dorr

32% of sentences in UN Spanish/English Corpus (5K)

Categorial X tener hambre Y have hunger

98%

Conflational X dar puñaladas a Z X stab Z

83%

Structural X entrar en Y X enter Y

35%

Head Swapping X cruzar Y nadando X swim across Y

8%

Thematic X gustar a Y Y likes X

6%

Page 11: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Transfer rules

Page 12: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Syntax-driven statistical machine translation

Slides from Devi Xiong, CAS, Beijing

Page 13: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Why syntax-based SMT

Weakness of phrase-based SMT

• Long-distance reordering: phrase-level reordering

• Discontinuous phrases

• Generalization

• …

Other methods using syntactic knowledge

• Word alignment integrating syntactic constraints

• Pre-order source sentences

• Rerank n-best output of translation models

Page 14: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SSMT based on formal structures

Compared with phrase-based SMT

• Translated hierarchically

• The target structures finally generated are not necessarily real linguistic structures, but– Make long-distance reordering more feasible– Introduce non-terminals/variables

• Discontinuous phrases: put x on, 在 x 时• Generalization

Page 15: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SCFG

Formulated:

• Two CFGs and there correspondences

Or

• P:

) , ,( 21 GGSCFG

)S , , ,( PTNSCFG

) , ,( BA

) , ,( X

Page 16: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SCFG: an example

Page 17: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SCFG: derivation

Page 18: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

ITG as reordering constraint

Two kinds of reordering• Inverted• straight

Coverage• Wu(1997): “been unable to find real examples” of cases where

alignments would fail under this constraint, at least in “lightly inflected languages, such as English and Chinese.”

• Wellington(2006): “we found examples”, “at least 5% of the Chinese/English sentence pairs”.

Weakness• No strong mechanism determining which order is better, inverted

or straight.

Page 19: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chiang’05: Hierarchical Phrase-based Model (HPM)

Rules:

Glue rule:

Model: log-linear

Decoder: CKY

) , ,( X

) , ( 1221 XwithXhaveXyouXyuX

) , ( 2121 XSXSS

) ,( 11 XXS

Page 20: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chiang’05: rule extraction

Page 21: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chiang’05: rule extraction restrictions

Initial base rule at most 15 on French side

Final rule at most 5 on French side

At most two non-terminals on each side, nonadjacent

At least one aligned terminal pair

Page 22: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chiang’05: Model

• Log-linear form

• and

Page 23: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chiang’05: decoder

Page 24: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SSMT based on phrase structures

Using grammars with linguistic knowledge

• The grammars are based on SCFG

Two categories:

• Tree-string– Tree-to-string– String-to-tree

• Tree-tree

Page 25: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Yamada & Knight 2001, 2003

Page 26: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Yamada’s work vs. SCFG

Insertion operation:

• A (wA1, A1)

Reordering operation

• A (A1A2A3, A1A3A2)

Translating operation

• A (x, y)

Page 27: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Yamada: weakness

Single-level mapping

• Multi-level reordering– Yamada: flatten

Word-based

• Yamada: phrasal leaf

Page 28: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Galley et al. 2004, 2006

translation model incorporates syntactic structure on the target language side

• trained by learning “translation rules” from bilingual data

the decoder uses a parser-like method to create syntactic trees as output hypotheses

Page 29: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Translation rules

Translation rules

• Target: multi-level subtrees

• Source: continuous or discontinuous phrases

Types of translation rules

• Translating source phrases into target chunks– NPB(PRP/I) ↔我– NP-C(NPB(DT/this NN/address)) ↔这个 地址

Page 30: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Types of translation rules

Have variables• NP-C(NPB(PRP$/my x0:NN)) ↔ 我 的 x0

• PP(TO/to NP-C(NPB(x0:NNS NNP/park))) ↔ 去 x0 公园Combine previously translated results together• VP(x0:VBZ x1:NP-C) ↔ x1 x0

– takes a noun phrase followed by a verb, switches their order, then combines them into a new verb phrase

Page 31: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Rules extraction

Word-align a parallel corpus

Parse the target side

Extract translation rules

• Minimal rules: can not be decomposed

• Composed rules: composed by minimal rules

Estimate probalities

Page 32: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Rule extraction

Minimal rule

Page 33: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Composed rules

Page 34: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Format is Expressive

S

x0:NP VP

x1:VB x2:NP2

x1, x0, x2

S

PRO VP

VB x0:NPthere

are

hay, x0

NP

x0:NP PP

of

P x1:NP

x1, , x0

Multilevel Re-Ordering

Non-constituent Phrases

Lexicalized Re-Ordering

VP

VBZ VBG

is

está, cantando

Phrasal Translation

singing

VP

VB x0:NP PRT

put

poner, x0

Non-contiguous Phrases

on

NPB

DT x0:NNS

the

x0

Context-SensitiveWord Insertion

[Knight & Graehl, 2005]

Page 35: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

decoder

probabilistic CYK-style parsing algorithm with beams

results in an English syntax tree corresponding to the Chinese sentence

guarantees the output to have some kind of globally coherent syntactic structure

Page 36: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Decoding example

Page 37: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Decoding example

Page 38: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Decoding example

Page 39: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Decoding example

Page 40: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Decoding example

Page 41: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Marcu et al. 2006

SPMT

• Integrating non-syntactifiable phrases

• Multiple features for each rule

• Decoding with multiple models

Page 42: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SSMT based on phrase structures

Two categories:

• Tree-string– String-to-tree– Tree-to-string

• Tree-tree

Page 43: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Tree-to-string

Liu et al. 2006

• Tree-to-string alignment template model

Page 44: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

TAT

NP

NR NN

布什 总统

President Bush

LCP

NP LC

NR CC NR 间

美国 和

between United States and

NP

DNP NP

NP DEG

Page 45: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

TAT: extraction

Constraints

• Source trees have to be Subtree

• Have to be consistent with word alignment

Restrictions on extraction

• both the first and last symbols in the target string must be aligned to some source symbols

• The height of T(z) is limited to no greater than h

• The number of direct descendants of a node of T(z) is limited to no greater than c

Page 46: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

TAT: Model

Page 47: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Decoding

Page 48: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Tree-to-string vs. string-to-tree

Tree-to-string

• Integrating source structures into translation and reordering

• The output can not be grammatical

string-to-tree

• guarantees the output to have some kind of globally coherent syntactic structure

• Can not use any knowledge from source structures

Page 49: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SSMT based on phrase structures

Two categories:

• Tree-string– String-to-tree– Tree-to-string

• Tree-tree

Page 50: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Tree-Tree

Synchronous tree-adjoining grammar (STAG)

Synchronous tree substitution grammar (STSG)

Page 51: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

STAG

Page 52: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

STAG: derivation

Page 53: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

STSG

enfants(“kids”)

kids

NPd’

(“of”)

beaucoup(“lots”)

NP

NPquitenull

Adv

oftennullAdv

nullAdv

SamSam

NP

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

Start

Page 54: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

STSG: elementary trees

SamSam NPenfants(“kids”)

kids

NP

quitenull Adv

d’(“of”)

beaucoup(“lots”)

NPNP

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)Start

NP

NP

nullAdv

Page 55: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Dependency structures

外商 投资 企业 成为 中国 外贸 重要 增长点

NN NN NN VV NR NN JJ NN

NPNP NP ADJP NP

NP

VP

IP

外商 投资

企业

中国 外贸

增长点

重要

成为

(a)

(b)

Page 56: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

For MT: dependency structures vs. phrase structures

Advantages of dependency structures over phrase structures for machine translation

• Inherent lexicalization

• Meaning-relative

• Better representation of divergences across languages

Page 57: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SSMT based on dependency structures

Lin 2004

• A Path-based Transfer Model for Machine Translation

Quirk et al. 2005

• Dependency Treelet Translation: Syntactically Informed Phrasal SMT

Ding et al. 2005

• Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars

Page 58: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Lin 2004

Translation model trained by learning transfer rules from bilingual corpus where the source language sentences are parsed.

decoding: finding the minimum path covering of the source language dependency tree

Page 59: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Lin 2004: path

Page 60: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Lin 2004: transfer rule

Page 61: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Quirk et al. 2005

Translation model trained by learning treelet pairs from bilingual corpus where the source language sentences are parsed.

Decoding: CKY-style

Page 62: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Treelet pairs

Page 63: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Quirk 2005: decoding

Page 64: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Ding 2005

Page 65: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

summary

Tree (formal, phrase, or dependency structure )

String (phrase or chunk)

Semantic

Interlingua

Word

Source Languge Target Language

Page 66: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

State of the art machine translation systems based on statistical models rooted in the theory of formal grammars/automata

Translation models based on finite state devices cannot easily model translations between languages with strong differences in word ordering

Recently, several models based on context-free grammars have been investigated, borrowing from the theory of compilers the idea of synchronous rewriting

Introduction

Slides from G. Satta

Page 67: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Translation models based on synchronous rewriting:

Inversion Transduction Grammars (Wu, 1997)

Head Transducer Grammars (Alshawi et al., 2000)

Tree-to-string models (Yamada & Knight, 2001; Galley et al, 2004)

“Loosely tree-based” model (Gildea, 2003)

Multi-Text Grammars (Melamed, 2003)

Hierarchical phrase-based model (Chiang, 2005)

We use synchronous CFGs to study formal properties of all these

Introduction

Page 68: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Synchronous CFG

A synchronous context-free grammar (SCFG) is based on three components:

Context free grammar (CFG) for source language

CFG for target language

Pairing relation on the productions of the two grammars and on the nonterminals in their right-hand sides

Page 69: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Synchronous CFG

Example (Yamada & Knight, 2001) :

VB PRP(1) VB1(2) VB2(3)

VB2 VB(1) TO(2)

TO TO(1) NN(2)

PRP he

VB1 adores

VB listening

TO to

NN music

VB PRP(1) VB2(3) VB1(2)

VB2 TO(2) VB(1) ga

TO NN(2) TO(1)

PRP kare ha

VB1 daisuki desu

VB kiku no

TO wo

NN ongaku

Page 70: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Synchronous CFG

Example (cont’d):

VB(1) VB(1)

PRP(1) VB2(3) VB1(2) PRP(1) VB1(2) VB2(3)

he kare ha adores daisuki desu

NN(2) TO(1)TO(1) NN(2)

music ongaku to wo

VB(1) TO(2) TO(2) VB(1) ga

listening kiku no

Page 71: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Synchronous CFG

A pair of CFG productions in a SCFG is called a synchronous production

A SCFG generates pairs of trees/strings, where each component is a translation of the other

A SCFG can be extended with probabilities:

Each pair of productions is assigned a probability

Probability of a pair of trees is the product of probabilities of synchronous productions involved

Page 72: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Membership

The membership problem (Wu, 1997) for SCFGs is defined as follows:

Input: SCFG and pair of strings [w1, w2 ]

Output: Yes/No depending on whether w1 translates into w2 under the SCFG

Applications in segmentation, word alignment and bracketing of parallel corpora

Assumption that SCFG is part of the input is made here to investigate the dependency of problem complexity on grammar size

Page 73: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Membership

Result: Membership problem for SCFGs is NP-complete

Proof uses SCFG derivations to explore space of consistent truth assignments that satisfy source 3SAT instance

Remarks:

Result transfers to (Yamada & Knight, 2001), (Gildea, 2003), (Melamed, 2003), which are at least as powerful as SCFG

Page 74: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Membership

Remarks (cont’d):

Problem can be solved in polynomial time if:

• input grammar is fixed or production length is bounded (Melamed, 2004)

• Inversion Transduction Grammars (Wu, 1997)

• Head Transducer Grammars (Alshawi et al., 2000)

For NLP applications, it is more realistic to assume a fixed grammar and varying input string

Page 75: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chart parsing

Providing an exponential time lower bound for the membership problem would amount to showing P ≠ NP

But we can show such a lower bound if we make some assumptions on the class of algorithms and data structures that we use to solve the problem

Result: If chart parsing techniques are used to solve the membership problem for SCFG, a number of partial analyses is obtained that grows exponentially with the production length of the input grammar

Page 76: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chart parsing

Chart parsing for CFGs works by combining completed constituents with partial analyses:

A B1 B2 B3 … Bn

Three indices are used to process each combination, for a total number of O(n3) possible combinations that must be checked, n the length of the input string

B1 B2 B3

B4

Page 77: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chart parsing

Consider the synchronous production :

[ A B (1) B (2) B (3) B (4) , A B (3) B (1) B (4) B (2) ]

representing the permutation :

B (1) B (2) B (3) B (4)

B (3) B (1) B (4) B (2)

Page 78: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chart parsing

When applying chart parsing, there is no way to keep partial analyses “contiguous”:

B (4)B (1) B (2) B (3)

B (4)B (1) B (2)B (3)

Page 79: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Chart parsing

The proof of our result generalizes the previous observations

We show that, for some worst case permutations of length q, any combination strategy we choose leads to a number of indices growing with order at least sqrt(q)

Then for SCFGs of size q, sqrt(q) is an asymptotic lower bound for the membership problem when chart parsing algorithms are used

Page 80: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Translation

A probabilistic SCFG provides the probability that tree t1 translates into tree t2:

Pr( [t1 , t2] )

Accordingly, we can define the probability that string w1 translates into string w2:

Pr( [w1 , w2] ) = t1w1,t2w2 Pr( [t1 , t2] )

and the probability that string w translates into tree t:

Pr( [w , t ] ) = t1w Pr( [t1 , t ] )

Page 81: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Translation

The string-to-tree translation problem for probabilistic SCFGs is defined as follows:

Input: Probabilistic SCFG and string w

Output: tree t such that Pr([w, t ]) is maximized

Application in machine translation

Again, assumption that SCFG is part of the input is made to investigate the dependency of problem complexity on grammar size

Page 82: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Result: string-to-tree translation problem for probabilistic SCFGs (summing over possible source trees) is NP-hard

Proof reduces from consensus problem:

Strings generated by probabilistic finite automaton or hidden Markov model have probabilities defined as sum of probabilities of several paths

Maximizing such summation is NP-hard (Casacuberta & Higuera, 2000) (Lyngso & Pedersen, 2002)

Translation

Page 83: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Remarks:

Source of complexity of the problem comes from the fact that several source trees can be translated into the same target tree

Result persists if there is a constant bound on length of synchronous productions

Open: can the problem be solved in polynomial time if probabilistic SCFG is fixed?

Translation

Page 84: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

the

Learning Non-Isomorphic Tree Mappings for Machine Translation

a

b

A

B

events of

misinform

wrongly

report

to-John

events

him

“wrongly report events to-John” “him misinform of the events”

2 words become 1

reorder dependents

0 words become 1

0 words become 1

Slides from J. Eisner

Page 85: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Syntax-Based Machine Translation

Previous work assumes essentially isomorphic trees

• Wu 1995, Alshawi et al. 2000, Yamada & Knight 2000

But trees are not isomorphic!

• Discrepancies between the languages

• Free translation in the training data

the

a

b

A

B

events of

misinform

wrongly

report

to-John

events

him

Page 86: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Two training trees, showing a free translation from French to English.

Synchronous Tree Substitution Grammar

enfants(“kids”)

d’(“of”)

beaucoup(“lots”)

Sam

donnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

kids

Sam

kiss

quite

often

“beaucoup d’enfants donnent un baiser à Sam” “kids kiss Sam quite often”

Page 87: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

enfants(“kids”)

kids

NPd’

(“of”)

beaucoup(“lots”)

NP

NP

SamSam

NP

Synchronous Tree Substitution Grammar

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

quitenullAdv

oftennullAdv

nullAdv

“beaucoup d’enfants donnent un baiser à Sam” “kids kiss Sam quite often”

Two training trees, showing a free translation from French to English.A possible alignment is shown in orange.

Page 88: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

enfants(“kids”)

kids

Adv

d’(“of”)

beaucoup(“lots”)

NP

SamSam

NP

Synchronous Tree Substitution Grammar

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NPquite

often

“beaucoup d’enfants donnent un baiser à Sam” “kids kiss Sam quite often”

Two training trees, showing a free translation from French to English.A possible alignment is shown in orange.A much worse alignment ...

Page 89: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

enfants(“kids”)

kids

NPd’

(“of”)

beaucoup(“lots”)

NP

NP

SamSam

NP

Synchronous Tree Substitution Grammar

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

quitenullAdv

oftennullAdv

nullAdv

“beaucoup d’enfants donnent un baiser à Sam” “kids kiss Sam quite often”

Two training trees, showing a free translation from French to English.A possible alignment is shown in orange.

Page 90: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

enfants(“kids”)

kids

NPd’

(“of”)

beaucoup(“lots”)

NP

NPquitenull

Adv

oftennullAdv

nullAdv

SamSam

NP

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

Synchronous Tree Substitution Grammar

“beaucoup d’enfants donnent un baiser à Sam” “kids kiss Sam quite often”

Start

Two training trees, showing a free translation from French to English.A possible alignment is shown in orange. Alignment shows how trees are generated synchronously from “little trees” ...

Page 91: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SamSamNP

Grammar = Set of Elementary Trees

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

idiomatictranslation

enfants(“kids”)

kids

NP

Page 92: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

enfants(“kids”)

kids

NP

SamSamNP

Grammar = Set of Elementary Trees

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

idiomatictranslation

SamSamNP

enfants(“kids”)

kids

NP

Page 93: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Grammar = Set of Elementary Trees

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

SamSamNP

enfants(“kids”)

kids

NP

d’(“of”)

beaucoup(“lots”)

NP

NP

“beaucoup d’” deletes inside the tree

Page 94: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

d’(“of”)

beaucoup(“lots”)

NP

NP

“beaucoup d’” deletes inside the tree

Grammar = Set of Elementary Trees

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

SamSamNP

enfants(“kids”)

kids

NP

Page 95: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

enfants(“kids”)

kids

NPd’

(“of”)

beaucoup(“lots”)

NP

NP

“beaucoup d’” matches nothing in English

Grammar = Set of Elementary Trees

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

SamSamNP

enfants(“kids”)

kids

NP

Page 96: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

SamSamNP

enfants(“kids”)

kids

NPquitenull

Adv

Grammar = Set of Elementary Trees

oftennullAdv

nullAdv

d’(“of”)

beaucoup(“lots”)

NP

NP

kissdonnent (“give”)

baiser(“kiss”)

un(“a”)

à (“to”)

Start

NP

NP

nullAdv

adverbial subtree matches nothing in French

Page 97: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Probability model similar to PCFG

Probability of generating training trees T1, T2 with alignment A

P(T1, T2, A) = p(t1,t2,a | n)

probabilities of the “little” trees that are used

p(is given by a maximum entropy model

wrongly

misinform

NP

NP

reportVP | )VP

Page 98: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

FEATURES• report+wrongly misinform?

(use dictionary)

• report misinform? (at root)

• wrongly misinform?

Maxent model of little tree pairs

• verb incorporates adverb child?

• verb incorporates child 1 of 3?

• children 2, 3 switch positions?

• common tree sizes & shapes?

• ... etc. ....

p(wrongly

misinform

NP

NP

reportVP | )VP

Page 99: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Inside Probabilities

the

a

b

A

B

events of

misinform

wrongly

report

to-Johnevents

him

VP

( ) = ...misinformreport VP

* ( ) * ( ) + ...

p( | )VP

Page 100: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

Inside Probabilities

the

a

b

A

B

events of

misinform

wrongly

report

to-Johnevents

him

VP

NP

NP

( ) = ...misinformreport VP

events ofNP to-John himNP* ( ) * ( ) + ...

p( | )VP

NP

misinform

wrongly

report VP

NP

only O(n2)

Page 101: Transfer-based MT. Syntactic Transfer-based Machine Translation Direct and Example-based approaches – Two ends of a spectrum – Recombination of fragments

An MT Architecture

Viterbi alignment yields output T2

dynamic programming engine

Probability Model p(t1,t2,a) of Little Trees

score little tree pair

propose translations t2of little tree t1

each possible (t1,t2,a)

inside-outsideestimated counts

update parameters

for each possible t1, various (t1,t2,a)

each proposed (t1,t2,a)

DecoderTrainer

scores all alignmentsof two big trees T1,T2

scores all alignmentsbetween a big tree T1

& a forest of big trees T2