presentation

75
Translation Memory Retrieval Methods [Bloodgood and Strauss, 2014] in Proc of 14th EACL Koichi Akabe and Philip Arthur NAIST MT Study 2014-07-03 2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 1 / 27

Post on 14-Sep-2014

146 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Presentation

Translation Memory Retrieval Methods[Bloodgood and Strauss, 2014] in Proc of 14th EACL

Koichi Akabe and Philip Arthur

NAIST MT Study

2014-07-03

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 1 / 27

Page 2: Presentation

Introduction

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 2 / 27

Page 3: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 4: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 5: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 6: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 7: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja 犬がドアを開けた。 (fuzzy)

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 8: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja 犬がドアを開けた。 (fuzzy)

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 9: Presentation

Translation Memory (TM)

▶ Most widely used computer-assisted translation (CAT) tool

▶ Suggest translations using other translations

En The dog opened the door.

Ja 犬がドアを開けた。

En I saw a girl with a telescope.

Ja 僕は望遠鏡で少女を見た。

En John opened the door.

Ja ジョンがドアを開けた。

1. Find the nearest source sentence

2. Suggest a translation

3. Post-editing

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 3 / 27

Page 10: Presentation

How to find the nearest source sentence?

TM finds the nearest source sentence using similarity metrics

▶ Edit distance (Leven-shtein distance)−→ Widely used metric

▶ MT evaluation metrics [Simard and Fujita, 2012]−→ WER, BLEU, NIST, VMeteor, Meteor as TM metrics

▶ This paper

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 4 / 27

Page 11: Presentation

How to find the nearest source sentence?

TM finds the nearest source sentence using similarity metrics

▶ Edit distance (Leven-shtein distance)−→ Widely used metric

▶ MT evaluation metrics [Simard and Fujita, 2012]−→ WER, BLEU, NIST, VMeteor, Meteor as TM metrics

▶ This paper

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 4 / 27

Page 12: Presentation

How to find the nearest source sentence?

TM finds the nearest source sentence using similarity metrics

▶ Edit distance (Leven-shtein distance)−→ Widely used metric

▶ MT evaluation metrics [Simard and Fujita, 2012]−→ WER, BLEU, NIST, VMeteor, Meteor as TM metrics

▶ This paper

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 4 / 27

Page 13: Presentation

How to find the nearest source sentence?

TM finds the nearest source sentence using similarity metrics

▶ Edit distance (Leven-shtein distance)−→ Widely used metric

▶ MT evaluation metrics [Simard and Fujita, 2012]−→ WER, BLEU, NIST, VMeteor, Meteor as TM metrics

▶ This paper

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 4 / 27

Page 14: Presentation

Threshold of helpfulness

Matching algorithm always returns the nearest sentenceHowever, low score suggestions should not be shown

TM softwares set the threshold at 70% in practice

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 5 / 27

Page 15: Presentation

Threshold of helpfulness

Matching algorithm always returns the nearest sentenceHowever, low score suggestions should not be shown

TM softwares set the threshold at 70% in practice

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 5 / 27

Page 16: Presentation

Threshold of helpfulness

Matching algorithm always returns the nearest sentenceHowever, low score suggestions should not be shown

TM softwares set the threshold at 70% in practice −→ Why?

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 5 / 27

Page 17: Presentation

Translation Memory Similarity Metrics

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 6 / 27

Page 18: Presentation

Definitions

TM Similarity Metrics compare M and C.M : workload sentenceC: source language side of a candidate pre-existing translation

En The dog opened the door .

Ja 犬がドアを開けた。

En I saw a girl with a telescope .

Ja 僕は望遠鏡で少女を見た。

En John opened the door .

Ja 犬がドアを開けた。 (fuzzy)

M =John opened the door .C1 =The dog opened the door .C2 =I saw a girl with a telescope ....

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 7 / 27

Page 19: Presentation

Definitions

TM Similarity Metrics compare M and C.M : workload sentenceC: source language side of a candidate pre-existing translation

En The dog opened the door .

Ja 犬がドアを開けた。

En I saw a girl with a telescope .

Ja 僕は望遠鏡で少女を見た。

En John opened the door .

Ja 犬がドアを開けた。 (fuzzy)

M =John opened the door .C1 =The dog opened the door .C2 =I saw a girl with a telescope ....

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 7 / 27

Page 20: Presentation

Translation Memory Similarity Metrics

Compare the following metrics:

▶ Percent Match

▶ Weighted Percent Match

▶ Edit Distance

▶ N-gram Precision

▶ Weighted N-gram Precision

▶ Modified Weighted N-gram Precision

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 8 / 27

Page 21: Presentation

Percent Match (PM)

The simplest metric

PM(M,C) =|Munigrams ∩ Cunigrams|

|Munigrams|

e.g.

M =John opened the door .C =The dog opened the door .

PM(M,C) =4

5= 0.80

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 9 / 27

Page 22: Presentation

Percent Match (PM)

The simplest metric

PM(M,C) =|Munigrams ∩ Cunigrams|

|Munigrams|

e.g.

M =John opened the door .C =The dog opened the door .

PM(M,C) =4

5= 0.80

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 9 / 27

Page 23: Presentation

Percent Match (PM)

The simplest metric

PM(M,C) =|Munigrams ∩ Cunigrams|

|Munigrams|

e.g.

M =John opened the door .C =The dog opened the door .

PM(M,C) =4

5= 0.80

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 9 / 27

Page 24: Presentation

Percent Match (PM)

The simplest metric

PM(M,C) =|Munigrams ∩ Cunigrams|

|Munigrams|

e.g.

M =John opened the door .C =The dog opened the door .

PM(M,C) =4

5= 0.80

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 9 / 27

Page 25: Presentation

Weighted Percent Match (WPM)

We want to know translation of rare words

PM with IDF weighting

WPM(M,C) =

∑u∈{Munigrams∩Cunigrams}

idf(u,D)

∑u∈Munigrams

idf(u,D)

where D is a set of all source sentences in the parallel corpus

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 10 / 27

Page 26: Presentation

Weighted Percent Match (WPM)

We want to know translation of rare words

PM with IDF weighting

WPM(M,C) =

∑u∈{Munigrams∩Cunigrams}

idf(u,D)

∑u∈Munigrams

idf(u,D)

where D is a set of all source sentences in the parallel corpus

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 10 / 27

Page 27: Presentation

Weighted Percent Match (WPM)

We want to know translation of rare words

PM with IDF weighting

WPM(M,C) =

∑u∈{Munigrams∩Cunigrams}

idf(u,D)

∑u∈Munigrams

idf(u,D)

where D is a set of all source sentences in the parallel corpus

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 10 / 27

Page 28: Presentation

Problem of PM and WPM

PM and WPM only consider coverage of words

−→ They cannnot see any context

We show methods that consider contexts in next slides

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 11 / 27

Page 29: Presentation

Problem of PM and WPM

PM and WPM only consider coverage of words−→ They cannnot see any context

We show methods that consider contexts in next slides

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 11 / 27

Page 30: Presentation

Problem of PM and WPM

PM and WPM only consider coverage of words−→ They cannnot see any context

We show methods that consider contexts in next slides

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 11 / 27

Page 31: Presentation

Edit Distance (ED)

Widely used metric

ED = max

(1− edit-dist(M,C)

|Munigrams|, 0

)where edit-dist(M,C) is the number of word insertions, deletions,and substitutions required to transform M into C

e.g.

M =John opened the door .C =The dog opened the door .substitution: 1insertion: 1

ED(M,C) = 1− 2

5= 0.60

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 12 / 27

Page 32: Presentation

Edit Distance (ED)

Widely used metric

ED = max

(1− edit-dist(M,C)

|Munigrams|, 0

)where edit-dist(M,C) is the number of word insertions, deletions,and substitutions required to transform M into C

e.g.

M =John opened the door .C =The dog opened the door .substitution: 1insertion: 1

ED(M,C) = 1− 2

5= 0.60

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 12 / 27

Page 33: Presentation

Edit Distance (ED)

Widely used metric

ED = max

(1− edit-dist(M,C)

|Munigrams|, 0

)where edit-dist(M,C) is the number of word insertions, deletions,and substitutions required to transform M into C

e.g.

M =John opened the door .C =The dog opened the door .

substitution: 1insertion: 1

ED(M,C) = 1− 2

5= 0.60

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 12 / 27

Page 34: Presentation

Edit Distance (ED)

Widely used metric

ED = max

(1− edit-dist(M,C)

|Munigrams|, 0

)where edit-dist(M,C) is the number of word insertions, deletions,and substitutions required to transform M into C

e.g.

M =John opened the door .C =The dog opened the door .substitution: 1

insertion: 1

ED(M,C) = 1− 2

5= 0.60

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 12 / 27

Page 35: Presentation

Edit Distance (ED)

Widely used metric

ED = max

(1− edit-dist(M,C)

|Munigrams|, 0

)where edit-dist(M,C) is the number of word insertions, deletions,and substitutions required to transform M into C

e.g.

M =John opened the door .C =The dog opened the door .substitution: 1insertion: 1

ED(M,C) = 1− 2

5= 0.60

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 12 / 27

Page 36: Presentation

Edit Distance (ED)

Widely used metric

ED = max

(1− edit-dist(M,C)

|Munigrams|, 0

)where edit-dist(M,C) is the number of word insertions, deletions,and substitutions required to transform M into C

e.g.

M =John opened the door .C =The dog opened the door .substitution: 1insertion: 1

ED(M,C) = 1− 2

5= 0.60

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 12 / 27

Page 37: Presentation

N-gram Precision (NGP)

Mean of N-gram precision (like the BLEU metric)However, BLEU → 0 when the precision of longer N-grams is 0

This work uses arithmetic mean instead of geometric mean

NGP =1

N

N∑n=1

pn

pn =|Mn-grams ∩ Cn-grams|

Z ∗ |Mn-grams|+ (1− Z) ∗ |Cn-grams|

where Z is a parameter to control normalization,and N is the maximum length of N-gramN = 4 and Z = 0.75 in main experiments (discuss later)

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 13 / 27

Page 38: Presentation

N-gram Precision (NGP)

Mean of N-gram precision (like the BLEU metric)

However, BLEU → 0 when the precision of longer N-grams is 0

This work uses arithmetic mean instead of geometric mean

NGP =1

N

N∑n=1

pn

pn =|Mn-grams ∩ Cn-grams|

Z ∗ |Mn-grams|+ (1− Z) ∗ |Cn-grams|

where Z is a parameter to control normalization,and N is the maximum length of N-gramN = 4 and Z = 0.75 in main experiments (discuss later)

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 13 / 27

Page 39: Presentation

N-gram Precision (NGP)

Mean of N-gram precision (like the BLEU metric)However, BLEU → 0 when the precision of longer N-grams is 0

This work uses arithmetic mean instead of geometric mean

NGP =1

N

N∑n=1

pn

pn =|Mn-grams ∩ Cn-grams|

Z ∗ |Mn-grams|+ (1− Z) ∗ |Cn-grams|

where Z is a parameter to control normalization,and N is the maximum length of N-gramN = 4 and Z = 0.75 in main experiments (discuss later)

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 13 / 27

Page 40: Presentation

N-gram Precision (NGP)

Mean of N-gram precision (like the BLEU metric)However, BLEU → 0 when the precision of longer N-grams is 0

This work uses arithmetic mean instead of geometric mean

NGP =1

N

N∑n=1

pn

pn =|Mn-grams ∩ Cn-grams|

Z ∗ |Mn-grams|+ (1− Z) ∗ |Cn-grams|

where Z is a parameter to control normalization,and N is the maximum length of N-gramN = 4 and Z = 0.75 in main experiments (discuss later)

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 13 / 27

Page 41: Presentation

Weighted N-gram Precision (WNGP)

NGP with IDF weighting

WNGP =

N∑n=1

1

Nwpn

wpn =

∑i∈{Mn-grams∩Cn-grams}

w(i)

Z ∗

∑i∈Mn-grams

w(i)

+ (1− Z) ∗

∑i∈Cn-grams

w(i)

w(i) =∑

1-gram∈iidf(1-gram,D)

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 14 / 27

Page 42: Presentation

Weighted N-gram Precision (WNGP)

NGP with IDF weighting

WNGP =

N∑n=1

1

Nwpn

wpn =

∑i∈{Mn-grams∩Cn-grams}

w(i)

Z ∗

∑i∈Mn-grams

w(i)

+ (1− Z) ∗

∑i∈Cn-grams

w(i)

w(i) =∑

1-gram∈iidf(1-gram,D)

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 14 / 27

Page 43: Presentation

Modified Weighted N-gram Precision (MWNGP)

Shorter N-grams may help translators more than longer N-grams

WNGP =

N∑n=1

1

Nwpn

MWNGP =2N

2N − 1

N∑n=1

1

2nwpn

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 15 / 27

Page 44: Presentation

Modified Weighted N-gram Precision (MWNGP)

Shorter N-grams may help translators more than longer N-grams

WNGP =

N∑n=1

1

Nwpn

MWNGP =2N

2N − 1

N∑n=1

1

2nwpn

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 15 / 27

Page 45: Presentation

Experiment

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 16 / 27

Page 46: Presentation

Experiment

Two different technicals domains with Two different language pairs(Fr-En, Zn-En).

▶ Zn-En: OpenOffice3

▶ Fr-En: EMEA

Preprocessing is performed on both source sides to produce validsegment.

Some sentences are randomly sampled from corpus as M and C.

▶ Zn-En: 400 M and 10.000 C.

▶ Fr-En: 300 M and 10.000 C.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 17 / 27

Page 47: Presentation

Experiment

Two different technicals domains with Two different language pairs(Fr-En, Zn-En).

▶ Zn-En: OpenOffice3

▶ Fr-En: EMEA

Preprocessing is performed on both source sides to produce validsegment.

Some sentences are randomly sampled from corpus as M and C.

▶ Zn-En: 400 M and 10.000 C.

▶ Fr-En: 300 M and 10.000 C.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 17 / 27

Page 48: Presentation

Experiment

Two different technicals domains with Two different language pairs(Fr-En, Zn-En).

▶ Zn-En: OpenOffice3

▶ Fr-En: EMEA

Preprocessing is performed on both source sides to produce validsegment.

Some sentences are randomly sampled from corpus as M and C.

▶ Zn-En: 400 M and 10.000 C.

▶ Fr-En: 300 M and 10.000 C.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 17 / 27

Page 49: Presentation

Experiment

Two different technicals domains with Two different language pairs(Fr-En, Zn-En).

▶ Zn-En: OpenOffice3

▶ Fr-En: EMEA

Preprocessing is performed on both source sides to produce validsegment.

Some sentences are randomly sampled from corpus as M and C.

▶ Zn-En: 400 M and 10.000 C.

▶ Fr-En: 300 M and 10.000 C.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 17 / 27

Page 50: Presentation

Experiment

Two different technicals domains with Two different language pairs(Fr-En, Zn-En).

▶ Zn-En: OpenOffice3

▶ Fr-En: EMEA

Preprocessing is performed on both source sides to produce validsegment.

Some sentences are randomly sampled from corpus as M and C.

▶ Zn-En: 400 M and 10.000 C.

▶ Fr-En: 300 M and 10.000 C.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 17 / 27

Page 51: Presentation

Evaluation

Evaluation is performed with Human Evaluation using AmazonMechanical Turk.

The Score is ranging from 1 to 5 (Not Helpful until ExtremelyHelpful).

Each segment M is rated by 5 Turkers and we keep track whichmetric performs best (ties is allowed).

The scores of each M are averaged as Mean Opinion Score(MOS).

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 18 / 27

Page 52: Presentation

Evaluation

Evaluation is performed with Human Evaluation using AmazonMechanical Turk.

The Score is ranging from 1 to 5 (Not Helpful until ExtremelyHelpful).

Each segment M is rated by 5 Turkers and we keep track whichmetric performs best (ties is allowed).

The scores of each M are averaged as Mean Opinion Score(MOS).

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 18 / 27

Page 53: Presentation

Evaluation

Evaluation is performed with Human Evaluation using AmazonMechanical Turk.

The Score is ranging from 1 to 5 (Not Helpful until ExtremelyHelpful).

Each segment M is rated by 5 Turkers and we keep track whichmetric performs best (ties is allowed).

The scores of each M are averaged as Mean Opinion Score(MOS).

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 18 / 27

Page 54: Presentation

Evaluation

Evaluation is performed with Human Evaluation using AmazonMechanical Turk.

The Score is ranging from 1 to 5 (Not Helpful until ExtremelyHelpful).

Each segment M is rated by 5 Turkers and we keep track whichmetric performs best (ties is allowed).

The scores of each M are averaged as Mean Opinion Score(MOS).

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 18 / 27

Page 55: Presentation

Evaluation

Evaluation is performed with Human Evaluation using AmazonMechanical Turk.

The Score is ranging from 1 to 5 (Not Helpful until ExtremelyHelpful).

Each segment M is rated by 5 Turkers and we keep track whichmetric performs best (ties is allowed).

The scores of each M are averaged as Mean Opinion Score(MOS).

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 18 / 27

Page 56: Presentation

Result and Analysis

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 19 / 27

Page 57: Presentation

Result: Which metric performs best?

Table OO3 Zn-En

Metric Found Best Total C

PM 178 400WPM 200 400

ED 193 400NGP 251 400

WNGP 271 400MWNGP 282 400

Table EMEA Fr-En

Metric Found Best Total C

PM 166 300WPM 184 300

ED 148 300NGP 188 300

WNGP 198 300MWNGP 201 300

Modified Weighted N-Gram Precision (MWNGP) achieved thebest result compared to any other metrics.

There are slight different between WNGP and Modified-WNGP.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 20 / 27

Page 58: Presentation

Result: Which metric performs best?

Table OO3 Zn-En

Metric Found Best Total C

PM 178 400WPM 200 400

ED 193 400NGP 251 400

WNGP 271 400MWNGP 282 400

Table EMEA Fr-En

Metric Found Best Total C

PM 166 300WPM 184 300

ED 148 300NGP 188 300

WNGP 198 300MWNGP 201 300

Modified Weighted N-Gram Precision (MWNGP) achieved thebest result compared to any other metrics.

There are slight different between WNGP and Modified-WNGP.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 20 / 27

Page 59: Presentation

Result: Which metric performs best?

Table OO3 Zn-En

Metric Found Best Total C

PM 178 400WPM 200 400

ED 193 400NGP 251 400

WNGP 271 400MWNGP 282 400

Table EMEA Fr-En

Metric Found Best Total C

PM 166 300WPM 184 300

ED 148 300NGP 188 300

WNGP 198 300MWNGP 201 300

Modified Weighted N-Gram Precision (MWNGP) achieved thebest result compared to any other metrics.

There are slight different between WNGP and Modified-WNGP.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 20 / 27

Page 60: Presentation

Result: Which metric performs best?

Table OO3 Zn-En

Metric Found Best Total C

PM 178 400WPM 200 400

ED 193 400NGP 251 400

WNGP 271 400MWNGP 282 400

Table EMEA Fr-En

Metric Found Best Total C

PM 166 300WPM 184 300

ED 148 300NGP 188 300

WNGP 198 300MWNGP 201 300

Modified Weighted N-Gram Precision (MWNGP) achieved thebest result compared to any other metrics.

There are slight different between WNGP and Modified-WNGP.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 20 / 27

Page 61: Presentation

Scatterplot: OO3 Percent Match

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

MOS

0.0

0.2

0.4

0.6

0.8

1.0

Me

tric

Va

lue

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 21 / 27

Page 62: Presentation

Scatterplot: OO3 Edit Distance

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

MOS

0.0

0.2

0.4

0.6

0.8

1.0

Me

tric

Va

lue

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 22 / 27

Page 63: Presentation

Scatterplot: OO3 Modified N-Gram Precision

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

MOS

0.0

0.2

0.4

0.6

0.8

1.0

Me

tric

Va

lue

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 23 / 27

Page 64: Presentation

The effect of Z: Adjusting for length preferences

Many of the metrics are using Z as parameters.

Z parameter can be used to control for length preferences.

Table EMEA Fr-En

Z Value Avg Length0.00 9.92980.25 13.2040.50 16.01340.75 19.63551.00 27.8829

Table OO3 Zn-En

Z Value Avg Length0.00 7.24750.25 9.56000.50 11.12500.75 14.18251.00 25.0875

Smaller Z prefered shorter match that are more precise andincreased precision.

Larger Z prefers longer match that contains many correcttranslations and increased recall.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 24 / 27

Page 65: Presentation

The effect of Z: Adjusting for length preferences

Many of the metrics are using Z as parameters.

Z parameter can be used to control for length preferences.

Table EMEA Fr-En

Z Value Avg Length0.00 9.92980.25 13.2040.50 16.01340.75 19.63551.00 27.8829

Table OO3 Zn-En

Z Value Avg Length0.00 7.24750.25 9.56000.50 11.12500.75 14.18251.00 25.0875

Smaller Z prefered shorter match that are more precise andincreased precision.

Larger Z prefers longer match that contains many correcttranslations and increased recall.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 24 / 27

Page 66: Presentation

The effect of Z: Adjusting for length preferences

Many of the metrics are using Z as parameters.

Z parameter can be used to control for length preferences.

Table EMEA Fr-En

Z Value Avg Length0.00 9.92980.25 13.2040.50 16.01340.75 19.63551.00 27.8829

Table OO3 Zn-En

Z Value Avg Length0.00 7.24750.25 9.56000.50 11.12500.75 14.18251.00 25.0875

Smaller Z prefered shorter match that are more precise andincreased precision.

Larger Z prefers longer match that contains many correcttranslations and increased recall.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 24 / 27

Page 67: Presentation

The effect of Z: Adjusting for length preferences

Many of the metrics are using Z as parameters.

Z parameter can be used to control for length preferences.

Table EMEA Fr-En

Z Value Avg Length0.00 9.92980.25 13.2040.50 16.01340.75 19.63551.00 27.8829

Table OO3 Zn-En

Z Value Avg Length0.00 7.24750.25 9.56000.50 11.12500.75 14.18251.00 25.0875

Smaller Z prefered shorter match that are more precise andincreased precision.

Larger Z prefers longer match that contains many correcttranslations and increased recall.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 24 / 27

Page 68: Presentation

The effect of Z: Adjusting for length preferences

Many of the metrics are using Z as parameters.

Z parameter can be used to control for length preferences.

Table EMEA Fr-En

Z Value Avg Length0.00 9.92980.25 13.2040.50 16.01340.75 19.63551.00 27.8829

Table OO3 Zn-En

Z Value Avg Length0.00 7.24750.25 9.56000.50 11.12500.75 14.18251.00 25.0875

Smaller Z prefered shorter match that are more precise andincreased precision.

Larger Z prefers longer match that contains many correcttranslations and increased recall.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 24 / 27

Page 69: Presentation

Conclusion

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 25 / 27

Page 70: Presentation

Conclusion

▶ This paper compares TM similarity metrics.

▶ The best method is Modified Weighted N-Gram Precision.

▶ All the discussed metrics only consider source sides in thecalculation.

▶ Z parameter is used to adjust the length preferences of theretrieved TM.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 26 / 27

Page 71: Presentation

Conclusion

▶ This paper compares TM similarity metrics.

▶ The best method is Modified Weighted N-Gram Precision.

▶ All the discussed metrics only consider source sides in thecalculation.

▶ Z parameter is used to adjust the length preferences of theretrieved TM.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 26 / 27

Page 72: Presentation

Conclusion

▶ This paper compares TM similarity metrics.

▶ The best method is Modified Weighted N-Gram Precision.

▶ All the discussed metrics only consider source sides in thecalculation.

▶ Z parameter is used to adjust the length preferences of theretrieved TM.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 26 / 27

Page 73: Presentation

Conclusion

▶ This paper compares TM similarity metrics.

▶ The best method is Modified Weighted N-Gram Precision.

▶ All the discussed metrics only consider source sides in thecalculation.

▶ Z parameter is used to adjust the length preferences of theretrieved TM.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 26 / 27

Page 74: Presentation

Conclusion

▶ This paper compares TM similarity metrics.

▶ The best method is Modified Weighted N-Gram Precision.

▶ All the discussed metrics only consider source sides in thecalculation.

▶ Z parameter is used to adjust the length preferences of theretrieved TM.

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 26 / 27

Page 75: Presentation

Thank you for your attention!

2014-07-03 Koichi Akabe and Philip Arthur (MT Study) 27 / 27