simplification and explicitation universals

26
Translation Studies Simplification and Explicitation Universals Claudiu Mih˘ ail˘ a Faculty of Computer Science ”Alexandru Ioan Cuza” University of Ia¸ si 21 April 2010

Upload: claudiu-mihaila

Post on 30-Nov-2014

2.513 views

Category:

Technology


0 download

DESCRIPTION

Simplification and Explicitation Universals

TRANSCRIPT

Page 1: Simplification and Explicitation Universals

Translation StudiesSimplification and Explicitation Universals

Claudiu Mihaila

Faculty of Computer Science”Alexandru Ioan Cuza” University of Iasi

21 April 2010

Page 2: Simplification and Explicitation Universals

Outline

IntroductionMotivationTranslation studies

SimplificationDefinitonSimplification prosSimplification cons

ExplicitationDefinitonExplicitation prosExplicitation cons

Conclusions

2 of 13

Page 3: Simplification and Explicitation Universals

Motivation

• The questions◦ Is there a difference between original and translated language?◦ If so, is it automatically detectable?◦ And if so, does it improve NLP quality?

• The answers◦ Yes!◦ Yes: up to 97.62% for simplification◦ Yes:

• Human translator (self-)assessment• Statistical machine translation• Multilingual plagiarism detection

3 of 13

Page 4: Simplification and Explicitation Universals

Motivation

• The questions◦ Is there a difference between original and translated language?◦ If so, is it automatically detectable?◦ And if so, does it improve NLP quality?

• The answers◦ Yes!◦ Yes: up to 97.62% for simplification◦ Yes:

• Human translator (self-)assessment• Statistical machine translation• Multilingual plagiarism detection

3 of 13

Page 5: Simplification and Explicitation Universals

Translation studies

• Specific lexico-grammatical and syntactic characteristics

• Translationese - Gellerstam (1986)◦ ”Fingerprints” left behind by the translation process

• Translation laws - Toury (1983)◦ Standardisation, Interference

• Translation universals - Baker (1993)◦ Simplification, Explicitation, Convergence, Normalisation

4 of 13

Page 6: Simplification and Explicitation Universals

Translation studies

• Specific lexico-grammatical and syntactic characteristics

• Translationese - Gellerstam (1986)◦ ”Fingerprints” left behind by the translation process

• Translation laws - Toury (1983)◦ Standardisation, Interference

• Translation universals - Baker (1993)◦ Simplification, Explicitation, Convergence, Normalisation

4 of 13

Page 7: Simplification and Explicitation Universals

Translation studies

• Specific lexico-grammatical and syntactic characteristics

• Translationese - Gellerstam (1986)◦ ”Fingerprints” left behind by the translation process

• Translation laws - Toury (1983)◦ Standardisation, Interference

• Translation universals - Baker (1993)◦ Simplification, Explicitation, Convergence, Normalisation

4 of 13

Page 8: Simplification and Explicitation Universals

Translation studies

• Specific lexico-grammatical and syntactic characteristics

• Translationese - Gellerstam (1986)◦ ”Fingerprints” left behind by the translation process

• Translation laws - Toury (1983)◦ Standardisation, Interference

• Translation universals - Baker (1993)◦ Simplification, Explicitation, Convergence, Normalisation

4 of 13

Page 9: Simplification and Explicitation Universals

Simplification

• Tendency to produce simpler and easier-to-follow texts

• Laviosa (2002)◦ Study on small corpus◦ Features for simplification◦ Insufficient evidence

5 of 13

Page 10: Simplification and Explicitation Universals

Simplification

• Tendency to produce simpler and easier-to-follow texts

• Laviosa (2002)◦ Study on small corpus◦ Features for simplification◦ Insufficient evidence

5 of 13

Page 11: Simplification and Explicitation Universals

Simplification pros

• Baroni (2006)◦ Detect originals and translations in an Italian corpus◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags◦ Supervised learning system◦ Accuracy up to 87%

• Corpas (2008a)◦ English-into-Spanish and Spanish medical and technical texts◦ Validated for lexical richness◦ Contradicted for complex sentences, sentence length, ambiguity,

information load, depth of syntactic trees

• Corpas (2008b)◦ Validated for lexical richness and density, number of discourse

markers, complex sentences, sentence length◦ More visible for technical domain

6 of 13

Page 12: Simplification and Explicitation Universals

Simplification pros

• Baroni (2006)◦ Detect originals and translations in an Italian corpus◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags◦ Supervised learning system◦ Accuracy up to 87%

• Corpas (2008a)◦ English-into-Spanish and Spanish medical and technical texts◦ Validated for lexical richness◦ Contradicted for complex sentences, sentence length, ambiguity,

information load, depth of syntactic trees

• Corpas (2008b)◦ Validated for lexical richness and density, number of discourse

markers, complex sentences, sentence length◦ More visible for technical domain

6 of 13

Page 13: Simplification and Explicitation Universals

Simplification pros

• Baroni (2006)◦ Detect originals and translations in an Italian corpus◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags◦ Supervised learning system◦ Accuracy up to 87%

• Corpas (2008a)◦ English-into-Spanish and Spanish medical and technical texts◦ Validated for lexical richness◦ Contradicted for complex sentences, sentence length, ambiguity,

information load, depth of syntactic trees

• Corpas (2008b)◦ Validated for lexical richness and density, number of discourse

markers, complex sentences, sentence length◦ More visible for technical domain

6 of 13

Page 14: Simplification and Explicitation Universals

Simplification pros

• Ilisei (2010)◦ 21 language-independent features◦ Supervised machine learning - 8 classifiers◦ Accuracy of 97.62%◦ Most salient features - InfoGain, ChiSquare

• Lexical richness• Sentence length• Proportions of pronouns, conjunctions, grammatical and lexical words

7 of 13

Page 15: Simplification and Explicitation Universals

Simplification cons

• Jantunen (2001)◦ Boosters in Finnish translations - hyvin, kovin, oikein◦ typical lexical combinations in most cases

• Jantunen (2004)◦ Boosters in Finnish translations - hyvin, kovin, oikein◦ untypical lexical combinations in translations◦ similar colligations in originals and translations

8 of 13

Page 16: Simplification and Explicitation Universals

Simplification cons

• Jantunen (2001)◦ Boosters in Finnish translations - hyvin, kovin, oikein◦ typical lexical combinations in most cases

• Jantunen (2004)◦ Boosters in Finnish translations - hyvin, kovin, oikein◦ untypical lexical combinations in translations◦ similar colligations in originals and translations

8 of 13

Page 17: Simplification and Explicitation Universals

Explicitation

• Introducing overt information into the translation that is implicit inthe source language

• Classification - Pym (2005)◦ Obligatory explicitation

• Forced by language specificity or grammar

◦ Voluntary explicitation

• Optional information to avoid misinterpretations

9 of 13

Page 18: Simplification and Explicitation Universals

Explicitation

• Introducing overt information into the translation that is implicit inthe source language

• Classification - Pym (2005)◦ Obligatory explicitation

• Forced by language specificity or grammar

◦ Voluntary explicitation

• Optional information to avoid misinterpretations

9 of 13

Page 19: Simplification and Explicitation Universals

Explicitation pros

• Burnett (1999)◦ BNC vs. TEC◦ suggest, admit, claim, think, believe, hope, know

• Olohan (2000)◦ BNC vs. TEC◦ say / tell + that / zero connective

• Olohan (2001)◦ BNC vs. TEC◦ promise + that / zero connective

10 of 13

Page 20: Simplification and Explicitation Universals

Explicitation pros

• Burnett (1999)◦ BNC vs. TEC◦ suggest, admit, claim, think, believe, hope, know

• Olohan (2000)◦ BNC vs. TEC◦ say / tell + that / zero connective

• Olohan (2001)◦ BNC vs. TEC◦ promise + that / zero connective

10 of 13

Page 21: Simplification and Explicitation Universals

Explicitation pros

• Burnett (1999)◦ BNC vs. TEC◦ suggest, admit, claim, think, believe, hope, know

• Olohan (2000)◦ BNC vs. TEC◦ say / tell + that / zero connective

• Olohan (2001)◦ BNC vs. TEC◦ promise + that / zero connective

10 of 13

Page 22: Simplification and Explicitation Universals

Explicitation cons

• Cheong (2006)◦ Explicitation vs. implicitation◦ English-into-Korean translations◦ The phenomena appear equally◦ The direction of translation influences their behaviour

11 of 13

Page 23: Simplification and Explicitation Universals

Conclusions

• Simplification◦ Many studies supporting it◦ Many studies contradicting it◦ Not yet clearly confirmed

• Explicitation◦ Occuring often to avoid misinterpretations◦ Implicitation needs to be considered as well

• Usefulness◦ SMT◦ Multilingual plagiarism detection◦ (Self-)assessment of translators’s work

12 of 13

Page 24: Simplification and Explicitation Universals

Conclusions

• Simplification◦ Many studies supporting it◦ Many studies contradicting it◦ Not yet clearly confirmed

• Explicitation◦ Occuring often to avoid misinterpretations◦ Implicitation needs to be considered as well

• Usefulness◦ SMT◦ Multilingual plagiarism detection◦ (Self-)assessment of translators’s work

12 of 13

Page 25: Simplification and Explicitation Universals

Conclusions

• Simplification◦ Many studies supporting it◦ Many studies contradicting it◦ Not yet clearly confirmed

• Explicitation◦ Occuring often to avoid misinterpretations◦ Implicitation needs to be considered as well

• Usefulness◦ SMT◦ Multilingual plagiarism detection◦ (Self-)assessment of translators’s work

12 of 13

Page 26: Simplification and Explicitation Universals

Thank you!

• Questions?

13 of 13