modern mt systems and the myth of human translation: real world status quo
DESCRIPTION
Modern MT Systems and the Myth of Human Translation: Real World Status Quo. Intro MT & HT Definitions Comparison MT vs. HT Evaluation Methods FAE Framework Conclusion Discussion. Is This for Me?. (Freelance) translators and agencies Developers and vendors of MT systems - PowerPoint PPT PresentationTRANSCRIPT
Modern MT Systems and theMyth of Human Translation:
Real World Status Quo
● Intro
● MT & HT Definitions
● Comparison MT vs. HT
● Evaluation Methods
● FAE Framework
● Conclusion
● Discussion
Is This for Me?
● (Freelance) translators and agencies
● Developers and vendors of MT systems
● People concerned with MT evaluation
● People concerned with HT evaluation
This talk may be of benefit for:
Not for interpreters and speech/non-text based issues
Introduction
● What is Machine Translation (MT)?
● What is [Human] Translation (HT)?
„MT is the automatic translation of human language by computers.“
„The process of transforming text from one language into another language.“
„A written communication in a second language having the same meaning as the written communication in a first language.“
Introduction II
● Is there such a thing as HT?„Pure Human Translation“„Machine Aided Human Translation“„Human Aided Machine Translation“
● Is HT equal to HT?
„Native Speaker“„Speaks Language X“„[Trained] Professional“„Trained Prof. specialized in X“
HT/MT Examples & Quizshow
Original: Einzigartiger Freizeitpark für Groß und Klein
T1: Singular recreational park for large and smallT2: Unique leisure time park for largely and smallT3: Ein Fantastische DinoPark ferrcoitungT4: Unique Freizeitpark at big and littleT5: Unique amusement park for great and KleinT6: Unique leisure park for big and little
T1: Babelfish/SYSTRANT2: SDL FreeTranslation.comT3: HumanT4: InterTranT5: Linguatex eTranslationT6: PetaMem LangSuite MT
Summary HT Quality
● Not all HTs are equal● Significant amount done by untrained people● Better performance of good(!) MT systems on these
examples suggests rising MT competitiveness
Issues with MT & HT Evaluation
● Evaluation vs. Similarity• Ngram does work? Why?
● Reference Translations:• Cost & Availability
• Multiples – which
• „Axiomatic Truth“
● Judging• Expensive
• Questionable results
● Using MT-eval methods: limitations just mentioned
Mission Impossible?
● Fully automatic evaluation method for both MT & HT – with no human Intervention?
● Purpose: Automatic QA of translations – at least safe rejection of bad results
● Part of an iterative process (with faith in the translator)
We need it – should we give up?
Let's Try Anyway!
● Text Metrics• Length
• Word/Sentence/Paragraph count
● Statistics• Character/Word occurrence
• Ngram
• Collocations
● Translator Parameters
● Monolingual Corpora for SL & TL
• Statistical reference
● Dictionaries & Thesauri• Adequacy check
• Translation distance
• Sentence Alignment
● Parallel Corpora• Translation Length Ratio
Extract Information Reference Data
Workflow
Conclusion
● Translation results of the best contemporary MT systems can be considered on par with the average HT
● The presented evaluation framework is just the beginning of an automatic evaluation method for both MT & HT
● It is a robust and reliable validation method with safe rejection of invalid/bad translations
● In production Q1/2005
Thanks!
Q & A