terminology as a service – a model for collaborative terminology management
Post on 30-Jul-2015
82 Views
Preview:
TRANSCRIPT
Terminology as a Service –a model for collaborative
terminology management
EAFT Terminology SummitBarcelona – 27-28 November 2014
Klaus-Dirk SchmitzCologne University of Applied Sciencesklaus.schmitz@fh-koeln.de
Tatiana GornostayTilde, Rigatatiana.gornostay@tilde.lv
K.-D. Schmitz, IIM, FH Köln
Collaborative terminology management
Collaborative: several individuals are involved in the creation of terminological entries
Different terminological competences require well elaborated user profiles with specific rights and views (read/write, only certain languages/datCats, …)
Well defined workflow and quality assurance procedures needed (supported by e.g. QuickTerm)
Metadata (datCats) for normative and workflow status needed (preferred/admitted/deprecated, draft/under discussion/final, …)
K.-D. Schmitz, IIM, FH Köln
Cloud-based terminology management
Since terminology work is “expensive”, why not involve the Crowd to create and validate terminology?
You need a tool for managing terminology in the Cloud!
Examples: Wikipedia (www.wikipedia.org) TermWiki (www.termwiki.com)
Different approach to: web interfaces for TMS (e.g. MultiTerm-Web) web-based TMS (e.g. TermWeb)
K.-D. Schmitz, IIM, FH Köln
The main questions: How can you animate the Crowd?
Hidden business model? Free services? Sharing data? Do you want to have your data in the Cloud?
Can you apply established terminological principles (meta model, datCats, concept-orientation)
How can you ensure correctness? How can you ensure completeness? How can you ensure consistency? How can you ensure reliability?
Cloud-/Crowd-based terminology work
K.-D. Schmitz, IIM, FH Köln
A new approach as an example:
TaaS - Terminology as a Service:
cloud-based platform for acquiring, cleaning up, sharing, and reusing multilingual terminological data
The project has received funding from the European Union Seventh Framework Programme (FP7/2007-2013), grant agreement no 296312.
The TaaS Project
K.-D. Schmitz, IIM, FH Köln
Partners: Tilde Latvia (Coordinator) TAUS The Netherlands Kilgray Hungary Fachhochschule Köln Germany University of Sheffield UK
Time: 1. June 2012 – 31. May 2014Languages: all European + Russian
www.taas-project.eu
The TaaS Project
K.-D. Schmitz, IIM, FH Köln
Automatic extraction of monolingual term candidates from user uploaded documents using state-of-the art terminology extraction techniques
Automatic retrieval of translation equivalents for the extracted terms, in user-defined target language(s) from different public and industry terminology databases
Translation candidate acquisition for terms not found in term banks from parallel web data using state of-the-art terminology extraction and bilingual terminology alignment methods;
Basic Services of TaaS
K.-D. Schmitz, IIM, FH Köln
Facilities for cleaning-up by users automatically acquired terminological data
Data sharing and integration facilities through APIs and export tools for sharing of resulting terminological data with major term banks and usage in different applications
Basic Services of TaaS
K.-D. Schmitz, IIM, FH Köln
Go to https://term.tilde.com
Direct search for terms and equivalents
Or log in / sign up for further services
Example: Term extraction via TaaS
K.-D. Schmitz, IIM, FH Köln
Gehe zu https://term.tilde.com
Entweder direkte Suche
Oder anmelden / registrieren für weitere Services
Projekt zur Termextraktion anlegen
Text(e) zur Extraktion laden
Beispiel: Termextraktion mit TaaS
K.-D. Schmitz, IIM, FH Köln
Gehe zu https://term.tilde.com
Entweder direkte Suche
Oder anmelden / registrieren für weitere Services
Projekt zur Termextraktion anlegen
Text(e) zur Extraktion laden
Extraktionseinstellungen festlegen
Extraktion starten
Beispiel: Termextraktion mit TaaS
K.-D. Schmitz, IIM, FH Köln
Gehe zu https://term.tilde.com
Entweder direkte Suche
Oder anmelden / registrieren für weitere Services
Projekt zur Termextraktion anlegen
Text(e) zur Extraktion laden
Extraktionseinstellungen festlegen
Extraktion starten
Prüfe und ergänze Extraktionsergebnisse
Beispiel: Termextraktion mit TaaS
K.-D. Schmitz, IIM, FH Köln
Gehe zu https://term.tilde.com
Entweder direkte Suche
Oder anmelden / registrieren für weitere Services
Projekt zur Termextraktion anlegen
Text(e) zur Extraktion laden
Extraktionseinstellungen festlegen
Extraktion starten
Prüfe und ergänze Extraktionsergebnisse
Visualisierung
Beispiel: Termextraktion mit TaaS
K.-D. Schmitz, IIM, FH Köln
Some evaluation results
Evaluation in April (and June) 2014
4 test documents
Type: online article, white paper, dissertation
Domain: energy, economics, IT, astronomy
Languages: DE-EN, DE-FR, EN-FR
Gold Standard: human term extraction, 7-10 candidates / documentproblem: subjectivity
K.-D. Schmitz, IIM, FH Köln
Calculation of Recall and Precision
Recall:
all found relevant TC / all relevant TC
all relevant TC found?
Precision:
all found relevant TC / all found TC
all found TC relevant?
K.-D. Schmitz, IIM, FH Köln
Test with Kilgray (statistic):
Results of the TaaS evaluation
Test with TWSC and Term Normalizer (linguistic):
K.-D. Schmitz, IIM, FH Köln
Improvement of TaaS
Second (short) evaluation after the end of the project in June 2014:
K.-D. Schmitz, IIM, FH Köln
Comparison TaaS – human – MT-Extract
T1: Terminologist with the best Recall and Precision values
T4: Terminologist with the worst Recall values
Ü1: Translator with the worst Precision values
MT: MultiTerm Extract (statistical) with different Silence/Noise values
K.-D. Schmitz, IIM, FH Köln29
⇒ Auto-lookup ⇒ Manual lookup
⇒ Adding and editing terms ⇒ Transferring term extraction lists
TaaS: CAT Tool Integration
K.-D. Schmitz, IIM, FH Köln30
Data acquisition from SMT systems
Export of multilingual terminology for reuse in MT systems
Online Terminology Services
Translation
Training
SMT System Training and adaptation
Online Translation Service
Input Text for Translation
Parallel corpus
Monolingual corpus
Bilingual term collections
Monolingual Term
Extraction
Trained SMT Model
Bilingual Term
Extraction
Translated Text
TaaS: (statistical) Machine Translation
K.-D. Schmitz, IIM, FH Köln
Conclusion
TaaS offers free of charge services for terminology extraction, retrieval, management, and sharing
The term extraction results are excellent, if the linguistic algorithms are available for that language
Companies react very carefully concerning TaaS
But the free services offered by TaaS may attract language workers to use TaaS for terminology management, to share (validated) terminology, and to collaborate with others.
top related