esr9 chris hokamp - expert summer school - malaga 2015
TRANSCRIPT
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
A Component-Oriented Design Frameworkfor Translation Interfacess
Chris [email protected]
Dublin City University (DCU)
June, 2015
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 1 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Summary
A Component-Centric Framework For Computer-Aided Translation
• Component-Centric Design (CCD)I Component TypesI Data Model — UI bindings
• HandyCATI Functional AreasI XLIFF-inspired data model
• Prototypes & ExperimentsI Autocompletion and Typeahead ComponentsI ProphetMTI Dynamic Linked Terminologies
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 2 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
An Abstract Framework for CAT Tool design
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 3 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Problem Statement
Existing work
1 User Interfaces Are ComplexI difficult/impossible to optimize
2 Components are difficult to reuseI component interface is tightly coupled to the application data
modelI Complex dependencies between components
⇒ Solution: A type system for CAT components
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 4 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Definition
A component is a means of transforming and/orrendering some data to the user.
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 5 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Definition
A component is a means of transforming and/orrendering some data to the user, optionally withinteractive capabilities which allow users to modify theunderlying data model.
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 6 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Component Types
• Data ServicesI transform input data into another representation
• Interaction ElementsI allow users to view and possibly modify the data model
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 7 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Data Services
Translation Resources are functions which map sequences in onelanguage to sequences in another language.
Example: An SMT system which outputs one or moretarget hypotheses given a source segment
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 8 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
The Translation Resource Continuum
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 9 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Interaction Elements
Interaction Elements allow users to view and possibly modify thedata model.
Example: a text editing area designed to aid the user inpost-editing an SMT hypothesis.
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 10 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Component Hierarchy
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 11 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
MotivationComponent Types
Formalization
An Interface is a set of (E+, D*) tuples, where:
E = Interaction ElementD = Data Service+ = one or more* = zero or more
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 12 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Functional AreasData Model
HandyCAT
• A CAT tool built from Components
• Functional Areas define the interface to the translation datamodel
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 13 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Functional AreasData Model
Functional Areas
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 14 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Functional AreasData Model
Functional Areas
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 15 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Functional AreasData Model
XLIFF-inspired data model
<unit id="1">
<segment>
<source>This is the first sentence.</source>
<target>Esta es la primera frase.</target>
</segment>
<segment>
<source>Second sentence.</source>
</segment>
</unit>
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 16 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Functional AreasData Model
Default Component interfaces
Component InterfacesDocument
Segment
Source
Target
Global Tooling
Segment Tooling
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 17 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Prototypes
• Enhanced Autocompletion
• ProphetMT — Preauthoring with Syntax-based SMT
• Using Linked Data for Dynamic Terminologies
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 18 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Autocompletion Using SMT Components
• Target LM-backed Autocompletion
• Phrase Table-backed Autocompletion
• Phrase Table + LM Autocompletion
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 19 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Enhanced Autocompletion for CAT
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 20 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Enhanced Autocompletion for CAT
Dataset Avg. Time
A 79.53
B 76.47
Table 1: Average sentence completion time for each Wikipedia dataset
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 21 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Enhanced Autocompletion for CAT
Autocomplete Type Avg. Time
Default 82.75
PT-backed 73.25
Table 2: Average sentence completion time for each autocomplete type
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 22 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Preauthoring Using Syntax-based MT
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 23 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Enhanced AutocompletionProphetMTDynamic Linked Terminologies
Dynamic Linked Terminologies
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 24 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Software
• HandyCAT — https://github.com/chrishokamp/handycat
• Marmot — https://github.com/qe-team/marmot
• Graph Translation Memory —https://github.com/chrishokamp/graph-translation-memory
• Node Xliff — https://github.com/chrishokamp/node-xliff
• MultilingualW2V —https://github.com/chrishokamp/multilingualw2v
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 25 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Acknowledgements
Thanks to Anna Zaretskaya and Xiaofeng Wu for theircollaboration.
Thanks to Qun Liu, Josef Van Genabith, and Kashif Shah for theiruseful comments and suggestions.
Chris Hokamp is supported by the People Programme (Marie CurieActions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471.
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 26 / 27
IntroductionComponent-Centric Design
HandyCATPrototypes and Experiments
Software
Thanks!
Chris Hokamp [email protected] EXPERT Workshop | Malaga, Spain 27 / 27