- nordterm 2007 - bergen, norway specialised translation and terminology koen kerremans centrum voor...

89
- - Nordterm 2007 Nordterm 2007 - Bergen, Norway Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel http://cvc.ehb.be

Upload: violet-barrett

Post on 29-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

Specialised translation and terminology

Koen Kerremans

Centrum voor Vaktaal en CommunicatieErasmushogeschool Brussel

http://cvc.ehb.be

Page 2: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

Part 1:“Terminography for translators:

methodology”

Page 3: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Purpose

1. To show some steps terminographers go through in order to develop specialised dictionaries

2. To raise awareness concerning the specific problems that may arise during the compilation of such dictionaries

3. To present a method in terminology description, Termontography, which supports the development of ontologically-underpinned terminological dictionaries

Page 4: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayTerminology specialised dictionary

• Within the present scope:Terminology / Terminography

• is the study and the field of activity concerned with the collection, the description and the presentation of terms (Sager 1990:2). Terms are related to subject-field communication (e.g. technical writing, technical documentation).

• “the practical task of producing dictionaries of lexical items that are specific to specialised domains of knowledge” (Meyer 2001:279).

Specialised dictionary• results from the process of creating, storing, processing,

recording, reusing, etc. specialised information and knowledge

Page 5: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Preliminary remarks (1/3)

• Within the present scope:User of the specialised dictionary?

• Translator

Requirements of this specific user• Content of the dictionary?• Format of the dictionary?

Page 6: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Preliminary remarks (2/3)

• data gathering for lexical analysis may be based on:introspectionelicitation of dataobservation of non-elicited language use

= text-oriented approach

Page 7: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Preliminary remarks (3/3)

• Lots of texts are currently available in electronic formats

• It becomes possible to ‘process’ these texts using specific software tools

‘Terminotics’

Page 8: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayTerminology specialised dictionary

1. Corpus compilation2. Term identification3. Information extraction4. Analysis and synthesis5. Encoding6. Organisation7. Management

Specialised Specialised dictionarydictionary

Page 9: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 1. Corpus compilation

• = searching and categorising texts considered relevant for terminological analysis

• Problem: representivenessScientific specialised discourseScientific official discourseScientific pedagogical or didactic discourseScientific semi-popularised discourseScientific popularised discourse

(e.g. Laurian 1983; Meyer and Mackintosh 1996; Pearson 1998)

At least 2 At least 2 languages!languages!

Page 10: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 1. Corpus compilation

• Tools (examples):Search engine:

• “an information retrieval system designed to help find information stored on a computer system, such as on the World Wide Web” (http://en.wikipedia.org/wiki/Search_engines).

Web crawler• “a program or automated script which browses the World

Wide Web in a methodical, automated manner” (http://en.wikipedia.org/wiki/Web_crawler).

Text aligner• a tool that organises “different language versions of a text in

order to be able to identify equivalent terms, phrases, or expressions”(http://portal.bibliotekivest.no/terminology.htm).

Page 11: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

• = extracting terms from texts that have been gathered during the corpus compilation phase

• What is a term?“A semantically charged linear structure,

which names an abstract or concrete reality studied [in] a special-subject field” (Collet 2004:109).

A lexical unit that has a special meaning depending on the thematic context.

Page 12: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

TOPIC: early retirement

When the eligibility criteria for early retirement were tightened, early retirees began being granted the status of older unemployed. Standard unemployment benefits are higher for unemployed persons over the age of 50 who have been unemployed for a year but have spent 20 years in work. Until very recently, those in the “older unemployed” category were exempt from the ‘actively seeking work’ rule, which suggested that it was virtually impossible to find work again after the age of 50.

Since summer 2002, however, this exemption for the older unemployed is gradually being phased out. It is also the case that early retirement arrangements have become opaque and inequitable. The range of measures is now so wide that there has clearly been some duplication. They include early retirement on a half-time basis and career break measures, now replaced by the time-credit scheme.

Page 13: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

• Automatic term extraction ≠ automatic keyword extraction!

Knowledge of the Knowledge of the languagelanguage

Knowledge of the Knowledge of the world (the domain)world (the domain)

Knowledge of the Knowledge of the (dictionary) user (dictionary) user profileprofile

Page 14: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

• Our solution in application-orientedapplication-oriented terminology projects:

1. Set up a categorisation frameworkcategorisation framework2. Map terminology to the framework

Page 15: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

• A categorisation framework:= an ontologically-underpinned framework of

(meta)categories and (meta)relations which is used to extract and organise multilingual terminology

• Advantages:Helps us to establish extraction criteria as to what

terms in text are or should be (cf. ‘15th day of the month following that in which the chargeable event took place’)

Facilitates the process of aligning multilingual terminology

Page 16: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

hyperonym of

transactions not allowing the

supplier to deduct VAT

transactions occurring outside the

territory of the VAT legislation at

stake

transactions occurring outside the

scope of VAT

transactions allowing the supplier

to deduct VAT

transactions for which no VAT is required

hyponym of

Dutch (Belgium):

vrijstelling

niet onderworpen aan BTW

French (Belgium):

exemption

English (UK):

exemption

zero-rated

outside the scope of VAT

English (Ireland):

exemption

zero-rated

Page 17: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

• Idea of mapping terminology to a categorisation framework is adopted in the Termontography approach

Page 18: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

Knowledge Analysisphase (1)

Informationgatheringphase (2)

Searchphase (3)

TSR + categorisation

framework

(mono- or multilingual)

domain-specific corpus

Domain-experts

Refinementphase (4)

first version of termontological

database

(mono- or multilingual)

termontological database

Verificationphase (5)

Validationphase (6)

??

Dictionary

2. Term identification

Page 19: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 2. Term identification

• Termontography is a terminological approach in which one structures terminological information, retrieved from a corpus of texts, according to a framework of domain-specific knowledge.

Page 20: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• = adding ‘supplementary information’ to each term

• Dictionaries should be designed for special users groups in response to specific needs (cf. ‘Knowledge analysis phase’ in Termontography)

• What supplementary information do translators require?Synonyms? Translation equivalents? Part of speech

tags? Examples? Contexts? Collocations? Domain specifications? Definitions? ( what type of definition?)

Page 21: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Techniques to find out user requirements are amongst others:

SurveysExperimental research & Model Building

Page 22: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Surveys:To ask people what they use dictionaries for

and howNot very reliable

Page 23: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Experimental research:Look-up behaviour of subjectsError analysis

Page 24: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Model buildingModel building(based on translation (based on translation process)process)

(Agirre et al. 2001)(Agirre et al. 2001)

Page 25: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Translators need insight in at least three different types of contexts:

linguistic context of a translation unit,cultural (situational) contextcognitive (ontological) context

• A translator having access to terminological knowledge resources providing him with information on these different types of contexts, is likely to produce high quality translations

Page 26: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

• On the whole translating dictionaries and traditional multilingual

terminological resources do not provide sufficient information

for the translator

• Multilingual terminology management must widen its scope

towards knowledge management and representationknowledge management and representation

(Meyer 1992, Dancette 1997, Temmerman 2000, 2003, 2005):

providing a cognitive structure in order to improve the

understanding of the specialised domain

providing extralinguistic / encyclopaedic information in

order to improve the understanding of terms and categories

in the specialised domain (of source and target language)

3. Information extraction

Page 27: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

3. Information extraction

Dancette, J. & C. Réthoré (2000). Dictionnaire Analytique de la Distribution. Analytical Dictionary of Retailing. Les presses de l’université de Montréal

Users: translators who are to translate from English into French on ‘retailing’

Page 28: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

Lay-out

Page 29: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

• AimsAims: to maximally stimulate the creativity of the

translator by offering ontologically enriched information on the subject, in the French language (target language for the translator)

to optimise understanding by stimulating the semantic network in the brain of the translator

3. Information extraction

Page 30: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

• The dictionary user gets introduced to the meaning of the term in several textual modules formulated in French: définition précisions sémantiques relations internotionelles compléments d’information informations linguistiques contextes exemples

• Cross-referencing is provided for by printing entries that are covered in another article for French in bold and for English in small capitals. Related terms for French are in bold, for English in italics.

3. Information extraction

Page 31: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: ‘label’

Définition:Document d’identification du produit qui lui est apposé ou y est attaché et qui en décrit les caractéristiques (nature, prix, provenance, marque, etc.).

Page 32: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: ‘label’

Précision sémantiques:Depuis les années 1970, l’étiquette comprend généralement un code-barre (BAR CODE). Le code-barre contient des informations telles que la description et le prix du produit, qui seront lues à l’ aide d’un lecteur optique (OPTICAL READER).

Page 33: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: ‘label’

Relations internotionelles:Le terme anglais TAG désigne une étiquette que l’ on peut facilement enlever, ce qui n’ est pas le cas de label.Ne pas confondre l’anglais LABEL avec son homonyme label, qui a le sens de marque (BRAND), comme dans le terme PRIVATE LABEL (marque de distributeur).

Page 34: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: ‘label’

Compléments d’information:Les producteurs ont l’ obligation, en vertu de la Loi sur la protection du consommateur (Consumer Protection Act), de répertorier sur l’étiquette tous les ingrédients contenues dans le produit alimentaire.

Page 35: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: ‘label’

Information linguistique:Étiqueter: to ticketétiqueteuse: labeler, label machine

Page 36: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: ‘label’

Contextes:But it wasn’t until 1900 that [he] put the first Polar label on

a bottle of cool, naturally purified water taken directly from one of these springs on his property. http://www.water.com/polar/index.html (30-3-99)

Dans ce but, la réglementation mise au point par les organismes de la CEE et par l’ administration française prévoit sur chaque étiquette la présence d’un certain nombre de mentions obligatoires, en fonction de la catégorie du vin. http://www.vin.champagne.com/etiq.htm (30-3-99)

Page 37: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Challenge: how to arrive at specialised dictionaries offering ontologically-enriched information?

analysis of Knowledge Rich Contexts (Meyer 2001)

Page 38: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• ‘Knowledge Rich ContextsKnowledge Rich Contexts’ (Meyer 2001:281):“a context indicating at least one item of

domain knowledge that could be useful for conceptual analysis. In other words, the context should indicate at least one conceptual characteristic, whether it be an attribute or relation.”

can be used to derive synonyms and translation equivalents

Page 39: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

KWIC concordancer

Page 40: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Certain contextual markers may indicate in KRCs specific conceptual relations.Compost: a ready-to-use soil enricher that looks and

feels like dark, crumbly soil.Compost contains nutritients, nitrogen, potassium

and phosphorus.Compost is perhaps best defined as organic material

assembled for fast decomposition.Compost, a dark, nutritient-rich soil conditioner,

consists of a small amount of soil along with decomposed or partially decomposed plant residues.

->-> meronymymeronymy

-> purpose-> purpose

-> -> hyperonymyhyperonymy

-> attribute-> attribute

Page 41: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

• Synonyms and translation equivalents are identified based on a comparison between KRCs:

cooccurrence or substitution tests

feature analysis

3. Information extraction

Page 42: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

• Superstore Assortment:

• Food: very wide assortment

• Non-food: very wide assortment (house-hold products, clothing, kitchen utensils, gardening tools, etc.)

Area: • 2300 to 4600 m2

• Hypermarché Assortment:

• Food: very wide assortment

• Non-food: very wide assortment (house-hold products, clothing, kitchen utensils, gardening tools + electronical appliances, furniture, etc.)

Area: • Up to 24.000 m2

• Supermarché Assortment:

• Food: very wide assortment

• Non-food: fairly wide assortment (house-hold products, clothing, etc.)

Area: • 400 to 2500 m2

FEATURE ANALYSIS

Page 43: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Category: Category: “an event on which VAT has to be “an event on which VAT has to be paid”paid”Domain: VAT lawDomain: VAT law

English-UK: chargeable eventVAT will be due on the date the invoice is issued

English-Ireland: chargeable eventVAT is due no later than the 15th day of the month following the month in which the supply takes place

French: fait générateurVAT is due at the moment the goods are supplied

Page 44: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Summary

• Steps terminographers have to go through in order to develop specialised dictionaries for translators:Requirements of translators ( knowledge about the

linguistic, situational and cognitive contexts)

• Problems discussed:Representiveness of the corpusTerm identification ( categorisation frameworks?)Terminology structuring (variation)

Page 45: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Other problems?

4. Analysis and synthesis (definitions)5. Encoding (précision sémantiques vs.

relations internotionelles vs. complément d’information)

6. Organisation (tree structure, hyperlinks, ‘traditional’ term records)

7. Management (dictionary up-to-date?)

Page 46: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway

Part 2:“Towards ‘intelligent’ dictionaries for

translators”

Page 47: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Purpose

• Which information sources do translators use during the translation of a given text sample?

• How do we arrive at ‘intelligent’ dictionaries?Possibilities?Technology?

Page 48: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Translation sample

• Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area. The White Nile brought highly mineralized silt which would be eroded from Abyssinian Alps 1500 miles inland in Central Africa. The silt from the Blue Nile was heavy with humus from the jungle and swampy sources. Not only did the flood bring silt, the soil would be soft and easy to plow. They would plant and harvest in early spring and then allow the fields to lay until July when the floods would come again. Based on: http://historylink101.com/lessons/farm-city/egypt1.htm

Resource

s?

Page 49: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 50: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 51: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Translation dictionaries

Page 52: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Translation dictionaries

http://www.wordreference.com/

http://www.ectaco.com/English-Multilanguage-Dictionary/

http://www.allwords.com/

http://www1.cs.columbia.edu/~radev/dictionary/

http://www.foreignword.com/Tools/dictsrch.htm

http://www.langtolang.com/

http://users.otenet.gr/~vamvakos/multilingual.htm

http://www.tritrans.net/…

Page 53: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Translation dictionaries

http://www.freedict.com/onldict/onldict.php

http://www.tritrans.info/

Page 54: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 55: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Explanatory dictionaries

Ref.: http://dictionary.cambridge.org/define.asp?key=51745&dict=CALD

Page 56: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Contexts

"While the computer linked these reduced northern hemisphere temperatures to Laki, it also connected the dots to a weak monsoon – the seasonal winds that bring the annual rains to southern Asia and northern Africa. The unusual cold in the North lessened the temperature contrast between the land and the oceans, upon which the monsoon winds rely for their development and strength. "

http://www.physorg.com/news83338494.html

Page 57: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Contexts

"It may be that only with the additional water provided as a result of the intensifying monsoon that the upstream Nile was able to erode its way through the Nubian Swell and continue north to the Mediterranean Sea. "

http://www.utdallas.edu/geosciences/remsens/Nile/geology.html

Page 58: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 59: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Encyclopedia

http://en.wikipedia.org/wiki/Monsoon

http://no.wikipedia.org/wiki/Monsun

Example:

Page 60: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Encyclopedia

Page 61: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Encyclopedia

Page 62: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Encyclopedia

Page 63: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 64: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Translation forums

http://www.oasisllc.com/transtrad/forum.htm

http://www.translatorsbase.com/Forum/Forums/

http://www.english-spanish-translator.org/translation-issues/

http://disc.server.com/Indices/6657.html

http://members3.boardhost.com/translate2/

http://www.all-translations.com/forum/index.php

http://www.foreignword.com/Forum/default.asp

http://tech.groups.yahoo.com/group/sptranslators/

Page 65: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 66: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Translation engines

• Sample: Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area.

http://www.freetranslation.com/

Page 67: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Some resources

Resources

Resources

Explanatory dictionaries

Translation dictionaries

Specialised dictionaries

Translation engines

Translation forums

Encyclopedia

Picture dictionaries

Combinatory dictionaries

Synonym dictionaries

Page 68: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Specialised dictionaries

• Sample: Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area.

Page 69: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Specialised dictionaries

• Sample: Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area.

GEOGRAPHY

HYDROLOGY

Page 70: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Specialised dictionaries

• Monsoon A regional scale wind system that predictably

change direction with the passing of the seasons. Monsoon winds blow from land to sea in the winter, and from sea to land in the summer. Summer monsoons are often accompanied with precipitation.

• Flood Inundation of a land surface that is not normally

submerged by water from quick change in the level of a water body like a lake, stream, or ocean.

• Silt Mineral particle with a size between 0.004 and 0.06

millimeters in diameter. Also see clay and sand.

http://www.physicalgeography.net/physgeoglos/

Page 71: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayConclusion first part

• Translation is a complex process. Translators need to consider both intra- and extratextual factors

• There are a lot of resources (freely available) on the Internet that translators can use for their own translation projects.

• Disadvantage: these resources need to be consulted one-by-one

• Consequence: there is a need for a more ‘intelligent dictionary’

Page 72: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway The intelligent dictionary

… requirements:it should be possible to combine results from

existing electronic resources and to present the relevant information to translators

The dictionary should be context-sensitive. Translation segments should be automatically linked to information in the available knowledge resources.• (the dictionary may be able to suggest a

translation, based on ‘intelligent reasoning’)

Page 73: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway The intelligent dictionary

• Each year the "Gift of the Nile" would be a flood brought on by the monsoon.

Page 74: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Example: IATE (Inter Active Terminology for Example: IATE (Inter Active Terminology for Europe)Europe)

Page 75: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway 3. Information extraction

Page 76: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayThe Semantic Web

• “[…] an extension of the current web in which information is given well-defined meaning”

(Tim Berners-Lee, James Hendler, Ora Lassila (2001). “The Semantic Web”. Scientific American)

• “[…] provides a common framework that allows [smart] data to be shared and reused […]”

(W3C, http://www.w3.org/2001/sw)

Page 77: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Opportunities of a “semantic web”

• For instance: Improvement of information retrieval: e.g. ‘general

interest in MT’ as a query will no longer lead to websites of MT companies that present their products

Software agents will be able to detect the pieces of information they need to make for hotel bookings via the Semantic Web

Question-answer machines will be able to formulate better answers on the basis of the user’s question (cf. http://www.answerbus.com)

Page 78: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayAn example…

Page 79: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayThe Semantic Web

• “[…] an extension of the current web in which information is given well-defined meaning”

(Tim Berners-Lee, James Hendler, Ora Lassila (2001). “The Semantic Web”. Scientific American)

• “[…] provides a common framework that allows [smart] data to be shared and reused […]”

(W3C, http://www.w3.org/2001/sw/)

Page 80: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway The content of webpages?

XML:

-…a beautiful hotel in <European country>France</European country>…

-…each room has an <minibar>AMS</minibar>, a <television>Phillips</television>,…

-…for one night, you pay <price> € 50</price>.

XML:

-…a beautiful hotel in <country>France</country>…

-…each room has an <minibar>AMS</minibar>, a <TV>Phillips</TV>,…

-…for one night, you pay <cost> € 50</cost>.

Resource Description Framework (RDF):

A default framework for structuring XML tags. With RDF it becomes clear that in the example above the following tags carry the same meaning:

“European country” and “country”“minibar” and “minibar”“television” and “TV”“price” and “cost”

Page 81: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, NorwayThe Semantic Web

• Dream or reality?

2000: the development took off in the US with the DARPA Agent Markup Language (DAML)

2001: In Europe, researchers set up the ‘Ontoweb thematic network’ in order to federate the research activities

Research activities for building the Semantic Web were central to the ‘knowledge technologies’ area of the EU 6th framework programme

In fact, research activities can be found world-wide

Page 82: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway The intelligent dictionary

… requirements:it should be possible to combine results from

existing electronic resources and to present the relevant information to translators

The dictionary should be context-sensitive. Translation segments should be automatically linked to information in the available knowledge resources.• (the dictionary may be able to suggest a

translation, based on ‘intelligent reasoning’)

Page 83: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway An example…

• http://www.inreallife.be/Articles/BELbruxEuropeenne.php

Page 84: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Context-sensitivity

Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoonmonsoon. These floods brought only a thin layer of silt, dropped on the banksbanks, from both a jungle area and also a mountainous area.

Financiële instelling (‘financial institution’)Reserve (‘reserve’)Oever (‘river edge’)Rij (‘row’)

Context

Page 85: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Schematically visualised

Text ‘word_A’ concept_A ‘word_B’ Resource B

Resource A‘word_A’

‘word_C’ Resource C

Problem: Is the right concept activated?

Page 86: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Technology?

• Ontology:A formal knowledge repository of concepts and

relations with the possibility to derive new facts from given knowledge

Example:

Nile river bank

Is instance of

has instance

has property

is property of

Page 87: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Technology?

Nile river bank

Is instance of

has instance

has property

is property of

Nijl(Nl)

Nile(En)

rivier(Nl)

river(En)

rivière(Fr)

oever(Nl)

bank(En)

Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banksbanks, from both a jungle area and also a mountainous area.

Page 88: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway Conclusion

• Thanks to the new technology, dictionaries will become much more intelligent:Context-sensitive search possibilitiesFlexibility (customisation according to user

profiles)Dynamic (management in time)Interactive (Self-learning)

• Dictionaries could become intelligent translation engines (that consider the context in which a translation segment occurs)

Page 89: - Nordterm 2007 - Bergen, Norway Specialised translation and terminology Koen Kerremans Centrum voor Vaktaal en Communicatie Erasmushogeschool Brussel

- - Nordterm 2007Nordterm 2007 - -

Bergen, NorwayBergen, Norway To finish…

An ontology-based application:http://www.20q.net/index.html