tmf - a tutorial part 3: designing (schemas and) filters tmf - terminological markup framework...

23
TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Upload: hayden-joyce

Post on 26-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

TMF - a tutorialPart 3: Designing (schemas and)

filters

TMF - Terminological Markup Framework

Laurent Romary - Laboratoire Loria

Page 2: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

General principles

Terminological information interchange– Three components:

• Source TDB1

• Target TDB2

• Terminological interchange format– A specific TML (DXLT, Geneter)

TDB1 TDB2

TML

Page 3: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Important notice

– GMT is not a TML• A too abstract format

– Uncontrolled recursivity (‘ struct ’ element)

– Uncontrolled content (‘ feat ’ and ‘ annot ’)

• Necessity to provide a schema to check interchanged data

– Precise list of datacategory

– Precise definition of format

– GMT is here to provide conceptual simplicity

Page 4: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Designing filters

TML to GMT

Page 5: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

General principles

Just for your information– The creation of the filters can be automatized

Basic processes– Reduction of expansion trees– Mapping elements and attributes to the

corresponding data categories

Page 6: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Reducing expansion trees

Example• DXLT (Martif) sub-tree

<ntig><!-- some general information associated with the term --><termGrp>

<!-- term related information --></termGrp>

</ntig>

• GMT<struct type="TS"><!-- some features -->

</struct>

Page 7: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Element mapping

Example• DXLT (Martif)

<definition>Bla, bla, bla etc.</definition>

• GMT<feat type="definition">Bla, bla, bla etc.</feat>

Page 8: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Structural elements

Generating a GMT ‘ struct ’ element

<xsl:template match="termEntry"><xsl:element name="struct">

<xsl:attribute name="type">TE</xsl:attribute>

<xsl:apply-templates select="@*|node()"/></xsl:element>

</xsl:template>

Page 9: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Features

Generating a GMT‘ feat ’ element» (style=Attribute)

<xsl:template match="@id"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-identifier</xsl:attribute>

<xsl:value-of select="."/></xsl:element>

</xsl:template>

Page 10: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Features

Generating a GMT‘ feat ’ element» (style=Element)

<xsl:template match="term"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-term</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Page 11: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Features

Generating a GMT‘ feat ’ element» (style=TypedElement)

<xsl:template match="descrip[@type='subjectField']"><xsl:element name="attr">

<xsl:attribute name="type">SubjectField</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Page 12: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

XML Schemas for TMLs

…work ahead…

Page 13: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Analysing existing TDBs

Towards a generic methodology

Page 14: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

General Architecture

TDB Flat XML GMT TMLForm

at spe

cific

XSL

sty

lesh

eet

Sim

ple

DB dum

per

Autom

atic G

MT2

TML st

yles

heet

Page 15: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

A two phase process

List the various Data Categories used in the TDB– Relate them to existing registries (e.g. iso 12620),

cf. http://salt.loria.fr/public/salt/DCQuery.html

Identify the underlying organization of the TDB– Relate it to the Meta-model– Anchor the DatCat where they actually occur

Page 16: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Analysis of an existing TDB

Going through an example

Page 17: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Eurodicautom sample<entry>

<BE>BTB</BE><TY>DAG77</TY><NI>398</NI><CF>3</CF><CM>AG1</CM><CM>JUA</CM><EN>

<VE>key money</VE><RF>CILF,Dict.Agriculture,ACCT,1977</RF>

</EN><FR>

<VE>pas-de-porte</VE><DF>prix payé au précédent occupant pour le droit d'entrer dans une

exploitation agricole</DF><RF target="DF">TNC(1997)</RF><RF>CILF,Dict.Agriculture,ACCT,1977</RF><NT type="NTE">droit rural;pratique prohibée par la loi</NT>

</FR></entry>

definition-12620A.5.1 (TS)

term-12620A.1 (TS)

Language 12620A.10.7(LS)

note-12620A.8 (TS)

classificationCode-12620A.4.2 (TE)

Page 18: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Result in GMT (1/2)<tmf>

<struct type="TE"><feat type="entryIdentifier-12620A.10.15">BTB-TY-398</feat><feat type="originatingInstitution-12620A.10.22.2">BTB</feat><feat type="projectSubset">DAG77</feat><feat type="NI">398</feat><feat type="reliabilityCode">3</feat><feat type="classificationCode-12620A.4.2">AG1</feat><feat type="classificationCode-12620A.4.2">JUA</feat><struct type="LS">

<feat type="language-12620A.10.7">EN</feat><struct type="TS">

<feat type="term-12620A.1">key money</feat></struct><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat></struct>

Page 19: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Result in GMT (2/2)<struct type="LS">

<feat type="language-12620A.10.7">fr</feat><struct type="TS">

<feat type="term-12620A.1">pas-de-porte</feat>

</struct><brack>

<feat type="definition-12620A.5.1">prix payé au précédent occupant pour le droit d'entrer dans une exploitation agricole</feat>

<feat type="sourceIdentifier-12620A.10.20">TNC(1997)</feat>

</brack><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat><feat type="note-12620A.8">droit rural;pratique

prohibée par la loi</feat></struct>

</struct></tmf>

Page 20: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Simple rules

Using XSL locality

<xsl:template match="CM"> <feat type="classificationCode-12620A.4.2"> <xsl:apply-templates/> </feat></xsl:template>

Page 21: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Introducing specific levels

Necessity to combine structure and content

<xsl:template match="VE"> <struct type="TS"> <feat type="term-12620A.1"> <xsl:apply-templates/> </feat> </struct></xsl:template>

Page 22: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Default rule

Useful for keeping track of unmapped data categories

<xsl:template match="*"> <feat> <xsl:attribute name="type">

<xsl:value-of select="name()"/></xsl:attribute>

<xsl:apply-templates/> </feat></xsl:template>

Page 23: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Useful pointers

TMF page:– http://www.loria.fr/projets/TMF

HLT/Salt project page– http://www.loria.fr/projets/SALT

Data category query tool:– http://salt.loria.fr/public/salt/DCQuery.html