iso/tc37/sc4/tdg6 language resource ontologies 2008-09-27, pisa hasida koiti [email protected]...
TRANSCRIPT
ISO/TC37/SC4/TDG6ISO/TC37/SC4/TDG6Language Resource Language Resource
OntologiesOntologies2008-09-27, Pisa2008-09-27, Pisa
HASIDA KoitiHASIDA Koiti
[email protected]@aist.go.jp
CfSR, AIST, JapanCfSR, AIST, Japan
OntologizationOntologizationreformulation in terms of ontologyreformulation in terms of ontology
provide standard way to convert provide standard way to convert annotations to labeled directed graphsannotations to labeled directed graphs
DCR, LAF, LMF, FS, MAF, SemAF, SynAF, DCR, LAF, LMF, FS, MAF, SemAF, SynAF, MLIF, etc.MLIF, etc.
Cf. LMF and MAF have UML-based Cf. LMF and MAF have UML-based schemas.schemas.
not XML but RDF as base description not XML but RDF as base description and modeling tooland modeling toolstandard semantic interpretation for RDFstandard semantic interpretation for RDFhighlight semantics rather than syntaxhighlight semantics rather than syntax
2
Purposes of OntologizationPurposes of Ontologizationinteroperabilityinteroperability
among ISO/TC37 standardsamong ISO/TC37 standardswith ontologies from elsewherewith ontologies from elsewherewith any data containing linguistic contentwith any data containing linguistic content
RDF data are easier to integrate than XML data.RDF data are easier to integrate than XML data.
e.g. external annotation of texts in SMIL data e.g. external annotation of texts in SMIL data without including linguistic description in SMIL without including linguistic description in SMIL specificationspecification
fuller formalization of IS specificationsfuller formalization of IS specificationssemantic extension of DCRsemantic extension of DCR
3
Semantic Extension of DCR Semantic Extension of DCR
sorts of DCssorts of DCsunary predicate → classunary predicate → classbinary relation → propertybinary relation → propertysymmetric binary relation, etc.symmetric binary relation, etc.
types of the domain (1types of the domain (1stst arg.) and arg.) and the range (2the range (2ndnd arg.) of binary arg.) of binary relations (properties)relations (properties)
4
XML MessXML Mess
Semantic interpretation of XML is Semantic interpretation of XML is not standardized but defined ad hoc.not standardized but defined ad hoc.
Many inconsistent `standards’ on Many inconsistent `standards’ on overlapping issues.overlapping issues.
Huge standards containing many Huge standards containing many different semantic interpretation different semantic interpretation manners.manners.e.g., MPEG-7 > 2000 pagese.g., MPEG-7 > 2000 pages
5
RDFRDFResource Description FrameworkResource Description Framework
labeled directed graphlabeled directed graphW3C recommendation W3C recommendation
http://www.w3.org/RDF/Schemas are provided by RDFS, Schemas are provided by RDFS,
OWL, etc.OWL, etc.textual representationtextual representation
XML, N3, etc.XML, N3, etc.
6
http://meetings.example.com/cal#m1http://meetings.example.com/cal#m1http://meetings.example.com/cal#m1http://meetings.example.com/cal#m1
RDF GraphRDF Graph
http://www.example.org/people#fredhttp://www.example.org/people#fredhttp://www.example.org/people#fredhttp://www.example.org/people#fred
http://meetings.example.com/m1/hphttp://meetings.example.com/m1/hphttp://meetings.example.com/m1/hphttp://meetings.example.com/m1/hp
m:homePagem:homePagem:homePagem:homePage
m:attendingm:attendingm:attendingm:attending
m:givenNamem:givenNamem:givenNamem:givenName FredFredFredFred
m:hasEmailm:hasEmailm:hasEmailm:hasEmailmailto:[email protected]:[email protected]:[email protected]:[email protected]
7
Conversion of XML to RDFConversion of XML to RDF
AnyURI- and IDREF(S)-type AnyURI- and IDREF(S)-type attributeattribute
→ → object property (link)object property (link)other attribute → datatype other attribute → datatype
propertypropertyembedded elementembedded element
→ → object/datatype propertyobject/datatype property
8
24610: Feature Structure24610: Feature Structuretyped feature structure as in HPSG, etc.typed feature structure as in HPSG, etc.ISO 24610-1: Feature Structure ISO 24610-1: Feature Structure
RepresentationRepresentationISO 24610-2: Feature System ISO 24610-2: Feature System
DeclarationDeclarationlabeled directed graphlabeled directed graphAVM (attribute-value matrix)AVM (attribute-value matrix)textual encoding by XMLtextual encoding by XML
9
FS Graph = RDF GraphFS Graph = RDF Graph
determinerdeterminerdeterminerdeterminerPOSPOSPOSPOS
SPECIFIERSPECIFIERSPECIFIERSPECIFIER
ORTHORTHORTHORTH lalalala
HEAHEADD
HEAHEADD
AGRAGRAGRAGR
AGAGRR
AGAGRR
nounnounnounnounPOSPOSPOSPOS
ORTHORTHORTHORTH pommepommepommepomme
singularsingularsingularsingularNUMBERNUMBERNUMBERNUMBER
10
FS in AVMFS in AVM
SPECIFIER
HEAD
POS determinerORTH `la’AGR [1][NUMBER singular]
POS nounORTH `pomme’AGR [1]
11
Ontologies Subsume Feature Ontologies Subsume Feature SystemsSystems
Features are partial functions, Features are partial functions, whereas RDF properties are relations whereas RDF properties are relations in general (possibly partial functions).in general (possibly partial functions).
Usual feature systems have no Usual feature systems have no taxonomy of features, whereas usual taxonomy of features, whereas usual ontologies have taxonomies of ontologies have taxonomies of properties (e.g., due to properties (e.g., due to rdfs:subPropertyOf).rdfs:subPropertyOf).
12
wordwordwordword
<fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl></fsDecl>
<fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl></fsDecl>
orthorthorthorth
Feature-System Feature-System DeclarationDeclaration
13
signsignsignsign
rdfs:domainrdfs:domainrdfs:domainrdfs:domainstringstringstringstringrdfs:rangerdfs:rangerdfs:rangerdfs:range
rdfs:subClassOfrdfs:subClassOfrdfs:subClassOfrdfs:subClassOf
The fundamental type for individual wordsThe fundamental type for individual words
rdfs:commentrdfs:commentrdfs:commentrdfs:comment
The orthographic representation for this wordThe orthographic representation for this word
rdfs:commentrdfs:commentrdfs:commentrdfs:commentowl:FunctionalPropertyowl:FunctionalPropertyowl:FunctionalPropertyowl:FunctionalProperty
rdf:typerdf:typerdf:typerdf:type
Constraint (Conditional)Constraint (Conditional)
14
XXXX
invinvinvinv
truetruetruetrue
finfinfinfin
auxauxauxaux
vformvformvformvform
<cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs></cond>
<cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs></cond>
XXXX truetruetruetrue
condcondcondcond
SWRL representation:inv(?X,true)-> aux(?X,true) & vform(?X,fin)
FS Ontologization FS Ontologization (Summary)(Summary)
RDF ⊃ FSRDF ⊃ FSUse ontologies for feature-system Use ontologies for feature-system
declarations.declarations.SWRL to encode constraintsSWRL to encode constraintsDefaults are outside of ontology.Defaults are outside of ontology.
15
24612: Linguistic 24612: Linguistic Annotation FrameworkAnnotation Framework
16
GrAF in RDFGrAF in RDF
NUMBERNUMBERNUMBERNUMBER
17
rdfs:typerdfs:typerdfs:typerdfs:type NPNPNPNP
TheTheTheThe
clockclockclockclock
SINGSINGSINGSING
rdfs:typerdfs:typerdfs:typerdfs:type TOKENTOKENTOKENTOKEN
POSPOSPOSPOS
BASEBASEBASEBASETHETHETHETHE
DETDETDETDET
rdfs:typerdfs:typerdfs:typerdfs:type
POSPOSPOSPOS NNNNNNNN
BASEBASEBASEBASECLOCKCLOCKCLOCKCLOCK
possibly stand-off annotation
18
Turn
Agent
Utterance
Dialogue
addressee
overhearer
sender 1..*
1..*
0..*
1..1
1..*
DialogueAct
0..*1..* func.dep.
SemAF-DActsSemAF-DActs
TODOs (projects in TDG6?)TODOs (projects in TDG6?)include ontologies in documentsinclude ontologies in documents
FSDFSDjust check UML (as far as no property just check UML (as far as no property
hierarchy is necessary)hierarchy is necessary)LMF, MAFLMF, MAF
finish ontologization (possibly in UML) finish ontologization (possibly in UML) SynAFSynAF
ontologize from scratch, forgetting ontologize from scratch, forgetting XMLXMLDCR, SemAF-Time, SemAF-DActs, MLIF, DCR, SemAF-Time, SemAF-DActs, MLIF,
etc.etc. 19
IssuesIssuesWho should ontologize individual WIs?Who should ontologize individual WIs?
ontologize future WIs from the beginningontologize future WIs from the beginningTDG6 should exemplify how.TDG6 should exemplify how.whether and how to make ontologization whether and how to make ontologization
mandatory?mandatory?Where to include ontologies of ongoing Where to include ontologies of ongoing
WIs?WIs?depending on their stages (WD, CD, ...)depending on their stages (WD, CD, ...)
How to keep ontologizing DCs?How to keep ontologizing DCs?replace DC metamodel by ontology?replace DC metamodel by ontology?modify ISOCat?modify ISOCat?
20