2/02 1 an introduction to ontologies tim finin tim finin university of maryland baltimore county
TRANSCRIPT
2/022/02 11
An Introduction toAn Introduction to
OntologieOntologiess
Tim FininTim FininUniversity of University of
Maryland Maryland Baltimore CountyBaltimore County
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 22
What is an ontologyWhat is an ontology
The subject of ontologyThe subject of ontologyis the study of theis the study of thecategories of things thatcategories of things thatexist or mayexist or mayexist in some domain. exist in some domain.
The word ontology isThe word ontology isfrom the Greek from the Greek ontosontos for forbeing and being and logoslogos for word. for word.
Aristotle offered an ontology which included 10 Aristotle offered an ontology which included 10 categories, shown as the leaves in this tree categories, shown as the leaves in this tree (from (from Sowa, after Brentano)Sowa, after Brentano)
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 33
Tree ofTree ofPorphyry Porphyry
The oldest knownThe oldest knowntree diagram is thetree diagram is the3rd century AD 3rd century AD work by Greek work by Greek philosopherphilosopherPorphyry in Porphyry in commentary on commentary on Aristotle.Aristotle.
Substance was identified as the Substance was identified as the supreme supreme genusgenus or the most general supertype. or the most general supertype.
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 44
Top down vs. bottom upTop down vs. bottom up Philosophers build fromPhilosophers build from
the top down and arethe top down and areinterested in capturinginterested in capturingthe most generalthe most generalconcepts.concepts.
Programmers tend toProgrammers tend towork from the bottomwork from the bottomup, supporting a set ofup, supporting a set ofapplications, with a little generality to help reuse and applications, with a little generality to help reuse and future development.future development.
Ex: CHAT-80 system (Periera and Warren, 1982) which Ex: CHAT-80 system (Periera and Warren, 1982) which answered NL questions about a geographic database.answered NL questions about a geographic database.
Example of a Example of a microworld microworld ontology supported NLP, query ontology supported NLP, query answering, and generationanswering, and generation
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 55
Blocks worldBlocks world
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 66
Blocks worldBlocks world
The The blocks world blocks world is another microworld used often for NLP, is another microworld used often for NLP, vision, planning.vision, planning.
It consists of a table, a set of blocks or different shapes, sizes It consists of a table, a set of blocks or different shapes, sizes and colors and a robot hand.and colors and a robot hand.
Some typical domain constraints:Some typical domain constraints: Only one block can be on another block.Only one block can be on another block. Any number of blocks can be on the table.Any number of blocks can be on the table. The hand can only hold one block.The hand can only hold one block.
Typical representation:Typical representation:ontable(a) ontable(c)ontable(a) ontable(c)
on(b,a) handemptyon(b,a) handempty
clear(b clear(c)clear(b clear(c)
AB
C
TABLE
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 77
Trees, Lattices, and Other Trees, Lattices, and Other HierarchiesHierarchies
• Most systems for expressing ontologies make heavy use of familiar representation schemes, including trees, lattices, acyclic graphs and general graphs
• A lattice has a TOP (everthing) and BOTTOM (nothing)
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 88
Ontologies in Computer Ontologies in Computer Science Science
Ontology Ontology : A common vocabulary and : A common vocabulary and agreed upon meanings to describe a subject agreed upon meanings to describe a subject domain. domain.
On*tol"o*gy (?), n. [Gr. the things which exist (pl.neut. of , , being, p.pr. of to be) + -logy: cf.F. ontologie.]
That department of the science of metaphysics which investigates and explains the nature and essential properties and relations of all beings, as such, or the principles and causes of being. Webster's Revised Unabridged Dictionary (G & C. Merriam Co., 1913, edited by Noah Porter)
This is not a profoundly new idea … –Vocabulary specification–Domain theory–Conceptual schema (for a data base)–Class-subclass taxonomy–Object schema
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 99
Importance of ontologies in Importance of ontologies in communicationcommunication
An example of the importance of ontologies in An example of the importance of ontologies in communication is the fate of NASA’s Mars communication is the fate of NASA’s Mars Climate OrbiterClimate Orbiter
It crashed into Mars on September 23, 1999It crashed into Mars on September 23, 1999 JPL used metric units in their program JPL used metric units in their program
controlling the thrusters and Lockheed-Martin controlling the thrusters and Lockheed-Martin used imperial units.used imperial units.
Instead of establishing an orbit at an altitude Instead of establishing an orbit at an altitude of 140km, it did so at 60km, causing it to burn of 140km, it did so at 60km, causing it to burn up in the Martian atmosphere.up in the Martian atmosphere.
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1010
Conceptual SchemasConceptual Schemas
Table: price *stockNo: integer; cost: float
139 74.50140 77.60… …
Data Base:
Data Base Schema:
Conceptual Schema:
A conceptual schema specifies the intended A conceptual schema specifies the intended meaning of concepts used in a data basemeaning of concepts used in a data base
AutoProduct
Ontology
ProductOntology
Units &MeasuresOntology
price(x, y) => (x’, y’) [auto_part(x’) & part_no(x’) = x & retail_price(x’, y’, Value-Inc) & magnitude(y’, US_dollars) = y]
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1111
Implicit vs. Explicit OntologiesImplicit vs. Explicit Ontologies
Systems which communicate and work Systems which communicate and work together must share an ontology.together must share an ontology.
The shared ontology can be The shared ontology can be implicitimplicit or or explicitexplicit..
Implicit ontology are typically represented Implicit ontology are typically represented only by proceduresonly by procedures
Explicit ontologies are (ideally) given a Explicit ontologies are (ideally) given a declarative representation in a well defined declarative representation in a well defined knowledge representation language.knowledge representation language.
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1212
Conceptualizations, Vocabularies and Conceptualizations, Vocabularies and AxiomitizationAxiomitization
Three important aspects to explicit ontologiesThree important aspects to explicit ontologies Conceptualization Conceptualization involves the underlying model of involves the underlying model of
the domain in terms of objects, attributes and the domain in terms of objects, attributes and relations.relations.
VocabularyVocabulary involves assigning symbols or terms to involves assigning symbols or terms to refer to those objects, attributes and relations.refer to those objects, attributes and relations.
AxiomitizationAxiomitization involves encoding rules and involves encoding rules and constraints which capture significant aspects of the constraints which capture significant aspects of the domain model.domain model.
Two ontologies mayTwo ontologies may be based on different conceptualizationsbe based on different conceptualizations be based on the same conceptualization but use be based on the same conceptualization but use
different vocabulariesdifferent vocabularies differ in how much they attempt to axiomitize the differ in how much they attempt to axiomitize the
ontologiesontologies
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1313
Simple examplesSimple examples
fruit
pomme citron orange
fruit
apple lemon orange
fruit
apple citrus pear
lime lemon orange
fruit
tropical temperate
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1414
Ontologies vs. KBsOntologies vs. KBsOntologies are distinguished from KBs not Ontologies are distinguished from KBs not by their form, but by the role they play in by their form, but by the role they play in representing knowledgerepresenting knowledge Consensus models for a domainConsensus models for a domain Emphasis on properties that hold in all situationsEmphasis on properties that hold in all situations Emphasis on classes rather than instancesEmphasis on classes rather than instances Intended to support multiple tasks and methodsIntended to support multiple tasks and methods Don’t change during problem solving and are Don’t change during problem solving and are
suited for “compiling” into toolssuited for “compiling” into tools Need to satisfy a community of useNeed to satisfy a community of use
Emphasis on collaborative developmentEmphasis on collaborative development Emphasis on translation to multiple logical Emphasis on translation to multiple logical
formalismsformalisms Useful for educationUseful for education
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1515
Ontology Library and Editing ToolsOntology Library and Editing Tools
Models ofSpace
Browse Compare Compose Extend Check
°EditingTools
SharedLibrary
WordNetPenman OntologyCYC Upper Ontology
Models ofTime
PhysicalObjects
Actions& Causality
Lexicons &Skeleton Ontologies
CommonOntologies & Theories
Geography& Terrain
Situations& Contexts
OperationsLogisticsSensor ManagementBattlefield SituationsCommand and Control
Domain-SpecificOntologies & Theories
Basic Representation Concepts: Sets, Sequences, Arrays, Quantities, Probabilities
Ontolingua is a language for building, publishing, and sharing ontologies.•A web-based interface to a browser/editor server at http://ontolingua.stanford.edu/ and mirror sites.
•Ontologies can betranslated into a number of content languages, including KIF, LOOM, Prolog, CLIPS, etc.
•Chimera is a tool for merging existing ontologies
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1616
Big OntologiesBig Ontologies
There are several large, general ontologies There are several large, general ontologies that are freely available.that are freely available.
Some examples are:Some examples are: Cyc - Cyc - Original general purpose ontology Original general purpose ontology WordNetWordNet - a large, on-line lexical reference - a large, on-line lexical reference
systemsystem World Fact Book -- World Fact Book -- 5Meg of KIF sentences!5Meg of KIF sentences! UMLSUMLS - NLM’s Unified Medical Language System - NLM’s Unified Medical Language System
See See http://www.cs.utexas.edu/users/mfkb/relatehttp://www.cs.utexas.edu/users/mfkb/related.html for mored.html for more
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1717
WordNetWordNet WordNet® is an on-line lexical referenceWordNet® is an on-line lexical reference
system whose design is inspired bysystem whose design is inspired bypsycholinguistic theories of human lexicalpsycholinguistic theories of human lexicalmemory. memory. English nouns, verbs, adjectives and adverbs are organized English nouns, verbs, adjectives and adverbs are organized
into synonym sets, each representing one underlying lexical into synonym sets, each representing one underlying lexical concept. concept.
Synsets: {board,plank}{board,committee}Synsets: {board,plank}{board,committee} Different relations link the synonym sets (e.g. antonyms, Different relations link the synonym sets (e.g. antonyms,
generalizations, etc)generalizations, etc) ~140K words~140K words
Developed by the Cognitive Science Laboratory at Developed by the Cognitive Science Laboratory at Princeton and available onlinePrinceton and available online
Although linguistically motivated, many groups Although linguistically motivated, many groups have used it as a general ontology of concepts.have used it as a general ontology of concepts.
http://www.cogsci.princeton.edu/~wn/http://www.cogsci.princeton.edu/~wn/
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1818
EDR Electronic Dictionary EDR Electronic Dictionary
http://www.iijnet.or.jp/edr/http://www.iijnet.or.jp/edr/ a dictionary with over 400,000 a dictionary with over 400,000
concepts, with their mappings to both concepts, with their mappings to both English and Japanese words. English and Japanese words.
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 1919
CycCyc CYC is a large KB which has beenCYC is a large KB which has been
under continual development sinceunder continual development sinceabout 1985.about 1985.
The CYC KB is a formalized representation a The CYC KB is a formalized representation a vast quantity of fundamental human vast quantity of fundamental human knowledge: facts, rules of thumb, and knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and heuristics for reasoning about the objects and events of everyday life. events of everyday life.
CYC is encoded in the KR language CYCLCYC is encoded in the KR language CYCL The The Upper CYC OntologyUpper CYC Ontology contains contains
approximately 3,000 terms “capturing the most approximately 3,000 terms “capturing the most general concepts of human consensus reality”.general concepts of human consensus reality”.
http://www.cyc.com/cyc-2-1/cover.htmlhttp://www.cyc.com/cyc-2-1/cover.html
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 2020
Cyc’s top level conceptsCyc’s top level concepts
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 2121
openCycopenCyc http://www.opencyc.org/http://www.opencyc.org/ OpenCyc 1.0 (summer 2002?) will include the following.OpenCyc 1.0 (summer 2002?) will include the following. 6,000 concepts: an upper ontology for all of human 6,000 concepts: an upper ontology for all of human
consensus reality.consensus reality. 60,000 assertions about the 6,000 concepts, interrelating 60,000 assertions about the 6,000 concepts, interrelating
them, constraining them, in effect (partially) defining them. them, constraining them, in effect (partially) defining them. A compiled version of the Cyc Inference Engine and the Cyc A compiled version of the Cyc Inference Engine and the Cyc
Knowledge Base Browser. Knowledge Base Browser. A specification of CycL, the language in which Cyc (and A specification of CycL, the language in which Cyc (and
hence OpenCyc) is written. There are CycL-to-Lisp,CycL-to-hence OpenCyc) is written. There are CycL-to-Lisp,CycL-to-C, etc. translators. C, etc. translators.
A specification of the Cyc API A specification of the Cyc API A few sample programs that demonstrate use of the Cyc A few sample programs that demonstrate use of the Cyc
API for application development.API for application development.
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 2222
IEEE Standard Upper OntologyIEEE Standard Upper Ontology
An IEEE standards working groupAn IEEE standards working group ““This standard will specify an upperThis standard will specify an upper
ontology that will enable computers to ontology that will enable computers to utilize it for applications such as data utilize it for applications such as data interoperability, information search and interoperability, information search and retrieval, automated inferencing, and retrieval, automated inferencing, and natural language processing.natural language processing.
http://suo.ieee.org/http://suo.ieee.org/ See site for documents and archives of mailing See site for documents and archives of mailing
list discussionslist discussions Two “starter documents” for SUOs: SUMO, IFFTwo “starter documents” for SUOs: SUMO, IFF
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 2323
World Fact BookWorld Fact Book
Stanford’s WFB aims to Stanford’s WFB aims to semi-automatically semi-automatically construct a substantial KB construct a substantial KB of basic geographic, of basic geographic, economic, political, and economic, political, and demographic knowledge demographic knowledge about the world's nations. about the world's nations.
Source: CIA World Fact Source: CIA World Fact BookBook
5.2 MB ~ 5K classes & 64K 5.2 MB ~ 5K classes & 64K facts and rules encoded in facts and rules encoded in KIFKIF
Available from Available from http://www-ksl-svc.stanford.edu:http://www-ksl-svc.stanford.edu:5915/doc/wfb/5915/doc/wfb/ in several forms in several forms
Example: resources, industries, Example: resources, industries, commoditiescommodities
Interrelated: crude-oil reserves, production, Interrelated: crude-oil reserves, production, exportsexports
Coal mining,computer industry,auto parts Coal mining,computer industry,auto parts industry, …industry, …
Specify basic definitionsSpecify basic definitions A natural resource is a deposit of stuff; an A natural resource is a deposit of stuff; an
industry is a collection of businesses; a industry is a collection of businesses; a commodity is an item whose sales can be commodity is an item whose sales can be measured as a continuous quantitymeasured as a continuous quantity
Examine related classes & identify key Examine related classes & identify key factorsfactors
E.g., material, process, product, customer, E.g., material, process, product, customer, location, tasklocation, task
Define each industry as a conjunction of Define each industry as a conjunction of factorsfactors
6 generative factors discriminate 500 6 generative factors discriminate 500 industriesindustries
Organize values of factors Organize values of factors (mining<extraction)(mining<extraction)
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 2424
Unified Medical Unified Medical Language SystemLanguage System
Under development since 1986 by the National Library Under development since 1986 by the National Library of Medicineof Medicine
Supports standardize medical terminology via a central Supports standardize medical terminology via a central dictionary + thesaurus + semantic network+ search dictionary + thesaurus + semantic network+ search engineengine Purpose is to “aid the development of systems that help health Purpose is to “aid the development of systems that help health
professionals and researchers retrieve and integrate electronic professionals and researchers retrieve and integrate electronic biomedical information from a variety of sources and to make it easy biomedical information from a variety of sources and to make it easy for users to link disparate information systems, including computer-for users to link disparate information systems, including computer-based patient records,bibliographic databases, factual databases, and based patient records,bibliographic databases, factual databases, and expert systems”.expert systems”.
There are four UMLS knowledge sources: There are four UMLS knowledge sources: UMLS Metathesaurus UMLS Metathesaurus SPECIALIST Lexicon SPECIALIST Lexicon UMLS Semantic Network UMLS Semantic Network UMLS Information Sources Map UMLS Information Sources Map
UMBCUMBCan Honors University in an Honors University in
MarylandMaryland 2525
Ontology ConclusionsOntology Conclusions
Shared ontologies are essential for agent Shared ontologies are essential for agent communication and knowledge sharingcommunication and knowledge sharing
Ontology tools and standards are importantOntology tools and standards are important Ontolingua and OKBC are good examplesOntolingua and OKBC are good examples XML and RDF may be a next step XML and RDF may be a next step
Some large general ontologies are availableSome large general ontologies are available Cyc, WFB, WordNet, …Cyc, WFB, WordNet, …
For more information…For more information… http://www.kr.org/top describes projects addressing http://www.kr.org/top describes projects addressing
major ontology construction issuesmajor ontology construction issues Ontology mailing list: send mail to Ontology mailing list: send mail to
[email protected]@cs.umbc.edu with “with “info ontologyinfo ontology” in ” in message body for information.message body for information.
ANSI Ad Hoc Group on Ontology Standards: ANSI Ad Hoc Group on Ontology Standards: http://WWW-KSL.Stanford.EDU/onto-std/http://WWW-KSL.Stanford.EDU/onto-std/