2/02 1 an introduction to ontologies tim finin tim finin university of maryland baltimore county

25
2/02 2/02 1 An Introduction An Introduction to to Ontologies Ontologies Tim Finin Tim Finin University of University of Maryland Maryland Baltimore County Baltimore County

Upload: wesley-maxwell

Post on 27-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

2/022/02 11

An Introduction toAn Introduction to

OntologieOntologiess

Tim FininTim FininUniversity of University of

Maryland Maryland Baltimore CountyBaltimore County

Page 2: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 22

What is an ontologyWhat is an ontology

The subject of ontologyThe subject of ontologyis the study of theis the study of thecategories of things thatcategories of things thatexist or mayexist or mayexist in some domain. exist in some domain.

The word ontology isThe word ontology isfrom the Greek from the Greek ontosontos for forbeing and being and logoslogos for word. for word.

Aristotle offered an ontology which included 10 Aristotle offered an ontology which included 10 categories, shown as the leaves in this tree categories, shown as the leaves in this tree (from (from Sowa, after Brentano)Sowa, after Brentano)

Page 3: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 33

Tree ofTree ofPorphyry Porphyry

The oldest knownThe oldest knowntree diagram is thetree diagram is the3rd century AD 3rd century AD work by Greek work by Greek philosopherphilosopherPorphyry in Porphyry in commentary on commentary on Aristotle.Aristotle.

Substance was identified as the Substance was identified as the supreme supreme genusgenus or the most general supertype. or the most general supertype.

Page 4: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 44

Top down vs. bottom upTop down vs. bottom up Philosophers build fromPhilosophers build from

the top down and arethe top down and areinterested in capturinginterested in capturingthe most generalthe most generalconcepts.concepts.

Programmers tend toProgrammers tend towork from the bottomwork from the bottomup, supporting a set ofup, supporting a set ofapplications, with a little generality to help reuse and applications, with a little generality to help reuse and future development.future development.

Ex: CHAT-80 system (Periera and Warren, 1982) which Ex: CHAT-80 system (Periera and Warren, 1982) which answered NL questions about a geographic database.answered NL questions about a geographic database.

Example of a Example of a microworld microworld ontology supported NLP, query ontology supported NLP, query answering, and generationanswering, and generation

Page 5: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 55

Blocks worldBlocks world

Page 6: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 66

Blocks worldBlocks world

The The blocks world blocks world is another microworld used often for NLP, is another microworld used often for NLP, vision, planning.vision, planning.

It consists of a table, a set of blocks or different shapes, sizes It consists of a table, a set of blocks or different shapes, sizes and colors and a robot hand.and colors and a robot hand.

Some typical domain constraints:Some typical domain constraints: Only one block can be on another block.Only one block can be on another block. Any number of blocks can be on the table.Any number of blocks can be on the table. The hand can only hold one block.The hand can only hold one block.

Typical representation:Typical representation:ontable(a) ontable(c)ontable(a) ontable(c)

on(b,a) handemptyon(b,a) handempty

clear(b clear(c)clear(b clear(c)

AB

C

TABLE

Page 7: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 77

Trees, Lattices, and Other Trees, Lattices, and Other HierarchiesHierarchies

• Most systems for expressing ontologies make heavy use of familiar representation schemes, including trees, lattices, acyclic graphs and general graphs

• A lattice has a TOP (everthing) and BOTTOM (nothing)

Page 8: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 88

Ontologies in Computer Ontologies in Computer Science Science

Ontology Ontology : A common vocabulary and : A common vocabulary and agreed upon meanings to describe a subject agreed upon meanings to describe a subject domain. domain.

On*tol"o*gy (?), n. [Gr. the things which exist (pl.neut. of , , being, p.pr. of to be) + -logy: cf.F. ontologie.]

That department of the science of metaphysics which investigates and explains the nature and essential properties and relations of all beings, as such, or the principles and causes of being. Webster's Revised Unabridged Dictionary (G & C. Merriam Co., 1913, edited by Noah Porter)

This is not a profoundly new idea … –Vocabulary specification–Domain theory–Conceptual schema (for a data base)–Class-subclass taxonomy–Object schema

Page 9: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 99

Importance of ontologies in Importance of ontologies in communicationcommunication

An example of the importance of ontologies in An example of the importance of ontologies in communication is the fate of NASA’s Mars communication is the fate of NASA’s Mars Climate OrbiterClimate Orbiter

It crashed into Mars on September 23, 1999It crashed into Mars on September 23, 1999 JPL used metric units in their program JPL used metric units in their program

controlling the thrusters and Lockheed-Martin controlling the thrusters and Lockheed-Martin used imperial units.used imperial units.

Instead of establishing an orbit at an altitude Instead of establishing an orbit at an altitude of 140km, it did so at 60km, causing it to burn of 140km, it did so at 60km, causing it to burn up in the Martian atmosphere.up in the Martian atmosphere.

Page 10: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1010

Conceptual SchemasConceptual Schemas

Table: price *stockNo: integer; cost: float

139 74.50140 77.60… …

Data Base:

Data Base Schema:

Conceptual Schema:

A conceptual schema specifies the intended A conceptual schema specifies the intended meaning of concepts used in a data basemeaning of concepts used in a data base

AutoProduct

Ontology

ProductOntology

Units &MeasuresOntology

price(x, y) => (x’, y’) [auto_part(x’) & part_no(x’) = x & retail_price(x’, y’, Value-Inc) & magnitude(y’, US_dollars) = y]

Page 11: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1111

Implicit vs. Explicit OntologiesImplicit vs. Explicit Ontologies

Systems which communicate and work Systems which communicate and work together must share an ontology.together must share an ontology.

The shared ontology can be The shared ontology can be implicitimplicit or or explicitexplicit..

Implicit ontology are typically represented Implicit ontology are typically represented only by proceduresonly by procedures

Explicit ontologies are (ideally) given a Explicit ontologies are (ideally) given a declarative representation in a well defined declarative representation in a well defined knowledge representation language.knowledge representation language.

Page 12: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1212

Conceptualizations, Vocabularies and Conceptualizations, Vocabularies and AxiomitizationAxiomitization

Three important aspects to explicit ontologiesThree important aspects to explicit ontologies Conceptualization Conceptualization involves the underlying model of involves the underlying model of

the domain in terms of objects, attributes and the domain in terms of objects, attributes and relations.relations.

VocabularyVocabulary involves assigning symbols or terms to involves assigning symbols or terms to refer to those objects, attributes and relations.refer to those objects, attributes and relations.

AxiomitizationAxiomitization involves encoding rules and involves encoding rules and constraints which capture significant aspects of the constraints which capture significant aspects of the domain model.domain model.

Two ontologies mayTwo ontologies may be based on different conceptualizationsbe based on different conceptualizations be based on the same conceptualization but use be based on the same conceptualization but use

different vocabulariesdifferent vocabularies differ in how much they attempt to axiomitize the differ in how much they attempt to axiomitize the

ontologiesontologies

Page 13: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1313

Simple examplesSimple examples

fruit

pomme citron orange

fruit

apple lemon orange

fruit

apple citrus pear

lime lemon orange

fruit

tropical temperate

Page 14: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1414

Ontologies vs. KBsOntologies vs. KBsOntologies are distinguished from KBs not Ontologies are distinguished from KBs not by their form, but by the role they play in by their form, but by the role they play in representing knowledgerepresenting knowledge Consensus models for a domainConsensus models for a domain Emphasis on properties that hold in all situationsEmphasis on properties that hold in all situations Emphasis on classes rather than instancesEmphasis on classes rather than instances Intended to support multiple tasks and methodsIntended to support multiple tasks and methods Don’t change during problem solving and are Don’t change during problem solving and are

suited for “compiling” into toolssuited for “compiling” into tools Need to satisfy a community of useNeed to satisfy a community of use

Emphasis on collaborative developmentEmphasis on collaborative development Emphasis on translation to multiple logical Emphasis on translation to multiple logical

formalismsformalisms Useful for educationUseful for education

Page 15: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1515

Ontology Library and Editing ToolsOntology Library and Editing Tools

Models ofSpace

Browse Compare Compose Extend Check

°EditingTools

SharedLibrary

WordNetPenman OntologyCYC Upper Ontology

Models ofTime

PhysicalObjects

Actions& Causality

Lexicons &Skeleton Ontologies

CommonOntologies & Theories

Geography& Terrain

Situations& Contexts

OperationsLogisticsSensor ManagementBattlefield SituationsCommand and Control

Domain-SpecificOntologies & Theories

Basic Representation Concepts: Sets, Sequences, Arrays, Quantities, Probabilities

Ontolingua is a language for building, publishing, and sharing ontologies.•A web-based interface to a browser/editor server at http://ontolingua.stanford.edu/ and mirror sites.

•Ontologies can betranslated into a number of content languages, including KIF, LOOM, Prolog, CLIPS, etc.

•Chimera is a tool for merging existing ontologies

Page 16: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1616

Big OntologiesBig Ontologies

There are several large, general ontologies There are several large, general ontologies that are freely available.that are freely available.

Some examples are:Some examples are: Cyc - Cyc - Original general purpose ontology Original general purpose ontology WordNetWordNet - a large, on-line lexical reference - a large, on-line lexical reference

systemsystem World Fact Book -- World Fact Book -- 5Meg of KIF sentences!5Meg of KIF sentences! UMLSUMLS - NLM’s Unified Medical Language System - NLM’s Unified Medical Language System

See See http://www.cs.utexas.edu/users/mfkb/relatehttp://www.cs.utexas.edu/users/mfkb/related.html for mored.html for more

Page 17: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1717

WordNetWordNet WordNet® is an on-line lexical referenceWordNet® is an on-line lexical reference

system whose design is inspired bysystem whose design is inspired bypsycholinguistic theories of human lexicalpsycholinguistic theories of human lexicalmemory. memory. English nouns, verbs, adjectives and adverbs are organized English nouns, verbs, adjectives and adverbs are organized

into synonym sets, each representing one underlying lexical into synonym sets, each representing one underlying lexical concept. concept.

Synsets: {board,plank}{board,committee}Synsets: {board,plank}{board,committee} Different relations link the synonym sets (e.g. antonyms, Different relations link the synonym sets (e.g. antonyms,

generalizations, etc)generalizations, etc) ~140K words~140K words

Developed by the Cognitive Science Laboratory at Developed by the Cognitive Science Laboratory at Princeton and available onlinePrinceton and available online

Although linguistically motivated, many groups Although linguistically motivated, many groups have used it as a general ontology of concepts.have used it as a general ontology of concepts.

http://www.cogsci.princeton.edu/~wn/http://www.cogsci.princeton.edu/~wn/

Page 18: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1818

EDR Electronic Dictionary EDR Electronic Dictionary

http://www.iijnet.or.jp/edr/http://www.iijnet.or.jp/edr/ a dictionary with over 400,000 a dictionary with over 400,000

concepts, with their mappings to both concepts, with their mappings to both English and Japanese words. English and Japanese words.

Page 19: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1919

CycCyc CYC is a large KB which has beenCYC is a large KB which has been

under continual development sinceunder continual development sinceabout 1985.about 1985.

The CYC KB is a formalized representation a The CYC KB is a formalized representation a vast quantity of fundamental human vast quantity of fundamental human knowledge: facts, rules of thumb, and knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and heuristics for reasoning about the objects and events of everyday life. events of everyday life.

CYC is encoded in the KR language CYCLCYC is encoded in the KR language CYCL The The Upper CYC OntologyUpper CYC Ontology contains contains

approximately 3,000 terms “capturing the most approximately 3,000 terms “capturing the most general concepts of human consensus reality”.general concepts of human consensus reality”.

http://www.cyc.com/cyc-2-1/cover.htmlhttp://www.cyc.com/cyc-2-1/cover.html

Page 20: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2020

Cyc’s top level conceptsCyc’s top level concepts

Page 21: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2121

openCycopenCyc http://www.opencyc.org/http://www.opencyc.org/ OpenCyc 1.0 (summer 2002?) will include the following.OpenCyc 1.0 (summer 2002?) will include the following. 6,000 concepts: an upper ontology for all of human 6,000 concepts: an upper ontology for all of human

consensus reality.consensus reality. 60,000 assertions about the 6,000 concepts, interrelating 60,000 assertions about the 6,000 concepts, interrelating

them, constraining them, in effect (partially) defining them. them, constraining them, in effect (partially) defining them. A compiled version of the Cyc Inference Engine and the Cyc A compiled version of the Cyc Inference Engine and the Cyc

Knowledge Base Browser. Knowledge Base Browser. A specification of CycL, the language in which Cyc (and A specification of CycL, the language in which Cyc (and

hence OpenCyc) is written. There are CycL-to-Lisp,CycL-to-hence OpenCyc) is written. There are CycL-to-Lisp,CycL-to-C, etc. translators. C, etc. translators.

A specification of the Cyc API A specification of the Cyc API A few sample programs that demonstrate use of the Cyc A few sample programs that demonstrate use of the Cyc

API for application development.API for application development.

Page 22: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2222

IEEE Standard Upper OntologyIEEE Standard Upper Ontology

An IEEE standards working groupAn IEEE standards working group ““This standard will specify an upperThis standard will specify an upper

ontology that will enable computers to ontology that will enable computers to utilize it for applications such as data utilize it for applications such as data interoperability, information search and interoperability, information search and retrieval, automated inferencing, and retrieval, automated inferencing, and natural language processing.natural language processing.

http://suo.ieee.org/http://suo.ieee.org/ See site for documents and archives of mailing See site for documents and archives of mailing

list discussionslist discussions Two “starter documents” for SUOs: SUMO, IFFTwo “starter documents” for SUOs: SUMO, IFF

Page 23: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2323

World Fact BookWorld Fact Book

Stanford’s WFB aims to Stanford’s WFB aims to semi-automatically semi-automatically construct a substantial KB construct a substantial KB of basic geographic, of basic geographic, economic, political, and economic, political, and demographic knowledge demographic knowledge about the world's nations. about the world's nations.

Source: CIA World Fact Source: CIA World Fact BookBook

5.2 MB ~ 5K classes & 64K 5.2 MB ~ 5K classes & 64K facts and rules encoded in facts and rules encoded in KIFKIF

Available from Available from http://www-ksl-svc.stanford.edu:http://www-ksl-svc.stanford.edu:5915/doc/wfb/5915/doc/wfb/ in several forms in several forms

Example: resources, industries, Example: resources, industries, commoditiescommodities

Interrelated: crude-oil reserves, production, Interrelated: crude-oil reserves, production, exportsexports

Coal mining,computer industry,auto parts Coal mining,computer industry,auto parts industry, …industry, …

Specify basic definitionsSpecify basic definitions A natural resource is a deposit of stuff; an A natural resource is a deposit of stuff; an

industry is a collection of businesses; a industry is a collection of businesses; a commodity is an item whose sales can be commodity is an item whose sales can be measured as a continuous quantitymeasured as a continuous quantity

Examine related classes & identify key Examine related classes & identify key factorsfactors

E.g., material, process, product, customer, E.g., material, process, product, customer, location, tasklocation, task

Define each industry as a conjunction of Define each industry as a conjunction of factorsfactors

6 generative factors discriminate 500 6 generative factors discriminate 500 industriesindustries

Organize values of factors Organize values of factors (mining<extraction)(mining<extraction)

Page 24: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2424

Unified Medical Unified Medical Language SystemLanguage System

Under development since 1986 by the National Library Under development since 1986 by the National Library of Medicineof Medicine

Supports standardize medical terminology via a central Supports standardize medical terminology via a central dictionary + thesaurus + semantic network+ search dictionary + thesaurus + semantic network+ search engineengine Purpose is to “aid the development of systems that help health Purpose is to “aid the development of systems that help health

professionals and researchers retrieve and integrate electronic professionals and researchers retrieve and integrate electronic biomedical information from a variety of sources and to make it easy biomedical information from a variety of sources and to make it easy for users to link disparate information systems, including computer-for users to link disparate information systems, including computer-based patient records,bibliographic databases, factual databases, and based patient records,bibliographic databases, factual databases, and expert systems”.expert systems”.

There are four UMLS knowledge sources: There are four UMLS knowledge sources: UMLS Metathesaurus UMLS Metathesaurus SPECIALIST Lexicon SPECIALIST Lexicon UMLS Semantic Network UMLS Semantic Network UMLS Information Sources Map UMLS Information Sources Map

Page 25: 2/02 1 An Introduction to Ontologies Tim Finin Tim Finin University of Maryland Baltimore County

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2525

Ontology ConclusionsOntology Conclusions

Shared ontologies are essential for agent Shared ontologies are essential for agent communication and knowledge sharingcommunication and knowledge sharing

Ontology tools and standards are importantOntology tools and standards are important Ontolingua and OKBC are good examplesOntolingua and OKBC are good examples XML and RDF may be a next step XML and RDF may be a next step

Some large general ontologies are availableSome large general ontologies are available Cyc, WFB, WordNet, …Cyc, WFB, WordNet, …

For more information…For more information… http://www.kr.org/top describes projects addressing http://www.kr.org/top describes projects addressing

major ontology construction issuesmajor ontology construction issues Ontology mailing list: send mail to Ontology mailing list: send mail to

[email protected]@cs.umbc.edu with “with “info ontologyinfo ontology” in ” in message body for information.message body for information.

ANSI Ad Hoc Group on Ontology Standards: ANSI Ad Hoc Group on Ontology Standards: http://WWW-KSL.Stanford.EDU/onto-std/http://WWW-KSL.Stanford.EDU/onto-std/