knowledge representation issues for the semantic web
DESCRIPTION
Knowledge Representation Issues for the Semantic Web. Jeff Heflin Lehigh University. Outline. Introduction History OWL Overview Selected Research Issues Semantics of Distributed Ontologies Reasoning and Scalability Overview of Other Key Research Topics. The Semantic Web. Definition - PowerPoint PPT PresentationTRANSCRIPT
Knowledge RepresentationIssues for the Semantic Web
Jeff HeflinLehigh University
Outline Introduction
– History– OWL Overview
Selected Research Issues– Semantics of Distributed Ontologies– Reasoning and Scalability
Overview of Other Key Research Topics
The Semantic Web Definition
– The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. (Berners-Lee et al., Scientific American, May 2001)
Applications– managing corporate web sites (intranets)– more automatic generation of web portals– better indexing of multimedia resources– web agents and web services– ubiquitous computing
Semantic Web Challenges The Web is distributed
– many sources, varying authority– inconsistency
The Web is dynamic– representational needs may change
The Web is enormous– systems must scale well
The Web is an open-world
Evolution of Web Standards
XML
<tr><td><b>Charlotte’s Web</b> -E.B. White, Garth Williams.<font color=“Red”>$6.99</font></td></tr>
<book><title>Charlotte’s Web</title><author>E.B. White</author><author>Garth Williams</author><price units=“USD”>6.99</price><subject>Children’s Fiction</subject></book>
presentation-oriented markup
content-oriented markup
HTML
OWL Web Ontology Language
– W3C Recommendation– released Feb. 2004
<rdf:Description rdf:about=“”> <imports resource=“www.books.com/bookont”><rdf:Description><Book rdf:ID=“book26489”><author>E.B. White</author><title>Charlotte’s Web</title><price>6.99</price><subject rdf:resource=“&bookont;FictionChild”></Book>
<Class ID=“Book”><Property ID=“subject”> <domain resource=“#Book”> <range resource=“#Topic”></Property><Class ID=“FictionChild”> <subclassOf resource=“#Fiction”> <subclassOf resource=“#Childrens”></Class>…
imports
bookont ontology
markup linked to semantics
semantic markup
Ontology Definition
– a logical theory that accounts for the intended meaning of a formal vocabulary (Guarino 98)
– has a formal syntax and unambiguous semantics– inference algorithms can compute what logically
follows Relevance to Web:
– identify context– provide shared definitions– eases the integration of distinct resources
Semantic Web Timeline
Mar. 1996 - SHOE 0.90 (simple frames in HTML)
Jan. 1998 –SHOE 1.0 (frames + Horn logic)
Feb. 1998 – XML (semi-structured data for Web)
1996 20042000 20021998
Sep. 1998 – Berners-Lee’s Semantic Web Roadmap
Feb. 1999 –RDF (semantic nets in XML)
Mar. 2001 – DAML+OIL (expressive DL in RDF)
May 2001 – Berners-Lee et al. Scientific American article
June. 2002 – 1st Int’l Semantic Web Conference
Feb. 2004 – OWL (W3C Rec.)
RDF and RDF Schema
u:Chair
John Smith
rdf:typeg:name
<rdf:RDF xmlns:g=“http://schema.org/gen” xmlns:u=“http://schema.org/univ”> <u:Chair rdf:ID=“john”> <g:name>John Smith</g:name> </u:Chair></rdf:RDF>
g:Person
g:name
rdfs:Class rdfs:Property
rdf:typerdf:type
rdf:type
rdfs:subclassOf
rdfs:domain
<rdfs:Property rdf:ID=“name”> <rdfs:domain rdf:resource=“Person”></rdfs:Property>
<rdfs:Class rdf:ID=“Chair”> <rdfs:subclassOf rdf:resource= “http://schema.org/gen#Person”></rdfs:Class>
URIs and Namespaces URI
– Uniform Resource Identifier– includes URLs– but also anything that you can design an identification
scheme for– helps to prevent collision of names– all the “symbols” in RDF are either URIs or Literals
Namespace– a mechanism for abbreviating URIs– by assigning a prefix for a URI fragment
OWL RDF is a data language
– OWL adds ontologies to RDF– used to define RDF classes and properties
OWL ontologies are written in RDF syntax semantically, OWL is based on description
logics– tradeoff between expressivity and
computability
OWL Class Constructors
borrowed from Ian Horrocks
OWL RDF Syntax<owl:Class rdf:ID=”Band”> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource=”#hasMember” /> <owl:allValuesFrom resource=”#Musician” /> </owl:Restriction> </rdfs:subClassOf></owl:Class>
A Band is a subset of the set of objects which only have Musicians as members
OWL Axioms
borrowed from Ian Horrocks
OWL Inference
Bin Laden
<owl:Property rdf:ID=“head”> <rdf:subPropertyOf rdfs:resource=“member” /></owl:Property>
<owl:Class rdf:ID=“Terrorist”> <owl:sameClassAs> <owl:Restriction> <owl:onProperty rdf:resource=“member” /> <owl:someValuesFrom rdf:resource=“TerroristOrg” /> </owl:Restriction> </owl:sameClassAs></owl:Class> Al Qaeda TerrorOrg
Terrorist
type
head
type
The head of an organization is also a member of it
A member of a terror organization is a terrorist
Therefore, the head of a terror organization is a terrorist
Benefit of Description Logic optimized computation of subsumption
– calculate implicit subClassOf relations ontology integration
– if two ontologies use class expressions to define their vocabularies in terms of a third ontology, then subsumption can be used to compute an integrated ontology
Species of OWL OWL Full
– very expressive (e.g., classes as instances)– theoretical properties not well understood
OWL DL– has a standard model theoretic semantics
OWL Lite– subset of OWL DL– easier to reason with
Formal Semantics OWL Lite and OWL DL
– fairly standard DL-style model theoretic semantics– defined using interpretations– classes are sets of objects– class constructors and axioms place conditions on
interpretations OWL Full
– non-standard RDF-style semantics– but still model-theoretic in nature
Selected Research Issues Work by the SWAT lab at Lehigh
– students» Yuanbo Guo» Zhengxiang Pan
Semantics for distributed ontologies Reasoning and scalability
A Web of Ontologies
A1 A2
B3B1 B2C1 D1
E1 F1
revises
revises revises
extends
extends extendsextends
extends extends extends
S1
S2 S3
S5S4
commits to
commits tocommits to
commits to
commits to
commits to
Semantics of Ontology “Links” Brachman (1983) regarding links between
concepts in early semantic networks– . . . the meaning of the link was often relegated to
“what the code does with it”- neither an appropriate notion of semantics nor a useful guide for figuring out what the link, in fact means.
DLs were one solution to this problem In Semantic Web, links between ontologies now
suffer from a similar lack of clear semantics
owl:imports ontology extension / commitment semantics
– in order to satisfy an ontology, an interpretation must also satisfy all ontologies that it imports
only provides semantics for each document in isolation!
Ontology Versioning Each new version has new URL
– other users may have committed to your ontology» “point at” it using its URL
– if you change the file at that location, then you change their commitment without their consent
Issue: Should veh76 be a v2:Vehicle?
car54
Vehicle
type
http://ex.org/ont-v2
Car
subClassOf
veh76 Vehicletype
http://ex.org/ont-v1
Versioning Complications
Should Flipper be a v2:Mammal?– depends
» is change to correct a modeling error?» or to reflect a change in interpretation of “Dolphin”?
Dolphin
DolphinFish
Fish
subClassOf
http:/ex.org/schema-v2
http://ex.org/schem-v1
Mammal
Mammal
subClassOf
Flipper
type
Versioning in OWL priorVersion
– indicates a previous version of an ontology backwardCompatibleWith
– indicates a version with which ontology is backward compatible DeprecatedClass
– used to signify that a class should no longer be used DeprecatedProperty
– used to signify that a property should no longer be used versionInfo
– used for CVS-like strings incompatibleWith
– opposite of backwardCompatible with
OWL Versioning Syntax<rdf:rdf xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmlns:rdfs=“http://www.w3.org/2000/01/rdf-schema#”>
<owl:Ontology rdf:about=“”> <owl:priorVersion rdf:resource=“http://ex.org/schema-v1”> <owl:backwardCompatibleWith rdf:resource=“http://ex.org/schema-v1”></owl:Ontology><owl:DeprecatedClass rdf:ID=“Megalodon”>
<owl:Class rdf:ID=“Dolphin”> <rdfs:subClassOf rdf:resource=“#Mammal”></owl:Class></rdf:rdf>
…
Formal Ontology Definition Ontology O=<V,A,E,P,B>
– V = vocabulary (a set of symbols)– A = axioms (a set of wffs)– E = set of extended ontologies– P = set of prior versions of ontology– B = set of ontologies O is backward-
compatible with (subset of P)
Resource Definitions R is the set of resources Knowledge function
– maps resources to sets of wffs– K : R 2W
Commitment function– maps resources to ontologies– C : R O
Ontology Perspectives Users may wish to view data through viewpoint of
different ontologies– versioning is a special case of this
An ontology specifies a set of axioms Ontology perspectives specify a logical theory
based on an ontology and a set of data sources– combine axioms and ground atoms– queries are with respect to a perspective
Ontology Perspective Theory
UU
UU
})()(,|{})(|{
)}()()(|{)}(|{
)()(
)(
jiji
iiij
BrCOancOjRrBrCRr
OancrCOrCRrOancOjjii
rKrK
rKAA
Given O={O1,O2,…,On} where Oi=<Vi,Ai,Ei,Pi,Bi>axioms of basis ontology
data from sources that committo basis ontology or its ancestors
axioms ofextended ontologies
data from sources that commit toontologies that are compatiblewith the basis ontology
data from sources that commit toontologies that are compatible withancestors of the basis ontology
Perspectives ExampleOntologies:O1: A1 = {Dolphin(x) Fish(x)} B1 = {}O2: A2 = {Dolphin(x) Mammal(x)} B2 = {O1}
Data:C(r1) = O1
K(r1) ={Dolphin(flipper), Fish(charlie), Mammal(bigfoot)}
C(r2) = O2
K(r2) = {Dolphin(splasher)}Perspective
Query 1 2
Dolphin(x)
Fish(x)
Mammal(x)
flipper
charlie, flipper
bigfoot
flipper, splasher
charlie
bigfoot, flipper, splasher
Scalable Systems Motivation
– the Web is large» it won’t fit in main memory!
– current systems don’t scale DLDB
– DB: Relational Database (Microsoft®Access)» scalable technology for querying data
– DL: Description Logics (FaCT reasoner)» rich inference capability» close correspondence to semantics of OWL
<owl:Class rdf:ID=”Student”/><owl:Class rdf:ID="UndergraduateStudent"> <rdfs:subClassOf rdf:resource="#Student" /><owl:Class/>
Design – RDF(S) Entailment Use views to store class hierarchy
CREATE VIEW Student_v ASSELECT * FROM Student UNION SELECT * FROM UndergraduateStudent_view
Design – OWL Entailment
Inferred Hierarchy
DL Reasoner
Ontology
table & viewcreation
Database operation
…Student Person who takes a CourseGraduateStudent Person who takes a GraduateCourseGraduateCourse Course…
…Graduate Student Student…
CREATE VIEW Student_1_view ASSELECT * FROM Student_1 UNION SELECT * FROM UndergraduateStudent_1_view UNION SELECT * FROM GraduateStudent_1_view;
Implementation – Query
(Type GraduateStudent ?X)(TakeCourse ?X http://www.foo.edu/department0/course0)
SELECT GraduateStudent_2_view.IDFROM GraduateStudent_2_view, takeCourse_2_view WHERE GraduateStudent_2_view.id = takeCourse_2_view.subject AND takeCourse_2_view.object= http://www.foo.edu/department0/course0
Query Interface application
KIF-like conjunctive query
Query Translation Algorithm
SQL Sentences
RDBMS
Query API
Lehigh University Benchmark Can be used to evaluate semantic web reasoning systems Features
– OWL ontology for university domain (moderate complexity)– customizable data generation
» can select number of universities and random number generator seed» arbitrary size» repeatable
– plausible» “real world” constraints are applied
Metrics– load time– repository size– query response time– degree of completeness– degree of soundness
Benchmark System
Repository 1
Repository N
API
API
BenchmarkDataData
Generator
14 TestQueries*
Tester
Univ-BenchOntology
Test Results
*each query is executed by 10 times to account for caching.
Initial Experiment Four systems tested
– Sesame Memory, Sesame DB, OWLJessKB, DLDB Five data sizes
– ranging from 15 files (8 MB) to 999 files (583 MB) Summary of results
– Sesame-Memory best for small to medium size if only RDFS inference is needed
– OWLJessKB can answer queries none of the other systems can
» but doesn’t scale and makes some unsound inferences– DLDB has best balance between query response time
and completeness
Some Other Research Topics Knowledge acquisition Language design Semantic Web services
Knowledge Acquisition data
– create or find relevant ontology– then either
» convert existing forms to RDF e.g., XML, relational DBs, CGs, etc.
» information extraction» natural language processing» controlled English? (Sowa, yesterday)
ontologies– import existing ontologies– manual creation (e.g., Protogé)– machine learning– formal concept analysis? (Rudolph, yesterday)
Language Design DL is insufficient for some applications Significant demand for “rules”
– Combining logic programming with DL (Grosof et al. 2003)
SWRL (Semantic Web Rule Language)– proposal to add Horn logic to OWL
However, must consider expressivity / scalability tradeoff
Semantic Web Services Web service
– a web-accessible program that provides information or performs an action
OWL-S– ontology for describing web services
» consists of profile, process model, and grounding Current research includes:
– matchmaking (e.g., see work of Sycara)– automated composition (e.g., see work of McIlraith)– much more …
Conclusion The Semantic Web is concerned with
interoperability of distributed information OWL is a standard that allows for sharing
of ontologies– if you want your ontologies to be used by the
world, then export (what you can) to OWL There is much research to do before the
Semantic Web problem is solved– we need all the help we can get!
For more information... Useful websites
– http://www.semwebcentral.org/– http://www.w3.org/2001/sw/– http://www.daml.org/– http://www.semanticweb.org/
My information– [email protected]– http://www.cse.lehigh.edu/~heflin/
The End
Ontology Divergence
The Web is distributed and dynamic
Therefore, ontological differences will arise– terminology– scope– encoding– context
Thing
Car
Civic
Automobile
Delorean
Object
general-ontology
trans-ont vehicle-ont
isaisa
isa
PorscheEscort
Resolving Ontology Divergence
O1 O2
O1 O2
O1 O2 O1 O2
O1 O2
OM
ON
Mapping Ontology Mapping Revisions Intersection Ontology
OM contains rulesthat map conceptsbetween the ontologies
O1 contains rules thatmap O2 objects to O1 terminology. O2 doesthe reverse
ON contains intersectionof concepts. O1 and O2rename terms wherenecessary
revised byextended by
Key:
Implementation - Database Schema
Student_1_view
1http://www.lehigh.edu/~zhp2/univ-bench.owl
SeqNumURLOntologies_Index
2file:/D:/demo/UBArtiData/University0_0.owl
1http://www.lehigh.edu/~zhp2/univ-bench.owl
SeqNumURL
Source_Index
2http://www.Department0.University0.edu/GraduateCourse9
3http://www.Department0.University0.edu/GraduateStudent123
1http://www.Department0.University0.edu/UndergraduateStudent121
IDURI
13
11
SourceID
TakeCourse_1
…
2
Object
1…
13
SourceSubject
URI_Index