knowledge representation issues for the semantic web

48
Knowledge Representation Issues for the Semantic Web Jeff Heflin Lehigh University

Upload: shaman

Post on 25-Feb-2016

48 views

Category:

Documents


1 download

DESCRIPTION

Knowledge Representation Issues for the Semantic Web. Jeff Heflin Lehigh University. Outline. Introduction History OWL Overview Selected Research Issues Semantics of Distributed Ontologies Reasoning and Scalability Overview of Other Key Research Topics. The Semantic Web. Definition - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Knowledge Representation Issues for the Semantic Web

Knowledge RepresentationIssues for the Semantic Web

Jeff HeflinLehigh University

Page 2: Knowledge Representation Issues for the Semantic Web

Outline Introduction

– History– OWL Overview

Selected Research Issues– Semantics of Distributed Ontologies– Reasoning and Scalability

Overview of Other Key Research Topics

Page 3: Knowledge Representation Issues for the Semantic Web

The Semantic Web Definition

– The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. (Berners-Lee et al., Scientific American, May 2001)

Applications– managing corporate web sites (intranets)– more automatic generation of web portals– better indexing of multimedia resources– web agents and web services– ubiquitous computing

Page 4: Knowledge Representation Issues for the Semantic Web

Semantic Web Challenges The Web is distributed

– many sources, varying authority– inconsistency

The Web is dynamic– representational needs may change

The Web is enormous– systems must scale well

The Web is an open-world

Page 5: Knowledge Representation Issues for the Semantic Web

Evolution of Web Standards

XML

<tr><td><b>Charlotte’s Web</b> -E.B. White, Garth Williams.<font color=“Red”>$6.99</font></td></tr>

<book><title>Charlotte’s Web</title><author>E.B. White</author><author>Garth Williams</author><price units=“USD”>6.99</price><subject>Children’s Fiction</subject></book>

presentation-oriented markup

content-oriented markup

HTML

Page 6: Knowledge Representation Issues for the Semantic Web

OWL Web Ontology Language

– W3C Recommendation– released Feb. 2004

<rdf:Description rdf:about=“”> <imports resource=“www.books.com/bookont”><rdf:Description><Book rdf:ID=“book26489”><author>E.B. White</author><title>Charlotte’s Web</title><price>6.99</price><subject rdf:resource=“&bookont;FictionChild”></Book>

<Class ID=“Book”><Property ID=“subject”> <domain resource=“#Book”> <range resource=“#Topic”></Property><Class ID=“FictionChild”> <subclassOf resource=“#Fiction”> <subclassOf resource=“#Childrens”></Class>…

imports

bookont ontology

markup linked to semantics

semantic markup

Page 7: Knowledge Representation Issues for the Semantic Web

Ontology Definition

– a logical theory that accounts for the intended meaning of a formal vocabulary (Guarino 98)

– has a formal syntax and unambiguous semantics– inference algorithms can compute what logically

follows Relevance to Web:

– identify context– provide shared definitions– eases the integration of distinct resources

Page 8: Knowledge Representation Issues for the Semantic Web

Semantic Web Timeline

Mar. 1996 - SHOE 0.90 (simple frames in HTML)

Jan. 1998 –SHOE 1.0 (frames + Horn logic)

Feb. 1998 – XML (semi-structured data for Web)

1996 20042000 20021998

Sep. 1998 – Berners-Lee’s Semantic Web Roadmap

Feb. 1999 –RDF (semantic nets in XML)

Mar. 2001 – DAML+OIL (expressive DL in RDF)

May 2001 – Berners-Lee et al. Scientific American article

June. 2002 – 1st Int’l Semantic Web Conference

Feb. 2004 – OWL (W3C Rec.)

Page 9: Knowledge Representation Issues for the Semantic Web

RDF and RDF Schema

u:Chair

John Smith

rdf:typeg:name

<rdf:RDF xmlns:g=“http://schema.org/gen” xmlns:u=“http://schema.org/univ”> <u:Chair rdf:ID=“john”> <g:name>John Smith</g:name> </u:Chair></rdf:RDF>

g:Person

g:name

rdfs:Class rdfs:Property

rdf:typerdf:type

rdf:type

rdfs:subclassOf

rdfs:domain

<rdfs:Property rdf:ID=“name”> <rdfs:domain rdf:resource=“Person”></rdfs:Property>

<rdfs:Class rdf:ID=“Chair”> <rdfs:subclassOf rdf:resource= “http://schema.org/gen#Person”></rdfs:Class>

Page 10: Knowledge Representation Issues for the Semantic Web

URIs and Namespaces URI

– Uniform Resource Identifier– includes URLs– but also anything that you can design an identification

scheme for– helps to prevent collision of names– all the “symbols” in RDF are either URIs or Literals

Namespace– a mechanism for abbreviating URIs– by assigning a prefix for a URI fragment

Page 11: Knowledge Representation Issues for the Semantic Web

OWL RDF is a data language

– OWL adds ontologies to RDF– used to define RDF classes and properties

OWL ontologies are written in RDF syntax semantically, OWL is based on description

logics– tradeoff between expressivity and

computability

Page 12: Knowledge Representation Issues for the Semantic Web

OWL Class Constructors

borrowed from Ian Horrocks

Page 13: Knowledge Representation Issues for the Semantic Web

OWL RDF Syntax<owl:Class rdf:ID=”Band”> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource=”#hasMember” /> <owl:allValuesFrom resource=”#Musician” /> </owl:Restriction> </rdfs:subClassOf></owl:Class>

A Band is a subset of the set of objects which only have Musicians as members

Page 14: Knowledge Representation Issues for the Semantic Web

OWL Axioms

borrowed from Ian Horrocks

Page 15: Knowledge Representation Issues for the Semantic Web

OWL Inference

Bin Laden

<owl:Property rdf:ID=“head”> <rdf:subPropertyOf rdfs:resource=“member” /></owl:Property>

<owl:Class rdf:ID=“Terrorist”> <owl:sameClassAs> <owl:Restriction> <owl:onProperty rdf:resource=“member” /> <owl:someValuesFrom rdf:resource=“TerroristOrg” /> </owl:Restriction> </owl:sameClassAs></owl:Class> Al Qaeda TerrorOrg

Terrorist

type

head

type

The head of an organization is also a member of it

A member of a terror organization is a terrorist

Therefore, the head of a terror organization is a terrorist

Page 16: Knowledge Representation Issues for the Semantic Web

Benefit of Description Logic optimized computation of subsumption

– calculate implicit subClassOf relations ontology integration

– if two ontologies use class expressions to define their vocabularies in terms of a third ontology, then subsumption can be used to compute an integrated ontology

Page 17: Knowledge Representation Issues for the Semantic Web

Species of OWL OWL Full

– very expressive (e.g., classes as instances)– theoretical properties not well understood

OWL DL– has a standard model theoretic semantics

OWL Lite– subset of OWL DL– easier to reason with

Page 18: Knowledge Representation Issues for the Semantic Web

Formal Semantics OWL Lite and OWL DL

– fairly standard DL-style model theoretic semantics– defined using interpretations– classes are sets of objects– class constructors and axioms place conditions on

interpretations OWL Full

– non-standard RDF-style semantics– but still model-theoretic in nature

Page 19: Knowledge Representation Issues for the Semantic Web

Selected Research Issues Work by the SWAT lab at Lehigh

– students» Yuanbo Guo» Zhengxiang Pan

Semantics for distributed ontologies Reasoning and scalability

Page 20: Knowledge Representation Issues for the Semantic Web

A Web of Ontologies

A1 A2

B3B1 B2C1 D1

E1 F1

revises

revises revises

extends

extends extendsextends

extends extends extends

S1

S2 S3

S5S4

commits to

commits tocommits to

commits to

commits to

commits to

Page 21: Knowledge Representation Issues for the Semantic Web

Semantics of Ontology “Links” Brachman (1983) regarding links between

concepts in early semantic networks– . . . the meaning of the link was often relegated to

“what the code does with it”- neither an appropriate notion of semantics nor a useful guide for figuring out what the link, in fact means.

DLs were one solution to this problem In Semantic Web, links between ontologies now

suffer from a similar lack of clear semantics

Page 22: Knowledge Representation Issues for the Semantic Web

owl:imports ontology extension / commitment semantics

– in order to satisfy an ontology, an interpretation must also satisfy all ontologies that it imports

only provides semantics for each document in isolation!

Page 23: Knowledge Representation Issues for the Semantic Web

Ontology Versioning Each new version has new URL

– other users may have committed to your ontology» “point at” it using its URL

– if you change the file at that location, then you change their commitment without their consent

Issue: Should veh76 be a v2:Vehicle?

car54

Vehicle

type

http://ex.org/ont-v2

Car

subClassOf

veh76 Vehicletype

http://ex.org/ont-v1

Page 24: Knowledge Representation Issues for the Semantic Web

Versioning Complications

Should Flipper be a v2:Mammal?– depends

» is change to correct a modeling error?» or to reflect a change in interpretation of “Dolphin”?

Dolphin

DolphinFish

Fish

subClassOf

http:/ex.org/schema-v2

http://ex.org/schem-v1

Mammal

Mammal

subClassOf

Flipper

type

Page 25: Knowledge Representation Issues for the Semantic Web

Versioning in OWL priorVersion

– indicates a previous version of an ontology backwardCompatibleWith

– indicates a version with which ontology is backward compatible DeprecatedClass

– used to signify that a class should no longer be used DeprecatedProperty

– used to signify that a property should no longer be used versionInfo

– used for CVS-like strings incompatibleWith

– opposite of backwardCompatible with

Page 26: Knowledge Representation Issues for the Semantic Web

OWL Versioning Syntax<rdf:rdf xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”

xmlns:rdfs=“http://www.w3.org/2000/01/rdf-schema#”>

<owl:Ontology rdf:about=“”> <owl:priorVersion rdf:resource=“http://ex.org/schema-v1”> <owl:backwardCompatibleWith rdf:resource=“http://ex.org/schema-v1”></owl:Ontology><owl:DeprecatedClass rdf:ID=“Megalodon”>

<owl:Class rdf:ID=“Dolphin”> <rdfs:subClassOf rdf:resource=“#Mammal”></owl:Class></rdf:rdf>

Page 27: Knowledge Representation Issues for the Semantic Web

Formal Ontology Definition Ontology O=<V,A,E,P,B>

– V = vocabulary (a set of symbols)– A = axioms (a set of wffs)– E = set of extended ontologies– P = set of prior versions of ontology– B = set of ontologies O is backward-

compatible with (subset of P)

Page 28: Knowledge Representation Issues for the Semantic Web

Resource Definitions R is the set of resources Knowledge function

– maps resources to sets of wffs– K : R 2W

Commitment function– maps resources to ontologies– C : R O

Page 29: Knowledge Representation Issues for the Semantic Web

Ontology Perspectives Users may wish to view data through viewpoint of

different ontologies– versioning is a special case of this

An ontology specifies a set of axioms Ontology perspectives specify a logical theory

based on an ontology and a set of data sources– combine axioms and ground atoms– queries are with respect to a perspective

Page 30: Knowledge Representation Issues for the Semantic Web

Ontology Perspective Theory

UU

UU

})()(,|{})(|{

)}()()(|{)}(|{

)()(

)(

jiji

iiij

BrCOancOjRrBrCRr

OancrCOrCRrOancOjjii

rKrK

rKAA

Given O={O1,O2,…,On} where Oi=<Vi,Ai,Ei,Pi,Bi>axioms of basis ontology

data from sources that committo basis ontology or its ancestors

axioms ofextended ontologies

data from sources that commit toontologies that are compatiblewith the basis ontology

data from sources that commit toontologies that are compatible withancestors of the basis ontology

Page 31: Knowledge Representation Issues for the Semantic Web

Perspectives ExampleOntologies:O1: A1 = {Dolphin(x) Fish(x)} B1 = {}O2: A2 = {Dolphin(x) Mammal(x)} B2 = {O1}

Data:C(r1) = O1

K(r1) ={Dolphin(flipper), Fish(charlie), Mammal(bigfoot)}

C(r2) = O2

K(r2) = {Dolphin(splasher)}Perspective

Query 1 2

Dolphin(x)

Fish(x)

Mammal(x)

flipper

charlie, flipper

bigfoot

flipper, splasher

charlie

bigfoot, flipper, splasher

Page 32: Knowledge Representation Issues for the Semantic Web

Scalable Systems Motivation

– the Web is large» it won’t fit in main memory!

– current systems don’t scale DLDB

– DB: Relational Database (Microsoft®Access)» scalable technology for querying data

– DL: Description Logics (FaCT reasoner)» rich inference capability» close correspondence to semantics of OWL

Page 33: Knowledge Representation Issues for the Semantic Web

<owl:Class rdf:ID=”Student”/><owl:Class rdf:ID="UndergraduateStudent"> <rdfs:subClassOf rdf:resource="#Student" /><owl:Class/>

Design – RDF(S) Entailment Use views to store class hierarchy

CREATE VIEW Student_v ASSELECT * FROM Student UNION SELECT * FROM UndergraduateStudent_view

Page 34: Knowledge Representation Issues for the Semantic Web

Design – OWL Entailment

Inferred Hierarchy

DL Reasoner

Ontology

table & viewcreation

Database operation

…Student Person who takes a CourseGraduateStudent Person who takes a GraduateCourseGraduateCourse Course…

…Graduate Student Student…

CREATE VIEW Student_1_view ASSELECT * FROM Student_1 UNION SELECT * FROM UndergraduateStudent_1_view UNION SELECT * FROM GraduateStudent_1_view;

Page 35: Knowledge Representation Issues for the Semantic Web

Implementation – Query

(Type GraduateStudent ?X)(TakeCourse ?X http://www.foo.edu/department0/course0)

SELECT GraduateStudent_2_view.IDFROM GraduateStudent_2_view, takeCourse_2_view WHERE GraduateStudent_2_view.id = takeCourse_2_view.subject AND takeCourse_2_view.object= http://www.foo.edu/department0/course0

Query Interface application

KIF-like conjunctive query

Query Translation Algorithm

SQL Sentences

RDBMS

Query API

Page 36: Knowledge Representation Issues for the Semantic Web

Lehigh University Benchmark Can be used to evaluate semantic web reasoning systems Features

– OWL ontology for university domain (moderate complexity)– customizable data generation

» can select number of universities and random number generator seed» arbitrary size» repeatable

– plausible» “real world” constraints are applied

Metrics– load time– repository size– query response time– degree of completeness– degree of soundness

Page 37: Knowledge Representation Issues for the Semantic Web

Benchmark System

Repository 1

Repository N

API

API

BenchmarkDataData

Generator

14 TestQueries*

Tester

Univ-BenchOntology

Test Results

*each query is executed by 10 times to account for caching.

Page 38: Knowledge Representation Issues for the Semantic Web

Initial Experiment Four systems tested

– Sesame Memory, Sesame DB, OWLJessKB, DLDB Five data sizes

– ranging from 15 files (8 MB) to 999 files (583 MB) Summary of results

– Sesame-Memory best for small to medium size if only RDFS inference is needed

– OWLJessKB can answer queries none of the other systems can

» but doesn’t scale and makes some unsound inferences– DLDB has best balance between query response time

and completeness

Page 39: Knowledge Representation Issues for the Semantic Web

Some Other Research Topics Knowledge acquisition Language design Semantic Web services

Page 40: Knowledge Representation Issues for the Semantic Web

Knowledge Acquisition data

– create or find relevant ontology– then either

» convert existing forms to RDF e.g., XML, relational DBs, CGs, etc.

» information extraction» natural language processing» controlled English? (Sowa, yesterday)

ontologies– import existing ontologies– manual creation (e.g., Protogé)– machine learning– formal concept analysis? (Rudolph, yesterday)

Page 41: Knowledge Representation Issues for the Semantic Web

Language Design DL is insufficient for some applications Significant demand for “rules”

– Combining logic programming with DL (Grosof et al. 2003)

SWRL (Semantic Web Rule Language)– proposal to add Horn logic to OWL

However, must consider expressivity / scalability tradeoff

Page 42: Knowledge Representation Issues for the Semantic Web

Semantic Web Services Web service

– a web-accessible program that provides information or performs an action

OWL-S– ontology for describing web services

» consists of profile, process model, and grounding Current research includes:

– matchmaking (e.g., see work of Sycara)– automated composition (e.g., see work of McIlraith)– much more …

Page 43: Knowledge Representation Issues for the Semantic Web

Conclusion The Semantic Web is concerned with

interoperability of distributed information OWL is a standard that allows for sharing

of ontologies– if you want your ontologies to be used by the

world, then export (what you can) to OWL There is much research to do before the

Semantic Web problem is solved– we need all the help we can get!

Page 44: Knowledge Representation Issues for the Semantic Web

For more information... Useful websites

– http://www.semwebcentral.org/– http://www.w3.org/2001/sw/– http://www.daml.org/– http://www.semanticweb.org/

My information– [email protected]– http://www.cse.lehigh.edu/~heflin/

Page 45: Knowledge Representation Issues for the Semantic Web

The End

Page 46: Knowledge Representation Issues for the Semantic Web

Ontology Divergence

The Web is distributed and dynamic

Therefore, ontological differences will arise– terminology– scope– encoding– context

Thing

Car

Civic

Automobile

Delorean

Object

general-ontology

trans-ont vehicle-ont

isaisa

isa

PorscheEscort

Page 47: Knowledge Representation Issues for the Semantic Web

Resolving Ontology Divergence

O1 O2

O1 O2

O1 O2 O1 O2

O1 O2

OM

ON

Mapping Ontology Mapping Revisions Intersection Ontology

OM contains rulesthat map conceptsbetween the ontologies

O1 contains rules thatmap O2 objects to O1 terminology. O2 doesthe reverse

ON contains intersectionof concepts. O1 and O2rename terms wherenecessary

revised byextended by

Key:

Page 48: Knowledge Representation Issues for the Semantic Web

Implementation - Database Schema

Student_1_view

1http://www.lehigh.edu/~zhp2/univ-bench.owl

SeqNumURLOntologies_Index

2file:/D:/demo/UBArtiData/University0_0.owl

1http://www.lehigh.edu/~zhp2/univ-bench.owl

SeqNumURL

Source_Index

2http://www.Department0.University0.edu/GraduateCourse9

3http://www.Department0.University0.edu/GraduateStudent123

1http://www.Department0.University0.edu/UndergraduateStudent121

IDURI

13

11

SourceID

TakeCourse_1

2

Object

1…

13

SourceSubject

URI_Index