the semantic web stefan decker information sciences institute university of southern california
TRANSCRIPT
3
Semantic Web
• coined by Tim Berners-Lee (1997)
"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”
– T. Berners-Lee, J. Hendler, O. Lassila,“The Semantic Web”, Scientific American, May 2001
4
Doctor’s appointment“The Semantic Web”, Scientific American, May 2001
MomPhysician’s Agent
Lucy’s Agent
requiredtreatment
Schedule appointment
Insurance Co.
Provider sites
Rating
in-plan?close-by?
Specialist?
Pete’ Agent
Driving schedule
5
Means to Achieve the Vision
• Explicit Ontologies– Needed to understand each others data
(e.g., joint notion about what a schedule is)
• Web Services– Required to actively interconnect systems
(automatically make an appointment)
6
Technical challenges
• Interoperability– Inaccurate, incomplete, heterogeneous data– Unreliable, ill-defined, evolving services
• Natural language processing, data mining– make information explicit
• Human-computer interaction– querying interfaces, visualization
• Scalability– Subsecond performance
7
Social challenges
• Standardization is hard– DublinCore
• Bogus or inaccurate metadata– Physician rating, profile
• Competition and commoditization
• Economical incentive– Chicken and egg
• Complexity: developers and users
8
Jump Starters
• Machine Readable Data:– .org (human-edited
directory)
– .org (Music encyclopedia)
– RSS (RDF Site Summary)
– (embedded metadata)
– CC/PP (Composite Capability/Preference Profiles)
– P3P (Platform for Privacy Preferences)
9
Jump Starters
• B2B Vocabulary Projects– PapiNet.org: Vocabulary for Paper Industry– BPMI.org: Vocabulary for exchanging Business Process
Models– XML-HR: Vocabularies for human resources (HR)– DMTF (Distributed Management Task Force)
(Vocabularies for managing enterprises– …
• Research Vocabulary Projects– Gen Ontology Working Group– Earth Sciences– MathNet– …
10
How do we get there?
Research communities
Industry
Standards bodies
Non-profit
DL, AI, DB, …
IBM, Nokia, HP, Microsoft(?),...
W3C, OMG, …
US, EC, Japan
Business.semanticweb.org
11
Non-profit
• DARPA– “DARPA Agent Markup Language”– since Aug 2000
• NSF– Co-sponsored events (e.g., SWWS)– Further support in the loop
• European Council– “Semantic Web Technologies”, FrameWork 6
• Japan– Interoperability Technology Association for
Information Processing, Japan (INTAP)
www.daml.org
www.ontoweb.org
www.semanticweb.org/SWWS
www.net.intap.or.jp/INTAP/
12
AI: “Add logic to the Web”
• Assertions, rules• Agents• Interoperability
– First-order logics– Ontologies, description logics– Logic programming, datalog– Problem-solving methods– …
Distributed knowledge base
13
DB: “Everything is syntax”
• Semistructured data• Web services• Interoperability
– Data integration– Mediation, query rewriting– Model management– Conceptual modeling
Conglomerate of distributed heterogeneous
(semistructured) databases
16
1. Step
• Define uniform, underlying syntax– Lowest common denominator: labeled graphs
(semi-structured Data) -> RDF
ID F-name L-name
1 Stefan Decker
2 Birgit Decker
ID F-nameL-name
DeckerStefan
row
Person
Person
1
ID F-name
L-name
DeckerBirgit2
row
Relational Database Structured Text (e.g., Vcard)
begin: vcardfn: Stefann: Decker;Stefanend: vcard
Stefan
fn
vcard1
Decker;Stefan
n
17
XML
• Containment, hierarchy
• Adjacency (A followed by B)
• Attributes (atomic values)
• Opaque reference (IDREF)
Good for serialization, poor for modeling relational semantics
18
Encoding of Information
http://www.w3.org/Home/Lassila Creator
Endless encoding possibilities in XML:<Creator> <uri>http://www.w3.org/Home/Lassila</uri> <name>Ora Lassila</name></Creator>
<Document uri=“http://www.w3.org/Home/Lassila” <Creator>Ora Lassila</Creator></Document>
<Document uri=“http://www.w3.org/Home/Lassila” Creator=“Ora Lassila”/>
“The Creator of the Resource “http://www.w3.org/Home/Lassila” is Ora Lassila
Ora Lassila
19
Introduction to RDF
• RDF (Resource Description Framework)– Beyond Machine readable to Machine understandable
• RDF unites a wide variety of stakeholders:– Digital librarians, content-raters, privacy advocates,
B2B industries, AI...– Significant (but less than XML) industrial momentum,
lead by W3C
• RDF consists of two parts– RDF Model (a set of triples)– RDF Syntax (different XML serialization syntaxes)
• RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF)
20
A Simple Example• Describing Resources
– URIs: global OIDs, literals– Binary relationships between objects– Arcs (relationships) are first-class objects– Blank (anonymous) nodes
• “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila”• Structure
– Resource (subject) http://www.w3.org/Home/Lassila– Property (predicate) http://www.schema.org/#Creator– Value (object) "Ora Lassila”
http://www.w3.org/Home/Lassila s:Creator Ora Lassila
21
RDF
• Graph-based universal syntax
Scheduling Service
Insurance Ratings Calendar
RDF-Layer (Single dataformat, Query and storage System)
(Agent-) Applications
Semantics in a global, open environment?
22
Step2: Ontologies
• What is an Ontology?„An ontology is a specification of a conceptualization.“
Tom Gruber, 1993
• Ontologies are social contracts– Agreed, explicit semantics– Understandable to outsiders– (Often) derived in a community process
• Ontologies require Knowledge Representation– Is_a hierarchy, part of, attributes, axioms
23
RDF and Ontologies
Idea: Define an Ontology Language by defining
predefined nodes and arcs
The Ontology Language itself is just an Ontology
Ontologies are used to tag data from sources
24
Step 2: Layers on Top of RDF
Tim Berners-Lee:“Axioms, Architecture and Aspirations”W3C all-working group plenary Meeting28 February 2001
ID F-nameL-name
DeckerStefan
row
Person
1
ID F-name
L-name
Birgit2
row
Decker
LivingThing
subClassOf
From an Ontology
25
W3C Semantic Web Activity
• Annotation (Annotea)
• Access control
• Calendaring
• Collaboration
• Logic
• Rules
• Workflows
Working Groups
Web Ontology
Advanced development
RDF Core
26
RDF Core Working Group
• Resource Description Framework (RDF)
• Goals– Improve RDF abstract model and XML syntax
according to implementors feedback
– Define precise semantics for RDF and RDF Schema
– Clarify ties with XML family
27
Web Ontology Working Group
• Standard definition language for ontologies (conceptual models)
• Derived from Description Logics– But partial mapping to Datbase and Datalog possible ->
(see Horrocks, Volz, Decker, Grossof: WWW2003)
• Extension of RDF Schema and DAML+OIL– Class Expressions (Intersection, Union, Complement)– XML Schema Datatypes– Enumerations– Property Restrictions
• Cardinality Constrains• Value Restrictions
28
The Layer Cake
Tim Berners-Lee:“Axioms, Architecture and Aspirations”W3C all-working group plenary Meeting28 February 2001
Recommendation Phase
Standardization Phase
Research Phase
30
Tasks within SCEC - CME
• Towards an Earth Sciences Ontology:– Cataloging and Unification of Existing
Databases• E.g., Fissures and Fault Activity Database
• Building a Mediation Environment
• Organizing a Community Process
• Enriching of Web Services and Grid Infrastructure with Semantics– Service Discovery and Match Making
31
Fault Activity Database
• Hand-Maintained within SCEC (Sue Perry)
• Re-engineering of the Database Schemata<rdfs:Class rdf:about="&FAD_v1;AVG_RECURRENCE_INTERVAL"
rdfs:label="AVG_RECURRENCE_INTERVAL"><a:_slot_constraints
rdf:resource="&FAD_v1;SCFADsep_02_00106"/><rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class><rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT"
rdfs:label="AVG_SLIP_PER_EVENT"><rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class><rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT_METHOD"
rdfs:label="AVG_SLIP_PER_EVENT_METHOD"><rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class><rdf:Property rdf:about="&FAD_v1;CFM-A_coord_file_URL"
a:maxCardinality="1" rdfs:label="CFM-A_coord_file_URL"><rdfs:domain rdf:resource="&FAD_v1;FAULT"/><rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
32
Planned: Mediation Environment with RDF-based Rule Language
Fault Activity Database
Fissures Grid Services
Mediation with RDF-based Rule Language
Applications
33
Motivation: Why Rule Languages for the Web
• Plethora of data available– Data needs to be adapted and combined– “Time to Market”: Faster to write rules than code– Data Transformation and Integration
• Logic specification, not programming– Tabled evaluation/bottom-up evaluation– Semi-structured data– Multiple semantics (Relational Data, UML, ER,
TopicMaps, DAML+OIL, XML-Schema, special purpose data models)
– Distributed, heterogeneous sources
34
What’s Wrong With Existing Approaches?
• Built-in semantics (e.g. SiLRI, RQL, DQL)– but: many RDF-based languages with different
semantics (DAML+OIL, RDF Schema, UML/RDF, TopicMaps/RDF, DMTF, …)
– For each language a specialized query language ????
35
TRIPLE:Language Overview
•Native support •for Resources & namespaces,•Abbreviations•Models (sets of RDF statements)•Reification
•Rules with expressive bodies (full FOL syntax)•Inspired by F-Logic:
•subject[predicateobject] (“molecule”)
36
Language Description I
• Namespace and resource abbreviations:– rdf := “http://www.w3.org/1999/02/22-rdf-syntax-ns#”.– isa := rdf:subClassOf.
• Statements, triples, molecules:– subject[predicateobject]– subject[p1o1; p2 o2; ...]– s1[p1 s2[p2o] ]
• Models, model expressions, parameterized models:– s[po]@m “triple <s,p,o> in model m”– s[po]@(m1 m2) model intersection, union, diff.– s[po]@sf(m1, X, Y)Skolem function
37
Language Description II• Reification:
– stefan[believes <Ora[isAuthorOfhomepage]> ]• Logical formulae:
– usual logical connectives and quantifiers: – all variables introduced via (or )
• Clauses:– facts: s[p1o1; p2 o2; ...].– rules: X s1[p1X] s2[p2X] ... .
• Model blocks:– @model { clauses } Mdl @model(Mdl) { clauses }
38
dc := “http://purl.org/dc/elements/1.0/”.db := “http://www-db.stanford.edu/”.····@db:documents { db:d_01_01 [ dc:title TRIPLE; dc:creator “Stefan Decker”; dc:subject RDF; dc:subject triples; ... ].
}
Example: Dublin Corenamespace abbreviations
model block
factdb:d_01_01
Stefan Decker
RDF triples
TRIPLE
dc:title dc:creator
dc:subject dc:subject
...
Person
Stefan Decker
rdf:typename
N p(N)[ rdf:type xyz:Person; xyz:name N ] D D[dc:creator N].
rule
N = “Stefan Decker”N P P[rdf:type xyz:Person; xyz:name N]@db:documents.
query:“find all names”
39
Example: Specification of RDF Schema Semantics
namespace abbreviations
resource abbreviations
model block
rdf := 'http://www.w3.org/...rdf-syntax-ns#'.rdfs := 'http://www.w3.org/.../PR-rdf-schema-...#'.type := rdf:type.subPropertyOf := rdfs:subPropertyOf.subClassOf := rdfs:subClassOf.
FORALL Mdl @rdfschema(Mdl) {
FORALL O,P,V O[P->V] <- O[P->V]@Mdl. FORALL O,V O[subClassOf->V] <- EXISTS W (O[subClassOf->W] AND W[subClassOf->V]).
…}
Transitivity of subClassOf
“copy” triples from Mdl
40
Example: Cars Ontology with RDF Schema Semantics
@cars {
xyz:MotorVehicle[rdfs:subClassOf -> rdfs:Resource].
xyz:PassengerVehicle[rdfs:subClassOf -> xyz:MotorVehicle].
xyz:Truck[rdfs:subClassOf -> xyz:MotorVehicle].
xyz:Van[rdfs:subClassOf -> xyz:MotorVehicle].
xyz:MiniVan[
rdfs:subClassOf -> xyz:Van;
rdfs:subClassOf -> xyz:PassengerVehicle].
}
xyz:MotorVehicle
xyz:Vanxyz:Truc
k
xyz:PassengerVehicle
xyz:MiniVan
X = xyz:Van X = xyz:Truck X = xyz:PassengerVehicle
FORALL X <- X[rdfs:subClassOf -> xyz:MotorVehicle]@cars.
FORALL X <- X[rdfs:subClassOf -> xyz:MotorVehicle]@rdfschema(cars).
X = xyz:Van
X = xyz:Truck
X = xyz:PassengerVehicle
X = xyz:MiniVan
41
Grid Computing and Web Services (ongoing)
• Matchmaking between Jobs and Resources
• Hard-Coded in Globus Toolkit– Reeingineering using a Ontology and Rule-
based solution– RDF and DMTF Vocabulary (www.dmtf.org)
<rdfs:Class rdf:ID="CIM_ComputerSystem"> <rdfs:subClassOf rdf:resource="#CIM_System"/><version><![CDATA["2.6.0"]]></version><rdfs:comment parseType="Literal"><![CDATA["A class derived from System that is a special collection of ManagedSystemElements. This collection provides compute capabilities and serves as aggregation point to associate one or more of the following elements: FileSystem, OperatingSystem, Processor and Memory (Volatile and/or NonVolatile Storage)."]]></rdfs:comment><rdfs:subClassOf> <daml:Restriction> <daml:toClass rdf:resource="#string"/> <daml:onProperty><daml:DatatypeProperty rdf:ID="NameFormat"> <daml:toClass rdf:resource="http://www.w3.org/2001/XMLSchema#string"/></daml:DatatypeProperty> </daml:onProperty></rdfs:Class>