© 2009 franz j. kurfess semantic web 1 cpe/csc 481: knowledge-based systems dr. franz j. kurfess...
TRANSCRIPT
© 2009 Franz J. Kurfess Semantic Web 1
CPE/CSC 481: Knowledge-Based Systems
CPE/CSC 481: Knowledge-Based Systems
Dr. Franz J. Kurfess
Computer Science Department
Cal Poly
© 2009 Franz J. Kurfess Semantic Web 2
Usage of the SlidesUsage of the Slides these slides are intended for the students of my
CPE/CSC 481 “Knowledge-Based Systems” class at Cal Poly SLO if you want to use them outside of my class, please let me know
([email protected]) I usually put together a subset for each quarter as a
“Custom Show” to view these, go to “Slide Show => Custom Shows”, select the
respective quarter, and click on “Show” To print them, I suggest to use the “Handout” option
4, 6, or 9 per page works fine Black & White should be fine; there are few diagrams where
color is important
© 2009 Franz J. Kurfess Semantic Web 3
Course OverviewCourse Overview Introduction Knowledge Representation
Semantic Nets, Frames, Logic
Reasoning and Inference Predicate Logic, Inference
Methods, Resolution
Reasoning with Uncertainty Probability, Bayesian Decision
Making
Expert System Design ES Life Cycle
CLIPS Overview Concepts, Notation, Usage
Pattern Matching Variables, Functions,
Expressions, Constraints
Expert System Implementation Salience, Rete Algorithm
Expert System Examples Semantic Web &
Knowledge Conclusions and Outlook
© 2009 Franz J. Kurfess Semantic Web 4
OverviewOverview Introduction Knowledge Processing
Knowledge Acquisition, Representation and Manipulation
Knowledge Organization Classification, Categorization Ontologies, Taxonomies, Thesauri
Knowledge Retrieval Information Retrieval Knowledge Navigation
Knowledge Presentation Knowledge Visualization
Knowledge Exchange Knowledge Capture, Transfer,
and Distribution Usage of Knowledge
Access Patterns, User Feedback
Knowledge Management Techniques Topic Maps, Agents
Knowledge Management Tools
Knowledge Management in Organizations
© 2009 Franz J. Kurfess Semantic Web 5
Course OverviewCourse Overview Introduction Knowledge Representation
Semantic Nets, Frames, Logic
Reasoning and Inference Predicate Logic, Inference
Methods, Resolution
Reasoning with Uncertainty Probability, Bayesian Decision
Making
Expert System Design ES Life Cycle
CLIPS Overview Concepts, Notation, Usage
Pattern Matching Variables, Functions,
Expressions, Constraints
Expert System Implementation Salience, Rete Algorithm
Expert System Examples Conclusions and Outlook
© 2009 Franz J. Kurfess Semantic Web 6
Overview Semantic WebOverview Semantic Web Motivation Objectives Semantic Web Introduction
World Wide Web “Deep Web” Knowledge and the Web
Syntactic vs. Semantic Web human view of documents computer view of documents
Knowledge Representation on the Web XML + Meta-tags RDF
Knowledge Organization with Ontologies Conceptual building blocks Web Ontology Language
(OWL)
Reasoning with Ontologies Description Logics Reasoning with OWL
Important Concepts and Terms
Chapter Summary
© 2009 Franz J. Kurfess Semantic Web 7
LogisticsLogistics Introductions Course Materials
textbook handouts Web page CourseInfo/Blackboard System and Alternatives
Term Project Lab and Homework Assignments Exams Grading
© 2009 Franz J. Kurfess Semantic Web 18
History of the Semantic WebHistory of the Semantic Web Web was “invented” by Tim Berners-Lee (amongst others), a physicist
working at CERN TBL’s original vision of the Web was much more ambitious than the
reality of the existing (syntactic) Web:
TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web
E.g., article in May 2001 issue of Scientific American…
“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 19
Realising the complete “vision” is too hard for now (probably) But we can make a start by adding semantic annotation to web
resources
Scientific American, May 2001
Semantic Web – Scientific AmericanSemantic Web – Scientific American
© 2009 Franz J. Kurfess Semantic Web 20
Where we are Today: the Syntactic Web
Where we are Today: the Syntactic Web
[Hendler & Miller 02]
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 21
The Syntactic Web is…The Syntactic Web is…A place where computers do the presentation (easy) and
people do the linking and interpreting (hard). A hypermedia, a digital library
A library of documents called (web pages) interconnected by a hypermedia of links
A database, an application platform A common portal to applications accessible through web pages, and
presenting their results as web pages
A platform for multimedia BBC Radio 4 anywhere in the world! Terminator 3 trailers!
A naming scheme Unique identity for those documents (URLs)
Why not get computers to do more of the hard work?[Goble 03]
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 22
Hard Work using the Syntactic Web…
Hard Work using the Syntactic Web…
Find images of Steve Furber
Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois
Carole Goble
… Alan Rector…
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 23
Impossible (?) using the Syntactic Web…Impossible (?) using the Syntactic Web…
Complex queries involving background knowledge Find information about “animals that use sonar but are not
either bats or dolphins”
Locating information in data repositories Travel enquiries Prices of goods and services Results of human genome experiments
Finding and using “web services” Visualise surface interactions between two proteins
Delegating complex tasks to web “agents” Book me a holiday next weekend somewhere warm, not too far
away, and where they speak French or English
, e.g., Barn Owl
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 24
Syntactic vs. Semantic WebSyntactic vs. Semantic Web
human view of documents
computer view of documents
© 2009 Franz J. Kurfess Semantic Web 25
What is the Problem?What is the Problem?
Web pages contain content (text, images, music) markup (HTML, XHTML) hyperlinks code (JavaScript)
Content is most critical for humans but meaningless to computers requires interpretation (understanding)
© 2009 Franz J. Kurfess Semantic Web 26
What information can we see…What information can we see…
WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong, india,
ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire
Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 27
What information can a machine see…
What information can a machine see…
…
…
…
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 28
Solution: XML markup with “meaningful” tags?
Solution: XML markup with “meaningful” tags?<name>
</name><location> </location>
<date> </date><slogan> </slogan><participants>
</participants>
<introduction>
…
</introduction><speaker> </speaker><bio> </bio>…
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 30
Still the Machine only sees…Still the Machine only sees…< > </ >< > </ >
< > </ >< > </ >< >
</ >
< >
…
</ >< > </ >< > </ >< > </ >< > </ >
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 31
Need to Add “Semantics”Need to Add “Semantics”External agreement on meaning of annotations
E.g., Dublin Core for annotation of library/bibliographic information Agree on the meaning of a set of annotation tags
Problems with this approach Inflexible Limited number of things can be expressed
Use Ontologies to specify meaning of annotations Ontologies provide a vocabulary of terms New terms can be formed by combining existing ones
“Conceptual Lego” Meaning (semantics) of such terms is formally specified Can also specify relationships between terms in multiple
ontologies
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 34
Meanwhile related developmentsMeanwhile related developmentsObject oriented programming
Simula, Smalltalk, … JavaObject oriented design
Entity relationship diagrams… UMLSGML, HTML, XML and the web
Including RDF and Topic Maps
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 35
Knowledge Representation on the Web
Knowledge Representation on the Web
knowledge is primarily enclosed in documents Web pages use of HTML offers no clear separation of content and
presentation (formatting) HTML is very limited in its expressiveness
fixed vocabulary
© 2009 Franz J. Kurfess Semantic Web 36
HTML + Meta-tagsHTML + Meta-tags
intention was to use meta-tags to describe the contents of Web pages was quickly abused to increase the relevance rankings of
pages
meta-tags are just labels on the documents no structure to the labels free (unlimited, uncontrolled) vocabulary
© 2009 Franz J. Kurfess Semantic Web 37
XML + Meta-tagsXML + Meta-tags
XML offers much better expressiveness XML itself is not a KR language offers facilities to define KR languages
much better separation of content and presentationXML allows the definition of schemata
customized naming and structure of tags
flexible transformations for variable presentations e.g. XSLT to create different versions of documents
© 2009 Franz J. Kurfess Semantic Web 38
MicroformatsMicroformats
BackgroundPurpose
limitations of current approaches human-readable and machine-readable
Usage basics customization
Limits of microformatsOutlook
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 40
Michael Cook
Caleb Troughton
5/3/2007
MicroFormatsMicroFormats
© 2009 Franz J. Kurfess Semantic Web 41
What are Microformats?What are Microformats?
A standard set of HTML/XML semanticsA set of tag classes and patterns used for storing
information in web pages Microformat tags are typically kept brief and descriptive
Open format Microformats are not controlled by any one company or
individual Anyone can suggest new Microformats or request
revisions
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 42
More about MicroformatsMore about Microformats
typically written in XHTML visible representation of the data semantic data hidden in the code
implicit interpretation similar to how “int” implies that the data element will be an
integer example: “hCard/adr” implies that the data element will
contain an address.
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 43
What current systems exist?What current systems exist?
no standard set of web markup techniques XML is standardized, but it’s a markup specification
language
commonly use markup techniques do exist many Microformats are derived directly from methods
already used
“Microformats are semantics with momentum, a codification of
what everyone did anyway.” Derrick Pallas, Alexa Internet, Inc.
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 44
Current ConventionsCurrent Conventions
lower readability by humans information scattered around a page or several pages data may be formatted using sloppy markup or
inconsistent patterns
lower readability by machines require scraping an entire page in search for patterns information less trustworthy
unexpected information such as contact information for a person you aren’t looking for
May be categorized incorrectly 805 Santa Barbara: is this a street address or an area code and city name?
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 45
Human or Machine Use?Human or Machine Use?
both: humans first, machines second
commonly used tags to embed information designed to be brief, descriptive easy for a human to interpret embedded data from code
adding semantics becomes second nature such as “blockquote”, Microformat class names
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 46
Use of MicroformatsUse of Microformats
conventions written in current standard markup languages an experienced programmer will easily interpret the
Microformat syntax and utilize implications of the language
implementation current markup languages allows Microformats to be
quickly and easily implemented on machines
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 47
Who should use Microformats?Who should use Microformats?
programmer web code with embedded information a user or search engine might extract it typically information that will be represented or extracted
often can contain anything from name and address, to business
affiliation, to contact information
also used to aid search engines extracting information by supplementing Metadata deter crawlers from following links with a “rel=‘nofollow’"
attribute.
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 48
Benefits of Microformats 1Benefits of Microformats 1
programmer easily read raw markup language code naming conventions allow others to easily extract data
from the raw language readily recognizable names of data members.
easily edited by other programmers the information is easily readable potentially abstract data member naming is following a
standardized convention
in the future someone may need to edit this information
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 49
Benefits of Microformats 2Benefits of Microformats 2
universal modules in code. creation of scripts Microformatsnaming conventions are standardized
easy to integrate this code into web pages modular development of pages development of library utilities
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 50
Benefits of Microformats 3Benefits of Microformats 3
unit conversion and currency conversion is trivial Web browser contains a user’s default preferences makes conversions on the fly
information extraction Web crawler searching for an address or email.
accuracy is dramatically improved the programmer specified “this is the email” or “this is the address”
crawler speed is much faster it no longer has to scan the entire page or rely on a regular expression to
extract information
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 51
Microformat UsageMicroformat Usage
easy, intuitive specificationsformat corresponds to the data type you wish to
represent in your code specifications can be found on the Microformats Wiki sometimes little difference between Microformats and code
you might see on a web page today. benefits the readability by humans and accessibility of
code by machines
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 52
Code DifferencesCode Differences
Commonplace vCard:
BEGIN:VCARD VERSION:3.0 N:Çelik;Tantek FN:Tantek Çelik URL:http://tantek.com ORG:Technorati END:VCARD
Microformat hCard:
<div class="vcard"> <a class="url fn" href="http://tantek.com/"> Tantek Çelik </a> <div class="org">Technorati</div> </div>
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 53
CustomizationCustomization
Identify a situation in which a Microformat would provide a solution no existing Microformats or XML markup addresses it
Propose the Microformat to the Microformats WikiWork with other contributors to develop a draft version
of the format. without community involvement, the new format will not be
adoptedSubmit the final draft to the Wiki
the community will accept the format as a standard if it becomes more and more common in practice
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 54
Web Page UpdateWeb Page Update
2 methods of converting existing web data to Microformatted data By hand
large amount of manual labor complicated when extracting fields from arbitrary structures
By machine applications that attempt to convert commonplace semantics such
as vCard to the appropriate Microformat, hCard inneffective unless the original content follows commonplace trends can become error prone in interpreting existing field names
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 55
Microformat LimitationsMicroformat Limitations
acceptance and use in practice only successful if widely used
existing vs. new content since conversion is tedious at best, advocation of
Microformats in new content is arguably more important than content conversion
confusion short class names can lead to possibly ambiguous titles
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 56
ExampleExample <div class="vcard">
<span class="fn">John Smith</span>, <div class="adr"> <div class="street-address">1 Seaview Lane</div>, <span class="locality">Mousehole</span>, <span class="region">Cornwall</span>, <span class="country-name">UK</span> </div></div>
query to Google Maps “1 Seaview Lane, Mousehole, Cornwall, UK” combination of the above
interpretation “locality” is “Mousehole” a county name? A province name? ambiguous information
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 57
ToolsTools
Web browsers FireFox 3 has Microformat copy/paste support.
Web browser extensions “Microformats Extensions”, Operator recognize Microformatted code allow users to perform copy operations from within a web
browserDreamweaver extensions
easy implementation of Microformats in new web pagescalendar, address book, and email utilities
ability to copy paste Microformatted data, preserving fields
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 58
FutureFuture
Microformats will hopefully become a standard in Web development
Advanced search engines will use Microformats to directly extract information
Search engines will use Microformats to establish relations between data types and values
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 59
How Can You Help?How Can You Help?
being a member of the Microformats community helps the development of Microformats.
suggest a new Microformat for your favorite items most likely somebody else was faster
help influence proposed Microformats structure, usage, implementation
help translate Microformats into other languagesadvocate the use of Microformats
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 60
SourcesSources Digital Web Magazine. “Microformats Primer.”
http://www.digital-web.com/articles/microformats_primer/. Describes and demonstrates why and how microformats are used. Argues that microformats will aid programmers intending to generate CSS code, insert Metadata, or implement plug and play Javascript.
Official Microformats Home Page. http://microformats.org/. Updated 4/17/2007.Provides up to date information on the implementation of microformats, via a Web Blog. An overview of the set of current supported microformats, and proposed new formats.
Official Microformats Wiki. http://microformats.org/wiki/Main_Page. Updated 4/17/2007.Allows anyone to contribute ideas to the official Microformat team. Provides detailed specification of existing elements, allows the public to submit new elements for consideration, and demonstrates examples of microformat use.
Wikipedia “Microformats.” http://en.wikipedia.org/wiki/Microformats. Created 3/1/2007.Good overview.
XML.com “Microformats.” http://www.xml.com/pub/a/2005/03/23/deviant.html. Updated 3/23/2007.General overview of intended uses for microformats. Demonstrates simple examples.
Cook & Troughton, 2007
© 2009 Franz J. Kurfess Semantic Web 61
Resource Description Framework (RDF)
Resource Description Framework (RDF)
grammar for encoding relationships RDF triples as basic building blocks
An RDF triple has three components: subject predicate (or verb) object each can be expressed as a resource on the Web (URI)
far less ambiguous than encoding data in random XML documents
© 2009 Franz J. Kurfess Semantic Web 63
What is the Purpose of RDF?What is the Purpose of RDF?
The purpose of RDF (Resource Description Framework) is to give a standard way of specifying data "about" something.
Here's an example of an XML document that specifies data about China's Yangtze river:
<?xml version="1.0"?><River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>
"Here is data about the Yangtze River. It has a length of 6300 kilometers.Its startingLocation is western China's Qinghai-Tibet Plateau. Its endingLocationis the East China Sea."
© 2009 Franz J. Kurfess Semantic Web 64
XML --> RDFXML --> RDF
<?xml version="1.0"?><River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>
XML
Modify the following XML document so that it is also a valid RDF document:
<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>
RDF
Yangtze.xml
Yangtze.rdf
"convert to"
© 2009 Franz J. Kurfess Semantic Web 65
The RDF FormatThe RDF Format
<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>
RDF provides an ID attribute for identifying the resource being described.
The ID attribute is in the RDF namespace.
Add the "fragment identifier symbol" to the namespace.
1
2
3
© 2009 Franz J. Kurfess Semantic Web 66
The RDF Format (cont.)The RDF Format (cont.)
<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>
Identifies the type(class) of the resource being described.
Identifies the resource being described. Thisresource is an instance of River.
These are properties,or attributes, of thetype (class).
Values of the properties
1
2
3
4
© 2009 Franz J. Kurfess Semantic Web 67
Namespace ConventionNamespace Convention
xmlns="http://www.geodesy.org/river#"Question: Why was "#" placed onto the end of the namespace? E.g.,
Answer: RDF is very concerned about uniquely identifying things - uniquely identifying the type (class) and uniquely identifying the properties.If we concatenate the namespace with the type then we get a uniqueidentifier for the type, e.g.,http://www.geodesy.org/river#RiverIf we concatenate the namespace with a property then we get a uniqueidentifier for the property, e.g.,
http://www.geodesy.org/river#length
http://www.geodesy.org/river#startingLocation
http://www.geodesy.org/river#endingLocation
Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name.
Bes
t Pra
ctic
eB
est Practice
© 2009 Franz J. Kurfess Semantic Web 68
The RDF FormatThe RDF Format
<?xml version="1.0"?><Class rdf:ID="Resource" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="uri"> <property>value</property> <property>value</property> ...</Class>
© 2009 Franz J. Kurfess Semantic Web 69
Advantage of using the RDF FormatAdvantage of using the RDF FormatYou may ask: "Why should I bother designing my XML to be in the RDF
format?"Answer: there are numerous benefits:
The RDF format, if widely used, will help to make XML more interoperable: Tools can instantly characterize the structure, "this element is a type (class), and here are its
properties”. RDF promotes the use of standardized vocabularies ... standardized types (classes) and
standardized properties.
The RDF format gives you a structured approach to designing your XML documents. The RDF format is a regular, recurring pattern.
It enables you to quickly identify weaknesses and inconsistencies of non-RDF-compliant XML designs. It helps you to better understand your data!
You reap the benefits of both worlds: You can use standard XML editors and validators to create, edit, and validate your XML. You can use the RDF tools to apply inferencing to the data.
It positions your data for the Semantic Web!
Net
wor
k ef
fect
Inte
rope
rabi
lity
© 2009 Franz J. Kurfess Semantic Web 70
Disadvantage of using the RDF Format
Disadvantage of using the RDF Format
Constrained: the RDF format constrains you on how you design your XML (i.e., you can't design your XML in any arbitrary fashion).
RDF uses namespaces to uniquely identify types (classes), properties, and resources. Thus, you must have a solid understanding of namespaces.
Another XML vocabulary to learn: to use the RDF format you must learn the RDF vocabulary.
© 2009 Franz J. Kurfess Semantic Web 71
Uniquely Identify the ResourceUniquely Identify the Resource
Earlier we said that RDF is very concerned about uniquely identifying the type (class) and the properties. RDF is also very concerned about uniquely identifying the resource, e.g.,
<?xml version="1.0"?><River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation></River>
This is the resource being described. We want to uniquelyidentify this resource.
© 2009 Franz J. Kurfess Semantic Web 72
Triple -> resource/property/valueTriple -> resource/property/value
http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers
resource property valuehttp://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's ...
resource property value
http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea
resource property value
© 2009 Franz J. Kurfess Semantic Web 73
The RDF Format = triples!The RDF Format = triples!
XML data are structured as resource/property/value triples value of a property can be a literal
length has a value of 6300 kilometers
value of a property can be a resource property-A has a value of Resource-B property-B has a value of Resource-C
the RDF design pattern is an alternating sequence of resource-property pairs known as "striping”
© 2009 Franz J. Kurfess Semantic Web 74
“Striped” RDF Triples“Striped” RDF Triples<?xml version="1.0"?><Resource-A> <property-A> <Resource-B> <property-B> <Resource-C> <property-C> Value-C </property-C> </Resource-C> </property-B> </Resource-B> </property-A></Resource-A>
value of:property-A
property-B
p.-C
Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation.
© 2009 Franz J. Kurfess Semantic Web 76
KR on the Web RequirementsKR on the Web Requirements
high expressiveness difficult to predict the knowledge that will be represented open, distributed repository
new representation mechanism may be introduced knowledge may be distributed across multiple sites
syntactic interoperability easy access to knowledge in repositories facilitated via APIs and libraries
semantic interoperability interpretation of knowledge is compatible across
repositories
© 2009 Franz J. Kurfess Semantic Web 77
XML as KR on the WebXML as KR on the Web
expressiveness anything for which a grammar can be defined can be
encoded in XMsyntactic interoperability
an XML parser can parse any XML data it is usually a reusable component
semantic interoperability ]major limitation: it just describes grammars no way to recognize a semantic unit from a particular domain
because XML aims at document structure no common interpretation of the document content
© 2009 Franz J. Kurfess Semantic Web 78
RDF as KR on the WebRDF as KR on the Web
expressiveness nested object-attribute-value structure satisfies the
universal expressive power requirement
syntactic interoperability qpplication-independent RDF parsers are available
semantic interoperability object-attribute structure provides natural semantic units
all objects are independent entities
a domain model—defining objects and relationships—can be represented naturally in RDF translation steps are not necessary as they are with XML
© 2009 Franz J. Kurfess Semantic Web 79
Ontology: Origins and HistoryOntology: Origins and History
Ontology in Philosophy a philosophical discipline—a branch of philosophy that deals with the nature and the organisation of reality
Science of Being (Aristotle, Metaphysics, IV, 1)Tries to answer the questions:
What characterizes being? Eventually, what is being
How should things be classified?
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 80
Ontology in LinguisticsOntology in Linguistics
“Tank“
ReferentFormStands for
Relates toactivates
Concept
[Ogden, Richards, 1923]?
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 81
Classification: An Old ProblemClassification: An Old Problem“On those remote pages it is written that animals are divided into:
a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigse. mermaids f. fabulous ones g. stray dogs h. those that are included in this classificationi. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance"
From The Celestial Emporium of Benevolent Knowledge, Borges
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 82
Ontology in Computer ScienceOntology in Computer Science
An ontology is an engineering artifact: It is constituted by a specific vocabulary used to describe a
certain reality, plus a set of explicit assumptions regarding the intended meaning
of the vocabulary. Almost always including how concepts should be classified
describes a formal specification of a certain domain Shared understanding of a domain of interest Formal and machine manipulable model of a domain of
interest explicit specification of a conceptualisation
[Gruber93]
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 84
Ontology Classified LogicallyOntology Classified Logically
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 85
Where else are ontologies used?Where else are ontologies used?
Bioinformatics The Gene Ontology The Protein Ontology (MGED)
Medicine “The terminology wars”
LinguisticsDatabase integrationUser interface designFractal Indexing
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 86
“Manchester Postgraduate Student taking CS626”
“Hand which isanatomicallynormal”
Ontologies as Conceptual LegoOntologies as Conceptual Lego
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 87
User Interfaces using conceptual Lego
User Interfaces using conceptual Lego
FRACTURE SURGERY FRACTURE SURGERY
Structured Data Entry
File Edit Help
TibiaTibia FibulaFibula AnkleAnkle More...More...
RadiusRadius UlnaUlna WristWrist More...More...HumerusHumerus
FemurFemur
LeftLeft RightRight
More...More...Gt TrochGt TrochShaftShaft NeckNeck
FemurFemur
LeftLeft
NeckNeck
ReductionReduction FixationFixation
OpenOpen ClosedClosedOpenOpen
FixationFixation
Fixation of open fracture of neck of left femur
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 88[AKT 2003]
Semantic Web Challenge
Semantic Web Challenge
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 89
So why is it hard?So why is it hard?Ontology languages are tricky
“All tractable languages are useless; all useful languages are intractable”
Ontologies are tricky People do it too easily;
People are not logicians Intuitions hard to formalise
The evidence The problem has been about for 3000 years
But now it matters! The semantic web means knowledge representation matters
The goal Make it easier
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 90
Structure of an OntologyStructure of an Ontology
Ontologies typically have two distinct components:
Names for important concepts in the domain Background knowledge/constraints on the domain
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 91
Concept Names Concept Names
Elephant a concept whose members are a kind of animal
Herbivore a concept whose members are exactly those animals who
eat only plants or parts of plants
Adult_Elephant a concept whose members are exactly those elephants
whose age is greater than 20 years
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 92
Domain KnowledgeDomain Knowledge
Adult_Elephants weigh at least 2,000 kgAll Elephants are either African_Elephants or
Indian_ElephantsNo individual can be both a Herbivore and a
Carnivore
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 93
Tools and ServicesTools and Services
We need to provide tools and services to: Design and maintain high quality ontologies, e.g.:
Meaningful — all named classes can have instances Correct — captured intuitions of domain experts Minimally redundant — no unintended synonyms Richly axiomatised — (sufficiently) detailed descriptions
Store (large numbers) of instances of ontology classes Annotations from web pages
Answer queries over ontology classes and instances, e.g.: Find more general/specific classes Retrieve annotations/pages matching a given description
Integrate and align multiple ontologies
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 94
OWL as (Description) LogicOWL as (Description) Logic
XMLS datatypes as well as classes in 8P.C and 9P.C E.g., 9hasAge.nonNegativeInteger
Arbitrarily complex nesting of constructors E.g., Person u 8hasChild.(Doctor t 9hasChild.Doctor)
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 97
Ontologies as DL Knowledge BasesOntologies as DL Knowledge Bases
OWL ontology maps to a DL knowledge base K = hT, Ai T (Tbox) is a set of axioms of the form:
C v D, C ´ D (concept inclusion/equivalence) R v S, R ´ S (role inclusion/equivalence) R+ v R (role transitivity)
A (Abox) is a set of axioms of the form x 2 D (concept instantiation) hx,yi 2 R (role instantiation)
Two sorts of Tbox axioms often distinguished “Definitions”
C v D or C ´ D where C is a concept name
General Concept Inclusion axioms (GCIs) C v D where C in an arbitrary concept
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 98
Knowledge Base SemanticsKnowledge Base Semantics
An interpretation I satisfies (models) an axiom A (I ² A): I ² C v D iff CI µ DI I ² C ´ D iff CI = DI I ² R v S iff RI µ SI I ² R ´ S iff RI = SI I ² R+ v R iff (RI)+ µ RI I ² x 2 D iff xI 2 DI I ² hx,yi 2 R iff (xI,yI) 2 RI
I satisfies a Tbox T (I ² T ) iff I satisfies every axiom A in T
I satisfies an Abox A (I ² A) iff I satisfies every axiom A in A
I satisfies a KB K (I ² K) iff I satisfies both T and A
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 99
Services as Reasoning (1)Services as Reasoning (1)
Knowledge is meaningful (classes can have instances) C is satisfiable w.r.t. K iff there exists some model I of K
s.t. CI ;
Knowledge is correct (captures intuitions) C subsumes D w.r.t. K iff for every model I of K, CI µ DI
Knowledge is minimally redundant (no unintended synonyms) C is equivalent to D w.r.t. K iff for every model I of K, CI =
DI
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 100
Services as Reasoning (2)Services as Reasoning (2)
Querying knowledge x is an instance of C w.r.t. K iff for every model I of K, xI 2
CI hx,yi is an instance of R w.r.t. K iff for, every model I of K,
(xI,yI) 2 RI
All above problems reducible to Knowledge Base consistency A KB K is consistent iff there exists some model I of K
KB consistency reducible to concept consistency
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 101
Results for Margherita PizzaResults for Margherita Pizza
What it means All Margherita_pizzas (amongst other things)
Are Pizzas have_topping some Tomato_topping have_topping some Mozzarella_topping
& because they are Pizzashave_base some Pizza_base
someValuesFromrestrictions
Properties subpane showingalternative ‘frame’view
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 102
Pizza_toppings
Pizzas
Margherita_pizzas
aMP1
aMP2
aMPi
Pizza_base
…
aPB1
aPBj
aPB2
What itMeans
What itMeans
Mozzarella_Toppings
aMZ1 aMZ2
aMZ3
…
aMZ4
Tomato_toppingss
aTkaT1
aT2
aT4
aT3…
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 103
DL Reasoning (1)DL Reasoning (1)
Tableau algorithms used to test satisfiability (consistency)
Try to build a tree-like model I of the input concept CDecompose C syntactically
Apply tableau expansion rules Infer constraints on elements of model
Tableau rules correspond to constructors in logic (u, t etc) Some rules are nondeterministic (e.g., t, 6) In practice, this means search
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 104
DL Reasoning (2)DL Reasoning (2)
Stop when no more rules applicable or clash occurs Clash is an obvious contradiction, e.g., A(x), : A(x)
Cycle check (blocking) may be needed for termination
C satisfiable iff rules can be applied such that a fully expanded clash free tree is constructed
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 105
Highly Optimised ImplementationHighly Optimised Implementation
Naive implementation leads to effective non-termination
Modern systems include MANY optimisationsexamples
classification subsumption
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 106
Optimised classification Optimised classification
compute partial orderinguse enhanced traversal
exploit information from previous tests
use structural information to select classification order
Horrocks & Rector, 2004
© 2009 Franz J. Kurfess Semantic Web 107
Optimised subsumption testing Optimised subsumption testing
search for modelsnormalisation and simplification of conceptsabsorption (rewriting) of general axiomsDavis-Putnam style semantic branching searchdependency directed backtrackingcaching of satisfiability results and (partial) modelsheuristic ordering of propositional and modal
expansion…
© 2009 Franz J. Kurfess Semantic Web 109[Dieng et al. 1999]
Reference [Dieng et al. 1999]Reference [Dieng et al. 1999]
© 2009 Franz J. Kurfess Semantic Web 110
Reference [Sommerville 01] Reference [Sommerville 01]
[Sommerville 01]
[Sommerville 01]
© 2009 Franz J. Kurfess Semantic Web 113
ReferencesReferences [Gil 2000] Yolanda Gil, Knowledge Mobility. Dagstuhl Workshop
“Semantics for the Web”, March 2000. [NEEDS] National Engineering Digital Library, www.needs.org [Russell & Norvig 1995] Stuart Russell and Peter Norvig, Artificial
Intelligence - A Modern Approach. Prentice Hall, 1995.
© 2009 Franz J. Kurfess Semantic Web 114
Important Concepts and TermsImportant Concepts and Terms natural language processing neural network predicate logic propositional logic rational agent rationality Turing test
agent automated reasoning belief network cognitive science computer science hidden Markov model intelligence knowledge representation linguistics Lisp logic machine learning microworlds
© 2009 Franz J. Kurfess Semantic Web 115
Our goal, by the end of the course…Our goal, by the end of the course…
You should be able to understand the similarities and differences amongst the related methodologies
Understand the logical foundationsHave the vocabulary and basic skills to know when
and how to use modern ontology tools… and when not to!
© 2009 Franz J. Kurfess Semantic Web 117
ResourcesResources
Presentation Ian Horrocks and Alan Rector: The Semantic Web: Ontologies and OWL CS64, University of Manchester, Manchester, UK
Presentation James Hendler: