internet technologies1 rdf these slides were built using modified examples from “xml how to...

32
Internet Technologies 1 RDF se slides were built using modified examples from “XML How To Program” by tel, Deitel, Nieto, Lin and Sadhu. slides also make use of material from “XML Bible” by Elliotte Rusty Harold. what follows, a “resource” might be a web page, an element in a web page, a ice, a person and more.

Post on 21-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Internet Technologies 1

RDFThese slides were built using modified examples from “XML How To Program” byDeitel, Deitel, Nieto, Lin and Sadhu.

The slides also make use of material from “XML Bible” by Elliotte Rusty Harold.

In what follows, a “resource” might be a web page, an element in a web page, adevice, a person and more.

Internet Technologies 2

RDF• The Resource Description Framework (RDF) is a W3C recommendation for an XML encoding of metadata.

• A standard for encoding metadata is important for finding and describing resources.

• Card catalogs (with wooden drawers and index cards) have been used for years to record metadata about the collection of materials in the library.

An RDF document or element makes statements about resources.

A statement can be thought of as an ordered triple composed of three items:

(resource, property-type, property-value)

Internet Technologies 3

RDF

(resource, property-type, property-value)

It is required that each resource have a URI.

http://www.andrew.cmu.eduhttp://www.andrew.cmu.edu/~mm6/my.xml#root().child(1)mailto:[email protected]:isbn:0764532367

A property is a specific characteristic, attributeor relationship of a resource. Each property has a specific meaningthat can be identified by the property’s name and the associated schema. The schema must actually be pointed to by the property’s namespace.

The schema describes the values or value ranges that are permitted for theproperty.

Internet Technologies 4

<?xml version = "1.0"?>

<!-- Fig. 22.3 : simple.rdf --><!-- Simple usage of RDF -->

<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc = "http://purl.org/dc/elements/1.1/">

<rdf:Description about="http://www.deitel.com"> <dc:Title>Deitel and Associates, Inc.</dc:Title> <dc:Description> This is the home page of Deitel and Associates, Inc. </dc:Description> <dc:Date>2000-5-24</dc:Date> <dc:Format>text/html</dc:Format> <dc:Language>en</dc:Language> <dc:Creator>Deitel and Associates</dc:Creator> </rdf:Description>

</rdf:RDF>

The root element of an RDF document is RDF.

Each property of the resource being describedis a child element of the Description element.

The content of the child is the value of theproperty.

Namespaces are used to distinguish betweenRDF elements and elements in property typesand values.

Describinga web site

Internet Technologies 5

The Dublin CoreA collection of elements designed to provide a similar structure as thatprovided by a card catalog. For example, the following are elements definedin the Dublin Core namespace:

TITLE The name given to the resourceCREATOR The person or organization that created …SUBJECT The topic of the resource…DESCRIPTION…::

In the near future, this Dublin Core “schema” is likely to be encoded with a formal syntax.Perhaps this syntax will be RDF.

Internet Technologies 6

<?xml version = "1.0"?>

<!-- Fig. 22.3 : simple.rdf --><!-- Simple usage of RDF -->

<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc = "http://purl.org/dc/elements/1.1/">

<rdf:Description about="http://www.deitel.com"> <dc:Title>Deitel and Associates, Inc.</dc:Title> <dc:Description> This is the home page of Deitel and Associates, Inc. </dc:Description> <dc:Date>2000-5-24</dc:Date> <dc:Format>text/html</dc:Format> <dc:Language>en</dc:Language> <dc:Creator>Deitel and Associates</dc:Creator> </rdf:Description>

</rdf:RDF>

A single RDF element can containany number of Description elements.

A Description element can statemore than one property abouta resource.

Some properties may be resource valued. For example,suppose Deitel and Associates has an email address…

Internet Technologies 7

<?xml version = "1.0"?>

<!-- Fig. 22.3 : simple.rdf --><!-- Simple usage of RDF -->

<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc = "http://purl.org/dc/elements/1.1/">

<rdf:Description about="http://www.deitel.com"> <dc:Title>Deitel and Associates, Inc.</dc:Title> <dc:Description> This is the home page of Deitel and Associates, Inc. </dc:Description> <dc:Date>2000-5-24</dc:Date> <dc:Format>text/html</dc:Format> <dc:Language>en</dc:Language> <dc:Creator> <rdf:Description about = “mailto:[email protected]” > <dc:Title>Deitel and Associates</dc:Title> </rdf:Description> </dc:Creator> </rdf:Description></rdf:RDF>

The Creator becomes a resource rather than a literal. This is a resourcevalued property.

Another way to say the samething is with a resourceattribute…

Internet Technologies 8

<?xml version = "1.0"?>

<!-- Fig. 22.3 : simple.rdf --><!-- Simple usage of RDF -->

<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc = "http://purl.org/dc/elements/1.1/">

<rdf:Description about="http://www.deitel.com"> <dc:Title>Deitel and Associates, Inc.</dc:Title> <dc:Description> This is the home page of Deitel and Associates, Inc. </dc:Description> <dc:Date>2000-5-24</dc:Date> <dc:Format>text/html</dc:Format> <dc:Language>en</dc:Language> <dc:Creator rdf:resource = “mailto:[email protected]” /> </rdf:Description> <rdf:Description about = “mailto:[email protected]” > <dc:Title>Deitel and Associates</dc:Title> </rdf:Description> </rdf:RDF>

Internet Technologies 9

RDF Containers

An RDF element may describe a resource with multiple properties of the same type.Perhaps a book has several authors or a web page may be found at several sites.

RDF defines three types of container objects:

Bag – a group of unorderd properties – use li.Seq – a sequence (ordered list) of propertiesAlt – a list of alternative properties from which to choose a single one

Let’s look at a more involved example from Deitel and Deitel…

Internet Technologies 10

<?xml version = "1.0"?>

<!-- Fig. 22.5 : links.rdf --><!-- Describing entire Web site -->

<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc = "http://purl.org/dc/elements/1.1/">

<rdf:Description about = "www.deitel.com"> <dc:Title>Home page of Deitel products</dc:Title> <dc:Creator>Deitel and Associates, Inc.</dc:Creator> <dc:Subject> <rdf:Bag ID = "links_1"> <rdf:li resource = "http://www.deitel.com/books/index.htm"/> <rdf:li resource = "http://www.deitel.com/services/training/index.htm"/> </rdf:Bag>

Statements can be made about a container as a whole and so we givethe container an ID.

Internet Technologies 11

<rdf:Bag ID = "links_2"> <rdf:li resource ="http://www.deitel.com/announcements/contractors.htm"/> <rdf:li resource ="http://www.deitel.com/announcements/internships.htm"/> </rdf:Bag>

<rdf:Seq ID = "links_3"> <rdf:li resource = "http://www.deitel.com/intro.htm"/> <rdf:li resource = "http://www.deitel.com/directions.htm"/> </rdf:Seq> </dc:Subject></rdf:Description>

<!-- description of the common feature of the Bag links_1--><rdf:Description aboutEach = "#links_1"> <dc:Description>About our Products</dc:Description></rdf:Description>

The aboutEach attributeapplies to each element in thecontainer.

Internet Technologies 12

<rdf:Description aboutEach = "#links_2"> <dc:Description> Announcements, Oppurtunities and internships at Deitel Associates </dc:Description></rdf:Description>

<rdf:Description aboutEach = "#links_3"> <dc:Description>All about us</dc:Description></rdf:Description>

<!-- further description of each link --><rdf:Description about = "http://www.deitel.com/books/index.htm"> <!-- description of page title --> <rdf:Title> Books, Multimedia Cyber Classrooms and Complete Training Courses </rdf:Title></rdf:Description>

<rdf:Description about = "http://www.deitel.com/services/training/index.htm"> <rdf:Title>Corporate Training Courses</rdf:Title></rdf:Description>

Internet Technologies 13

<rdf:Description about = "http://www.deitel.com/announcements/contractors.htm"> <rdf:Title>Looking for Training Contractors</rdf:Title></rdf:Description>

<rdf:Description about = "http://www.deitel.com/announcements/internships.htm"> <rdf:Title> Internships at Deitel and Associates, Inc. </rdf:Title></rdf:Description>

<rdf:Description about = "http://www.deitel.com/intro.htm"> <rdf:Title> Introduction to Deitel and Associates, Inc. </rdf:Title></rdf:Description>

<rdf:Description about = "http://www.deitel.com/directions.htm"> <rdf:Title>Our location and how to get there</rdf:Title></rdf:Description> </rdf:RDF>

Internet Technologies 14

The Semantic Web

By augmenting web pages with data directed at computers and by adding documents solely for computers, we will transform the web into the Semantic Web.

Intuitive software will be developed that will allow anyone to create Semantic Web Pages.

For the semantic web to function, computers must have access to structured collectionsof information and sets of inference rules that can be used to conduct automated reasoning.

XML has no built-in mechanism to convey the meaning of the user’s new tags to otherusers.

These notes are from an article entitled “The Semantic Web” by Tim Berners-Lee,James Hendler and Ora Lassila appearing in Scientific American, May 2001

Internet Technologies 15

The Semantic WebThe challenge of the Semantic Web is to provide a language that expresses bothdata and rules for reasoning about the data and that allows rules from an existing knowledge-representation system to be exported unto the Web.

Ontologies: Collections of statements written in a language such as RDF that definethe relations between concepts and specify logical rules for reasoning about them.

Computers will “understand” the meaning of semantic data on a web page byfollowing links to specified ontologies.

Consider the statement “a hex-head bolt is a type of machine bolt”. We could encode thisin RDF.

When writing code against traditional XML data, the programmer must know what thethe document author uses each tag for.

Meaning is expressed by RDF, which encodes it in a set of triples.

Internet Technologies 16

The Semantic WebAn RDF document makes assertions that particular things (people, web pages,or whatever) have properties (such as “is sister of”, “is the author of”) with certain values (another person, another Web page).

We can remove ambiguity by associating each of the three parts with a URI. Forexample:

“(filed 5 in database A) (is a field of type) (zip code)” could be replaced with three URI’s.

An ontology is a document or file that formally defines the relations among terms.

An ontology may express a rule “If a city code is associated with a state code, andan address uses that city code, then that address has the associated state code.”

A program can then draw conclusions.

The meaning of terms or XML codes can be defined by pointers from the page toan ontology.

Internet Technologies 17

The Semantic Web

Many automated web based services already exist without semantics, but other programssuch as agents have no way to locate one that will perform a specific function.

Service Discovery will happen only when there is a common language to describe aservice in a way that lets other agents “understand” both the function offered and how to take advantage of it.

Services can advertise their functions in directories analogous to the Yellow Pages.

Devices can advertise their abilities with RDF in the form of CC/PP…

Internet Technologies 18

What is CC/PP?

A composite capability/preference profile is a collection of information which describes the capabilities,hardware, system software and applications used by someone accessing the web. Information mightinclude:

• Preferred language (Spanish, French, etc.)• Sound (on/off)• Images (on/off)• Class of device (phone, PC, printer, etc.)• Screen size• Available bandwidth• Version of HTML supported, and so on.

Internet Technologies 19

Composite Capability/Preference Profiles (CC/PP)

DEVICE PROFILE

CC/PP

RDF

XML

CC/PP provides the equivalentof database fields and associatedmodel for formalizing the device profiles

RDF is language which providesa standard way for using XML torepresent metadata in the form ofproperties and relationships ofitems on the Web.

The device profile and user preferences might be stored in aCC/PP repository. CC/PP is in turn an RDF application.

Internet Technologies 20

• The location of the device profile is sent with a request for a Web page.• The CC/PP data is accessed and on the basis of the profile, a Web server can choose the right content. This might be a certain XHTML file or perhaps a suitable document would be generated on the fly.• A document on the server may refer to its own document profile-describing the required capabilities of its client.• The server might match and send or generate and send.

Composite Capability/Preference Profiles (CC/PP)

An example:

Internet Technologies 21

Each variant of the document has adocument profile describing the browsersupport it needs to display it

DEVICE PROFILESDOCUMENT PROFILES

NEGOTIATE CORRECTCONTENT FOR DEVICES

If none of the document variants are suitable,existing document may be transformed by stylesheet or tool for the purpose, or new documentgenerated

DEVICES RECEIVE RIGHT MARK-UP

Internet Technologies 22

The CC/PP working group was formed in August 1999. Its mission is to develop an RDF-based framework forthe management of device profile information.

Composite Capability/Preference Profiles (CC/PP)

Internet Technologies 23

The resource description framework is a proposal for representing metadata in XML. Its intended applicationsinclude:

• Providing better search engine capabilities in resourcediscovery

• Cataloguing for describing the content and contentrelationships available at a particular Web site

• Allowing software agents to share and exchange data

More on RDF

The RDF data model is that of a directed edge labeled graph. Nodes are called resources and edge labels are called properties. RDF’s syntax is a convention for representing this model in XML.

Internet Technologies 24

person

name

age

email

Alan 42 [email protected] This element describes aresource. We aredescribing two resourcesin this document withunique ID’s.

<rdf:description ID=“001”><person> <rdf.description ID=“002”>

<name>Alan</name><age>42</age><email>[email protected]</email>

</rdf.description></person>

</rdf:description>

An edge in RDF is called a statement.

Four statements have been made.1) resource 001 has property person whose value is resource 002.2) resource 002 has property name with value Alan.3) resource 002 has property age with value 42.4) resource 002 has property email with value [email protected].

Internet Technologies 25

What is a Resource?

A resource can be anything that can have a Uniform Resource Identifier(URI).URIs are a superset of the more common Uniform Resource Locator (URLs), butThey can also identify books, elements on a page, television shows, individual People, and more.

Thus a resource might be

An entire web site (http://www.cmu.edu)A single web page (http://www.andrew.cmu.edu/~mm6/index.html)A specific HTML or XML element on a web page (identified with Xpointer)A book (urn:isbn:0764532367)A person(mailto:[email protected])

Internet Technologies 26

•RDF provides a model for describing resources•Resources have properties (attributes or characteristics).•RDF defines a resource as any object that is uniquely identifiable by a Uniform Resource Identifier (URI).•The properties associated with resources are identified by property-types.•Property-types have corresponding values.•Property-types express the relationships of values associated with resources.•Values may be atomic in nature (text strings, numbers, etc.) or other resources.•A collection of properties that refer to the same resource is called a description.

The RDF Data Model

Internet Technologies 27

Resource1

Resource2

Resource3

PropertyType1 PropertyType3

PropertyType 2 PropertyType 4

“Atomic Value” “Atomic Value”

RDF Description

Internet Technologies 28

Consider the following statements:

1. “The author of Document 1 is John Smith”2. “John Smith is the author of Document1”

To humans, these statements convey the same meaning (that is, JohnSmith is the author of a particular document). To a machine, however,these are completely different strings. Whereas humans are extremely adept at extracting meaning from differing syntactic constructs, machines remain grossly inept. Using a triadic model ofresources, property-types and corresponding values, RDF attempts toprovide an unambiguous method of expressing semantics in amachine-readable encoding.

Internet Technologies 29

Document 1Author

“John Smith”

Resource Property Value

Internet Technologies 30

Document 1Author

Resource Property Value

Author_001

Affiliation

Name

Email

“Home, Inc.”

“John Smith”

[email protected]

Each resource must havea unique identifier.

Perhaps wewant to keepinformationabout theauthor.

Internet Technologies 31

An example RDF document (DC is the Dublin Core namespace)

<?xml:namespace ns = “http://www.w3.org/RDF/RDF/” prefix=“RDF” ?><?xml:namespace ns = “http://purl.oclc.org/DC/” prefix = “DC” ?>

<RDF:RDF> <RDF:Description RDF:HREF = “http://uri-od-Document-1”> <DC:Creator>John Smith</DC:Creator> </RDF:Description></RDF:RDF>

Description keyword meaning resource

A property value

A property type

Internet Technologies 32

RDF’s data model extends the graph model in several ways:

1) It has containers which can be bags (sets with duplicates) or sequences (lists).

2) It extends the model with higher order statements. The RDF syntax provides the author with the ability to say things like:

John says that the email of resources 002 is [email protected] the resource is John, the property is “says”, and itsvalue is another statement

or

The Library of Congress says resource X is authoritative.