rdf, xml and interoperability managing networks : understanding new technologies, birmingham, 13...

39
RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported by: Email [email protected] URL http://www.ukoln.ac.uk/

Upload: ethel-shields

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

RDF, XML and interoperability

Managing networks : understanding new technologies, Birmingham,

13 September 2001

Pete Johnston

UKOLN, University of Bath

Bath, BA2 7AY

UKOLN is supported by:

[email protected]://www.ukoln.ac.uk/

Page 2: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

2

RDF, XML & interoperability

• Metadata : a reprise• Communities, communication & XML• An introduction to RDF• RDF, XML and interoperability

Page 3: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

3

What is metadata?

• “Data associated with objects which relieves their potential users of having to have full advance knowledge of their existence or characteristics. A user might be a program or a person.”

– Dempsey and Heery, 1998

• “Machine understandable information about web resources or other things.”

– Berners-Lee, 1997

• Structured data about resources that can be used to help support a wide range of operations

Page 4: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

4

What resources, objects, things?

• HTML documents• digital images• databases• books• museum objects• archival records• metadata records

• collections• services• physical places• people• abstract “works”• concepts• events

Page 5: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

5

What operations?

• User wants to– find, identify, select, obtain / use

• Owner / manager / provider wants to– describe – enable and control access/use– administer

• Different “flavours” of metadata serve different purposes

– Simple, generic vs. rich, specific

Page 6: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

6

Communities & communication

• Effective transmission of information requires agreement on

– semantics– what terms mean– e.g. “cat”, “to sit”, “mat”

– structure– significance of arrangement of terms– e.g. sentence: subject -> verb -> object

(in English….)

– syntax– rules of expression– “The cat sat on the mat.”

• A resource description community is defined by consensus on conventions

Page 7: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

7

Communication using XML (1)

• An example– I prepare a music catalogue using the (imaginary!)

AlbumCat XML schema – I publish my XML document on the Web– someone else prepares a catalogue using the

same XML schema and publishes their XML document

• I can read their XML document and locate tracks created by Don Van Vliet in their catalogue

• But more importantly…..

Page 8: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

8

Page 9: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

9

Page 10: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

10

Communication using XML (2)

User request: Find identifiers of all tracks with creator “Don Van Vliet”

Program action:Find values of dc:identifier attributes of track elements which have a dc:creator child element with content “Don Van Vliet”

… my software can search their document because I have programmed it to map:

Page 11: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

11

Communication using XML (3)Program action:Find

values of dc:identifier attributes

of track elements

which have a dc:creator child element

with content “Don Van Vliet”

<catalogue>

<album dc:identifier="http://pj.org/album/245">

<dc:title>The Spotlight Kid</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

<track dc:identifier="http://pj.org/track/723">

<dc:title>Grow fins</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

</track>

</album>

</catalogue>

Program action:Find

values of dc:identifier attributes

of track elements

which have a dc:creator child element

with content “Don Van Vliet”

<catalogue>

<album dc:identifier="http://pj.org/album/245">

<dc:title>The Spotlight Kid</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

<track dc:identifier="http://pj.org/track/723">

<dc:title>Grow fins</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

</track>

</album>

</catalogue>

Program action:Find

values of dc:identifier attributes

of track elements

which have a dc:creator child element

with content “Don Van Vliet”

<catalogue>

<album dc:identifier="http://pj.org/album/245">

<dc:title>The Spotlight Kid</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

<track dc:identifier="http://pj.org/track/723">

<dc:title>Grow fins</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

</track>

</album>

</catalogue>

Program action:Find

values of dc:identifier attributes

of track elements

which have a dc:creator child element

with content “Don Van Vliet”

<catalogue>

<album dc:identifier="http://pj.org/album/245">

<dc:title>The Spotlight Kid</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

<track dc:identifier="http://pj.org/track/723">

<dc:title>Grow fins</dc:title>

<dc:creator>Van Vliet, Don</dc:creator>

</track>

</album>

</catalogue>

Page 12: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

12

Metadata use

• Resource users wish to – search across the boundaries of communities– combine resources from different communities

• Resource providers wish to – exchange descriptions with members of other

communities

• Third parties wish to– describe resources owned/described by others

• Metadata is – used beyond its creator community– combined with metadata from other communities

Page 13: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

13

Communication using XML (4)

• Continuing the example– a museum describes their holdings using the

(imaginary...) ArtCat XML schema and publishes their XML document

• I can read their XML document and locate pictures created by Don Van Vliet listed in their catalogue

– requires my guesswork and/or reference to semantics of ArtCat schema

• But….

Page 14: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

14

Page 15: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

15

Communication using XML (5)

User request: Find identifiers of all “works” with creator “Don Van Vliet”

Program action (AlbumCat):Find values of dc:identifier attributes of track elements which have a dc:creator child element with content “Don Van Vliet”

… to search across both catalogues, my software now has to be programmed with two mappings:

Program action (ArtCat):Find content of dc:identifier elements which have a picture parent element with a details child element which has a dc:creator child element with content “Don Van Vliet”

Page 16: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

16

The problem

• Statement– this resource (track, picture... etc!) has dc:creator

“Don Van Vliet”

• Multiple expressions in XML– different XML schemas make different choices– all “good” (and valid)– human reader of document can interpret (maybe)– program needs prior “knowledge” of structural

conventions in each XML schema

• Not scalable in an “open” environment– how to manage ever increasing set of conventions– always encountering unknown schemas

Page 17: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

17

The problem (2)

“XML allows users to add arbitrary structure to their documents but says nothing about what the structures mean.”

– Berners-Lee, 2001

• Consensus on syntax– use of XML

• Consensus on semantics of terms– meaning of (uniquely named through XML

namespace) elements/attributes

• No consensus on meaning of structure– e.g. parent-child element relations

Page 18: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

18

Introducing RDF

• Resource Description Framework Model & Syntax

• Recommendation of W3C, 1999• Generic “architecture” for metadata

– set of conventions for applications exchanging metadata

– allow semantics to be defined by different resource description communities

– accommodate mixing of metadata from diverse sources

Page 19: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

19

Introducing RDF (2)

• Defines – model for making statements about resources– conventions for encoding statements using XML

syntax

• Object types– Resource : any object identified by URI

– not necessarily accessible via Web– Property : “attribute” to describe resource

– properties also uniquely identified by URI– Statement : “triple” of specific resource, named

property, and value

Page 20: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

20

The RDF model

http://pj.org/doc/1author

Pete

A resource has some property whose value is either (i) a simple string value (literal)…

– The resource identified by the URI http://pj.org/doc/1 has a property “author” whose value is “Pete”

– Or, “Pete” is the “author” of the resource identified by http://pj.org/doc/1

Page 21: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

21

The RDF model (2)

… or (ii) another resource...

http://pj.org/doc/1author

Pete [email protected]

name email

– The value of property “author” is another resource which has a property “name” with value “Pete” and a property “email” with value “[email protected]

Page 22: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

22

The RDF model (3)

… which may itself have a URI

http://pj.org/doc/1

author

Pete

http://pj.org/person/pete

[email protected]

name email

Page 23: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

23

The power of RDF

• Extensible model– supports any vocabularies

• Supports arbitrary complexity of description• URIs as unique fixed points to identify

– resources– properties

• Descriptions created independently can be “merged” using URIs as “anchors”

Page 24: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

24

First source

http://pj.org/doc/1

author

Pete

http://pj.org/person/pete

[email protected]

name email

Page 25: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

25

Second source

http://pj.org/doc/1subject

XML

Page 26: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

26

Third source

http://pj.org/person/pete

organisation

UKOLN

Page 27: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

27

Three descriptions merged

http://pj.org/doc/1

author

Pete

http://pj.org/person/pete

[email protected]

name email

http://pj.org/doc/1

subject

XML

http://pj.org/person/pete

organisation

UKOLN

Page 28: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

28

The RDF XML syntax

• XML representation of model– to store/exchange descriptions

• Property names made unique through use of XML namespaces

• Variant XML syntaxes for RDF

<rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=”http://pj.org/doc/1”> <uc:author>Pete</uc:author> </rdf:Description></rdf:RDF>

Page 29: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

29

The RDF XML syntax (2)

• Using RDF/XML syntax means accepting conventions for the meaning of structures in XML document

• So, an RDF/XML processor can “know in advance” the meaning of structures

– even if the description uses unanticipated vocabularies

– “partial understanding”

• Can read multiple descriptions into store and “merge” on URIs

• Will be generated/consumed by software!

Page 30: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

30

First source

http://pj.org/doc/1

author

Pete

http://pj.org/person/pete

[email protected]

nameemail

<rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=“http://pj.org/doc/1”> <uc:author> <rdf:Description about=“http://pj.org/person/pete”> <uc:name>Pete</uc:name> <uc:email>[email protected]</uc:email> </rdf:Description </uc:author> </rdf:Description></rdf:RDF>

Page 31: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

31

Second source

http://pj.org/doc/1subject

XML

<rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=”http://pj.org/doc/1”> <uc:subject>XML</uc:author> </rdf:Description></rdf:RDF>

Page 32: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

32

Third source

http://pj.org/person/pete

organisation

UKOLN

<rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=”http://pj.org/person/pete”> <uc:organisation>UKOLN</uc:organisation> </rdf:Description></rdf:RDF>

Page 33: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

33

Three descriptions merged

<rdf:RDF xmlns:uc=“http://www.ukoln.ac.uk/core/”> <rdf:Description about=“http://pj.org/doc/1”> <uc:author> <rdf:Description about=“http://pj.org/person/pete”> <uc:name>Pete</uc:name> <uc:email>[email protected]</uc:email> <uc:organisation>UKOLN</uc:organisation> </rdf:Description </uc:author> <uc:subject>XML</uc:subject> </rdf:Description></rdf:RDF>

Page 34: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

34

A Dublin Core description

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:dc="http://purl.org/dc/elements/1.1/">

<rdf:Description about="http://www.ukoln.ac.uk/">

<dc:title>UKOLN home page</dc:title>

<dc:creator>Web-support Team, UKOLN</dc:creator>

<dc:subject>digital information management; metadata</dc:subject>

<dc:description>The home page of the UKOLN web site. UKOLN is a

national focus of expertise in digital information management. It

provides policy, research and awareness services to the UK library,

information and cultural heritage communities. UKOLN is based at the

University of Bath.</dc:description>

<dc:publisher>UKOLN</dc:publisher>

<dc:date>2001-09-06</dc:date>

<dc:type>Text</dc:type>

<dc:format>text/html</dc:format>

<dc:format>12809 bytes</dc:format>

</rdf:Description>

</rdf:RDF>

Page 35: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

35

RDF, XML & interoperability

• Why isn’t XML enough?– simple statement could be expressed in XML in

many different ways– human reader makes interpretation/guess– application program requires prior knowledge of

schema/DTD design

• RDF/XML– imposes extra syntactic constraints on how

statement expressed– both human and program can interpret description

consistently

• Less flexibility, greater interoperability

Page 36: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

36

RDF, XML & interoperability

• Tentatively….• Use XML for exchange when

– partners (humans, applications) both “know” semantics conveyed by structure of (meta)data

• Use RDF/XML for exchange when– (meta)data potentially used by applications without

prior “knowledge” of specific schema– (meta)data incorporates overlapping structures

from different domains

• N.B. raises issues of trust– who made statements?

Page 37: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

37

A note of caution

• RDF not (yet?) a widely adopted technology• Addresses cross- organisation/domain problems • Some scepticism?

– perceived as theoretical, “academic”?– also considerable enthusiasm!

• Some revisions to Model & Syntax in progress at W3C

– XML 1.0 is stable– RDF less so

• Limited tools available (at present!)• But also growing number of applications

Page 38: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

38

Exercise (optional)

• DC-dot– http://www.ukoln.ac.uk/metadata/dcdot/– Web-based tool– generates DC metadata for Web pages, based on

existing <meta> tags, heading content etc

• Experiment with DC-dot to generate DC metadata for pages of your choice

• View the RDF/XML representations

Page 39: RDF, XML and interoperability Managing networks : understanding new technologies, Birmingham, 13 September 2001 Pete Johnston UKOLN, University of Bath

Managing networks: understanding new technologies, Birmingham, 13 Sep 2001

39

Acknowledgements

UKOLN is funded by Resource: the Council for Museums, Archives and Libraries, the Joint Information Systems Committee (JISC) of the UK higher and further education funding councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.

http://www.ukoln.ac.uk/