dm110 - week 10 - semantic web / web 3.0

34
Copyright 2005 Digital Enterprise Research Institute. All rights reserved. www.deri.org DM110 Emerging Web Media Dr. John Breslin [email protected] http://sw.deri.org/ ~jbreslin/ Week 10: Semantic Web / Web 3.0

Upload: john-breslin

Post on 29-Oct-2014

14 views

Category:

Business


2 download

DESCRIPTION

DM110 Emerging Web Media / Huston Film School, National University of Ireland, Galway / 13th March 2007

TRANSCRIPT

Page 1: DM110 - Week 10 - Semantic Web / Web 3.0

Copyright 2005 Digital Enterprise Research Institute. All rights reserved.

www.deri.org

DM110 Emerging Web Media

Dr. John Breslin

[email protected]://sw.deri.org/~jbreslin/

Week 10: Semantic Web / Web 3.0

Page 2: DM110 - Week 10 - Semantic Web / Web 3.0

2

What is the Semantic Web?

• Sir Tim Berners-Lee et al., Scientific American, 2001:– “An extension of the current web in which information is given

well-defined meaning, better enabling computers and people to work in cooperation.”

• “Entrepreneurs see a Web guided by common sense”, John Markoff, New York Times, 2006:– “Referred to as Web 3.0, the effort is in its infancy, and the very

idea has given rise to skeptics who have called it an unobtainable vision. But the underlying technologies are rapidly gaining adherents, at big companies like IBM and Google as well as small ones.”

• Requires web pages to have metadata with underlying ontologies

Page 3: DM110 - Week 10 - Semantic Web / Web 3.0

3

Where are we in the Semantic Web layer cake?

YouAreHere!

Page 4: DM110 - Week 10 - Semantic Web / Web 3.0

4

What is metadata?

• Metadata has been with us since the first librarian made a list of the items on a shelf of handwritten scrolls

• The term “meta” comes from a Greek word that denotes “alongside, with, after, next”

• More recent Latin and English usage would employ “meta” to denote something transcendental, or beyond nature

• Metadata can be thought of as “data about data”• It is the Internet-age term for information that librarians

traditionally have put into catalogues, and it most commonly refers to descriptive information about Web resources

Page 5: DM110 - Week 10 - Semantic Web / Web 3.0

5

Why do we need metadata?

• To provide a structured description of characteristics such as the meaning (semantics), content, structure and purpose of a resource

• To facilitate information sharing• To enable more sophisticated search engines on the

Internet• To support intelligent agents and the pushing of data

(e.g. from blog feeds)• To minimise data loss or repetition• To improve resource discovery by enabling field-based

searches

Page 6: DM110 - Week 10 - Semantic Web / Web 3.0

6

Why does the Web need metadata?

• No metadata:– Google

• Library analogy:– Index

every word in

every page in

every book

Bad search results Lagging the growth and

change in the Web

• Metadata (basic):– Yahoo! Directory

• Library analogy:– Categories– Titles– Descriptions– Ratings

Better results More work in classifying

things and assigning properties!

Page 7: DM110 - Week 10 - Semantic Web / Web 3.0

7

What kind of resources, objects, things?

• HTML documents• digital images• databases• books• museum objects• archival records• metadata records

• collections• services• physical places• people (using FOAF)• abstract “works”• concepts• events

Page 8: DM110 - Week 10 - Semantic Web / Web 3.0

8

Who or what makes use of metadata?

• People:– an owner managing resources– a researcher seeking resources– third-party services

• Software agents:– aggregators (e.g. blog collections)– “portals” presenting “landscape” of data to users– “brokers” performing query tasks on behalf of users

Page 9: DM110 - Week 10 - Semantic Web / Web 3.0

9

What can they do with metadata?

• End user wants to:– find– identify– select– obtain/use– interpret

• Third-party service may want to:– disclose/promote– enable and control access/use– annotate– re-contextualise

Page 10: DM110 - Week 10 - Semantic Web / Web 3.0

10

Metadata and ontologies

• Metadata elements are used to provide structure to the description of a resource:– e.g. title, description, keywords, author, educational level,

version, location, language, date created, etc.

• Further structure is provided by a metadata schema or ontology:– For example, if there is metadata about a soccer team, an

underlying ontology will say that a soccer team always has a goalkeeper and always has a manager, so each metadata entry for a soccer team should have that information

Page 11: DM110 - Week 10 - Semantic Web / Web 3.0

11

How is metadata created?

• By software tools:– indexing robots, web crawlers– from resource content, from server info

• By people:– descriptions added by resource creator/owner– descriptions provided by third party services, specialist

cataloguers or resource users

• Creating (and maintaining) good quality metadata is not always cheap:– may be rights issues for metadata as well as for resources

Page 12: DM110 - Week 10 - Semantic Web / Web 3.0

12

Where can you find metadata?

• Embedded within the coding for a resource itself:– depends on format of resource– can metadata be extracted from resource

• Linked to resource• In a database of descriptions/repository of resources:

– may be remote database

• …• Adopt approach which offers most flexibility:

– may need to “present” different subsets of full metadata in different contexts

Page 13: DM110 - Week 10 - Semantic Web / Web 3.0

13

What about metadata standards?

• Metadata standards are agreed-on criteria for describing data to support interoperability

• Simple example:– January 31, 2006– 31 janvier 2006– 2006-01-31– 01-31-2006– 31012006

• Need some consistent forms for exchanging metadata• Many standards for different domains (Dublin Core,

Warwick Framework, SCORM, IMS, ARIADNE, IEEE LOM, AICC, ADL SCORM, Merlot, RDF), so may also need mappings between these standards

Page 14: DM110 - Week 10 - Semantic Web / Web 3.0

14

What is RDF?

• On the Semantic Web, we use a standard called RDF to express metadata about resources, and RDF Schema to create metadata schemas or ontologies

• RDF stands for Resource Description Framework• RDF is a framework for describing and interchanging

Semantic Web metadata• “RDF is an infrastructure that enables the encoding,

exchange, and reuse of structured metadata” - Bearman et al., 1999

Page 15: DM110 - Week 10 - Semantic Web / Web 3.0

15

A typical full text search without RDF

• Web pages at the moment are mainly text, e.g.“Stefan Decker works at DERI, funded by SFI.”

• NLP not evolved enough to solve human problems, e.g. how can one find out Stefan’s funding agency?

1. Google: “stefan decker +deri”– Did I choose the right keywords?

2. Look through results– How do I know Google’s rankings are correct?

3. Click on most likely link– But is it really the best choice?

4. Search through text for the answer– The answer in the text is ambiguous…

Page 16: DM110 - Week 10 - Semantic Web / Web 3.0

16

Same example but with RDF metadata

• If we use RDF metadata in a web page, e.g.

<Person><name>Stefan Decker</name><workplaceHomepage>http://www.deri.ie/</workplaceHomepage><fundedBy>http://www.sfi.ie/</fundedBy>

</Person>

• Now a computer can return an answer to a question such as “who funds Stefan Decker?” rather than requiring a combination of person plus computer to figure it out!

Page 17: DM110 - Week 10 - Semantic Web / Web 3.0

17

What does RDF consist of?

• Resources– A resource is a thing you talk about (can reference)– Resources have URIs (e.g. they may be web pages, a part of an

XML document, etc.)

• Properties – Slots, define relationships to other resources or atomic values

• Statements– “Resource has Property with Value” (expressed as a Subject /

Predicate / Object statement)– Values can be resources or atomic XML data (e.g. “literal” string)

• Frames– A straightforward way to express these abstract properties in

XML

Page 18: DM110 - Week 10 - Semantic Web / Web 3.0

18

A simple RDF example

http://www.w3.org/Home/Lassilas:Creator Ora Lassila

• Statement:– “Ora Lassila is the creator of the resource (web page)

http://www.w3.org/Home/Lassila”

• Structure:Resource (subject) http://www.w3.org/Home/Lassila

Property (predicate) http://www.schema.org/#Creator

Value (object) "Ora Lassila”

• Directed graph:

Page 19: DM110 - Week 10 - Semantic Web / Web 3.0

19

Simple RDF example shown in RDF/XML

• In the directed graphs, the arrows point from the subject to the object, and the text on the arrow is the predicate

• The ellipses are resources and the rectangles are literals or text strings

• We can also represent this graph model in RDF/XML:

<rdf:Description about=“http://www.w3.org/Home/Lassila”>

<Creator>Ora Lassila</Creator>

</rdf:Description>

Page 20: DM110 - Week 10 - Semantic Web / Web 3.0

20

Expanding on the previous example

http://www.w3.org/Home/Lassila

s:Creator

Person://fi/654645635

Name

Ora Lassila [email protected]

Email

• To add properties to the “Creator”, point through an intermediate resource (the ellipses are resources and the rectangles are literals or text strings)

Page 21: DM110 - Week 10 - Semantic Web / Web 3.0

21

Expanded RDF example shown in RDF/XML

<rdf:Description about=“http://www.w3.org/Home/Lassila”>

<Creator rdf:resource=“Person://fi/654645635”/>

</rdf:Description>

<rdf:Description about=“Person://fi/654645635”>

<Name>Ora Lassila</Name>

<Email>[email protected]</Email>

</rdf:Description>

Page 22: DM110 - Week 10 - Semantic Web / Web 3.0

22

What is an ontology?

• In a nutshell, ontologies are formal and consensual specifications of conceptualisations that provide a shared and common understanding of a domain

• Ontologies define the terms used to describe and represent an area of knowledge

• Ontologies are a key enabling technology for the Semantic Web

• They interweave human understanding of symbols with their machine processability

Page 23: DM110 - Week 10 - Semantic Web / Web 3.0

23

Ontologies on the Semantic Web

• Semantic Web ontologies have computer-usable definitions:

➔ Concepts (AKA classes) are general things in the domain:– Person, Document, Book, Web_Page

➔ Relationships exist among things:– Book, Web_Page are subclasses of Document

➔ Properties (attributes) that things may have:– Person has an age, Web_Page has a creation_date

Page 24: DM110 - Week 10 - Semantic Web / Web 3.0

24

Ontology structures

From: http://aot.ce.unipr.it/team/poggi/teaching/ia/docs/Ontology.pdf

Page 25: DM110 - Week 10 - Semantic Web / Web 3.0

25

Why use ontologies?

• Labeling:– If I say “car” and you say “automobile”, how do we know we

mean the same thing?

• Semantics:– If I say “vehicle”, how do you know if this includes buses,

powered motorcycles?

• Knowledge sharing and reuse:– Need to be able to create definitions of terms in a machine-

understandable format– Systematic categorisation and computation requires systematic

representation:• Systematic representation corresponds to an ontology

Page 26: DM110 - Week 10 - Semantic Web / Web 3.0

26

What is a concept?

• Concepts or “classes”:– Are in general language independent (the words ‘university’ and ‘ollscoil’

denote the same concept)

– Are mental or logical representations of reality

– Are related to other concepts

– Do not need symbols but hold them for means of communication

• A concept has:– Intension, i.e. meaning

– Extension, i.e. a set of objects that the concept refers to

• On the difference between intension and extension, consider phrases "Evening Star" and "Morning Star" that have different meanings (intension) yet both refer to planet Venus (extension)

• Ontology is mainly concerned with intension

Page 27: DM110 - Week 10 - Semantic Web / Web 3.0

27

Components of an ontology

• Concepts– Cat

– Dog

• Properties– Length

– Age

• Constraints– Cardinality is at least 1

– Maximum value is 300

• Axioms– Cows are larger than dogs

– Cats cannot eat only vegetation

• Relationships– Is a

– Part of

Page 28: DM110 - Week 10 - Semantic Web / Web 3.0

28

An ontology example in RDF

<rdf:Description ID=“Document">

<rdf:type resource="http://www.w3.org/...#Class"/>

<rdfs:subClassOf

rdf:resource="http://www.w3.org/...#Resource"/>

</rdf:Description>

<rdf:Description ID=“Book">

<rdf:type resource="http://www.w3.org/...#Class"/>

<rdfs:subClassOf rdf:resource="#Document"/>

</rdf:Description>

Page 29: DM110 - Week 10 - Semantic Web / Web 3.0

29

Implementing or creating ontologies

• Implementation consists in defining all the ontology components through an ontology definition language

• Generally in two stages:– Informal stage:

• Ontology is sketched out using either natural language descriptions or some diagram technique

– Formal stage:• Ontology is encoded in a formal knowledge representation

language, that is machine computable

• Different tools (e.g., Protégé) may help in the implementation

Page 30: DM110 - Week 10 - Semantic Web / Web 3.0

30

Can already describe lots of things semantically

• Geographic coordinates:– GEO

• Library books:– Dublin Core (DC)

• Online discussions:– SIOC

• People, social networks:– Friend-of-a-Friend (FOAF)

• Maybe even hormones!– GeneOnt

Page 31: DM110 - Week 10 - Semantic Web / Web 3.0

31

The power of the Semantic Web

• Interoperability and increased connectivity is possible through a commonality of expression

• Vocabularies can be combined and used together:– e.g. a description of a book using Dublin Core metadata can be

augmented with specifics about the book author using the Friend-of-a-Friend vocabulary

• Vocabularies can be easily extended (modules, etc.)• Intelligent search with more granularity and relevance:

– e.g. a search can be personalised to an individual by making use of their identity and relationship information

Page 32: DM110 - Week 10 - Semantic Web / Web 3.0

32

The challenge for the Semantic Web

• The Semantic Web can’t work all by itself:– If it did it would be called the “Magic Web”– It will need some help to become a reality

• For example, it is not very likely that you will be able to sell your car just by putting your RDF file on the Web

• Need society-scale applications:– Consumers and processors of Semantic Web data– Semantic Web agents or services– More advanced collaborative applications that make real use of

shared data and annotations

Page 33: DM110 - Week 10 - Semantic Web / Web 3.0

33

The path to Web 3.0

• The Semantic Web effort is mainly towards producing standards and recommendations that will interlink applications

• The Web 2.0 meme (already discussed) is about providing user applications

• Not mutually exclusive:– http://www.oreillynet.com/xml/blog/2005/10/

is_web_20_killing_the_semantic.html– With a little effort, many Web 2.0 applications can and do use

Semantic Web technologies to great benefit

Page 34: DM110 - Week 10 - Semantic Web / Web 3.0

34

Semantic Web + Web 2.0 = Web 3.0

• Web 2.0 applications such as blogging and wikis have become very popular and at the same time have created an interconnected information space (through the “blogosphere” and inter-wiki links)

• At the same time, these applications are experiencing boundaries in terms of information dissemination and automation, as they require increased levels of automation (i.e. more automated ways for information distribution)

• The Semantic Web is increasingly aiming at these applications areas:– Semantic Wikis, Semantic Desktops, etc.