semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 ·...

240
Semantic web เสถียร หันตา คณะเทคโนโลยีสารสนเทศและการสื่อสาร มหาวิทยาลัยพะเยา

Upload: others

Post on 06-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Semantic web

เ ส ถ ย ร ห น ต า

ค ณ ะ เ ท ค โ น โ ล ย ส า ร ส น เ ท ศ แ ล ะ ก า ร ส อ ส า ร

ม ห า ว ท ย า ล ย พ ะ เ ย า

Page 2: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Today’s WebMost of today’s Web content is suitable for human consumption◦ Even Web content that is generated

automatically from databases is usually presented without the original structural information found in databases

Typical Web uses today people’s◦ seeking and making use of information,

searching for and getting in touch with other people, reviewing catalogs of online stores and ordering products by filling out forms

➢สวนประกอบของเวบ➢แมวาจะถกน าเสนอจากฐานขอมลอตโนมตแตกถกน าเสนอโดยไมมโครงสรางของฐานขอมลใหเหน

➢การใชงานเวบ➢การคนหาและใชประโยชนจากขอมลโดยการกรอกผานฟอรมทมใหในเวบ

Page 3: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

การคนหาขอมลหาขอมลจากเวบไซตจ านวนมาก หลากหลายรปแบบ หลายวตถประสงค หลายภาษา

ตองการการรวบรวมขอมลเพอใหบรรลวตถประสงค

เปนกระบวนการทยาว และนาเบอ

Page 4: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

การท างานทแยกจากกนของแตละเวบไซตแตละเวบไซตท างานแยกกน ตองการการกรอกขอมลเขาไป แมวาจะเปนขอมลเดยวกน

เขาถงขอมลไดยาก

Page 5: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Keyword-Based Search Engines Current Web activities are not particularly well supported by software tools

◦ Except for keyword-based search engines (e.g. Google, AltaVista, Yahoo)

5

Page 6: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Problems of Keyword-Based Search Engines

High recall, low precision.

Low or no recall

Results are highly sensitive to vocabulary

Results are single Web pages

Human involvement is necessary to interpret and combine results

Results of Web searches are not readily accessible by other software tools

6

Page 7: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The Key Problem of Today’s WebThe meaning of Web content is not machine-accessible: lack of semantics

7

Page 8: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The Semantic Web ApproachRepresent Web content in a form that is more easily machine-processable.

Use intelligent techniques to take advantage of these representations.

The Semantic Web will gradually evolve out of the existing Web, it is not a competition to the current WWW

8

Page 9: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The Sematic Web ImpactKnowledge Management

B2C

B2B

WIKIs

Page 10: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The Semantic Web Impact – Knowledge Management

Knowledge management concerns itself with acquiring, accessing, and maintaining knowledge within an organization

Key activity of large businesses: internal knowledge as an intellectual asset

It is particularly important for international, geographically dispersed organizations

Most information is currently available in a weakly structured form (e.g. text, audio, video)

10

Page 11: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Limitations of Current Knowledge Management Technologies

Searching information ◦ Keyword-based search engines

Extracting information◦ human involvement necessary for browsing, retrieving, interpreting,

combining

Maintaining information◦ inconsistencies in terminology, outdated information.

Viewing information◦ Impossible to define views on Web knowledge

11

Page 12: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Semantic Web Enabled Knowledge Management

Knowledge will be organized in conceptual spaces according to its meaning.

Automated tools for maintenance and knowledge discovery

Semantic query answering

Query answering over several documents

Defining who may view certain parts of information (even parts of documents) will be possible.

12

Page 13: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The Semantic Web Impact –B2C Electronic Commmerce

A typical scenario: user visits one or several online shops, browses their offers, selects and orders products.

Ideally humans would visit all, or all major online stores; but too time consuming

Shopbots are a useful tool

13

Page 14: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Limitations of ShopbotsThey rely on wrappers: extensive programming required

Wrappers need to be reprogrammed when an online store changes its outfit

Wrappers extract information based on textual analysis

◦ Error-prone

◦ Limited information extracted

14

Page 15: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Semantic Web Enabled B2C Electronic Commerce

Software agents that can interpret the product information and the terms of service.

◦ Pricing and product information, delivery and privacy policies will be interpreted and compared to the user requirements.

Information about the reputation of shops

Sophisticated shopping agents will be able to conduct automated negotiations

15

Page 16: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The Semantic Web Impact –B2B Electronic Commerce

Greatest economic promise

Currently relies mostly on EDI◦ Isolated technology, understood only by experts

◦ Difficult to program and maintain, error-prone

◦ Each B2B communication requires separate programming

Web appears to be perfect infrastructure◦ But B2B not well supported by Web standards

16

Page 17: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Electronic Data Interchange (EDI)

Page 18: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Semantic Web Enabled B2B Electronic Commerce

Businesses enter partnerships without much overhead

Differences in terminology will be resolved using standard abstract domain models

Data will be interchanged using translation services.

Auctioning, negotiations, and drafting contracts will be carried out automatically (or semi-automatically) by software agents

18

Page 19: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

WikisCollections of web pages that allow users to add content via a browser interface

Wiki systems allow for collaborative knowledge

Users are free to add and change information without ownership of content, access restrictions, or rigid workflows

19

Page 20: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Some Uses of WikisDevelopment of bodies of knowledge in a community effort, with contributions from a wide range of users (e.g. Wikipedia)

Knowledge management of an activity or a project (e.g. brainstorming and exchanging ideas, coordinating activities, exchanging records of meetings)

20

Page 21: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Semantic Web Enabled WikisThe inherent structure of a wiki, given by the linking between pages, gets accessible to machines beyond mere navigation

Structured text and untyped hyperlinks are enriched by semantic annotations referring to an underlying model of the knowledge captured by the wiki − e.g. a hyperlink from Knossos to Heraklion could be annotated with

information is located in. This information could then be used for context-specific presentations of pages, advanced querying, and consistency verification

21

Page 22: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

ReferencesGrigoris Antoniou and Frank van Harmelen. A Semantic Web Primer. The MIT Press. 2003

https://www.edibasics.com/

https://www.w3.org/2009/Talks/1030-Philadelphia-IH/Tutorial.ppt

Page 23: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

23

Semantic Web TechnologiesExplicit Metadata

Ontologies

Logic and Inference

Agents

Page 24: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

24

On HTML

Web content is currently formatted for human readers rather than programs

HTML is the predominant language in which Web pages are written (directly or using tools)

Vocabulary describes presentation

เนอหาของเวบในปจจบน เปนรปแบบใหคนอาน ไมใช คอมพวเตอรสวนใหญเขยนอยโดยภาษา HTML

Page 25: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

25

An HTML Example<h1>Agilitas Physiotherapy Centre</h1>

Welcome to the home page of the Agilitas Physiotherapy Centre. Do

you feel pain? Have you had an injury? Let our staff Lisa Davenport,

Kelly Townsend (our lovely secretary) and Steve Matthews take care

of your body and soul.

<h2>Consultation hours</h2>

Mon 11am - 7pm<br>

Tue 11am - 7pm<br>

Wed 3pm - 7pm<br>

Thu 11am - 7pm<br>

Fri 11am - 3pm<p>

But note that we do not offer consultation during the weeks of the

<a href=". . .">State Of Origin</a> games.

Page 26: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

26

Problems with HTMLHumans have no problem with this

Machines (software agents) do:◦ How distinguish therapists from the secretary,

◦ How determine exact consultation hours

◦ They would have to follow the link to the State Of Origin games to find when they take place.

จาก ตวอยาง HTML ถาคนอานจะสามารถเขาใจได แตถาเปนคอมพวเตอรอานจะตความไมได

Page 27: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

27

A Better Representation<company>

<treatmentOffered>Physiotherapy</treatmentOffered>

<companyName>Agilitas Physiotherapy Centre</companyName>

<staff>

<therapist>Lisa Davenport</therapist>

<therapist>Steve Matthews</therapist>

<secretary>Kelly Townsend</secretary>

</staff>

</company>

Page 28: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

28

Explicit Metadata

This representation is far more easily processable by machines

Metadata: data about data◦ Metadata capture part of the meaning of data

Semantic Web does not rely on text-based manipulation, but rather onmachine-processable metadata

Semantic Web ไมพงพาการจดการ text แตจะใช การประมวลผล metadata

Page 29: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

29

Ontologies

The term ontology originates from philosophy

The study of the nature of existence

Different meaning from computer science

An ontology is an explicit and formal specification of a conceptualization

Ontology เปน ขอก าหนดทชเฉพาะ ชดเจน และเปนรปแบบ ของ แนวความคด

Page 30: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

30

Typical Components of OntologiesTerms denote important concepts (classes of objects) of the domain

◦ e.g. professors, staff, students, courses, departments

Relationships between these terms: typically class hierarchies◦ a class C to be a subclass of another class A if every object in C is also included in A

◦ e.g. all professors are staff members

Page 31: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

31

Further Components of Ontologies

Properties: ◦ e.g. X teaches Y

Value restrictions ◦ e.g. only faculty members can teach courses

Disjointness statements ◦ e.g. faculty and general staff are disjoint

Logical relationships between objects ◦ e.g. every department must include at least 10 faculty

Page 32: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

32

Example of a Class Hierarchy

Page 33: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

33

The Role of Ontologies on the Web

Ontologies provide a shared understanding of a domain: semantic interoperability◦ overcome differences in terminology

◦ mappings between ontologies

Ontologies are useful for the organization and navigation of Web sites

Ontology ท าใหเกดการเผยแพร ความเขาใจใน Domain

Page 34: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

34

The Role of Ontologies in Web Search

Ontologies are useful for improving the accuracy of Web searches ◦ search engines can look for pages that refer to a precise concept in an ontology

Web searches can exploit generalization/ specialization information ◦ If a query fails to find any relevant documents, the search engine may suggest to the user a

more general query.

◦ If too many answers are retrieved, the search engine may suggest to the user some specializations.

Ontology เพมความแมนย าในการคนหาเวบ โดยการใช generalization/specialization

Page 35: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

35

Web Ontology Languages

RDF Schema

RDF is a data model for objects and relations between them

RDF Schema is a vocabulary description language

Describes properties and classes of RDF resources

Provides semantics for generalization hierarchies of properties and classes

RDF เปนขอมลแบบจ าลองของวตถ และความสมพนธระหวางวตถ

Page 36: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

36

Web Ontology Languages (2)

OWL

A richer ontology language

relations between classes ◦ e.g., disjointness

cardinality ◦ e.g. “exactly one”

richer typing of properties

characteristics of properties (e.g., symmetry)

Page 37: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

37

Logic and Inference

Logic is the discipline that studies the principles of reasoning

Formal languages for expressing knowledge

Well-understood formal semantics◦ Declarative knowledge: we describe what holds without caring about how it can be deduced

Automated reasoners can deduce (infer) conclusions from the given knowledge

Logic คอ กฎระเบยบทศกษาหลกของเหตและผล

Page 38: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

38

An Inference Exampleprof(X) → faculty(X)

faculty(X) → staff(X)

prof(michael)

We can deduce the following conclusions:

faculty(michael)

staff(michael)

prof(X) → staff(X)

Page 39: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

39

Logic versus Ontologies

The previous example involves knowledge typically found in ontologies◦ Logic can be used to uncover ontological knowledge that is implicitly given ◦ It can also help uncover unexpected relationships and inconsistencies

Logic is more general than ontologies◦ It can also be used by intelligent agents for making decisions and selecting

courses of action

Logic สามารถน ามาใชเพอคนพบความรเกยวกบ ontological และยงท าใหคนพบความสมพนธ ทไมไดระบ

Page 40: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

40

Tradeoff between Expressive Power and Computational Complexity

The more expressive a logic is, the more computationally expensive it becomes to draw conclusions

◦ Drawing certain conclusions may become impossible if non-computability barriers are encountered.

Our previous examples involved rules “If conditions, then conclusion,” and only finitely many objects

◦ This subset of logic is tractable and is supported by efficient reasoning tools

ถา Logic แสดงออกมากขน การสรปผลกจะดขน การสรปผลจะท าไมไดถา พบปญหาทไมสามารถค านวณได

Page 41: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

41

Inference and Explanations

Explanations: the series of inference steps can be retraced

They increase users’ confidence in Semantic Web agents: “Oh yeah?” button

Activities between agents: create or validate proofs

You press it when you loses that feeling of trust. It says to the Web, "so how do I know I can trust this information?". The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons.

Page 42: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

42

Typical Explanation ProcedureFacts will typically be traced to some Web addresses

◦ The trust of the Web address will be verifiable by agents

Rules may be a part of a shared commerce ontology or the policy of the online shop

Page 43: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

43

Software AgentsSoftware agents work autonomously and proactively

◦ They evolved out of object oriented and compontent-based programming

A personal agent on the Semantic Web will:◦ receive some tasks and preferences from the person

◦ seek information from Web sources, communicate with other agents

◦ compare information about user requirements and preferences, make certain choices

◦ give answers to the user

Page 44: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

44

Intelligent Personal Agents

Page 45: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

45

Semantic Web Agent Technologies

Metadata ◦ Identify and extract information from Web sources

Ontologies◦ Web searches, interpret retrieved information

◦ Communicate with other agents

Logic◦ Process retrieved information, draw conclusions

Page 46: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

46

Semantic Web Agent Technologies (2)

Further technologies (orthogonal to the Semantic Web technologies)◦ Agent communication languages

◦ Formal representation of beliefs, desires, and intentions of agents

◦ Creation and maintenance of user models.

Page 47: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

47

A Layered ApproachThe development of the Semantic Web proceeds in steps

◦ Each step building a layer on top of another

Principles:

Downward compatibility

Upward partial understanding

Page 48: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

48

The Semantic Web Layer Tower

Page 49: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

An Alternative Layer StackTakes recent developments into account

The main differences are:

−The ontology layer is instantiated with two alternatives: the current standard Web ontology language, OWL, and a rule-based language

−DLP is the intersection of OWL and Horn logic, and serves as a common foundation

The Semantic Web Architecture is currently being debated and may be subject to refinements and modifications in the future.

Chapter 1 A SEMANTIC WEB PRIMER 49

Page 50: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Alternative Semantic Web Stack

50

Page 51: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

51

Semantic Web LayersXML layer

◦ Syntactic basis

RDF layer◦ RDF basic data model for facts

◦ RDF Schema simple ontology language

Ontology layer◦ More expressive languages than RDF Schema

◦ Current Web standard: OWL

Page 52: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

52

Semantic Web Layers (2)Logic layer

◦ enhance ontology languages further

◦ application-specific declarative knowledge

Proof layer◦ Proof generation, exchange, validation

Trust layer◦ Digital signatures

◦ recommendations, rating agencies ….

Page 53: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Data integration

1. แมปขอมลตาง ๆ ใหเปน abstract data

2. รวมการแสดงผล

3. ท าการสอบถาม (query) จากขอมลทงหมด

Page 54: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

A simplified bookstore data (dataset “A”)

ID Author Title Publisher Year

ISBN0-00-651409-X The Glass Palace 2000id_xyz id_qpr

ID Name Home Page

ID City

Harper Collins London

id_xyz Ghosh, Amitav http://www.amitavghosh.com

Publ. Name

id_qpr

ผแตง

ส านกพมพ

หนงสอ

Page 55: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

1st: export data as a set of relations

Page 56: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Another bookstore data (dataset “F”)

A B D E

1 ID Titre Original

2

ISBN0 2020386682 A13 ISBN-0-00-651409-X

3

6 ID Auteur

7 ISBN-0-00-651409-X A12

11

12

13

Traducteur

Le Palais

des

miroirs

Nom

Ghosh, Amitav

Besse, Christianne

ขอมลลกษณะเดยวกนแตคนละภาษา

Page 57: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

2nd: export second set of data

Page 58: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

3rd: start merging data

Page 59: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

3rd: start merging your data (cont.)

Page 60: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

3rd: merge identical resources

Page 61: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Start making queries…• User of data “F” can now ask queries like:

• “give me the title of the original”• well, … « donnes-moi le titre de l’original »

• This information is not in the dataset “F”…

• …but can be retrieved by merging with dataset “A”!

Page 62: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

However, more can be achieved…• We “feel” that a:author and f:auteur should be the

same

• But an automatic merge doest not know that!

• Let us add some extra information to the merged data:• a:author same as f:auteur

• both identify a “Person”

• a term that a community may have already defined:• a “Person” is uniquely identified by his/her name and, say, homepage

• it can be used as a “category” for certain type of resources

Page 63: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

3rd revisited: use the extra knowledge

Page 64: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Start making richer queries!• User of dataset “F” can now query:

• “donnes-moi la page d’accueil de l’auteur de l’originale”• well… “give me the home page of the original’s ‘auteur’”

• The information is not in datasets “F” or “A”…

• …but was made available by:• merging datasets “A” and datasets “F”

• adding three simple extra statements as an extra “glue”

Page 65: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Combine with different datasets• Using, e.g., the “Person”, the dataset can be combined with

other sources

• For example, data in Wikipedia can be extracted using dedicated tools• e.g., the “dbpedia” project can extract the “infobox” information

from Wikipedia already…

Page 66: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Merge with Wikipedia data

Page 67: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Merge with Wikipedia data

Page 68: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Merge with Wikipedia data

Page 69: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of
Page 70: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

References Grigoris Antoniou and Frank van Harmelen. A Semantic Web Primer. The MIT Press. 2003

Toby segaran, Colin Evans, and Jamie Taylor. Programming the Semantic Web. O Reilly Media Inc. 2009

John Hebeler, Matthew Fisher, Ryan Blace, Andrew Perez-Lopez. Semantic Web Programming. Wiely Publishing Inc. 2009

https://www.w3.org/2009/Talks/1030-Philadelphia-IH/Tutorial.ppt

Page 71: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

XML

Page 72: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

72

An HTML Example<h2>Nonmonotonic Reasoning: Context-

Dependent Reasoning</h2>

<i>by <b>V. Marek</b> and

<b>M. Truszczynski</b></i><br>

Springer 1993<br>

ISBN 0387976892

Page 73: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

73

The Same Example in XML<book>

<title>Nonmonotonic Reasoning: Context- Dependent Reasoning</title>

<author>V. Marek</author>

<author>M. Truszczynski</author>

<publisher>Springer</publisher>

<year>1993</year>

<ISBN>0387976892</ISBN>

</book>

Page 74: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

74

HTML versus XML: SimilaritiesBoth use tags (e.g. <h2> and </year>)

Tags may be nested (tags within tags)

Human users can read and interpret both HTML and XML representations quite easily

… But how about machines?

Page 75: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

75

Problems with Automated Interpretation of HTML Documents

An intelligent agent trying to retrieve the names

of the authors of the book

Authors’ names could appear immediately after the title

or immediately after the word by

Are there two authors?

Or just one, called “V. Marek and M. Truszczynski”?

Page 76: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

76

HTML vs XML: Structural Information

HTML documents do not contain structural information: pieces of the document and their relationships.

XML more easily accessible to machines because ◦ Every piece of information is described.

◦ Relations are also defined through the nesting structure.

◦ E.g., the <author> tags appear within the <book> tags, so they describe properties of the particular book.

Page 77: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

77

HTML vs XML: Structural Information (2)

A machine processing the XML document would be able to deduce that

◦ the author element refers to the enclosing book element

◦ rather than by proximity considerations

XML allows the definition of constraints on values◦ E.g. a year must be a number of four digits

Page 78: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

78

HTML vs XML: FormattingThe HTML representation provides more than the XML representation:

◦ The formatting of the document is also described

Τhe main use of an HTML document is to display information: it must define formatting

XML: separation of content from display◦ same information can be displayed in different ways

Page 79: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

79

The XML LanguageAn XML document consists of

a prolog

a number of elements

an optional epilog (not discussed)

Page 80: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?>

<!-- Identifier Card -->

<!DOCTYPE book SYSTEM “idcards.dtd">

<IdCards>

<IdCard>

<IdNumber>1234567890</IdNumber>

<Name title = "Mr." sex = "Male">

<FirstName>John</FirstName>

<LastName>Red</LastName>

</Name>

<DateOfBirth>

<Date>1</Date>

<Month>January</Month>

<Year>1900</Year>

</DateOfBirth>

</IdCard>…</IdCards>

XML Declaration

Comment

DocumentElements

Root Element

Element

Start Tag End Tag

Attribute

Textual Content

Prolog

Page 81: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

81

Prolog of an XML DocumentThe prolog consists of

an XML declaration and

an optional reference to external structuring documents

<?xml version="1.0" encoding="UTF-16"?>

<!DOCTYPE book SYSTEM "book.dtd">

Page 82: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

82

XML ElementsThe “things” the XML document talks about

◦ E.g. books, authors, publishers

An element consists of:◦ an opening tag

◦ the content

◦ a closing tag

<lecturer>David Billington</lecturer>

Page 83: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

83

XML Elements (2)Tag names can be chosen almost freely.

The first character must be a letter, an underscore, or a colon

No name may begin with the string “xml” in any combination of cases ◦ E.g. “Xml”, “xML”

Page 84: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

84

Content of XML ElementsContent may be text, or other elements, or nothing

<lecturer>

<name>David Billington</name>

<phone> +61 − 7 − 3875 507 </phone>

</lecturer>

If there is no content, then the element is called empty; it is abbreviated as follows:

<lecturer/> for <lecturer></lecturer>

Page 85: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

85

XML AttributesAn empty element is not necessarily meaningless

◦ It may have some properties in terms of attributes

An attribute is a name-value pair inside the opening tag of an element

<lecturer name="David Billington" phone="+61 − 7 − 3875 507"/>

Page 86: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

86

XML Attributes: An Example<order orderNo="23456" customer="John Smith"

date="October 15, 2002">

<item itemNo="a528" quantity="1"/>

<item itemNo="c817" quantity="3"/>

</order>

Page 87: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

87

The Same Example without Attributes<order>

<orderNo>23456</orderNo>

<customer>John Smith</customer>

<date>October 15, 2002</date>

<item>

<itemNo>a528</itemNo>

<quantity>1</quantity>

</item>

<item>

<itemNo>c817</itemNo>

<quantity>3</quantity>

</item>

</order>

Page 88: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

88

XML Elements vs AttributesAttributes can be replaced by elements

When to use elements and when attributes is a matter of taste

But attributes cannot be nested

Page 89: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

89

Further Components of XML Docs

Comments◦ A piece of text that is to be ignored by parser

◦ <!-- This is a comment -->

Processing Instructions (PIs)◦ Define procedural attachments

◦ <?stylesheet type="text/css" href="mystyle.css"?>

Page 90: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

90

Well-Formed XML DocumentsSyntactically correct documents

Some syntactic rules:◦ Only one outermost element (called root element)

◦ Each element contains an opening and a corresponding closing tag

◦ Tags may not overlap◦ <author><name>Lee Hong</author></name>

◦ Attributes within an element have unique names

◦ Element and tag names must be permissible

Page 91: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Well-Formed XML1. เอกสาร XML แตละเอกสารม element รากเพยง 1 elementเทานน element ราก

ประกอบดวย element ลก

2. Element แตละ element ประกอบดวยแทกเปด และแทกปดเสมอ และ element ลก ตองปดแทกกอน parent element (properly nesting)

3. รายละเอยด/ขอมลของ Attribute ตองอยระหวางเครองหมายค าพด ("") เสมอ

4. อกษรตวเลกและอกษรตวใหญไมใชอกษรตวเดยวกน (case sensitive)

5. ชอของ element เรมตนดวยตวอกษรอะไรกไดทเปนทยอมรบ หรอ underscore หรอ colon อยางไรกตามไมแนะน าใหใช underscore และ colon เพราะอาจจะท าใหผอานเอกสารสบสน

6. อกขระตวถดมาของชอ element เปนตวอกษร ตวเลข หรอ underscore หรอ hyphen หรอ colon หรอ จด และอกขระตวอนๆทนอกเหนอจากรหส ASCII

Page 92: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

92

The Tree Model of XML Documents: An Example

<email>

<head>

<from name="Michael Maher"

address="[email protected]"/>

<to name="Grigoris Antoniou"

address="[email protected]"/>

<subject>Where is your draft?</subject>

</head>

<body>

Grigoris, where is the draft of the paper you promised me

last week?

</body>

</email>

Page 93: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

93

The Tree Model of XML Documents: An Example (2)

Page 94: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

94

The Tree Model of XML Docs

The tree representation of an XML document is an ordered labeled tree:

◦ There is exactly one root

◦ There are no cycles

◦ Each non-root node has exactly one parent

◦ Each node has a label.

◦ The order of elements is important

◦ … but the order of attributes is not important

Page 95: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

XML Structuring

Page 96: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

96

Structuring

a) DTDs

b)XML Schema

Page 97: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

97

Structuring XML Documents Define all the element and attribute names that may be used

Define the structure ◦ what values an attribute may take

◦ which elements may or must occur within other elements, etc.

If such structuring information exists, the document can be validated

Page 98: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

98

Structuring XML Dcuments (2)An XML document is valid if

◦ it is well-formed

◦ respects the structuring information it uses

There are two ways of defining the structure of XML documents: ◦ DTDs (the older and more restricted way)

◦ XML Schema (offers extended possibilities)

Page 99: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

XML 99

Internal DTD Declaration

<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>]><note><to>Tove</to> <from>Jani</from> <heading>Reminder</heading><body>Don't forget me this weekend</body> </note>

Page 100: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

XML 100

External DTD Declaration

<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note><to>Tove</to> <from>Jani</from> <heading>Reminder</heading><body>Don't forget me this weekend</body> </note>

<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>

note.dtd

Page 101: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

101

DTD: Element Type Definition<lecturer>

<name>David Billington</name>

<phone> +61 − 7 − 3875 507 </phone>

</lecturer>

DTD for above element (and all lecturer elements):

<!ELEMENT lecturer (name,phone)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phone (#PCDATA)>

Page 102: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

102

The Meaning of the DTDThe element types lecturer, name, and phone may be used in the document

A lecturer element contains a name element and a phone element, in that order (sequence)

A name element and a phone element may have any content

In DTDs, #PCDATA is the only atomic type for elements

Page 103: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

103

DTD: Disjunction in Element Type Definitions

We express that a lecturer element contains either a name element or a phone element as follows:

<!ELEMENT lecturer (name|phone)>

A lecturer element contains a name element and a phone element in any order.

<!ELEMENT lecturer((name,phone)|(phone,name))>

Page 104: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

104

Example of an XML Element<order orderNo="23456"

customer="John Smith"

date="October 15, 2002">

<item itemNo="a528" quantity="1"/>

<item itemNo="c817" quantity="3"/>

</order>

Page 105: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

105

The Corresponding DTD

<!ELEMENT order (item+)>

<!ATTLIST order orderNo ID #REQUIRED

customerCDATA #REQUIRED

date CDATA #REQUIRED>

<!ELEMENT item EMPTY>

<!ATTLIST item itemNo ID #REQUIRED

quantity CDATA #REQUIRED

comments CDATA #IMPLIED>

Page 106: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

106

Comments on the DTD

The item element type is defined to be empty

+ (after item) is a cardinality operator:◦ ?: appears zero times or once

◦ *: appears zero or more times

◦ +: appears one or more times

◦ No cardinality operator means exactly once

Page 107: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

107

Comments on the DTD (2)In addition to defining elements, we define attributes

This is done in an attribute list containing:◦ Name of the element type to which the list applies

◦ A list of triplets of attribute name, attribute type, and value type

Attribute name: A name that may be used in an XML document using a DTD

Page 108: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

108

DTD: Attribute TypesSimilar to predefined data types, but limited selection

The most important types are◦ CDATA, a string (sequence of characters)

◦ ID, a name that is unique across the entire XML document

◦ IDREF, a reference to another element with an ID attribute carrying the same value as the IDREF attribute

◦ IDREFS, a series of IDREFs

◦ (v1| . . . |vn), an enumeration of all possible values

Limitations: no dates, number ranges etc.

Page 109: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

109

DTD: Attribute Value Types#REQUIRED

◦ Attribute must appear in every occurrence of the element type in the XML document

#IMPLIED◦ The appearance of the attribute is optional

#FIXED "value"◦ Every element must have this attribute

"value"◦ This specifies the default value for the attribute

Page 110: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

110

Referencing with IDREF and IDREFS

<!ELEMENT family (person*)>

<!ELEMENT person (name)>

<!ELEMENT name (#PCDATA)>

<!ATTLIST person id ID #REQUIRED

mother IDREF #IMPLIED

father IDREF #IMPLIED

children IDREFS #IMPLIED>

Page 111: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

111

An XML Document Respecting the DTD<family>

<person id="bob" mother="mary" father="peter">

<name>Bob Marley</name>

</person>

<person id="bridget" mother="mary">

<name>Bridget Jones</name>

</person>

<person id="mary" children="bob bridget">

<name>Mary Poppins</name>

</person>

<person id="peter" children="bob">

<name>Peter Marley</name>

</person>

</family>

Page 112: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

XML EntitiesAn XML entity can play the role of

◦ a placeholder for repeatable characters

◦ a section of external data

◦ a part of a declaration for elements

⚫ We can use the entity reference &thisyear instead of the value " 2007 "

<!ENTITY thisyear " 2007 " >

112

Page 113: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

113

A DTD for an Email Element<!ELEMENT email (head,body)>

<!ELEMENT head (from,to+,cc*,subject)>

<!ELEMENT from EMPTY>

<!ATTLIST from name CDATA #IMPLIED

address CDATA #REQUIRED>

<!ELEMENT to EMPTY>

<!ATTLIST to name CDATA #IMPLIED

address CDATA #REQUIRED>

Page 114: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

114

A DTD for an Email Element (2)

<!ELEMENT cc EMPTY>

<!ATTLIST cc name CDATA #IMPLIED

addressCDATA #REQUIRED>

<!ELEMENT subject (#PCDATA)>

<!ELEMENT body (text,attachment*)>

<!ELEMENT text (#PCDATA)>

<!ELEMENT attachment EMPTY>

<!ATTLIST attachment

encoding (mime|binhex) "mime"

file CDATA #REQUIRED>

Page 115: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

115

Interesting Parts of the DTDA head element contains (in that order):

◦ a from element

◦ at least one to element

◦ zero or more cc elements

◦ a subject element

In from, to, and cc elements ◦ the name attribute is not required

◦ the address attribute is always required

Page 116: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

116

Interesting Parts of the DTD (2)A body element contains

◦ a text element

◦ possibly followed by a number of attachment elements

The encoding attribute of an attachment element must have either the value “mime” or “binhex”

◦ “mime” is the default value

Page 117: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

117

Remarks on DTDs

A DTD can be interpreted as an Extended Backus-Naur Form (EBNF)

◦ <!ELEMENT email (head,body)>

◦ is equivalent to email ::= head body

Recursive definitions possible in DTDs ◦ <!ELEMENT bintree

((bintree root bintree)|emptytree)>

Page 118: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

118

Structuring

a) DTDs

b) XML Schema

Page 119: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

119

XML Schema

Significantly richer language for defining the structure of XML documents

Tts syntax is based on XML itself◦ not necessary to write separate tools

Reuse and refinement of schemas◦ Expand or delete already existent schemas

Sophisticated set of data types, compared to DTDs(which only supports strings)

Page 120: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

120

XML Schema (2)

An XML schema is an element with an opening tag like

<schema "http://www.w3.org/2000/10/XMLSchema"

version="1.0">

Structure of schema elements◦ Element and attribute types using data types

Page 121: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

121

Element Types

<element name="email"/>

<element name="head" minOccurs="1" maxOccurs="1"/>

<element name="to" minOccurs="1"/>

Cardinality constraints:

minOccurs="x" (default value 1)

maxOccurs="x" (default value 1)

Generalizations of *,?,+ offered by DTDs

Page 122: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

122

Attribute Types

<attribute name="id" type="ID“ use="required"/>

< attribute name="speaks" type="Language"

use="default" value="en"/>

Existence: use="x", where x may be optional orrequired

Default value: use="x" value="...", where x may be default or fixed

Page 123: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

123

Data Types

There is a variety of built-in data types◦ Numerical data types: integer, Short etc.

◦ String types: string, ID, IDREF, CDATA etc.

◦ Date and time data types: time, Month etc.

There are also user-defined data types◦ simple data types, which cannot use elements or

attributes

◦ complex data types, which can use these

Page 124: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

124

Data Types (2)

Complex data types are defined from already existing data types by defining some attributes (if any) and using:

◦ sequence, a sequence of existing data type elements (order is important)

◦ all, a collection of elements that must appear (order is not important)

◦ choice, a collection of elements, of which one will be chosen

Page 125: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

125

A Data Type Example

<complexType name="lecturerType">

<sequence>

<element name="firstname" type="string"

minOccurs="0“ maxOccurs="unbounded"/>

<element name="lastname" type="string"/>

</sequence>

<attribute name="title" type="string" use="optional"/>

</complexType>

Page 126: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

126

Data Type Extension

Already existing data types can be extended by new elements or attributes. Example:

<complexType name="extendedLecturerType">

<extension base="lecturerType">

<sequence>

<element name="email" type="string"

minOccurs="0" maxOccurs="1"/>

</sequence>

<attribute name="rank" type="string" use="required"/>

</extension>

</complexType>

Page 127: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

127

Resulting Data Type

<complexType name="extendedLecturerType">

<sequence>

<element name="firstname" type="string"

minOccurs="0" maxOccurs="unbounded"/>

<element name="lastname" type="string"/>

<element name="email" type="string"

minOccurs="0" maxOccurs="1"/>

</sequence>

<attribute name="title" type="string" use="optional"/>

<attribute name="rank" type="string" use="required"/>

</complexType>

Page 128: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

128

Data Type Extension (2)

A hierarchical relationship exists between the original and the extended type

◦ Instances of the extended type are also instances of the original type

◦ They may contain additional information, but neither less information, nor information of the wrong type

Page 129: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

129

Data Type Restriction

An existing data type may be restricted by adding constraints on certain values

Restriction is not the opposite from extension ◦ Restriction is not achieved by deleting elements or attributes

The following hierarchical relationship still holds: ◦ Instances of the restricted type are also instances of the original

type

◦ They satisfy at least the constraints of the original type

Page 130: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

130

Example of Data Type Restriction

<complexType name="restrictedLecturerType">

<restriction base="lecturerType">

<sequence>

<element name="firstname" type="string"

minOccurs="1" maxOccurs="2"/>

</sequence>

<attribute name="title" type="string"

use="required"/>

</restriction>

</complexType>

Page 131: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

131

Restriction of Simple Data Types

<simpleType name="dayOfMonth">

<restriction base="integer">

<minInclusive value="1"/>

<maxInclusive value="31"/>

</restriction>

</simpleType>

Page 132: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

132

Data Type Restriction: Enumeration

<simpleType name="dayOfWeek">

<restriction base="string">

<enumeration value="Mon"/>

<enumeration value="Tue"/>

<enumeration value="Wed"/>

<enumeration value="Thu"/>

<enumeration value="Fri"/>

<enumeration value="Sat"/>

<enumeration value="Sun"/>

</restriction>

</simpleType>

Page 133: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

133

XML Schema: The Email Example

<element name="email" type="emailType"/>

<complexType name="emailType">

<sequence>

<element name="head" type="headType"/>

<element name="body" type="bodyType"/>

</sequence>

</complexType>

Page 134: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

134

XML Schema: The Email Example (2)

<complexType name="headType">

<sequence>

<element name="from" type="nameAddress"/>

<element name="to" type="nameAddress"

minOccurs="1" maxOccurs="unbounded"/>

<element name="cc" type="nameAddress"

minOccurs="0" maxOccurs="unbounded"/>

<element name="subject" type="string"/>

</sequence>

</complexType>

Page 135: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

135

XML Schema: The Email Example (3)

<complexType name="nameAddress">

<attribute name="name" type="string" use="optional"/>

<attribute name="address" type="string" use="required"/>

</complexType>

Similar for bodyType

Page 136: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF

Page 137: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

XML --> RDF<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

XML

Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

RDF

Yangtze.xml

Yangtze.rdf

"convert to"

Page 138: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The RDF Format

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

RDF provides an ID attribute for identifying the resource being described.

The ID attribute is in the RDF namespace.

Add the "fragment identifier symbol" to the namespace.

1

2

3

Page 139: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The RDF Format (cont.)

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Identifies the type(class) of the resource being described.

Identifies the resource being described. Thisresource is an instance of River.

These are properties,or attributes, of thetype (class).

Values of the properties

1

2

3

4

Page 140: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Namespace Convention

xmlns="http://www.geodesy.org/river#"

Question: Why was "#" placed onto the end of the namespace? E.g.,

Answer: RDF is very concerned about uniquely identifying things -uniquely identifying the type (class) and uniquely identifying the properties.If we concatenate the namespace with the type then we get a uniqueidentifier for the type, e.g.,

http://www.geodesy.org/river#River

If we concatenate the namespace with a property then we get a uniqueidentifier for the property, e.g.,

http://www.geodesy.org/river#length

http://www.geodesy.org/river#startingLocation

http://www.geodesy.org/river#endingLocation

Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name.

Page 141: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The RDF Format

<?xml version="1.0"?><Class rdf:ID="Resource"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="uri">

<property>value</property><property>value</property>

...</Class>

Page 142: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Uniquely Identify the Resource

Earlier we said that RDF is very concerned about uniquely identifying the type (class) and the properties. RDF is also very concerned about uniquely identifying the resource, e.g.,

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

This is the resource being described. We want to uniquelyidentify this resource.

Page 143: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:IDThe value of rdf:ID is a "relative URI".

The "complete URI" is obtained by concatenating the URL of the XML document with "#" and then the value of rdf:ID, e.g.,

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Suppose that this RDF/XML document is located at this URL: http://www.china.org/geography/rivers.Thus, the complete URI for this resource is:

Yangtze.rdf

http://www.china.org/geography/rivers#Yangtze

Page 144: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

xml:baseOn the previous slide we showed how the URL of the document provided the base URI.

Depending on the location of the document is brittle: it will break if the document is moved, or is copied to another location.

A more robust solution is to specify the base URI in the document, e.g.,

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xml:base="http://www.china.org/geography/rivers">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Resource URI = concatenation(xml:base, '#', rdf:ID)= concatenation(http://www.china.org/geography/rivers, '#', "Yangtze")= http://www.china.org/geography/rivers#Yangtze

Page 145: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:aboutInstead of identifying a resource with a relative URI (which then requires a base URI to be prepended), we can give the complete identity of a resource. However, we use rdf:about, rather than rdf:ID, e.g.,

<?xml version="1.0"?><River rdf:about="http://www.china.org/geography/rivers#Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Page 146: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Triple -> resource/property/value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's ...

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea

resource property value

Page 147: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The RDF Format = triples!The fundamental design pattern of RDF is to structure your XML data as resource/property/value triples!

The value of a property can be a literal (e.g., length has a value of 6300 kilometers).Also, the value of a property can be a resource, as shown above (e.g., property-Ahas a value of Resource-B, property-B has a value of Resource-C). We will see examplesof properties having a resource value in a little bit.

<?xml version="1.0"?><Resource-A>

<property-A><Resource-B>

<property-B><Resource-C>

<property-C>Value-C

</property-C></Resource-C>

</property-B></Resource-B>

</property-A></Resource-A>

value of property-A

value of property-B

Notice that the RDF design pattern is analternating sequence of resource-property.

This pattern is known as "striping".

Page 148: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Naming ConventionThe convention is to use a capital letter to start a type (class) name, and use a lowercase letter to start a property name.

◦ This helps the eye quickly discern the striping pattern.

<?xml version="1.0"?><River rdf:about="http://www.china.org/geography/rivers#Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

uppercase

lowercase

Page 149: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Model (graph)

Legend:Ellipse indicates "Resource"Rectangle indicates "literal string value"

Page 150: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:Description + rdf:typeThere is still another way of representing the XML. This way makes it very clear that you are describing something, and it makes it very clear what the type (class) is of the thing you are describing:

<?xml version="1.0"?><rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<rdf:type rdf:resource="http://www.geodesy.org/river#River"/><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</rdf:Description>

This is read as: "This is a Description about the resource http://www.china.org/geography/rivers#Yangtze.This resource is an instance of the River type (class). The http://www.china.org/geography/rivers#Yangtzeresource has a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau,and an endingLocation of the East China Sea."

Note: this form of describing a resource is called the "long form". The form we have seen previously is anabbreviation of this long form. An RDF Parser interprets the abbreviated form as if it were this long form.

Page 151: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

AlternativeAlternatively we can use rdf:ID rather than rdf:about, as shown here:

<?xml version="1.0"?><rdf:Description rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"

xml:base="http://www.china.org/geography/rivers"><rdf:type rdf:resource="http://www.geodesy.org/river#River"/><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</rdf:Description>

Page 152: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Equivalent Representations!

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xml:base="http://www.china.org/geography/rivers">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

<?xml version="1.0"?><River rdf:about="http://www.china.org/geography/rivers#Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

<?xml version="1.0"?><rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<rdf:type rdf:resource="http://www.geodesy.org/river#River"/><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</rdf:Description>

Note: In the RDF literature the examplesare typically shown in this form.

Page 153: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Namespace

http://www.w3.org/1999/02/22-rdf-syntax-ns#

ID

about

type

resource

Description

Page 154: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

TerminologyAs you read the RDF literature you may see the following terminology:

◦ Subject: this term refers to the item that is playing the role of the resource.

◦ predicate: this term refers to the item that is playing the role of the property.

◦ Object: this term refers to the item that is playing the role of the value.

Subject Objectpredicate

Resource Valueproperty

Equivalent!

Page 155: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation><Dam id="ThreeGorges"

xmlns="http://www.geodesy.org/dam"><name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam></River>

Yangtze2.xml

Modify the following XML document so that it is RDF-compliant:

Page 156: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Note the two types (classes)

River Dam

Instance: YangtzeProperties:

lengthstartingLocationendingLocation

Instance: ThreeGorgesProperties:

namewidthheightcost

Page 157: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Dam - out of place<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation><Dam id="ThreeGorges"

xmlns="http://www.geodesy.org/dam"><name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam></River>Dam

Types (classes) contain properties . Here we see the River type containing the properties - length, startingLocation, and endingLocation. It also shows River containing a type - Dam. Thus, there is a Resource that contains another Resource. This is inconsistent with RDF design pattern. (We are seeing one of the benefits of using the RDF format - to identify inconsistencies in an XML design.)

Page 158: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Property value must be a Literal or a Resource

<length>6300 kilometers</length>

property

Value is a Literal

<obstacle> <Dam id="ThreeGorges"

xmlns="http://www.geodesy.org/dam"><name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam></obstacle>

property

Value is a Resource

Page 159: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Modified XML (to make it consistent)

<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"><length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation><obstacle>

<Dam id="ThreeGorges"xmlns="http://www.geodesy.org/dam">

<name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam></obstacle>

</River>

Yangtze2,v2.xml

"The Yangtze River has an obstacle that is the ThreeGorges Dam. The Damhas a name - The Three Gorges Dam. It has a width of 1.5 miles, a height of 610 feet,and a cost of $30 billion."

Page 160: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Format<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xml:base="http://www.china.org/geography/rivers">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation><obstacle>

<Dam rdf:ID="ThreeGorges"xmlns="http://www.geodesy.org/dam#">

<name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam></obstacle>

</River>

Changed id to rdf:IDAdded the '#' symbol

As always, the other representations using rdf:about and rdf:Description are available.

Page 161: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Model (graph)

Page 162: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><Dam rdf:ID="ThreeGorges"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/dam#"xml:base="http://www.china.org/geography/rivers">

<name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam>

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xml:base="http://www.china.org/geography/rivers">

<length>6300 kilometers</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation><obstacle rdf:resource="http://www.china.org/geography/rivers#ThreeGorges"/>

</River>

Three-Gorges-Dam.rdf

Alternatively, suppose that someone has already created a document containing information about the Three Gorges Dam:

Yangtze.rdf

Then we can simply reference the Three Gorges Dam resource using rdf:resource, as shown here:

Page 163: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Anyone, Anywhere, Anytime Can Talk About a ResourceIn all of our examples we have provided a unique identifier to resources, e.g.,

http://www.china.org/geography/rivers#Yangtze

• Consequently, if another RDF document identifies the

same resource then the data that it specifies gives

additional data about that resource.

• An aggregator tool will be able to collect all data about

a resource and present a consolidated set of data for the

resource. That's powerful!

Page 164: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:ID versus rdf:aboutWhen should rdf:ID be used? When should rdf:about be used?

◦ When you want to introduce a resource, and provide an initial set of information about a resource use rdf:ID

◦ When you want to extend the information about a resource use rdf:about

◦ The RDF philosophy is akin to the Web philosophy. That is, anyone, anywhere, anytime can provide information about a resource.

Page 165: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"xmlns:uom="http://www.measurements.org/units-of-measure">

<length uom:units="kilometers">6300</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

XML

Yangtze4.xml

Yangtze4.rdf

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length><rdf:Description>

<rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</rdf:Description></length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

RDF

Page 166: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length uom:units="kilometers">6300</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Yangtze4.xml

RDF does not allow attributes on the properties (except for special RDFattributes such as rdf:resource). So we need to make the uom:units attributea child element.Your first instinct might be to modify length to have two child elements:

<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length><value>6300</value><uom:units>kilometers</uom:units>

</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

However, nowthe lengthproperty hasas its value twovalues.RDF only binary relationsi.e., a single value for aproperty.

Page 167: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:value

length

6300

kilometers

length has two values - 6300 and kilometers.RDF provides a special property, rdf:value, tobe used for specifying the "primary" value.In this example, 6300 is the primary value, andkilometers is a value which provides additionalinformation about the primary value.

Page 168: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Format<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length><rdf:Description>

<rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</rdf:Description></length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Yangtze4.rdf

An anonymousresource

Read this as: "The Yangtze River has a length whose value is a resourcewhich has a value of 6300 and whose units is kilometers.

Page 169: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Advantage of anonymous resources

<rdf:Description><rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</rdf:Description>

This is an anonymous resource. Its purposeis solely to provide a context for the twoproperties. Other RDF documents will haveno need to amplify this resource. So, in this case, there is no reason for giving theresource an identifier. In this case it makes good sense to use an anonymous resource.

Page 170: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Model (graph)

An anonymous resource (also called a "blank node"). That is, a resource with no identifier. (Note: RDF Parsers will typicallygenerate a unique identifier for anonymous resources, todistinguish one anonymous resource from another.)

Legend:

Page 171: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:parseType="Resource"

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length rdf:parseType="Resource"><rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation>

</River>

Yangtze4,v2.rdf

If the value of a property is comprised of several values then one option is to create an anonymous resource, as we saw. RDF provides a shorthand,so that you don't need to create an rdf:Description element, by using rdf:parseType="Resource", as shown here:

The meaning of this is identical to that shown on the previous slide.

Page 172: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Equivalent!

<length><rdf:Description>

<rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</rdf:Description></length>

<length rdf:parseType="Resource"><rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</length>

Page 173: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

RDF Format!<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xmlns:uom="http://www.measurements.org/units-of-measure#"xml:base="http://www.china.org/geography/rivers">

<length rdf:parseType="Resource"><rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</length><startingLocation>western China's Qinghai-Tibet Plateau</startingLocation><endingLocation>East China Sea</endingLocation><obstacle>

<Dam rdf:ID="ThreeGorges"xmlns="http://www.geodesy.org/dam#">

<name>The Three Gorges Dam</name><width>1.5 miles</width><height>610 feet</height><cost>$30 billion</cost>

</Dam></obstacle>

</River>

Yangtze.rdf

With relativelyfew changes theXML documentis now usable byboth XML toolsand RDF tools!

Page 174: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><River id="Yangtze"

xmlns="http://www.geodesy.org/river"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length uom:units="kilometers">6300</length><maxWidth uom:units="meters">175</maxWidth><maxDepth uom:units="meters">55</maxDepth>

</River>

Yangtze5.xml

Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#"xmlns:uom="http://www.measurements.org/units-of-measure#">

<length rdf:parseType="Resource"><rdf:value>6300</rdf:value><uom:units>kilometers</uom:units>

</length><maxWidth rdf:parseType="Resource">

<rdf:value>175</rdf:value><uom:units>meters</uom:units></maxWidth>

<maxDepth rdf:parseType="Resource"><rdf:value>55</rdf:value><uom:units>kilometers</uom:units>

</maxDepth></River>

Yangtze5.rdf

This is one way of doing it.Now we will see a betterway - using "typed literals".(See next slide)

Page 175: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Alternate RDF Format<?xml version="1.0"?><River rdf:ID="Yangtze"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.geodesy.org/river#">

<length rdf:datatype="http://www.uom.org/distance#kilometer">6300</length><maxWidth rdf:datatype="http://www.uom.org/distance#meter">175</maxWidth><maxDepth rdf:datatype="http://www.uom.org/distance#meter">55</maxDepth>

</River>

Yangtze5.rdf

With rdf:datatype you can give a property's value a datatype label. Therdf:datatype value acts as a semantic label for the datatype of the value.This is called a typed literal.

For this example there must be a namespace, http://www.uom.org/distance#,which defines two datatypes - kilometer and meter.

On the next slide is shown how to do this using XML Schemas.

Page 176: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Defining the kilometer and meter datatypes using XML Schemas

<?xml version="1.0" encoding="UTF-8"?><schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.uom.org/distance#">

<simpleType name="kilometer"><restriction base="integer"></restriction>

</simpleType>

<simpleType name="meter"><restriction base="integer"></restriction>

</simpleType>

</schema>

uom.xsd

Page 177: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Another example using rdf:datatype

<?xml version="1.0"?><Person rdf:ID="JohnSmith"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.person.org#">

<age rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">30</age></Person>

In this example we are specifying that the value (30) of age is a nonNegativeInteger (which is defined in the XML Schema namespace).

Page 178: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The rdf:Bag type (class)The rdf:Bag type is used to represent an unordered collection.

Page 179: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><Meeting id="XML-Design-Patterns"

xmlns="http://www.business.org"><attendees>

<name>John Smith</name><name>Sally Jones</name>

</attendees></Meeting>

Modify the following XML document so that it is also a valid RDF document:

DesignMeeting.xml

rdf:Bag makes it clear that this is an unorderedcollection of names.

<?xml version="1.0"?><Meeting rdf:ID="XML-Design-Pattern"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.business.org#">

<attendees><rdf:Bag>

<name>John Smith</name><name>Sally Jones</name>

</rdf:Bag></attendees>

</Meeting>

DesignMeeting.rdf

Page 180: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The rdf:Alt type (class)

• The rdf:Alt type is used to represent a set of

alternate properties.

Page 181: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><Retailer id="BarnesAndNoble"

xmlns="http://www.retailers.org"><webLocation>

<url>http://www.bn.com</url><url>http://www.barnesandnoble.com</url>

</webLocation></Retailer>

Modify the following XML document so that it is also a valid RDF document:

BarnesAndNoble.xml

rdf:Alt makes it clear that the urls listed arealternates, i.e., chooseone of them.

<?xml version="1.0"?><Retailer rdf:ID="BarnesAndNoble"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.retailers.org#">

<webLocation><rdf:Alt>

<url>http://www.bn.com</url><url>http://www.barnesandnoble.com</url>

</rdf:Alt></webLocation>

</Retailer>

BarnesAndNoble.rdf

Page 182: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

The rdf:Seq type (class)

• The rdf:Seq type is used to represent a

sequence of properties.

Page 183: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><ToDoList id="MondayMeetings"

xmlns="http://www.reminders.org"><activities>

<activity1>Meet with CEO at 10am</activity1><activity2>Luncheon at The Eatery</activity2><activity3>Flight at 3pm</activity3>

</activities></ToDoList>

Modify the following XML document so that it is also a valid RDF document:

MyDaysActivities.xml

rdf:Seq makes it clear that the activities listed are to be done in thesequence listed.

<?xml version="1.0"?><ToDoList rdf:ID="MondayMeetings"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.reminders.org#">

<activities><rdf:Seq>

<activity1>Meet with CEO at 10am</activity1><activity2>Luncheon at The Eatery</activity2><activity3>Flight at 3pm</activity3>

</rdf:Seq></activities>

</ToDoList>

MyDaysActivities.rdf

Page 184: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

rdf:li PropertyThe property, rdf:li ("list item"), is provided by RDF for use with either rdf:Bag, rdf:Alt, or rdf:Seq.

The rdf:li property is provided for you to specify an item in a Bag/Alt/Seq.

An RDF Parser will replace each rdf:li with rdf:_1, rdf:_2, rdf:_3, etc.

The following slide recasts the previous examples using the rdf:li property.

Page 185: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

<?xml version="1.0"?><ToDoList rdf:ID="MondayMeetings"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.reminders.org#">

<activities><rdf:Seq>

<rdf:li>Meet with CEO at 10am</rdf:li><rdf:li>Luncheon at The Eatery</rdf:li><rdf:li>Flight at 3pm</rdf:li>

</rdf:Seq></activities>

</ToDoList>

MyDaysActivities.rdf

<?xml version="1.0"?><Retailer rdf:ID="BarnesAndNoble"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.retailers.org#">

<webLocation><rdf:Alt>

<rdf:li>http://www.bn.com</rdf:li><rdf:li>http://www.barnesandnoble.com</rdf:li>

</rdf:Alt></webLocation>

</Retailer>

BarnesAndNoble.rdf

<?xml version="1.0"?><Meeting rdf:ID="XML-Design-Pattern"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.business.org#">

<attendees><rdf:Bag>

<rdf:li>John Smith</rdf:li><rdf:li>Sally Jones</rdf:li>

</rdf:Bag></attendees>

</Meeting>

DesignMeeting.rdf

Page 186: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Dublin Core (dc:)

• The Dublin Core is a standard set of properties:

Content IntellectualProperty

Instance

TitleSubjectDescriptionLanguageRelationCoverageSource

CreatorPublisherContributorRights

DateTypeFormatIdentifier

Note: many people use these properties in their HTML today. For example:<META NAME="DC.Creator" CONTENT="John Smith">

Page 187: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

187

Triple Notation – QNames as ShorthandsThe full triple notation results in very long lines.

Shorthand: We can use an XML qualified name (or QName) without angle brackets as an abbreviation for a full URI reference.

A QName consists of a prefix that has been assigned to a namespace URI, followed by a colon, and then a local name. The full URIref is formed from the QName by appending the local name to the namespace URI assigned to the prefix.

The concepts of names and namespaces used in RDF originate in XML.

Page 188: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

188

XML NamespacesA namespace is a way of identifying a subset of a set of names (e.g., the set of possible names of resources in the Web) which acts as a qualifier for the names in this subset.

XML namespaces are used for providing uniquely named elements and attributes in an XML document.

XML namespaces help us eliminate ambiguity in an XML document. For example, an XML document can use id to refer to both identifiers of customers and products if id is prefixed by an appropriate name space (e.g., http://customers.org and http://products.com).

A namespace is created by creating a URI for it. By qualifying names with the URIs of their namespaces, anyone can create their own names and properly distinguish them from names with identical spellings created by others.

See the W3C Recommendation “Namespaces in XML 1.0” available at http://www.w3.org/TR/REC-xml-names/.

Page 189: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

189

URIrefs as VocabularySince RDF uses URIrefs instead of words to name things in statements, URIrefs define vocabularies in RDF.

The URIrefs in RDF vocabularies are typically organized so that they can be represented as a set of QNames with a common prefix:

◦ A common namespace URIref is chosen for all terms in a vocabulary, typically a URIref under the control of whoever is defining the vocabulary.

◦ URIrefs that are contained in the vocabulary are formed by appending individual local names to the end of the common URIref.

Page 190: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

190

ExampleDBpedia (http://wiki.dbpedia.org/About) is a large data set which has been derived from Wikipedia by extracting various kinds of structured information from Wikipedia editions in 14 languages and combining this information into a huge, cross-domain knowledge base.

In the DBpedia data set, each thing is identified by a URIref of the form http://dbpedia.org/resource/Name, where Name is taken from the URL of the source Wikipedia article, which has the form http://en.wikipedia.org/wiki/Name. Thus, each resource is tied directly to an English-language Wikipedia article.

Page 191: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

191

DBpedia (cont’d)The URIref

http://dbpedia.org/resource/Greece

is the DBpedia resource about Greece.

The prefix dbpedia can be used instead of http://dbpedia.org/resource/

For example: dbpedia:Greece

Page 192: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

192

URIrefs as Vocabulary (cont’d)RDF uses this same approach to define its own vocabulary of terms with special meanings in RDF:

◦ The URIrefs in the RDF vocabulary all begin with http://www.w3.org/1999/02/22-rdf-syntax-ns#, conventionally associated with the QName prefix rdf:.

◦ The RDF Vocabulary Description Language defines an additional set of terms having URIrefs that begin with http://www.w3.org/2000/01/rdf-schema#, conventionally associated with the QName prefix rdfs:.

Where a specific QName prefix is commonly used in connection with a given set of terms in this way, the QName prefix itself is sometimes used as the name of the vocabulary. For example, someone might refer to "the rdfs: vocabulary".

Page 193: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

193

URIrefs as Vocabulary (cont’d)Convention: Organizations typically use a vocabulary's namespace URIref as the URL of a Web resource that provides further information about that vocabulary.

Example: the QName prefix dc: with the namespace URIref http://purl.org/dc/elements/1.1 refers to the Dublin Core vocabulary.

◦ Accessing this namespace URIref in a Web browser will retrieve additional information about the Dublin Core vocabulary (specifically, RDFS definitions of the Dublin core vocabulary).

◦ Reminder: this is just a useful convention. RDF does not assume that a namespace URI identifies a retrievable Web resource.

Page 194: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

194

URIrefs as Vocabulary (cont’d)Using URIrefs as subjects, predicates, and objects in RDF statements supports the development and use of shared vocabularies on the Web.

People can discover and begin using vocabularies already used by others to describe things, reflecting a shared understanding of those concepts.

Page 195: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

195

ExampleConsider the triple ex:index.html dc:creator exstaff:85740 .

The predicate dc:creator, when fully expanded as a URIref, is an unambiguous reference to the "creator" attribute in the Dublin Core metadata attribute set, a widely-used set of attributes (properties) for describing a wide range of networked resources (see http://dublincore.org/documents/usageguide/).

The writer of this triple is effectively saying that the relationship between the Web page and the creator of the page is exactly the concept identified by http://purl.org/dc/elements/1.1/creator.

Another person familiar with the Dublin Core vocabulary, or who finds out what dc:creator means (say by looking up its definition on the Web) will know what is meant by this relationship. In addition, based on this understanding, people can write programs to behave in accordance with that meaning when processing triples containing the predicate dc:creator.

Page 196: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

196

Another ExampleThe Friend of a Friend (FOAF) vocabulary at http://xmlns.com/foaf/spec/.

The FOAF project is creating a Web of machine-readable pages (written in RDF) describing people, the links between them and the things they create and do.

Page 197: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

197

URIrefs as Vocabulary (cont’d)RDF gives meaning to the terms defined in the relevant RDF vocabularies rdf: and rdfs:.

Others have defined the meaning of terms in other important vocabularies e.g., dc:

Page 198: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

198

Structured Values in RDFConsider the triple:

exstaff:85740 exterms:address "1501 Grant Avenue, Bedford,

Massachusetts 01730" .

What if the address needs to be represented as a structure consisting of separate street, city, state, and postal code values? How would this be done in RDF?

Structured information is represented in RDF by considering the aggregate thing to be described (like John Smith's address) as a resource, and then making statements about that new resource.

Page 199: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

199

Structured Values in RDF (cont’d)

Page 200: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

SPARQLSLIDES REFERENCE: SEMANTIC WEB PRIMER BOOK

เอกสารหลก ใน มคอ .3

Page 201: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

201

Why an RDF Query Language?Different XML Representations

XML at a lower level of abstraction than RDF

There are various ways of syntactically representing an RDF statement in XML

Thus we would require several XPath queries, e.g.◦ //uni:lecturer/uni:title if uni:title element

◦ //uni:lecturer/@uni:title if uni:title attribute

◦ Both XML representations equivalent!

Page 202: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

SPARQL Basic QueriesSPARQL is based on matching graph patterns

The simplest graph pattern is the triple pattern :

-like an RDF triple, but with the possibility of a variable instead of an RDF term in the subject, predicate, or object positions

Combining triple patterns gives a basic graph pattern, where an exact match to a graph is needed to fulfill a pattern

Page 203: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

ExamplesPREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?c

WHERE

{

?c rdf:type rdfs:Class .

}

Retrieves all triple patterns, where:

-the property is rdf:type

-the object is rdfs:Class

Which means that it retrieves all classes

Page 204: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Examples (2)Get all instances of a particular class (e.g. course) :

(declaration of rdf, rdfs prefixes omitted for brevity)

PREFIX uni: <http://www.mydomain.org/uni-ns#>

SELECT ?i

WHERE

{

?i rdf:type uni:course .

}

Page 205: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

205

Using select-from-where

As in SQL, SPARQL queries have a SELECT-FROM-WHERE structure:◦ SELECT specifies the projection: the number and order of retrieved data◦ FROM is used to specify the source being queried (optional)◦ WHERE imposes constraints on possible solutions in the form of graph

pattern templates and boolean constraints

Retrieve all phone numbers of staff members:

SELECT ?x ?y

WHERE

{ ?x uni:phone ?y .}

Here ?x and ?y are variables, and ?x uni:phone ?y represents a resource-property-value triple pattern

Page 206: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

206

Implicit Join

Retrieve all lecturers and their phone numbers:

SELECT ?x ?y

WHERE

{ ?x rdf:type uni:Lecturer ;

uni:phone ?y . }

Implicit join: We restrict the second pattern only to those triples, the resource of which is in the variable ?x

◦ Here we use a syntax shorcut as well: a semicolon indicates that the following triple shares its subject with the previous one

Page 207: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Implicit join (2)The previous query is equivalent to writing:

SELECT ?x ?y

WHERE

{

?x rdf:type uni:Lecturer .

?x uni:phone ?y .

}

Page 208: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

208

Explicit Join

Retrieve the name of all courses taught by the lecturer with ID 949352

SELECT ?n

WHERE

{

?x rdf:type uni:Course ;

uni:isTaughtBy :949352 .

?c uni:name ?n .

FILTER (?c = ?x) .

}

Page 209: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Optional Patterns<uni:lecturer rdf:about=“949352”>

<uni:name>Grigoris Antoniou</uni:name>

</uni:lecturer>

<uni:professor rdf:about=“94318”>

<uni:name>David Billington</uni:name>

<uni:email>[email protected]</uni:email>

</uni:professor>

For one lecturer it only lists the name

For the other it also lists the email address

209

Page 210: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Optional Patterns (2)All lecturers and their email addresses:

SELECT ?name ?email

WHERE

{?x rdf:type uni:Lecturer ;

uni:name ?name ;

uni:email ?email .

}

210

Page 211: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Optional Patterns (3)The result of the previous query would be:

Grigoris Antoniou is listed as a lecturer, but he has no e-mail address

211

?name ?email

David Billington [email protected]

Page 212: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Optional Patterns (4)As a solution we can adapt the query to use an optional pattern:

SELECT ?name ?email

WHERE

{?x rdf:type uni:Lecturer ;

uni:name ?name .

OPTIONAL { x? uni:email ?email }

}

212

Page 213: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Optional Patterns (5)The meaning is roughly “give us the names of lecturers, and if known also their e-mail address”

The result looks like this:

213

?name ?email

Grigoris Antoniou

David Billington [email protected]

Page 214: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

214

SummaryRDF provides a foundation for representing and processing metadata

RDF has a graph-based data model

RDF has an XML-based syntax to support syntactic interoperability ◦ XML and RDF complement each other because RDF supports semantic interoperability

RDF has a decentralized philosophy and allows incremental building of knowledge, and its sharing and reuse

Page 215: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

215

Summary (2)RDF is domain-independent

- RDF Schema provides a mechanism for describing specific domains

RDF Schema is a primitive ontology language◦ It offers certain modelling primitives with fixed meaning

Key concepts of RDF Schema are class, subclass relations, property, subpropertyrelations, and domain and range restrictions

There exist query languages for RDF and RDFS, including SPARQL

Page 216: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

LINKED OPEN DATA

Page 217: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Linked Open DataSlides Reference:

http://info.slis.indiana.edu/~dingying/

Page 218: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

What is nowUser generated content is growing tremendously

Isolated contents need deadly to get connected.

The world is connected, so do the data, information and knowledge

Page 219: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Old termsData -- sensing the world

◦ What you sense (see, hear, smell, touch…)

Information – perceiving the world◦ Perceive the sensed data

Knowledge – contextualizing information◦ Comprehend the perceived information

◦ Add context

Context ultimately determines what’s actually what.

Page 220: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

What is our daily lifeAccess data

Manipulate data (add, delete, change)

Process data◦ Generate information (tables, forms)

◦ Create knowledge (reports, papers..)

Page 221: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Data is our lifeData is our daily bread

Do we have identifier for data?◦ Not really important if data is small and individual

◦ Really important if data is huge and connected

? Should we need identifier for our data? Why do we need our name, or social security number

? Can you refer to someone without identifier

?a person with good heart----

Page 222: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Make our busy life less messyWe just got 24 hours per day, not more

Add identifier to our data◦ Give the everyone-agreed-unique-identifier to each data -- the perfect world of

our dreamland

◦ We will not have any integration problem, most of the IT departments can be closed

◦ Different groups give different identifiers to the same data – we can live with that, it is more real in our daily life, standardization bodies and IT guys are helping us.

We are happy that we can refer to data

Page 223: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Where are our dataIn computer

On the Web

In my paper notes

In printed books

Data are being digitalized and are available online

→Web Data

Page 224: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Web dataData on the Web

◦ Online journal

◦ Blog

◦ Wiki

◦ …

Data in physical world◦ Yourself

◦ Table

◦ Book in library

◦ Computer you are using

◦ …

The boundary is blurring◦ Paper is both in your hand and on the Web

Page 225: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

How to refer dataWeb data

◦ DOI (Digital Object Identifier)

◦ OpenID (people, …)

◦ URI (blog, wiki, homepage, …)

◦ …

Page 226: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

URI (Uniform Resource Identifier)To identify or name a resource on the Internet

The main purpose is to enable interaction with representations of the resource over a network, typically WWW, using specific protocols

–from Wikipedia◦ URN – like a person’s name

◦ urn:isbn:0-486-27557-4 – Book of “Romeo and Juliet”

◦ URL – like a street address

◦ http://www.slis.indiana.edu

Page 227: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Linked DataA term coined by Tim Berners-Lee

It describes HTTP-based Data Access by Reference for the Web

Current web is changing from hypertext links (link documents) to hyperdata links (linking data)

◦ Data are small components of the resources

◦ It drills deep to the details of the resources

Linked data provides a powerful mechanism for meshing disparate and heterogeneous data

Page 228: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Vision from Sir Berners-Lee“The Semantic Web isn’t just about putting data on the web. It is about making links”.

Four Rules for linking data◦ Use URIs as names for things

◦ Use HTTP URIs so that people can look up those names

◦ When someone looks up a URI, provide useful information (URI dereferencing)

◦ Include links to other URIs, so that they can discover more things

“Breaking them does not destroy anything, but misses an opportunity to make data interconnected. This in turn limits the ways it can later be reused in unexpected ways. It is the unexpected re-use of information which is the value added by the web”

Page 229: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

W3C SWEO Linking Open Data Project

Project aims to◦ Publish existing open license datasets as linked data on the web

◦ Interlink things between different data sources

◦ Develop clients and applications that consume linked data from the web

Page 230: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Bubbles in May 2007

Over 500M RDF triples

Around 120K RDF links between data sources

Page 231: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Bubbles in April 2008

>2B RDF triples

Around 3M RDF

links

Page 232: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

2011

Page 233: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

2017

Page 234: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

What are Linked Data?Linked Data require RDF

◦ Why not XML?◦ Different model theory

But not all RDF data are linked data◦ You have to compliant your RDF data according to the four rules mentioned

by Berners-Lee

Page 235: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

linked dataLinked data are just RDF triples

How can I get RDF triples◦ Relational database:

◦ D2R tools can convert them for you

◦ RDFizers from SIMILE:

◦ Can convert JPEG, MARC/MODS, OAI-PMH, OCW(MIT Open Course), Email, BibTex, Java, Javadoc, etc. to RDF

<rdf:Description about=“http://example.org/smith#albert”>

<fam:hasChild rdf:Resource="http://example.org/smith#brian">

<fam:hasChild rdf:Resource="http://example.org/smith#carol">

</rdf:Description>

Page 236: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Thumb of the rulesUnderstand your data

◦ What do you want to have in your data

◦ Do not reinvent – REUSE!◦ Potential ontologies/vocabularies

◦ FOAF, Dublin Core, SKOS

◦ URI Aliases◦ Different URIs for the same non-information resource (Berlin, etc.)

◦ owl:sameAs to link these URI aliases

Page 237: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

More principlesLinked Data is simply about using the Web to create typed links between data from different sources.

The principle of Linked data is to:◦ Use the RDF data model to publish structured data on the web

◦ Use RDF links to interlink data from different data sources.

◦ Use HTTP URIs to identify resource ◦ To avoid other URI schemes (URNs or DOIs)

Page 238: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Power of Linked Data

ying foaf:Person

rdf:type

Ying Ding

foaf:name

Stefanfoaf:knows

db:Galway

72K

dp:population

dp:Cities_in_Ireland

skos:subjectdp:Dublin

foaf:based_near

skos:subject

dblp:publications

foaf:publication

Page 239: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

What LOD can bring?It will lift current document web up to a data web

LOD browsers can let you navigate between different data sources by following RDF links.

It can drill down to the lower granularity of the information◦ allowing you for more fine search on the web

◦ making the question-answer search on the Web possible

◦ meshing up different data through RDF links

◦ Making the built-on-top application easier

Page 240: Semantic web - เว็บไซต์คณะเทคโนโลยี ... · 2020-02-06 · Results are single Web pages ... The Key Problem of Today’s Web The meaning of

Document Web vs. Data WebDocument Web

◦ Glued by hyperlinks

◦ Data are HTML pages

◦ Query result is HTML pages, which can not be further processed

◦ Data are just interlinked, but not integrated

◦ Data access through different APIs

Data Web◦ Glued by RDF links

◦ Data are RDF triples

◦ Query result is RDF triples which can be easily further processed (e.g., web services)

◦ Data are interlinked and integrated, and links are typed

◦ Data access through a single and standardized access mechanism (maybe it will called in the future LOD API?)