metadata and ontologies

54
Introduction and Theoretical Foundations of New Media Metadata and Ontologies ..

Upload: david-lamas

Post on 06-May-2015

2.605 views

Category:

Education


2 download

DESCRIPTION

Slides from the Introduction and Theoretical Foundations of New Media course of the Interactive Media and Knowledge Environments master program (Tallinn University).

TRANSCRIPT

Page 1: Metadata and ontologies

Introduction and Theoretical Foundations of New Media

Metadata and Ontologies

..

Page 2: Metadata and ontologies

David Lamas, TLU, 2011

2

Contents

Metadata

Ontologies

Folksonomies

The sematic web

The internet of things

Page 3: Metadata and ontologies

David Lamas, TLU, 2011

Metadata

Click icon to add picture

3

Page 4: Metadata and ontologies

David Lamas, TLU, 2011

4

Metadata

So, why is metadata relevant?Or… why should we care about metadata?

Page 5: Metadata and ontologies

David Lamas, TLU, 2011

5

Metadata

As a concept, is not newMetadata has long been for managing document collections

such as the ones kept by libraries

But the term itself, was only coined in 1968By Philip Bagley, a pioneer of computerized document

retrieval

Page 6: Metadata and ontologies

David Lamas, TLU, 2011

6

Metadata

Literally, a set of data that describes and gives information about other data, metadata in our context is:Machine readable

Descriptive

For the purposes of resource…

Discovery

Management

Delivery

Access control

Use

Re-use

Long term preservation

Page 7: Metadata and ontologies

David Lamas, TLU, 2011

7

Metadata

Or in other words, metadata allows for the description of the…Definition

Structure; and

Administration

of selected resources with all contents in context to ease the further use of the resource

Page 8: Metadata and ontologies

David Lamas, TLU, 2011

8

MARC

Or… Machine Readable CatalogueIs still the main metadata standard in the library world

although it is not a full cataloguing scheme being

Page 9: Metadata and ontologies

David Lamas, TLU, 2011

9

UDC, AARC2 and RDA

Universal Decimal ClassificationA multilingual classification scheme for all fields of knowledge

Available at… http://www.udcc.org/udcsummary/php/index.php

Anglo-American Cataloguing RulesFor use in the construction of catalogues

Available at…

http://www.aacr2.org/

Resource description and accessAvailable at…

http://www.rda-jsc.org/rda.html

Page 10: Metadata and ontologies

David Lamas, TLU, 2011

10

Z39.50, SRW and SRU

Z39.50is a client–server protocol for searching and retrieving

information widely used in library environments

Search & Retrieve Web ServiceA intended standard web-based text-searching interface

Search/Retrieval via URLA standard XML-focused search protocol for Internet search

queries, which uses the Contextual Query Language

Page 11: Metadata and ontologies

David Lamas, TLU, 2011

11

But…

This should not bother you other than to note that…Metadata tends to get more complicated the longer you think

about it

Page 12: Metadata and ontologies

David Lamas, TLU, 2011

12

As for the web…

It was early recognized that finding what you need was going to start getting difficultWe’re talking about the mid nineties when the web’s size

was referred to in terms of tens of thousands

Users, mainly information sciences specialists, begun trying to catalogue it by handDo you remember Yahoo’s earlier versions?

Page 13: Metadata and ontologies

David Lamas, TLU, 2011

13

As for the web…

The first search engines appeared and authors begun to realize that the metadata they embedded into web pages might be important

<html>

<head>

<title>A web page</title>

<meta name=“keywords” content=“some, key, words” />

<meta name=“description” content=“a summary” />

</head>

<body>

Page 14: Metadata and ontologies

David Lamas, TLU, 2011

14

As for the web…

Then came GoogleAnd metadata lost some relevance as Google’s PageRank

algorithm takes note of links between pages but places less emphasis on embedded metadata to avoid…

Metaspam

<meta name=“description” content=“a summary” />

Metacrap

<title>put your title here</title>

Page 15: Metadata and ontologies

David Lamas, TLU, 2011

15

Dublin Core

Despite the initial drawbacks, work continued on embedded metadata and the Dublin Core was and still is one of the main players with its 15 elements…Title, Creator, Subject, Description, Publisher, Contributor, Date,

Type, Format, Identifier, Source, Language, Relation, Coverage, Rights

…embedded into web pages or encoded using XML

The initial intention was to improve indexing by search enginesBut whereas its promoters forgot about metaspam and metacrap,

the search engines didn’t

And so, main search engines still ignore embedded metadata

Page 16: Metadata and ontologies

David Lamas, TLU, 2011

16

Dublin Core

Page 17: Metadata and ontologies

David Lamas, TLU, 2011

17

Metadata

Remarkably, there has been fairly widespread adoption of metadata principles, specially in policy terms, namely in government(look into http://www.esd.org.uk/standards/egms/viewer/

viewer.aspx for and interesting example)

And in:

Education

Health

Cultural heritage

Environmental agencies, and…

Libraries, of course

Page 18: Metadata and ontologies

David Lamas, TLU, 2011

18

Metadata

This resulted in the… Growth of metadata cataloguing rules

(although every community has its own rules)

Growth in use of additional elements for particular communities

(and again, every community’s additions are different)

Adoption of application profiles to document the distinct cataloguing rules and additions

Institution of the Dublin Core Metadata Initiative as

an organization engaged in the development of interoperable metadata standards that support a broad range of purposes and business models

Page 19: Metadata and ontologies

David Lamas, TLU, 2011

19

Metadata

But the Dublin Core isn’t alone, far from itMany other standards were and are being developed such as

these, just to name two:

RDF (Resource Description Framework)

LOM (Learning Object Metadata)

Page 20: Metadata and ontologies

David Lamas, TLU, 2011

20

Resource Description Framework

The resource description framework was developed by the W3C, the RDF is the envisioned standard for the semantic webIts goal is to allow software to automatically navigate and

reason about web content thus enabling…

A web of (linked) data

Page 21: Metadata and ontologies

David Lamas, TLU, 2011

21

Resource Description Framework

Page 22: Metadata and ontologies

David Lamas, TLU, 2011

22

Learning Object Metadata

Learning Object Metadata is a data modelUsually encoded in XML, it is used to describe learning

objects and similar digital resources used to support learning.

Page 23: Metadata and ontologies

David Lamas, TLU, 2011

23

Learning Object Metadata

Page 24: Metadata and ontologies

David Lamas, TLU, 2011

24

Metadata

As said in the beginning…Metadata tends to get more complicated the longer we think

about it

The current metadata efforts lack of within standards and within communities coherence and cohesion are a good example

And that is why we will next look into Ontologies

So… do we care about metadata?Why are we interested?

Page 25: Metadata and ontologies

David Lamas, TLU, 2011

25

Metadata

I guess the answer is yes, we care.And yes, we are interested, because metadata is everywhere

Sometimes it is explicitly available,

Other times it is hidden or not so readily available, but anyway…

It would be foolish not to make use of it

Page 26: Metadata and ontologies

David Lamas, TLU, 2011

26

Metadata

Further, there is increasing pressure to expose metadata on the web for other to mash up and this is specially true today in settings such as…Education;

Research; and

Government

And finally, metadata becomes paramount in scenarios wherecontent is data; or

the required information can not easily derived from content

Page 27: Metadata and ontologies

David Lamas, TLU, 2011

Ontologies

Click icon to add picture

27

Page 28: Metadata and ontologies

David Lamas, TLU, 2011

28

Ontologies

One way of dealing with the lack of within standards and within communities coherence and cohesion of current metadata efforts is to evolve to an ontology-base metadata approach

But what does this means?

Page 29: Metadata and ontologies

David Lamas, TLU, 2011

Ontologies

An ontology is a logical theory which gives an explicit partial account of a conceptualizationAn intentional semantic structure which encodes the implicit

rules constraining the structure of a piece of reality

In this light, the aim of an ontology is to define which primitives, provided with their associated semantics, are necessary for knowledge representation in a given context

Thomas R. Gruber (1993). Toward principles for the design of ontologies used for knowledge sharing.

Originally in N. Guarino and R. Poli, (Eds.), International Workshop on Formal Ontology, Padova, Italy. Revised

August 1993. Published in International Journal of Human-Computer Studies, Volume 43 , Issue 5-6

Nov./Dec. 1995, Pages: 907-928, special issue on the role of formal ontology in the information technology.

Page 30: Metadata and ontologies

David Lamas, TLU, 2011

30

Ontologies

Ontologies are usually characterized by their…Coverage

The extent to which the primitives mobilized by the perceived usage scenarios are covered by the ontology

Specificity

The extent to which ontological primitives are precisely identified

Granularity

The extent to which primitives are precisely and formally defined

Formality

The extent to which primitives are described in a formal language

Page 31: Metadata and ontologies

David Lamas, TLU, 2011

31

Ontologies

And ontologies are not… taxonomies

But taxonomy might be perceived as a specific case of an ontologyA taxonomy is a particular classification arranged in a

hierarchical structure

Typically it is organized by supertype/subtype relationships also called generalization/specialization relationships

Page 32: Metadata and ontologies

David Lamas, TLU, 2011

32

Why ontologies?

Pipe

Page 33: Metadata and ontologies

David Lamas, TLU, 2011

33

Why ontologies?

Pipe

Page 34: Metadata and ontologies

David Lamas, TLU, 2011

34

Why ontologies?

Pipe

Page 35: Metadata and ontologies

David Lamas, TLU, 2011

35

Why ontologies?

In short, we interpret, machines don’tAs such, an effort must be undertaken in order to support

adequate usage of digital resources

So, what’s missing?Among other…

The possibility to share a common understanding of the structure of information within a specific domain

The possibility to reuse domain knowledge

The possibility to make domain assumptions explicit

The possibility to analyze domain knowledge

Page 36: Metadata and ontologies

David Lamas, TLU, 2011

36

Ontologies and the web

It is estimated that by 2010…70% of public web pages will have some level of metadata,

but only

20% will use more extensive semantic web approaches such as ontology-based metadata

But why should we care?

http://www.afsg.nl/InformationManagement/images/nieuws/finding%20and%20exploiting%20value%20of%20semantic%20tech%20on%

20web.pdf

Page 37: Metadata and ontologies

David Lamas, TLU, 2011

37

Ontologies and the web

An emerging ontological approach is OWL or…Web Ontology Language

A vocabulary extension of the Resource Description Framework, which adds more vocabulary for describing characteristics of properties and classes or relations between classes

Page 38: Metadata and ontologies

David Lamas, TLU, 2011

38

Web Ontology Language

OWL enables ontology-based information sharing and manipulation together with RDF and XMLIn reverse order…

XML allows users to add arbitrary structure to their docuemnts but says nothing about what such structures mean

RDF enables expression of meaning over XML (and other) structures

Using subject, verb and object triples

OWL enables machines to comprehend semantic documents and data

Page 39: Metadata and ontologies

David Lamas, TLU, 2011

39

Web Ontology Languagehttp://www.w3.org/TR/owl-features/

Page 40: Metadata and ontologies

David Lamas, TLU, 2011

40

Ontologies

This said and while addressing some of the current metadata efforts weaknesses, present-day ontologies still largely depend on explicit human intervention to be usefulAnd that is why we will next look into folksonomies

Page 41: Metadata and ontologies

David Lamas, TLU, 2011

Folksonomies

Click icon to add picture

Page 42: Metadata and ontologies

David Lamas, TLU, 2011

Folksonomies

Are mainly a bottom-up social classification systemA way to organize and share contents by tagging resources

Synonyms are…

Ethno-classification; and

Collaborative tagging

Page 43: Metadata and ontologies

David Lamas, TLU, 2011

43

Folksonomies

Folksonomies are created by users and have…No structure

No fixed vocabulary

No explicit relationships between terms, and

No authority

Page 44: Metadata and ontologies

David Lamas, TLU, 2011

44

Folksonomies

Folksonomies also are…Distributed, and

Collaboratively built and maintained

You can tag items owned by others

You can get instant feedback

All items for the same tag

All tags for the same item

You can a adapt your tags to the group norm

But you are never forced

Page 45: Metadata and ontologies

David Lamas, TLU, 2011

45

Folksonomies

Some of their apparent benefits are…Being cheap and easy to build and use

Being capable to adapt very quickly to changes and users needs

They scale well

Foster serendipity

Semantic browsing instead of searching

Lower the cooperation barriers

Page 46: Metadata and ontologies

David Lamas, TLU, 2011

46

Folksonomies

But they have limits such as…Semantic ambiguity

Polysemy, synonymy, cardinality and the use of acronyms

Syntax free

Spaces and multiple words are used without rules

Language

Different languages can be used for the same tag

Being eventually shortsighted

Fail to depict the general overview

Lack of (or minimal) structure

No explicit relationships between otherwise related tags

Page 47: Metadata and ontologies

David Lamas, TLU, 2011

47

Folksonomies and ontologies

Folksonomies

Domains

Large corpus

Informal categories

Unstable entities

Unclear edges

Participants

Naïve cataloguers

No authority

Uncoordinated users

Amateur users

Critical mass needed

Ontologies

Domains

Small corpus

Formal categories

Stable entities

Restricted entities

Clear edges

Participants

Expert cataloguers

Authoritative sources of judgment

Coordinated users

Expert users

Page 48: Metadata and ontologies

David Lamas, TLU, 2011

48

Folksonomies and ontologies

How do we choose?Folksonomies are useful when all that is needed is the ability

to link items to topics

Ontologies are useful when what is needed is to formally define meaning

But… do we need to choose?Not really, at least that what current research is exploring

Page 49: Metadata and ontologies

David Lamas, TLU, 2011

49

Folksonomies and ontologies

Research directions includeThe combination of the folksonomy and ontology approaches

into an hybrid system where the most consensual constructs would long last while others would be forgotten or redefined

An approach that combines the ease and adaptability of folksonomy with the formality and semantic richness of an ontology

Quantitative tag analysis and qualitative use analysis in current online social networking services

To understand if tag usage converge or not

To understand how a folksonomy is formed

To… any ideas?

Page 50: Metadata and ontologies

David Lamas, TLU, 2011

Semantic web

Click icon to add picture

Page 51: Metadata and ontologies

David Lamas, TLU, 2011

Semantic Web

The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help

One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web

Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form.

Page 52: Metadata and ontologies

David Lamas, TLU, 2011

Internet of things

Click icon to add picture

Page 53: Metadata and ontologies

David Lamas, TLU, 2011

The internet of things

The internet of things might be described as a self-configuring wireless network of sensors whose purpose would be to interconnect all thingsAnd the concept is attributed to the former Auto-ID Center,

founded in 1999, based at the time at the MIT

An alternative view focuses instead on making all things addressable by the existing naming protocolsIn the current vision, objects themselves do not interact, but

they may now be referred to by other agents, such as centralized servers acting for their human users

Page 54: Metadata and ontologies

David Lamas, TLU, 2011

54

Metadata and Ontologies recap

Metadata

Ontologies

Folksonomies

The sematic web

The internet of things