semantic web and linked data for cultural heritage materials approaches in europeana

61
Semantic Web and Linked Data for cultural heritage materials Approaches in Europeana Antoine Isaac Vrije Universiteit Amsterdam Europeana DANS Linked Data and RDF workshop, Den Haag, July 28 th 2010

Upload: lysa

Post on 27-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Semantic Web and Linked Data for cultural heritage materials Approaches in Europeana. Antoine Isaac Vrije Universiteit Amsterdam Europeana. DANS Linked Data and RDF workshop, Den Haag, July 28 th 2010. A web of cultural heritage data?. ?. ?. The current portal. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Semantic Web and Linked Data for cultural heritage materials

Approaches in Europeana

Antoine Isaac

Vrije Universiteit AmsterdamEuropeana

DANS Linked Data and RDF workshop, Den Haag, July 28th 2010

Page 2: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A web of cultural heritage data?

?

Page 3: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

?

Page 4: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

The current portal

Page 5: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana
Page 6: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Towards semantic search: facets

Page 7: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Building a search engine on top of metadata is difficultIntrinsic quality problems: correctness, coverage

Especially when data is so heterogeneous100s of formatsFrom flat 5-fields records to 100-nodes XML treesLanguage issue!

We currently use a simple interoperability formatQuick-win showing quickly its limits

Page 8: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

We can better use institutions’ original metadata

Accommodate their different practicesData structures and semantics

Access objects via a semantic layer of vocabularies for subjects, persons, places…

Semantic ThoughtLab: experimenting solutions

Page 9: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Towards semantics-enabled searchBuilding a "semantic layer" to help accessing content

Page 10: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Towards semantics-enabled search

• Enhance access to Europeana content by semantics– Query expansion, clustering of results

• Exploiting various types of relations– "located in", "lived in", "is more specific concept"…

• Semantics are already there, in metadata and "controlled vocabularies" used in metadata– Thesauri, classifications…

• Requires to make it properly machine-accessible

Page 11: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Prototype: Europeana Thought Lab

http://europeana.eu/portal/thought-lab.html

Page 12: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Semantic auto-completion

Page 13: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Clustering of results

Page 14: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Baseline: matching concepts' label

Controlled place name from a vocabulary at the Rijskmuseum

Metadata for the object

Page 15: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A "more specific Egypte"?

Page 16: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A "more specific Egypte"?Metadata for the object

Page 17: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A place more specific than the Egypt one

Semantic information on the Giza place in the Rijskmuseum Vocabulary

Page 18: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Following other relations

Page 19: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Following other relations - creator

Metadata for the object

Controlled person name from a vocabulary at the Rijskmuseum

Page 20: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Following other relations - matchInformation on Gustave Le Gray from the Rijskmuseum Vocabulary

Matched to a "Gustave Le Gray" from another Vocabulary

Page 21: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Following other relations – death placeInformation on Gustave Le Gray from the Union List of Artist Names (Getty)

Page 22: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Following other relations – death placeInformation on Cairo from the Thesaurus of Geographic Names (Getty)

Matched to "Cairo" from another vocabulary…

Page 23: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

• A hell of relations?

• Well, they were in the original data, we just had to make them explicit!

• Cultural Heritage institution often have a wealth of metadata to share and exploit

Page 24: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Enabling bits & pieces

Exploiting semantic links in CH vocabulariesRijksmuseum thesaurus: Concept “Giza” narrower than concept “Egypte”

Mapping/alignment between CH vocabulariesLouvre’s “Égypte” equivalent to Rijksmuseum’s “Egypte”

Enrichment of existing metadataThe string “Egypt” in a metadata record indicates the concept of

Egypt defined in Rijksmuseum thesaurus

Page 25: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

SKOS, Knowledge Organization Systems and Linked Data

SKOS allows representing (simple) KOS data as RDF

animalsNT cats

catsUF domestic catsRT wildcatsBT animalsSN used only for domestic cats

domestic catsUSE cats

wildcats

Page 26: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

SKOS, KOSs and LD

SKOS allows bridging across KOSs from different contexts

http://www.w3.org/2004/02/skos/

Page 27: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

SKOS is used!• Many Libraries – not a surprise!

• Swedish National Library’s Libris catalogue and thesaurus http://libris.kb.se/ • Library of Congress’ vocabularies, including LCSH http://id.loc.gov/ • DNB’s Gemeinsame Normdatei (incl. SWD subject headings) http://d-nb.info/gnd/

Documentation at https://wiki.d-nb.de/display/LDS

• BnF’s RAMEAU subject headings http://stitch.cs.vu.nl/ • OCLC’s DDC classification http://dewey.info/ and VIAF http://viaf.org/ • STW economy thesaurus http://zbw.eu/stw • National Library of Hungary’s catalogue and thesauri http://oszkdk.oszk.hu/resource/DRJ/404

(example)

• Other fields• Wikipedia categories through Dbpedia http://dbpedia.org/ • New York Times subject headings http://data.nytimes.com/ • IVOA astronomy vocabularies http://www.ivoa.net/Documents/latest/Vocabularies.html• GEMET environmental thesaurus http://eionet.europa.eu/gemet • UMTHES• Agrovoc http://aims.fao.org/ • Linked Life Data http://linkedlifedata.com/ • Taxonconcept http://www.taxonconcept.org/ • UK Public sector vocabularies http://standards.esd.org.uk/ (e.g., http://id.esd.org.uk/lifeEvent/7 )

Page 28: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

KOS Alignments?

Quite many of them are linked to some other resource• LCSH, SWD and RAMEAU interlinked through MACS mappings• GND linked to DBpedia and VIAF• Libris linked to LCSH• Agrovoc to CAT, NAL, SWD, GEMET• NYT to freebase, DBpedia, Geonames• dbPedia links are overwhelming

Hungary, STW, TaxonConcept, GND…

Page 29: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Enabling bits & pieces (c’ed)

Appropriate data model for objectsGeneric constructs for creation, title, subject, etc. that are useful

for querying

Flexible data modelSW ontology linking features allow to keep close to original data

while having the generic notions above

Page 30: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Formal semantics, metadata schemas and querying

• The query:

• The existing description:

• Why is there a match?For the Europeana ontology, every rma:depicts statement implies a vra:subject statement

rma:gezicht_in_cairo

rma:Cairo

rma:depicts

rma:Egypt

skos:broader

?x

?y

vra:subject

rma:Egypt

skos:broader

Page 31: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Where are the challenges?

• Semantic conversion of data– Using appropriate data models– Enriching legacy metadata

• Semantic alignments– Between description ontologies

vra:depicts rdfs:subPropertyOf dc:subject

– Between concepts in controlled vocabulariesiconclass:bird skos:closeMatch ddc:bird

Page 32: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Alignment of semantic references

Page 33: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Where are the challenges?

• Semantic alignment (c'ed)– Find correspondences between large vocabularies– In a multilingual context

• Scalability– Plugging the semantic features into the Europeana production

environment

Page 34: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

The Europeana Data Model (EDM)

with input from Carlo Meghini, Guus Schreiber, Stefan Gradmann, Maxx Dekkers, Steffen Hennicke, Viktor de

Boer et al. from Europeana V1

Page 35: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Rationale of EDM

• Precursor: ESE (Europeana Semantic Elements)– represents lowest common denominator for object metadata

• convert datasets to Dublin-Core like standard

– forces interoperability– major drawback: original metadata is lost– most values are simple strings

• EDM goals– preserve original data while still allowing for interoperability– Semantic Web representation

• A community-driven effort– Core experts, validation by representatives of various CH domains

Page 36: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

EDM requirements & principles

1. Distinction between “provided object” (painting, book, program) and digital representation

2. Distinction between object and metadata record describing an object

3. Allow for multiple records for same object, containing potentially contradictory statements about an object

4. Support for objects that are composed of other objects

5. Standard metadata format that can be specialized

6. Standard vocabulary format that can be specialized

7. EDM should be based on existing standards

Page 37: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

EDM basics

• OAI ORE for organization of metadata about an object

• Dublin Core for metadata representation

• SKOS for vocabulary representation

+ Links to CIDOC-CRM and other shared ontologies

Page 38: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Dublin Core

• EDM uses the latest version of DCMI Metadata Terms for a core of semantically interoperable properties– And for backward compatibility, cf. ESE

• Specified with an RDF model• Specialization of 15 original DC elements• Can be specialized itself

– see requirement -> this is a crucial distinction with ESE

• Used in the richest way possible– Pointers to resources

Page 39: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

SKOS: vocabulary publication on the Web

• Already seen…

Page 40: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

OAI ORE

• Specification:

http://www.openarchives.org/ore/1.0/toc.html • Specified with an RDF model• Four key notions (RDF classes)

– Object: the book/painting/program being described– Aggregation: organizes object information from a particular

provider (museum, archive, library) – Proxy: the object as viewed in a metadata record– Digital representation: some digital form of the object with a Web

address

Page 41: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

The Example - 1

41

Page 42: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

The Example - 2

42

Page 43: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Aggregation organizes data of a provider

43

aggregation

digital representation

object

provenancemetadata

Page 44: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Proxy: metadata record for an object

44

proxy

objectmetadata

Page 45: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Multiple aggregations = multiple providers

45

aggregation of DMF

aggregation of Louvre

Page 46: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Multiple aggregations = multiple providers

46

DMF proxy

Louvre Proxy

Louvre title

DMF title

The “real” painting

Page 47: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Europeana is “just” a special providerwith processed/enriched metadata

47

Europeanaaggregation

enrichedmetadata

Europeana landingpage

Page 48: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A flexible model: different semantic grains

• Cf. goal: “preserve original data while still allowing for interoperability”• Keep data expressed as close as possible to original model• Using mappings to more interoperable level

Page 49: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A flexible model: objects, events and the rest

• Preserving and exploiting original data also means being compatible with descriptions beyond simple object level

• Also crucial for semantic enrichment

Page 50: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A flexible model: object and events (2)

• Classes and Properties for event-, agent-, place-centric modeling

• Instances of (local) vocabularies using skos:Concept

• Using RDF, EDM allows any kind of network to be attached to a provided object.

Page 51: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A flexible model: object and events (3)

Page 52: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Advanced modeling in EDM

• Relations between provided objects– Part-whole links for complex (hierarchical) objects – Derivation and versioning relations– Relations between provided objects, for instance artistic derivation

between works; • ens:isRepresentationOf

• ens:isNextInSequence

Page 53: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Linked data and cultural heritage?

Page 54: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

The case for linked data in cultural heritage

Not just a more sophisticated way to represent data!

• Ease of getting data from external sources– Just going to the URI and fetch the RDF there

• Ease of publishing data– Linked data as a dissemination channel for Europeana data

• Ease of linking across datasets– Linked data as a dissemination channel for Europeana data

• Object identification as cornerstone– Records are just a side feature!

Page 55: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

From a movement supported by researchersTo much wider awareness

Open government initiatives, libraries…

Continuing effort: show benefits of collaborating to a cultural heritage data web

Library Linked Data W3C incubatorhttp://www.w3.org/2005/Incubator/lld

Encouraging open linked data adoption

Page 56: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Linked Library Cloud beginning 2008

[Ross Singer, Code4Lib2010]

http://code4lib.org/conference/2010/singer

Page 57: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Linked Library Cloud mid-2010

Plus:• Germany NL• Hungary NL• STW• GEMET• NYT• Agrovoc

[Ross Singer, Code4Lib2010] http://code4lib.org/conference/2010/singer

Page 58: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Is that a surprise?

Not really, let’s have a look at a real-world case…

Page 59: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Johan Stapel, Koninklijke Bibliotheek

KOS & collection environment @KB

Page 60: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

A broad range of datasets• That describe the same objects• Or related objects• Which are about similar subjects• Which were made by the same persons• Or related persons• In the same places• Etc…

Page 61: Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana

Thanks!

[email protected]

Europeana.eu teamWeb and Media lab @ Vrije Universiteit Amsterdam

http://wiki.cs.vu.nl/web-media

EuropeanaConnect projecthttp://www.europeanaconnect.eu/