![Page 1: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/1.jpg)
This file has been cleaned of potential threats.
If you confirm that the file is coming from a trusted source, you can send the following SHA-256
hash value to your admin for the original file.
a33cf7fc221abf249dbdeb5da8931734a94fbdff271dc919873028879d474b7d
To view the reconstructed contents, please SCROLL DOWN to next page.
![Page 2: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/2.jpg)
A distributed network of digital
heritage information UNESCO-NDL INDIA INTERNATIONAL WORKSHOP
ON KNOWLEDGE ENGINEERING FOR DIGITAL LIBRARY DESIGN
Enno Meijers / 25 October 2017 / New Delhi - India
![Page 3: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/3.jpg)
https://www.google.nl/search?q=enno+meijers+netherlands
A short introduction….
![Page 4: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/4.jpg)
Contents
1. Introduction to Dutch Digital Heritage Network
2. Evaluating our current digital heritage infrastructure
3. Strategies for improvement
4. Building a distributed network for digital heritage information
![Page 5: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/5.jpg)
1. Introduction to Dutch Digital Heritage Network
![Page 6: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/6.jpg)
![Page 7: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/7.jpg)
![Page 8: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/8.jpg)
![Page 9: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/9.jpg)
![Page 10: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/10.jpg)
![Page 11: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/11.jpg)
The Digital Heritage Network (NDE) aims at
increasing the social value of the heritage
information maintained by libraries, archives,
museums and other cultural heritage institutions.
This strategy offers a perspective on developing
a national, cross-sector infrastructure of digital
heritage facilities.
It focuses on long term cooperation between the
government and the institutions on national,
regional and local level. It is about organizing the
network of people and information!
National Digital Heritage strategic plan (2015)
![Page 12: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/12.jpg)
Available metadata and collections online for general use (Europe)
Source: Enumerate Survey 2017
![Page 13: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/13.jpg)
![Page 14: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/14.jpg)
Thinking from the user’s perspective
Thinking from the user’s perspective also
means seeking out the digital platforms
and work environments where potential
users can already be found.
The attractiveness of information to a
certain user group is not determined only
by the nature of the information, but also
by the method and location through which
that information is offered.
![Page 15: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/15.jpg)
2. Evaluating current infrastructure
![Page 16: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/16.jpg)
General setup of digital heritage portals
Heritage information consisting of GLAM datasets and science collections
![Page 17: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/17.jpg)
Evaluating current approach (1)
Positive results so far:
• many sources available through OAI-PMH protocol
• powerful and smart protocol for metadata synchronization
• opened up data silos
• created the need for aligning data models
• made cross-collection and cross-domain discovery possible (e.g. Europeana)
![Page 18: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/18.jpg)
Evaluating current approach (2)
But there are two main problems areas:
• poor semantic alignment
• inefficient data integration
![Page 19: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/19.jpg)
Problem #1: Poor semantic alignment
Infrastructure:
• no sustainable identifiers (URIs) for objects
• use of strings instead of URIs for terms (who, what, where, when,..)
• no shared terminology sources available
• no provisions for linking to external references
Publishing:
• implementations lack support for multiple data models
• at best data is ‘flattened’ to a common data model (EDM, Dublin Core)
• loss of meaning due to transformation
![Page 20: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/20.jpg)
Problem #2: Inefficient data integration
Aggregation is based on physical data integration
• Dutch cultural heritage sector: 1500 institutions, >>1500 collections
• integration model is based on copying data (OAI-PMH)
• synchronization with the data source needs permanent attention
• ownership, licensing, provenance, control over access are difficult topics
• no feedback loop to the data source (usage, cleaning, enrichments)
• large distance between data source owner and user of the data
• centralized model leads to scalability problems
See also:
Miel Vander Sande et al. , Towards sustainable publishing and querying of distributed Linked Data archives - Journal of Documentation (2017)
Herbert Van de Sompel - Reminiscing About 15 Years of Interoperability Efforts - D-lib Magazine - December (2015)
![Page 21: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/21.jpg)
Networks of aggregators...
![Page 22: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/22.jpg)
In general - discovery of the “deep web”
• Institutional repositories, collection management systems
• Millions of ‘invisible’ datasets: publications, research data, heritage collections
• Poor coverage by regular search engines
• Metadata is key, describing physical materials or (licensed) digital content
• Demand for cross-institutional, cross-domain discovery
• Many specialized portals giving access to different views
![Page 23: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/23.jpg)
3. Strategies for improvement
![Page 24: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/24.jpg)
The Digital Heritage Network
is developing a three-layered
approach for improving the
sustainability, the usability
and the visibility of digital
heritage information.
sustainable
usable
visible
![Page 25: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/25.jpg)
The Digital Heritage Network
is developing a three-layered
approach for improving the
sustainability, the usability
and the visibility of digital
heritage information.
sustainable
usable
visible
![Page 26: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/26.jpg)
• build service portals as views based on a common data layer
• minimize the intermediate layers
• support decentralized discovery
• refer to the source instead of copying
• maximize the usability of data at the source
• develop a sustainable, ‘web-centric’ solution
• use HTTP, RDF and RESTful APIs as building blocks
=> implement the Linked Data principals
Inspired by the work of Ruben Verborgh, Herbert Van de Sompel and colleagues:
See for example: Miel Vander Sande et al. , Towards sustainable publishing and querying of distributed
Linked Data archives - Journal of Documentation (2017)
Design principles for a discovery infrastructure
![Page 27: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/27.jpg)
At the data source level:
• use sustainable URIs to identify the resources
• use formal definitions for persons, places, concepts, events
• use domain data models to describe the data
• add support for cross-domain discovery (Europeana Data Model, Schema.org,...)
• publish the collection information as Linked Data
=> Work with the IT suppliers as strategic partners for the implementation!
Implementing Linked Data principles (1)
![Page 28: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/28.jpg)
At the network level:
• create a ‘network of terms’ for shared terminology
• provide tools for alignment and linking
• create alignments and links between different terminology sources
• provide easy access to shared terminology for collection management systems
(API)
=> Provide open and cross-domain solutions at the network level!
Implementing Linked Data principles (2)
![Page 29: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/29.jpg)
Building on previous work
![Page 30: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/30.jpg)
![Page 31: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/31.jpg)
Ok, but how will our Linked Data be found?
![Page 32: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/32.jpg)
The Linked Open Data cloud…
![Page 33: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/33.jpg)
The Semantic Web is still a dream… #1
So discovery of Linked Data requires
registering datasets?!
![Page 34: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/34.jpg)
A tiny example...suppose a resource is defined as:
museum_X:object1
a nde:painting ;
dct:subject aat:windmill .
The Semantic Web is still a dream… #2
“windmill”
![Page 35: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/35.jpg)
A tiny example...suppose a resource is defined as:
museum_X:object1
a nde:painting ;
dct:subject aat:windmill .
For ‘browsable Linked Data’ you should(!) add the inverse relation [1],[2]:
aat:windmill
a skos:Concept ;
skos:prefLabel “Windmill“@en ;
dct:isSubjectOf museum_X:object1 .
The Semantic Web is still a dream… #2
[1]: Tim Berner’s Lee on ‘browsable linked data’ (2006) [2]: Tom Heath and Christian Bizer on ‘Incoming Links’ (2011)
“Windmill”
“Windmill”
![Page 36: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/36.jpg)
A tiny example...suppose a resource is defined as:
museum_X:object1
a nde:painting ;
dct:subject aat:windmill .
For ‘browsable Linked Data’ you should(!) add the inverse relation [1],[2]:
aat:windmill
a skos:Concept ;
skos:prefLabel “Windmill“@en ;
dct:isSubjectOf museum_X:object1 .
=> a Linked Data integration problem, the lack of “backlinks”
The Semantic Web is still a dream… #2
[1]: Tim Berner’s Lee on ‘browsable linked data’ (2006) [2]: Tom Heath and Christian Bizer on ‘Incoming Links’ (2011)
“windmill”
“windmill”
![Page 37: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/37.jpg)
Some theory on Linked Data integration
Possible approaches to make Linked Data work:
1. Semantic integration
2. Physical integration
3. Virtual integration – standard approach
4. Virtual integration – using Linked Data Fragments
![Page 38: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/38.jpg)
Actions:
• implement schema.org
• let search engines ‘infer’ the relations
• query the search engines
Outcome:
• is the data interesting enough for Google?
• what about special thematic or regional views?
• can we reuse the results of the integration? (NO!)
1. Semantic integration only
![Page 39: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/39.jpg)
Actions:
• aggregate all the related Linked Data sources
• build large triplestore and infer the relations
• query the aggregated data
Outcome:
• approach still based on copying
• same problems as traditional aggregation!
2. Physical integration of linked data
![Page 40: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/40.jpg)
![Page 41: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/41.jpg)
Actions:
• publish Linked Data through triplestore with
SPARQL endpoint
• build a central query engine to integrate the
results
Outcome:
• implementing a triplestore is hard for small data
providers
• federated querying over multiple triplestores
performs poorly
3a. Virtual integration - standard approach
![Page 42: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/42.jpg)
Actions:
• publish Linked Data using Linked Data Fragments
(LDF) technology
• build a central LDF based query engine to integrate
the results
Outcome:
• easy implementation for small data providers
• federated querying is supported
• more difficult to process the result
• possible support for time-based versions (Memento)
See also: Miel Vander Sande et al. , (2017) Towards sustainable publishing and querying of distributed Linked Data archives -
Journal of Documentation
3b. Virtual integration - using LDF
![Page 43: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/43.jpg)
Problem:
• query many data sources at the same time
is not realistic…
Solution:
• build a Knowledge Base with backlinks to
support the discovery process
• select relevant sources for querying based
on the Knowledge Base
See also: Miel Vander Sande et al. (2016) Hypermedia-Based Discovery for Source Selection Using Low-Cost Linked Data Interfaces
(IJSWIS) 12(3) 79–110
More advanced:
data source profiling or dataset summaries
But federation needs selection of sources…
![Page 44: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/44.jpg)
4. Building the distributed network
of Dutch Digital Heritage information
![Page 45: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/45.jpg)
Strategy for our distributed network
1. build a service for shared entities for Dutch digital heritage
2. improve the usability of the data source:
- align object descriptions with shared terminology
- publish data as Linked Data
semantic
alignment
![Page 46: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/46.jpg)
Strategy for our distributed network
1. build a service for shared entities for Dutch digital heritage
2. improve the usability of the data source:
- align object descriptions with shared terminology
- publish data as Linked Data
3. build a discovery infrastructure:
- register organizations and datasets in a registry
- build knowledge graph to support discovery (“backlinks”)
4. implement virtual data integration technology :
- use registry and knowledge graph for selecting the resources
- support federated querying (or selective aggregation)
semantic
alignment
data
integration
![Page 47: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/47.jpg)
https://github.com/netwerk-digitaal-erfgoed/high-level-design
High-level design of our discovery infrastructure
![Page 48: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/48.jpg)
Roadmap (1)
Phase 1 – functional design / developing partnerships:
• design of supporting cross-domain functionality
• develop partnerships with IT suppliers and specialists
• develop domain and cross domain strategies
Phase 2 – enrich the current (OAI-PMH based) infrastructure:
• build a registry for organizations and datasets
• build a network of terms to provide shared terminology for discovery
• upgrade object descriptions with formal definitions (URIs)
![Page 49: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/49.jpg)
Roadmap (2)
Phase 3: implement Linked Data technology at the network level
• make aggregators Linked Data compliant
• build a knowledge graph with backlinks for discovery
• support federated querying (or selective harvesting)
Phase 4: realize the distributed network of heritage information
• make collection management systems Linked Data compliant
• transform aggregators to service portals for discovery
![Page 50: A distributed network of digital heritage information · Heritage information consisting of GLAM datasets and science collections . Evaluating current approach (1) Positive results](https://reader034.vdocuments.site/reader034/viewer/2022042105/5e82dfebdd8b586fc154ee68/html5/thumbnails/50.jpg)
Thank you for your attention!
please share your thoughts with us...
email: enno.meijers at kb.nl
twitter, slideshare: ennomeijers
https://github.com/netwerk-digitaal-erfgoed