1 data description registry interoperability (ddri) working group dimitris gavrilis, amir aryani
TRANSCRIPT
1
Data Description Registry Interoperability(DDRI) Working Group
Dimitris Gavrilis, Amir Aryani
2
Enabling cross-platform discovery between research data registries.
Problem & Context
3
▪ ANDS. Australian National Data Service
▪ Data-PASS. Data Preservation Alliance for the Social Sciences
▪ Dryad. Digital Repository
▪ Thomson Reuters DCI. Data Citation Index
▪ VIVO Cornell. Research-focused multidisciplinary discovery tool
▪ CERN. European Organization for Nuclear Research
▪ DANS. Data Archiving & Networked Services
▪ da|ra. The DOI registration service in Germany
▪ DCU. Digital Curation Unit – IMIS, Athena R.C.
Partners
4
Example from Dryad
5
Author Information
6
http://researchdata.ands.org.au/associate-professor-katherine-belov/11038
7
Research Data Switchboard
8
Modelling Connections as a Graph
www.RD-Switchboard.org
9Duplication of Content
● Why ?○ Same information submitted to different repositories○ Harvested from multiple sources
● How can we deal with duplication ?○ Unique identifiers
■ DOI■ Handle.net■ ORCID■ ISNI■ ...
● What happens if no unique identifier is present ?
10
A De-Duplication Service for the Humanities
● Motivationo Huge amount of humanities related content in Europeo Many aggregation projects about cultural heritage
● Simple exampleso Same dataset is aggregated from different sourceso Same dataset is submitted multiple timeso Same author deposits data to different repositorieso Same author registers twiceo Two co-authors submit the same dataset twice from two different locations
11
Why De-Duplication
● Cleaner, more accurate data
● Save time in submission validation
● Save space and resources when aggregating content from multiple
sources
● Improved interoperability
12
Service Design
13
Matching Algorithm
● Proposed Elements to useo Keywordso Issued Dateo Subject Termso Spatial Informationo Temporal Information
● Matching algorithmso Accuracy
Exact Approximate
o Element Titles Authors/Creators Spatial Temporal
15
Operational Workflow
HarvestService
OAI-PMH
File Upload
REST
Validation
De-Duplication
ReportRDF Store
Publish results
Ingest Enrichment