1 data description registry interoperability (ddri) working group dimitris gavrilis, amir aryani

16
1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

Upload: philip-simon

Post on 31-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

1

Data Description Registry Interoperability(DDRI) Working Group

Dimitris Gavrilis, Amir Aryani

Page 2: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

2

Enabling cross-platform discovery between research data registries.

Problem & Context

Page 3: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

3

▪ ANDS. Australian National Data Service

▪ Data-PASS. Data Preservation Alliance for the Social Sciences

▪ Dryad. Digital Repository

▪ Thomson Reuters DCI. Data Citation Index

▪ VIVO Cornell. Research-focused multidisciplinary discovery tool

▪ CERN. European Organization for Nuclear Research

▪ DANS. Data Archiving & Networked Services

▪ da|ra. The DOI registration service in Germany

▪ DCU. Digital Curation Unit – IMIS, Athena R.C.

Partners

Page 4: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

4

Example from Dryad

Page 5: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

5

Author Information

Page 6: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

6

http://researchdata.ands.org.au/associate-professor-katherine-belov/11038

Page 7: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

7

Research Data Switchboard

Page 8: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

8

Modelling Connections as a Graph

www.RD-Switchboard.org

Page 9: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

9Duplication of Content

● Why ?○ Same information submitted to different repositories○ Harvested from multiple sources

● How can we deal with duplication ?○ Unique identifiers

■ DOI■ Handle.net■ ORCID■ ISNI■ ...

● What happens if no unique identifier is present ?

Page 10: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

10

A De-Duplication Service for the Humanities

● Motivationo Huge amount of humanities related content in Europeo Many aggregation projects about cultural heritage

● Simple exampleso Same dataset is aggregated from different sourceso Same dataset is submitted multiple timeso Same author deposits data to different repositorieso Same author registers twiceo Two co-authors submit the same dataset twice from two different locations

Page 11: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

11

Why De-Duplication

● Cleaner, more accurate data

● Save time in submission validation

● Save space and resources when aggregating content from multiple

sources

● Improved interoperability

Page 12: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

12

Service Design

Page 13: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

13

Matching Algorithm

● Proposed Elements to useo Keywordso Issued Dateo Subject Termso Spatial Informationo Temporal Information

● Matching algorithmso Accuracy

Exact Approximate

o Element Titles Authors/Creators Spatial Temporal

Page 14: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

14

Prototype Implementation

http://more.dcu.gr/

Page 15: 1 Data Description Registry Interoperability (DDRI) Working Group Dimitris Gavrilis, Amir Aryani

15

Operational Workflow

HarvestService

OAI-PMH

File Upload

REST

Validation

De-Duplication

ReportRDF Store

Publish results

Ingest Enrichment