3.7.17 dspace for data: issues, solutions and challenges webinar slides
TRANSCRIPT
Hot Topics: DuraSpace Community Webinar Series
Hot Topics: The DuraSpace Community Webinar Series
Series Fifteen: DSpace for Data
Curated by Claire Knowles, Library Digital Development Manager,
The University of Edinburgh.
Hot Topics: DuraSpace Community Webinar Series
Webinar 2:
DSpace for Data: issues, solutions and challenges
Presented by:
Claire Knowles, The University of Edinburgh
Ryan Scherle, Dryad Digital Repository
Pauline Ward, The University of Edinburgh
Today’s Speakers
Ryan Scherle Dryad Digital Repository datadryad.org
Pauline Ward Edinburgh DataShare, University of Edinburgh datashare.is.ed.ac.uk
What is Dryad?
A data repository, working closely with scientific journals. •data tightly connected to articles •broad disciplinary scope •broad interpretation of “data” •nonprofit, with Data Publication Charges
Why does Dryad use DSpace?
For the robust metadata model? For the extremely clean architecture? Just one reason… workflow
Issues to consider
File sizes File types Structured objects Versioning Timing of data release Additional metadata Sensitive data
File sizes
Allow submission of large files Provide curators ways to inspect large files Be aware of time required for automated processes
File types
DSpace doesn’t care, but the users do. Steer submitters to preferred types. Give curators tools to read varied types. Develop methods to look for common issues in a variety of types.
Structured objects
Changing the data model affects all parts of DSpace
•Submission •Identifiers •Curation •Item display •Search results •APIs
Articles are relatively static, but data is often reused, revised, and expanded!
Determine what constitutes a version, and how to cite it.
Versioning
http
s://f
lic.k
r/p/a
6Hpr
9
Timing of data release
Are data independent of the publication or synced with it?
Develop embargo policies for both metadata and bitstreams.
http
s://f
lic.k
r/p/e
bZd3
d
Additional metadata
Data in a repository may require additional metadata for:
•Discovery •Maintaining item structure •Support of workflow •Usage tracking
Sensitive data
Copyrights Endangered species Human subjects
http
s://f
lic.k
r/p/8
3Rki
t
http
s://f
lic.k
r/p/3
bpAk
c
Technical challenges in DSpace
The most important technical issues to address when adding data to DSpace are:
•Data model •Submission/curation workflow •Processes for large files •Embargo and access control
Pauline Ward The University of Edinburgh
https://wiki.duraspace.org/display/[email protected]/The+DSpace+Curator%27s+Handbook
https://wiki.duraspace.org/display/[email protected]/The+DSpace+Curator%27s+Handbook
What is Edinburgh DataShare?
● Institutional research data repository
●DSpace 5.2, with the XMLUI Mirage interface
●First deposit was accessioned in 2008
●Now contains 1,912 data items
●Very broad disciplinary spread
Big Files
Our researchers wanted to deposit files over 1 GB, which was difficult to do via the web submission form. So our developer ported the HTML5 upload facility from JSPUI to XMLUI. Now, users can upload up to 20 GB via their browser. EDINA’s code is available: https://github.com/edina/DSpace/tree/xml-html5-upload
The Missing Curator’s Handbook
Looking for help: ●https://wiki.duraspace.org/display/[email protected]/The+DSpace+Curator%27s+Handbook
How to contribute
Claim a ticket and/or join a meeting https://wiki.duraspace.org/display/DSPACE/DSpace+7+UI+Working+Group Join us on Slack / ask questions https://goo.gl/forms/s70dh26zY2cSqn2K3 DSpace 7 Outreach Group https://wiki.duraspace.org/display/DSPACE/DSpace+7+UI+Outreach+Group