datashare cni spring2013

33
DataShare: Collaboration Yields Promising Tool Julia Kochi, UCSF Library Angela Rizk-Jackson, UCSF CTSI Perry Willett, California Digital Library CNI 2013 Meeting San Antonio, TX

Upload: rizkjackson

Post on 10-May-2015

120 views

Category:

Education


0 download

DESCRIPTION

A presentation given at the Coalition for Networked Information meeting in which efforts to support sharing of research data at UCSF

TRANSCRIPT

Page 1: Datashare cni spring2013

DataShare: Collaboration Yields Promising Tool

Julia Kochi, UCSF LibraryAngela Rizk-Jackson, UCSF CTSI

Perry Willett, California Digital Library

CNI 2013 MeetingSan Antonio, TX

Page 2: Datashare cni spring2013

The Background

Julia KochiUCSF Library

Page 3: Datashare cni spring2013

What is DataShare?

An open data repository for the UCSF researcher

A concept initially envisioned by Michael Weiner, M.D.

A collaboration between UCSF CTSI, UCSF Library, and the California Digital Library

Page 4: Datashare cni spring2013

The Problem

Increasing requirements to share data• NIH grants >$500k • Publisher requirements

Unequal availability of national repositoriesCampus prioritiesFASTR, White House Directive

Page 5: Datashare cni spring2013

The Partners

UCSF CTSI• Knowledge of the researcher, access to the data

UCSF Library • Metadata expertise, programming resources

UC3• Preservations tools, services and expertise

Page 6: Datashare cni spring2013

Technical Infrastructure

Perry WillettCalifornia Digital Library

Page 7: Datashare cni spring2013

DataShare Components

Merritt: CDLEZID: CDLXTF: CDL, UCSF LibraryIngest tool: UCSF Library

Page 8: Datashare cni spring2013

Merritt Repository Service

Built on “micro-services” principlesContent and format agnosticHas a UI and RESTful APIs to submit and

retrieve content, and check statusesCan serve as either “dark” or “bright” archiveAdded public access, data use agreements,

asynchronous downloads as part of Datashare project

Page 9: Datashare cni spring2013

EZID

Service for creation and management of long-term identifiers

Currently supports ARKs and DOIs; other types in planning stages

Registers DOIs with DataCiteHas a UI and APIs with good documentation

Page 10: Datashare cni spring2013

XTF

eXtensible Text FrameworkDeveloped and maintained by CDLRuns several CDL services:• eScholarship• Online Archive of California• Calisphere

Faceted browsing, full-text search, other desirable features

Page 11: Datashare cni spring2013
Page 12: Datashare cni spring2013
Page 13: Datashare cni spring2013

Ingest tool

Submitting content to a digital repository is hard and costly

An attempt to simplify several aspects:• Digital object creation• Metadata creation• Object submission

Page 14: Datashare cni spring2013
Page 15: Datashare cni spring2013

Interactions for submission

Ingest Tool

Creates MetadataAssembles Dataset

Submits to Merritt

Merritt

EZID

Datacite

Requests DOISubmits Metadatato EZID

Registers DOI and Metadata

XTF

Requests ATOM feed for collection

Retrieves Metadata

Index metadata

Receives DOI

Packages object

Gets ATOM feed

Page 16: Datashare cni spring2013

Process for Endusers

Search, browse Request dataset download Fill out Data Use Agreement Receive dataset

Page 17: Datashare cni spring2013
Page 18: Datashare cni spring2013
Page 19: Datashare cni spring2013
Page 20: Datashare cni spring2013

Lessons learned

Partnerships• Many hands make light work• Real users uncover hidden assumptions

Scale• Object size• Number of files• Upload and download

Page 21: Datashare cni spring2013

If you build it, will they come?

Angela Rizk-JacksonUCSF CTSI

Page 22: Datashare cni spring2013

What will it take?

Sketch by Juliana Olivera Silva via Flickr

+

Page 23: Datashare cni spring2013

Providing Incentives: RequirementsOrganization Data Access Requirement # UCSF Studies

Funding

NIH Grants >$500K (2003 on), Specific programs

318 (active projects)693 (inactive)

NSF All funded projects (2005 on) 19

Foundations(e.g. Moore, Gates,

Hewlett)

All funded projects 3, 31, 19

Publishing

Nature Publishing Group (Nature, Science,

etc.)

All published studies (2009-2011) 58

Cell Press(Cell, Neuron, etc.)

All published studies (2009-2011) 48

PNAS All published studies (2005-2011) 26

Page 24: Datashare cni spring2013

Providing Incentives: Visibility

01010010101001100101001010100100100110001111

Enhances collaborative opportunities 69% increase in citation rate for

publications associated with shared data (Piwowar, 2007)

Page 25: Datashare cni spring2013

Providing Incentives: Credit

Page 26: Datashare cni spring2013

Providing Incentives: Preservation & Access

Page 27: Datashare cni spring2013

Providing Incentives: Institutional

UCLA Royce Hall photo courtesy of Adam Fagen via Flickr

• Support researcher needs• Improved archiving efficiency• Cost savings

Page 28: Datashare cni spring2013

Eliminating Barriers1. Time / Effort

- Minimal requirements- Specific tools (e.g. ingest)- Integrate into existing workflow

2. Control- Data Use Agreement- Centralized service

3. Cultural Paradigm- Outreach- Demonstrate value

Page 29: Datashare cni spring2013

Other Collaborators

Page 30: Datashare cni spring2013

Lessons LearnedDon’t underestimate technical matters • Separating data & metadata

Standards are not standard• Metadata schema (Dublin Core DataCite)• Interpretation

Policy issues are ever-present• Data Ownership & Data Use Agreements• Privacy & Consent (Human subjects)

Keep in mind the entire lifecycle: ALL users• Discoverability & interoperability• README File

Page 31: Datashare cni spring2013

Next Steps

OutreachSystem enhancements• Design overhaul• Ingest mechanism• DUA menu

Policy navigationProof-of-concept

Page 32: Datashare cni spring2013

Discussion Topics

What incentives have you found useful to encourage adoption of this type of resource?

Are you using data use agreements? Uniform or individualized?

Where do you see institutional data repositories fitting in the larger ecosystem?

Page 33: Datashare cni spring2013

More info

Datashare: http://datashare.ucsf.eduCDL: http://www.cdlib.org• Merritt: https://merritt.cdlib.org• EZID: http://n2t.net/ezid• XTF: http://xtf.cdlib.org

UCSF Library: http://www.library.ucsf.edu/UCSF CTSI: http://ctsi.ucsf.edu/

NCATS – NIH Grant # UL1 TR000004