Transcript
Page 1: Sharing Between Data Repositories

Kevin S. [email protected]

Thanks to the Dryad Data Repository contributors and funders:

Ryan Scherle, Todd J. Vision, Hilmar Lapp (NESCent)Ben Bosman, Mark Diggory, Kevin Van de Velde (@mire, Inc.)

Sharing Between Data Repositories

NESCent

Page 2: Sharing Between Data Repositories

The Bio-ReposphereThe Bio-Reposphere

(Generic Subject Repository)

(Subject Specific Repository)

(General Scholarly Repository)

Page 3: Sharing Between Data Repositories

Generic vs. Specific ReposGeneric vs. Specific Repos

✔ Easy submission✔ Simple metadata✔ Data is a “black box”✔ No “orphaned” data

✔ Complex submission✔ More useful metadata✔ Well structured data✔ Specific type of data

Page 4: Sharing Between Data Repositories

A Dryad Data PackageA Dryad Data Package

Page 5: Sharing Between Data Repositories

One Possible WorkflowOne Possible Workflow

Page 6: Sharing Between Data Repositories

““Save the Time of the User” #1Save the Time of the User” #1

Page 7: Sharing Between Data Repositories

““Save the Time of the User” #2Save the Time of the User” #2

Page 8: Sharing Between Data Repositories

Three Simple StepsThree Simple Steps

Page 9: Sharing Between Data Repositories

Case 1: TreeBASE Data ImportCase 1: TreeBASE Data Import

Page 10: Sharing Between Data Repositories

Harvesting and Web ServicesHarvesting and Web Services

OAI-PMH

PhyloWS

Page 11: Sharing Between Data Repositories

Case 2: Data Uploaded to DryadCase 2: Data Uploaded to Dryad

Page 12: Sharing Between Data Repositories

Partner Repository UploadPartner Repository Upload

Page 13: Sharing Between Data Repositories

BagIt DisseminatorBagIt Disseminator(implements DSpace PackageDisseminator) (implements DSpace PackageDisseminator)

DSpaceMetadata

XSLTCrosswalk

Dryad Application Profile

DryadData

Package

DryadPublication

DryadData File

DryadData File

DryadData File

DatafromDSpace

Bag

Page 14: Sharing Between Data Repositories

A BagIt BagA BagIt Bag

data

bag-info.txt

bagit.txt

manifest-md5.txt tagmanifest-md5.txt

Page 15: Sharing Between Data Repositories

Dryad Data in the BagDryad Data in the Bag

dryadpkg.xml

dryadpub.xml

ApineDNA.nexusdryadfile-2.xml

ApineCYTB.nexusdryadfile-1.xml

datafile-2

datafile-1

Page 16: Sharing Between Data Repositories

HTTP PUT HandshakeHTTP PUT Handshake

BagIt Upload

Email

TreeBASE URL

Page 17: Sharing Between Data Repositories

Lessons LearnedLessons Learned

✔ Just enough to get the job done and no more

✔ Less local conventions and more “standards”

✔ There will always be custom solutions

✔ Options are developing quickly in this space

Page 18: Sharing Between Data Repositories

Future DirectionsFuture Directions

Less reliance on local conventions✔ Plan to use OAI-ORE and Pairtree(s) within BagIt

OAI-ORE: Because it's Linked Data

Pairtree Filesystem✔ So we can dereference URIs in ORE Resource Maps http://dx.doi.org/10.5061/dryad.8343

URI prefix: http://dx.doi.org/10.5061/dryad. Path: 83/43 83/43/Arctostaphylos.nex

Page 19: Sharing Between Data Repositories

Other Interesting DevelopmentsOther Interesting Developments

DataONE✔ Distributing data files and metadata✔ May support packages in the future

“Dropbox of Bags” or Bag replication network (BagNet?)

METS in Bags (in contrast to ORE)

Page 20: Sharing Between Data Repositories

The EndThe End

The cake was a lie

Page 21: Sharing Between Data Repositories

ReferencesDryad Code http://dryad.googlecode.com

Dryad Data Repository http://datadryad.org

BagIt http://en.wikipedia.org/wiki/BagIt

OAI-ORE Primer http://www.openarchives.org/ore/1.0/primer

OAI-ORE in BagIt http://groups.google.com/group/oai-ore/browse_thread/thread/3ebfa7fcb4588048

ADMIRAL Data Packages (Planning ORE in BagIt) http://imageweb.zoo.ox.ac.uk/wiki/index.php/ADMIRAL_data_packages

DSpace Packagers https://wiki.duraspace.org/display/DSPACE/PackagerPlugins


Top Related