code4lib 2013 - all the metadatas re-revisited
DESCRIPTION
Last year Declan Fleming presented ALL TEH METADATAS and reviewed our UC San Diego Library Digital Asset Management system and RDF data model. You may be shocked to hear that all that metadata wasn't quite enough to handle increasingly complex digital library and research data in an elegant way. Our ad-hoc, 8-year-old data model has also been added to in inconsistent ways and our librarians and developers have not always been perfectly in sync in understanding how the data model has evolved over time. In this presentation we'll review our process of locking a team of librarians and developers in a room to figure out a new data model, from domain definition through building and testing an OWL ontology. We¹ll also cover the challenges we ran into, including the review of existing controlled vocabularies and ontologies, or lack thereof, and the decisions made to cover the gaps. Finally, we'll discuss how we engaged the digital library community for feedback and what we have to do next. We all know that Things Fall Apart, this is our attempt at Doing Better This Time.TRANSCRIPT
![Page 1: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/1.jpg)
ALL TEH METADATAS
Re-revisited2013 code{4}lib Meeting
February 13, 2013
Esmé CowlesMatthew CritchlowBradley Westbrook
![Page 2: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/2.jpg)
Overview
• Needs assessment and proposed solution
• Data modeling
• Tool implementation
Overview• Needs Assessment
• Data Model Process
• Implementation
![Page 3: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/3.jpg)
Overview
• Needs assessment and proposed solution
• Data modeling
• Tool implementation
Needs AssessmentBrad Westbrook
![Page 4: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/4.jpg)
Need One: More consistent data
![Page 5: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/5.jpg)
Need Two: Maintain syntax of hierarchical subjects
![Page 6: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/6.jpg)
Need Three: Improve support for complex objects
![Page 7: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/7.jpg)
Improve support for complex objects-2
![Page 8: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/8.jpg)
Need Four: Align more strongly with DL community
• Make sure UCSD RDF is public facing– Use vocabularies in the public– Make UCSD vocabularies public
• Develop technology stack– Utilize contributions from non-UCSD sources– Contribute to non-UCSD endeavors
![Page 9: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/9.jpg)
Data Model ProcessMatt Critchlow
![Page 10: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/10.jpg)
Project Overview
Research Data Curation Pilot Deadline: June, 2013
Timeline: July 16, 2012 – Oct 29, 2012
Deliverables• Abstract Data Model• OWL/RDF Ontology• Data Model Extension Guidelines
TeamMetadata Analyst: Arwen Hutt, Bradley WestbrookIT: Esmé Cowles, Matt Critchlow, Longshou Situ
![Page 11: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/11.jpg)
User Stories
As an administrative unit manager, I want to indicate any external versions or descriptions of an object that may be of probable importance to a user
As a user, I want to know what collection(s) an object belongs to
As a DAMS manager, I want to know what administrative unit an object belongs to
![Page 12: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/12.jpg)
Abstract Model – High Level
![Page 13: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/13.jpg)
Abstract Model
Collection
Object
Component
Relationship
Name
Role
![Page 14: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/14.jpg)
Data Dictionary
Title (title 1-m)
Administrative Unit (unit 1)
Language (language 1-m)
Copyright (copyright 1)
Relationship (relationship 0-m)
![Page 15: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/15.jpg)
Ontology
![Page 16: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/16.jpg)
Thing 1, Thing 2
![Page 17: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/17.jpg)
Thing 1, Thing 2
![Page 18: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/18.jpg)
ImplementationEsmé Cowles
![Page 19: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/19.jpg)
![Page 20: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/20.jpg)
DAMS Repository
• New version of our lightweight repository– Metadata in triplestore– Files on disk or cloud storage
• Explicit structural metadata • Native REST API• Fedora REST API (partial)
![Page 21: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/21.jpg)
DAMS Manager
• Separate Java webapp• Ingest, batch operations• Uses DAMS Repository REST API• Functionality moved into the repository– Characterization (JHove)– Fixity checking– Derivatives (ImageMagick)
![Page 22: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/22.jpg)
DAMS Public Access System
• Old frontend is unsustainable• New frontend in Hydra– Backed by DAMS Repo, not Fedora
• Hydra platform and community
![Page 23: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/23.jpg)
Timeline
• Started 2 months ago• Code sprint in January with cbeer and jcoyne• March: Beta release with research data• Spring: Migrating existing content• Summer: Production release
![Page 24: Code4Lib 2013 - All THE Metadatas Re-Revisited](https://reader033.vdocuments.site/reader033/viewer/2022052622/559038cb1a28ab090d8b45e3/html5/thumbnails/24.jpg)
One More Thing
• We’ve talked about DAMS for years...• Now we have code to share
http://github.com/ucsdlib/
@escowles @[email protected]