systems, processes & how we stop the wheels falling off
DESCRIPTION
Presentation from Digital Curator Dave Thompson on systems and processes for digitisation at the Wellcome Library for our second Digitisation Open Day.TRANSCRIPT
![Page 1: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/1.jpg)
Systems, processes & how we stop the wheels falling off
Digitisation Open Day, September 2013 Dave Thompson
Digital Curator, Wellcome Library
![Page 2: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/2.jpg)
Digitisation – process overview
Plan project
Catalogue
Identify material
Identify resources
Plan process
Review as you go
Digitise/process
Deliver
Refine processes
Document/share
Document/share
Document/share
Funding, staff, equipment, IT, storage, data management
planning
Open source player
![Page 3: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/3.jpg)
Meanwhile, at the coal face…
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
![Page 4: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/4.jpg)
Thinking conceptually … OAIS
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
In OAIS speak this is a SIP. An aggregation of object & its metadata in a form that is acceptable to the repository, e.g. JPEG2000 images and MARC XML.
The Open Archive Information System Reference model (OAIS) is an ISO that describes a conceptual model of an archive. It sets out the activities of an archive & the processes involved in submission, storage & access. Developed by NASA after they ‘lost’ space data through obsolescence.
![Page 5: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/5.jpg)
Thinking conceptually… OAIS
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
In OAIS speak this is a AIP. This is the object & its metadata stored in a repository.
OAIS talks of 3 information packages.1.Submission Information package = what is ingested2.Archive Information Package = what is stored3.Dissemination Information package = what is made available
![Page 6: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/6.jpg)
Thinking conceptually …OAIS
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
In OAIS speak this is a DIP. This is the parts of the object & its metadata that we are able to make available.
As defined in the (#DPC) handbook, access is assumed to mean continued, ongoing usability of a digital resource, retaining all qualities of authenticity, accuracy and functionality deemed to be essential for the purposes the digital material was created and/or acquired for.
![Page 7: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/7.jpg)
Lets tackle the basics…processing
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
Administrative metadata, (AMD) technical description of the files. Automatically created by Safety Deposit Box (SDB) on ingest into our repository. Used by the player for display purposes.
Administrative MetaData is typically created automatically, it could be:•File size•Image HxW•File format•Checksum
![Page 8: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/8.jpg)
Lets tackle the basics…processing
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
DMD. MARC, converted to MARC XML. This becomes MODS in the METS. Material must be catalogued before we can store it & make it available.
Descriptive MetaData (DMD), typically human generated, AKA cataloguing metadata. ISAD(g) for archival material, MARC for bibliographic material. Metadata Object Description Schema (MODS)
![Page 9: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/9.jpg)
Lets tackle the basics…processing
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
Safety Deposit Box (SDB), the place where we store digital stuff. Ingest is automatically initiated by Goobi. Database that associates objects with DMD & AMD. Source for dissemination.
Digital Repositories offer a convenient infrastructure through which to store, manage, re-use and curate digital materials. They are used by a variety of communities, may carry out many different functions, and can take many forms.
![Page 10: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/10.jpg)
Lets tackle the basics…processing
Administrative metadata
Descriptive metadata
Digitised images
Ingestion into repository
Creation of METS Access
+
=+
+ +
METS is metadata about structure & pagination created by humans, METS file built automatically.
A Metadata Encoding & Transmission Standard (METS) file is an aggregated collection of DMD & AMD (a file list with structure) that provides a mechanism for managed access. A METS file allows metadata from different system to be combined into a portable format.
![Page 11: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/11.jpg)
The formats
• JPEG2000 is our master image format.
• We create dissemination images (JPEG) on the fly.
• Also use PDF, MPEG2, MP3
![Page 12: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/12.jpg)
The systems
• Goobi. Manages & tracks the production of digitised content.
• SDB. Repository that stores digitised content along with its DMD & AMD.
• Player. User interface to view digitised material.
![Page 13: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/13.jpg)
How Goobi works – the basics
• Project based.
• Workflow driven.
• Users accept ‘tasks’.
• A users role determines what projects they belong to & what roles they have.
![Page 14: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/14.jpg)
How Goobi works – a workflow
![Page 15: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/15.jpg)
How Goobi works – METS editing
Pagination as per original
Descriptive metadata
Structure
![Page 16: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/16.jpg)
Lessons from Goobi
• Design your workflows in advance. But be flexible.
• Automate as much as possible, saves time & more efficient.
• Document processes & procedures.
• Share what you learn.
![Page 17: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/17.jpg)
How SDB works – the basics
• Workflow based easily ‘talks’ to other systems.
• Content agnostic.
• Creates administrative metadata on ingest.
• Preservation orientated.
![Page 18: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/18.jpg)
How SDB works
![Page 19: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/19.jpg)
How SDB works – behind the scenes
• No public access to SDB.
• Little direct staff access to SDB content.
• High levels of automation of ingest, Goobi.
• Platform for dissemination mediated by the player.
![Page 20: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/20.jpg)
Lessons from SDB
• Plan your systems integration, which system talks to which, and how.
• Plan workflows & processes.
• Data management plan. Your eggs in one basket.
• Plan what you’ll do when it all turns to custard.
![Page 21: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/21.jpg)
How the player works – the basics
![Page 22: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/22.jpg)
How the player works
• Makes HTTP request to SDB for content.
• Draws access conditions from METS file.
• Permitted actions drawn from METS.
• Draws DMD from live catalogue.
![Page 23: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/23.jpg)
![Page 24: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/24.jpg)
Summary
• Digitisation is an end to end process that brings together objects & metadata.
• Have to think about the whole system to deliver results. Process is one of combining metadata from different systems.
• Document plans & document process.
• Be prepared to be flexible & to change as necessary. But try to stick to the plan!
![Page 25: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/25.jpg)
Further reading
• Wellcome Library – http://wellcomelibrary.org
• Metadata Encoding & Transmission Standard at the Library of Congress - http://www.loc.gov/standards/mets/
• Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012 - http://public.ccsds.org/publications/RefModel.aspx
• Tessella, Safety Deposit Box - http://www.tessella.com/tag/safety-deposit-box/
• Data management planning - http://www.dcc.ac.uk/resources/data-management-plans
• Repository Software Comparison: Building Digital Library Infrastructure at LSE - http://www.ariadne.ac.uk/issue64/fay
![Page 26: Systems, processes & how we stop the wheels falling off](https://reader033.vdocuments.site/reader033/viewer/2022061201/5479accfb379597b2b8b4810/html5/thumbnails/26.jpg)
Thank you
Questions now, questions later…?
Dave Thompson, Digital CuratorWellcome Library
[email protected] - #welldigi
http://wellcomelibrary.org/