edge cases
DESCRIPTION
Edge Cases. Digitizing and delivering undescribed items within encoded archival descriptions. Archives Online @ IU. Since 2002 27 repositories and around 1000 public finding aids Web-based Submission/Workflow tool XTF-based delivery site. Existing digitization workflow. - PowerPoint PPT PresentationTRANSCRIPT
EDGE CASESDigitizing and delivering undescribed items within encoded archival descriptions
ARCHIVES ONLINE @ IU
Since 2002 27 repositories and around 1000 public
finding aids Web-based Submission/Workflow tool XTF-based delivery site
EXISTING DIGITIZATION WORKFLOW
Large batch of items digitized Processed into our repository Given purl-resolvable landing pages DAO links added to finding aid by XSLT
EXISTING WORKFLOW SHORTCOMINGS
Too much overhead when only a small number of items are digitized
Does not support the digitization of undescribed items
NEW WORKFLOW GOALS
As automatic as possible Recreate the experience of opening a folder
and flipping through the content Preserves order of undescribed items
DEVELOPMENT AND DESIGN TEAM
Jenn Riley Randall Floyd David Jiao Julie Hardesty Dot Porter Mike Durbin
NEW WORKFLOW: DIGITIZATION
One or more items are selected for digitization
The item’s parent component in the EAD and relative order to other digitized items is encoded into the collection spreadsheet along with the newly digitized item’s identifier
The material is scanned, and page ordering is encoded in filenames
The updated spreadsheet and master files are placed in a drop-box for automatic processing
Parent Id Item Id SequenceVAB8339-0001 VAB8339-200000 1
NEW WORKFLOW: PROCESSING AND AUTOMATIC QUALITY CONTROL
Digitized image files are run through quality control checks to determine that they meet the digitization standards
Upon failure, an email is sent to collection manager
Master files are moved off to archival storage, extracted metadata (MIX) as well as derivative images as passed off to another drop-box to be ingested into our Fedora repository.
QC
NEW WORKFLOW: FEDORA INGEST Objects are stored to Fedora
Collection Level Object Latest version of the spreadsheet Latest version of the EAD*
Archival Component Level Object METS (struct map, drives our page-turning application)
Item Level Objects PDF METS
Page Level Objects Image derivatives Master image link MIX metadata
An e-mail is sent to the collection manager List updated archival component objects Lists ingested items Includes reports of any problems/inconsistencies between the
spreadsheet and digitized files
NEW WORKFLOW: PUBLICATION
References are added to the EAD file XTF reindexes the EAD file and transforms
those references into links to display the components
THANKS! QUESTIONS?
Archives Online @ IU http://webapp1.dlib.indiana.edu/findingaids/