digitizing nevada newspapers: workflow
TRANSCRIPT
Nevada Digital Newspaper Project
Dana Bullinger (Project Coordinator) and Melissa Stoner (Project Technician)
Title Selection
● Advisory Board selects qualified titles
○ Research Value
○ Geographic Representation
○ Temporal Coverage
○ Diversity
NDNP Title Guidelines
● Complete (or majority of) title run should be available
on microfilm without restrictions
● Technical factors to consider:
○ Quality of original text and microfilm capture
○ Reduction ratio (lower the reduction ratio, the better, below 20x)
○ Camera master negative microfilm duplicated should have a resolution
test patterns readable at 5.0 or higher
○ Variations of no more than 0.2 within images and between exposures
○ Confidence level through OCR testing of sample page images
Deliverables
For Each Title
•Up-to-date MARC record from the
CONSER OCLC database
•Additional title-level metadata (Reel-Level
Metadata spreadsheet example)
•Newspaper History Essay - 500 words per
title
For each issue
•Structural metadata for issues digitized and
organized by date (Page-Level Metadata
spreadsheet example)
DeliverablesFor each newspaper page
- Page image in two formats
- Grayscale, scanned between 300-
400 dpi, uncompressed TIFF 6.0
image file
- Same image, compressed as
JPEG2000 (.JP2)
- OCR text using the ALTO schema
(1 file per page)
- PDF image with Hidden Text
Selected Titles
● Research Library of
Congress Control Numbers
CCNs and OCLC numbers
for all titles
● Accurate LCCNs critical for
data management
● Fill in spreadsheet
● Send to LC for approval
Before Duplication Begins...
● Set up purchase order with selected
digitization vendor (iArchives)
● Research and order microfilm reader
● Send work plan to NEH
● Order 10 1-TB Hard Drives for our
deliverables
Microfilm Reader and Software
•14MP Image Sensor
•Light Source
•File Output
•Lens with 7x to 105x
magnification
Sample Batch● Sample batch allows Library of Congress to
identify any potential problems and ensures
technical specifications are being implemented
● Tonopah Daily Bonanza (1901-1903)
● Negative and Positive Reels duplicated by
NSLA and sent to UNLV
● Apply LC-provided barcodes on Negative Reel
boxes
○ Barcode connects digital content to physical
reel deposited at LC
MasterFile●Document everything in the MasterFile and Reel-Level
Spreadsheet
○ Title, Year, LCCN, Barcode/Reel Number, Unique name for iArchives,
metadata received from NSLA
Collation: Reel-Level
UNLV NSLA
Unique Name Title
LCCN Source Repository
Reel-Number Density Readings
Location of Publication Reduction Ratio
Start/End date Average Density
Digital Responsible
Institution
Collation: Page-Level
● Use template
● One page-level spreadsheet = one reel
● Page count
● Anomalies
- Missing issues or pages
- Duplicate issues or pages
- Mutilated pages
- Other abnormalities (e.g. pages out of
order,incorrect dates)
Quality Review: before deliver to vendor● Re-visit collation sheet and reel
metadata line-by-line
● Confirm for accuracy
● Check delivered page count against
● Check all notation for standardization
and clarity
● Metadata property formatted
iArchives
● iArchives Portal
○ Upload Reel and Page-level in a
.CSV file
● Ship Negative reels and blank hard
drive to be digitized
Scanning Specifications● Scan from clean second-
generation duplicate silver
negative microfilm (to be
deposited at the Library of
Congress at the end of the award
period)
● Capture specifications are 8-bit
grayscale, between 300 and 400
dpi
● Target film strip should be
scanned at the start of each
session
● Provide the master page images,
delivered to LC, as uncompressed
images in TIFF 6.0 format
Quality Review- Quality Review process ensures that NDNP Specifications are met
by checking for image quality, irregularities, and correct
bibliographic software
- Digital Viewer and Validator
(DVV)
- Allows awardees and
vendors to view data and
validate technical aspects of
files
- Verification checks digital
signatures of all files in a batch
Quality Review● Verify Batch
● Double check dates using Calendar View
in DVV, cross reference with Reel-Level
and Page-Level data
● View thumbnails
● Check OCR (10% of pages)
● Verify Batch with DVV for a second time
● Email Tonijala Penn (LC Liaison) and Deb
Thomas (Project Coordinator for NDNP)
Library of Congress
● Ship to LC
○ Hard Drive
○ Shipping Manifest
○ Use fluorescent stickers!
● Receives and processes batch
● 6-8 weeks turnaround time
● If accepted, batch is ingested
into Chronicling America