1 large-scale collaborative digitisation 19 th century pamphlets online mar-2007 – feb-2009 grant...
TRANSCRIPT
1
Large-scale collaborative digitisation19th Century Pamphlets OnlineMar-2007 – Feb-2009
Grant [email protected]
Project Manager,19th Century Pamphlets Online,University of Southampton & RLUK
Digitisation & DigitalPreservation Specialist,Cambridge University Library
2
Overview
Pamphlets– what’s interesting about pamphlets?
Project– what’s interesting about the project?
Resource– what’s on offer for users?
Lessons– what lessons are there for digitisers?
3
1. What’s interesting about pamphlets?
4
1. What’s interesting about pamphlets?
Key means of getting message out
Informative and opinionated
Debates over time Collected and kept Complement other
forms of publication Underutilised
scholarly resource
5
More than just printed text!
1. What’s interesting about pamphlets?
6
2. What’s interesting about the project?
Large partnership involved Substantial & significant content Builds on previous work Business model for sustainability &
preservation Resource discovery model
7
Large partnership (12)
JISC – major funder RLUK – sponsor & funder University of Southampton /
BOPCRIS unit– lead, digitisation
JSTOR– resource discovery, delivery and preservation
Mimas– resource discovery
RLUK Libraries– pamphlet contributors
• Bristol• Durham• Liverpool• LSE• Manchester• Newcastle• UCL
8
Significant contentLibraries Collections
Durham Earls Grey – Family collection
Liverpool Earls of Derby – Family collection
UCL Joseph Hume (1777-1855) – Personal collection
Newcastle Joseph Cowen (1829-1900) – Personal collection
Manchester Foreign Office & Colonial Office collections – Government collections of overseas pamphlets
Selections from 19th Century collection – Strong on slavery and local issues
Bristol Selections from 19th Century collection – Strong on political parties
LSE Selections from 19th Century collection – Strong on pressure groups
9
Substantial content
23,000+ pamphlets 1 million+ pages 3 million+ files
£1.1 million budget (780K from JISC)
10
Substantial content
Per page:
Image OCR text(plain & co-ordinated)
11
Substantial content
Per pamphlet:
XML metadata: MODS, MIX and PREMIS in a METS wrapper
Folder of image and OCR files
12
Builds on previous work
Metadata –RSLP/CURL 19th Century Pamphlets Cataloguing Project (1999-2002, £800K)
Digitisation infrastructure –BOPCRIS digitisation unit
Delivery & preservation infrastructure – JSTOR
Relationships – RLUK membership
13
Interesting business model
Partners license all content to RLUK
RLUK-JSTOR agreement for 25 years• JSTOR provides free archiving & delivery for UK
in exchange for commercialisation elsewhere
Only exclusive for 5 years. After this…• Libraries could deliver digital copies of their own
pamphlets via open access• RLUK could enter into further agreements over
use of the content
14
Interesting resource discovery model
Pamphlet Collection
Google Scholar Search
Copac Academic & National Library Catalogue
Catalogues of libraries holding pamphlets
JSTOR’s search interface
19th Century Pamphlets Web Guide
Pamphlet level (bibliographic)Full text search
JSTOR
Mimas
Links from other JSTOR content
Regular Google Search
Many other services, resources & collections
CrossRef, OAI…
15
3. What’s on offer for users?
From early February: c. 7,000 pamphlets in initial release from JSTOR
From early March: www.pamphlets.ac.uk - online guide to pamphlets for researchers and educators
20 March - Formal launch at conference in Liverpool (free academic event)
16
3. What’s on offer for users?
17
4. What lessons are there for digitisers?
The headlines:
Projects don’t go to plan – things go wrong and opportunities arise
Projects depend on people as well as technology – good communication and trust are vital
18
4. What lessons are there for digitisers?
…about digitising pamphlets:
Scholars view pamphlets differently (intellectual content vs archival objects; individual items vs collections)
Libraries treat pamphlets differently (definition, location, binding, handling)
19
4. What lessons are there for digitisers?
… about the workflow:
Sampling & piloting are helpful but not foolproof
Time & motion is important – every second counts when undertaking large-scale digitisation
20
4. What lessons are there for digitisers?
… about IPR:
Important to accept some element of risk with copyright (<1% vs >25%)
Licensing arrangements can be extremely complex and protracted(9 separate agreements required)
21
4. What lessons are there for digitisers?
… about the use of standards:
Not always clear (e.g. different ways to mark-up with METS)
Not always stable (MIX and PREMIS were updated during course of project)
22
4. What lessons are there for digitisers?
… about working collaboratively:
Can pose challenges & require work (differing priorities, cultures, timezones)
Can provide opportunities & flexibility (pool of skills/experience to draw on, ‘extra-curricular’ activities)