nyam print preservation conference
TRANSCRIPT
Constance Malpas
Program Officer, OCLC Research
Conference on Collections of Historical &
Unique Value in the Middle Atlantic
Region (NN/LM MAR)
Managing Legacy Collections
in the Mass-digitized Library
Environment: the role of
regional print repositoriesNew York
Academy of Medicine
25 March 2011
My role today
• Provide ‘system-wide’ perspective on changes in the
library environment as they pertain to print collections
management
• Highlight some recent research and its implications for
NN/LM libraries in the Middle Atlantic Region
• Offer a point of view on the respective roles and
responsibilities of academic and special libraries (and
NN/LM) in cooperative print preservation
OCLC Research: what we do
Special focus on libraries in research institutions:
in US, libraries supporting doctoral-level education account for
<20% of academic libraries;>70% of library spending
changes in this sector impact library system as a whole;
collective preservation and access goals, shared infrastructure, &c.
Support global cooperative by providing internal data and
process analyses to inform enterprise service development
(R&D) and deploying collective research capacity to deepen
public understanding of the evolving library system
OCLC Research: who we are
~45 FTE with offices in Ohio, California and Leiden
Sponsored by OCLC and a partnership of research libraries
around the world that share:
• A strong motivation to effect system-wide change
• A commitment to collaboration as a means of achieving collective gains
• A desire to engage internationally
• Senior management ready to provide leadership within the transnational
research library community
• Deep and rich collections and a mandate to make them accessible
• The capacity and the will to contribute
System-wide organization
• Characterization of the aggregate library resource
Collections, services, user behaviors, institutional capacity
• Re-organization of individual libraries in network context
Institutions adapting to changes in system-wide organization
• Re-organization of the library system in network context
„Multi-institutional‟ library framework, collective adaptation
Research theme addresses “big picture” questions about the
future of libraries in the network environment; implications
for collections, services, institutions embedded in complex
networks of collaboration, cooperation and exchange
Defining characteristics of SO activities
Emphasis on analytic frameworks and heuristic models that
characterize (academic) library service environment as a
whole
Identifying and interpreting patterns in distribution,
character, use and value of library resource; implications
for future organization of collections and services
Provides context for decision-making, not prescriptive
judgments about a single, best course of action
Shared understanding of how network environment is
transforming library organization on micro and macro level
Low
Stewardship
High
Stewardship
In few
collections
In many
collections
Collections Grid
Licensed
Purchased
Purchased materialsLicensed E-Resources
Research & Learning
Materials
Open Web
Resources
Special CollectionsLocal Digitization
Credit: Dempsey, Childress (OCLC Research. 2003)
Low
Stewardship
High
Stewardship
In few
collections
Licensed
Purchased
Limited
High
attention
Less
attention
Limited Aspirational
Occasional
Intentional
Library attention and investment are shifting
In many
collections
Impeding this transition is not a good survival strategy
Low
Stewardship
High
Stewardship
In Few
Collections
In Many
Collections
Academic institutions, today and tomorrow
Licensed
Purchased
Redirection of
library resource
Today In 5 yrs
An Equal and Opposite Reaction
As and increasing share of library spending is directed
toward licensed content . . .
Pressure on print management costs increases
Fewer institutions to uphold preservation mandate
Stewardship roles must be reassessed
Shared infrastructure requirements will change
What‟s special about Medicine?
• Disintermediation of library began earlier and has
advanced further than in many other disciplines
• Web-scale discovery is uncontested; PubMed exercises
significant gravitational pull
• Commercial (and clinical, research) value of current
content is exceptionally high; licensed electronic content
is indispensable to research and practice
• Print collections are correspondingly under-resourced
[apart from the fact that it saves lives and reduces suffering]
A dramatic shift in library resourcing
OCLC Research. Derived from ARL Statistics and ARL Academic Health Sciences Library Statistics (2003/2004 -2008/2009)
In 5 years, proportional spending on e-resources
at ARL Health Science Libraries has more than doubled
Median$1.5M
Median$5.9M
Duplication rates are high . . .
1 2 3 4 5 6 7 8
Medicine: Periodicals, Societies, General Topics [R 1-130]
Medicine: History, Medical Expeditions [R 131-687]
Medicine: Special Subjects [R 690-920]
Medicine and the State [RA 3-420]
Public Health [RA 421-790]
Medical Geography [RA 791-955]
Medical Centers, Hospitals [RA 960-1000.5]
Forensic Medicine, Medical Jurisprudence [RA 1001-1171]
Toxicology [RA 1190-1270]
Pathology [RB]
Internal Medicine, Medical Practice: General Works [RC 1-106]
Infectious and Parasitic Diseases [RC 110-253]
Neoplasma, Neoplastic Diseases [RC 254-298]
Tuberculosis [RC 306-320]
Neurology [RC 321-431]
Psychiatry, Psychopathology [RC 435-576]
Allergic, Metabolic, Nutritional Diseases [RC 578-632]
Diseases of Organs, Glands, Systems [RC 633-935]
Diseases of Regions of the Body [RC 936-951]
Geriatrics [RC 952-954]
Arctic and Tropical Medicine, Industrial Medicine, Space Medicine, Sports …
Other Internal Medicine, Practice of Medicine [any other RC]
Surgery [RD]
Opthalmology [RE]
Otorhinolaryngology [RF]
Gynecology and Obstetrics [RG]
Pediatrics [RJ]
Dentistry [RK]
Dermatology [RL]
Therapeutics, Pharmacology [RM]
Pharmacy and Materia Medica [RS]
Nursing [RT]
Botanic, Thomsonian, Eclectic Medicine [RV]
Homeopathy [RX]
Other Systems of Medicine [RZ]
Items per Manifestation
OCLC Research. Derived from O’Neill et. al. OhioLINK Analysis (2010)
Above avg.
duplication
rates in most
medical
subject areas
. . . demand (even in the aggregate) is low
0.00 0.05 0.10 0.15 0.20 0.25 0.30
Medicine: Periodicals, Societies, General Topics [R 1-130]
Medicine: History, Medical Expeditions [R 131-687]
Medicine: Special Subjects [R 690-920]
Medicine and the State [RA 3-420]
Public Health [RA 421-790]
Medical Geography [RA 791-955]
Medical Centers, Hospitals [RA 960-1000.5]
Forensic Medicine, Medical Jurisprudence [RA 1001-1171]
Toxicology [RA 1190-1270]
Pathology [RB]
Internal Medicine, Medical Practice: General Works [RC 1-106]
Infectious and Parasitic Diseases [RC 110-253]
Neoplasma, Neoplastic Diseases [RC 254-298]
Tuberculosis [RC 306-320]
Neurology [RC 321-431]
Psychiatry, Psychopathology [RC 435-576]
Allergic, Metabolic, Nutritional Diseases [RC 578-632]
Diseases of Organs, Glands, Systems [RC 633-935]
Diseases of Regions of the Body [RC 936-951]
Geriatrics [RC 952-954]
Arctic and Tropical Medicine, Industrial Medicine, Space Medicine, Sports …
Other Internal Medicine, Practice of Medicine [any other RC]
Surgery [RD]
Opthalmology [RE]
Otorhinolaryngology [RF]
Gynecology and Obstetrics [RG]
Pediatrics [RJ]
Dentistry [RK]
Dermatology [RL]
Therapeutics, Pharmacology [RM]
Pharmacy and Materia Medica [RS]
Nursing [RT]
Botanic, Thomsonian, Eclectic Medicine [RV]
Homeopathy [RX]
Other Systems of Medicine [RZ]
Circulation Rate (total annual circulations / total circulating items)
OCLC Research. Derived from O’Neill et. al. OhioLINK Analysis (2010)
Below avg.
circulation
rates in most
medical
subject areas
… many books have simply „aged out‟
Medical Geography [RA 791-955]
56.9 Infectious and Parasitic Diseases [RC 110-253]
46.7
Neoplasma, Neoplastic Diseases [RC 254-298]
21.6
Tuberculosis [RC 306-320]59.5
Nursing [RT]23.8
Botanic, Thomsonian, Eclectic Medicine [RV]
136.4
0
20
40
60
80
100
120
140
160
Avera
ge A
ge (
in y
ears
) per
Item
OCLC Research. Derived from O’Neill et. al. OhioLINK Analysis (2010)
Average age per item (in 2010) = 20.5 years
„Value‟ of legacy print is a matter of opinion
When I talk to the Dean the Medical School about the importance of the old collections, he looks at me like I’m not only from Mars, but should probably go back there…
Paul Courant, “Economic Perspectives on Academic Libraries ” OCLC Distinguished Seminar Series (July 2010)
http://vidego.multicastmedia.com/player.php?p=vi151eg0 [41.57]
Why all this emphasis on academic libraries?
• NN/LM resource-sharing network is heavily reliant on
academic health science libraries
• Shift to electronic resources and declining use of physical
inventory makes it increasingly difficult for academic
libraries to uphold traditional print preservation mandate
Can a network established to provide equitable access to
health information be mobilized in support of collective
preservation goals? What incentives are needed to ensure
that cooperative preservation is sustainable?
A „big shift‟ in academic collections
Analysis of mass-digitized book corpus
& academic print collections (2009-
2010) found:
• avg. ~30% of ARL library collection
duplicated in HathiTrust digital
repository as of June 2010
• up to 75% of mass-digitized book
collection already held in print
storage facilities
• shared print repositories represent
critical infrastructure during p- to e-
transition, enabling significant library
space savings and cost avoidance
Mellon-funded project with
HathiTrust, NYU, & ReCAP partners
Working hypothesis
Emergence of large scale shared print and digital
repositories creates opportunity for strategic
externalization of core library operations
• Reduce costs of preserving scholarly record
• Enable reallocation of institutional resources
• New business relationships among library partners
Network (inter)dependencies
OCLC Research Analysis of WorldCat snapshot. Data current as of March 2011.
For academic HSL especially, concerns about sustainability of
current investment in redundant print inventory
N = 28 libraries
8.8M holdings
6.3M unique titles
Snapshot of the mass-digitized library environment
0 10,000 20,000 30,000 40,000 50,000 60,000
Medicine
Medicine By Discipline
Preclinical Sciences
Health Facilities, Nursing
Medicine By Body System
Communicable Diseases & Misc.
Unique Titles / Editions
Medical Literature in the HathiTrust Digital LibraryN = 119,217 titles
Full-view
Search-only
OCLC Research Analysis of WorldCat and HathiTrust snapshots. Data current as of March 2011.
(<3% of titles in HathiTrust Library)
medical
13%
Time for a game!
How much of the aggregate collection of NN/LM MAR
Resource Libraries is already duplicated in the
HathiTrust Digital Library?
a) <10%
b) 10-15%
c) >30%
The correct answer is . . .
MAR Resource Libraries: Collective Collection
8.8 million WorldCat holdings in 28 MAR Resource Libraries
>290K holdings (33%)
duplicated in HathiTrust
Digital Library OCLC Research Analysis of WorldCat and HathiTrust snapshots. Data current as of March 2011.
Round Two . . .
I‟d guess the percentage overlap between the HathiTrust
Digital Library and my library is:
a) 20-30%
b) 30-50%
c) 50% or more
The answer is . . .
A view of the MAR Resource Library network
OCLC Research Analysis of WorldCat and HathiTrust snapshots. Data current as of March 2011.
Median overlap = 38%
Stewardship and sustainability:
a pragmatic view
Using recent life-cycle adjusted cost model* for library print collections,
$4.25 per volume per year -- on campus
$ .86 per volume per year -- in high-density storage
Cornell University‟s Weill Medical College is spending between
[356K titles * $.86 =] $30K to $150K [= 356K titles * $4.25] annually
to retain local copies of content preserved by the HathiTrust (which
Cornell is also paying for) . . .
The library is not financially accountable for these costs
but it is responsible for managing them
*Paul Courant and M. “Buzzy” Nielson, “On the Cost of Keeping a Book” in The Idea of Order (CLIR, 2010)
~2.2K linear feet of shelf space
Perceptions of (health science) library value
NN/LM MAR Value of Library & Information Services in Patient Care (29 Sep 2010) https://webmeeting.nih.gov/p30767239/
Print is no longer at the center
We need a risk communication strategy
• Public domain in HathiTrust
cf. snippet view in GBS
• Scanned from U Minnesota
• 515 library holdings
Including NYAM, SUNY
Upstate, Columbia HSL, and
UPMC and … Redundant inventory is associated
with real costs that increase
stress on fragile library system
Where is all of this heading?
• In academic health science libraries especially, retention
of low (and no-) use print collections is increasingly
difficult to justify
• Print stewardship responsibilities will need to be
renegotiated, for journals and books
• This has implications for NN/LM resource sharing
relationships
• New business arrangements between academic and
“other” health science libraries may re-valorize historical
collections and enable sustainable preservation strategy
A new kind of „resource‟ library?
Print archiving partners enable redistribution of library
investment in print preservation
• Enabling rationalization of regional inventory
• Facilitating revitalization of library service portfolio(s)
• Reducing total cost of library service to health sector
A virtuous circle in which NN/LM delivers maximum
value at minimum cost; library print supply chain is
preserved during ongoing p- to e- transition
Hospital Library
OCLC Research Analysis of WorldCat and HathiTrust snapshots. Data current as of March 2011.
84% of digitized titles held by >99 libraries
Incentive to withdraw local copy increases
Academic Health Sciences Library
75% of digitized titles held by >99 libraries
OCLC Research Analysis of WorldCat and HathiTrust snapshots. Data current as of March 2011.
Market for shared print provision increases
Independent Research Library
OCLC Research Analysis of WorldCat and HathiTrust snapshots. Data current as of March 2011.
53% of digitized titles held by >99 libraries
Value of Hathi preservation increases
What sorts of arrangements are needed?
• Academic libraries will need an assurance of substantial
space savings and guaranteed access (persistence)
• Print archive partners will need to reconfigure collections
to maximize their „business‟ value to academic institutions
• Deliberate (regional?) strategy to reduce redundant
inventory, maximize visibility of print archive
holdings, cultivate new audiences for legacy print
Appropriate incentives will be needed to motivate
change on both sides, ensure sustainability
Potential Strategies for MAR
• Leverage available infrastructure
• Can academic HSL collections in ReCAP, Pitt, &c. storage
facilities be made available to MAR as de facto print archives?
• Can IRLA and special collection repositories assume new role as
regional preservation agents?
• Develop a ‘risk communication’ plan to assist MAR libraries in
managing down redundant print inventory
• Enlist stakeholders among faculty, students, researchers
• Divide & conquer
• Different approaches may be required for cooperative
preservation of serials, monographs, and primary resources
A de facto archive in the making?
As of February 2011:
Columbia Health Sciences holdings in ReCAP: 162,120 items
• HR Health Science Restricted 107,167 items
• HS Health Science 39,761 items
• HX Health Science Special 15,192 items
~5% of 3.5M items deposited by Columbia University
18,227 retrievals for these items over 8 years or ~1%
circulation rate annually
8,223 titles duplicated in HathiTrust repository; median 95
library holdings per title; 86% books; 43% public domain
Roles and responsibilities in MAR
Academic HSL
• Embrace cooperative sourcing of collections as a strategy for
legacy print management; maximize reliance on shared
infrastructure
Special libraries
• Cultivate brand as cultural and community resource; if your
collections aren‟t demonstrably enhancing institutional
reputation, they are at risk (no matter how „valuable‟)
Archival repositories
• Exercise your expertise in appraisal by helping colleagues
understand „how much (or little) is enough‟
How can NN/LM help?
• Create incentives for member participation in cooperative print
archiving efforts
• e.g. a special designation for libraries that assume responsibility
for preservation of resources that enables a shift in resourcing
across the network
• Make regional preservation needs-assessment a priority in next
RML contract cycle
• Variable distribution of aggregate HSL resource means different
strategies will be needed in the 8 regions
• Increase visibility of NLM as part of shared preservation
infrastructure by elevating selective archiving commitments
• A national strategy that distinguishes between the ends and
means of print collections
For discussion
What would it take to enable e.g. Columbia HSL to make
ReCAP holdings available to NN/LM MAR as shared
preservation infrastructure?
What would it take for e.g. NYAM to make digitized
monographic collections available to NN/LM as shared print
print archive?
• What tangible benefits accrue to NN/LM members?
• What incentives are needed to motivate Resource Libraries
to share this infrastructure?
• Can some number of „free riders‟ be accommodated?
See for yourself . . .
http://www.stats.oclc.org/cusp/login
Use your FirstSearch, Resource Sharing or Connexion Autho
(Pie chart is masked here)
Thanks for your attention.
Comments and questions are welcome
@ConstanceM
Selected Resources
OCLC Research – Shared Print activities – Constance Malpas ([email protected])
• Cloud-sourcing Research Collections: www.oclc.org/research/publications/library/2011/2011-01.pdf
• MARC 583 for print archives disclosure: http://docs.google.com/View?id=dc2djpm6_46cbq4kfgd
OCLC Print Archive Pilot – Kathryn Harnish ([email protected])
• http://www.oclc.org/us/en/productworks/coopprintarchiving.htm
CRL Print Archives Registry – Lizanne Payne ([email protected])
• http://archivereg.crl.edu/
UK Research Reserve Journal Archiving Project – Frances Boyle ([email protected])
• http://www.ukrr.ac.uk/
R2 Sustainable Collections – Rick Lugg ([email protected])
• http://sampleandhold-r2.blogspot.com/
Ithaka S+R – Roger Schonfeld ([email protected])
• http://www.ithaka.org/ithaka-s-r