preserving content from your institutional repository
DESCRIPTION
Between institutional repositories and hosting journals, many libraries are becoming responsible for scholarly content in new ways. While PDFs are the most common format today, the unique, local, serial content may be in variety of formats. These items may be digitized text, born digital text, audio, video, or images. This presentation will discuss formats that will remain accessible through time (PDF/A, txt, xml) so that content is not locked in proprietary formats. It will also discuss options for backing up items and associated metadata, including simple back-ups, off-site storage of files, LOCKSS, Private LOCKSS Networks, and Portico. The presenters will offer suggestions for how to ensure your local content is being preserved properly. Carol Ann Borchert Coordinator for Serials, University of South Florida Carol Ann Borchert has been the Coordinator for Serials at the University of South Florida (USF) since 2004. Previously, she was in the Reference and Government Documents departments at USF, and in several areas of the James B. Duke Library at Furman University. She holds an MLS from the University of Kentucky and an M.A. in Spanish from USF. Wendy Robertson University of Iowa Wendy Robertson, Digital Scholarship Librarian has worked as a librarian at The University of Iowa Libraries since 2001. Her previous work positions include Electronic Resources Systems Librarian in Enterprise Applications, Electronic Resources Management Unit Head in Technical Services, and Electronic Resources Technical Services Librarian in Serials. She holds an MLS from The University of Iowa.TRANSCRIPT
![Page 1: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/1.jpg)
Preserving Content from Your
Institutional Repository
Wendy C Robertson and Carol Ann BorchertNASIG, Buffalo, N.Y., June 8 2013
This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.
![Page 2: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/2.jpg)
The Signalhttp://blogs.loc.gov/digitalpreservation/
![Page 3: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/3.jpg)
“
”
a permanent, institution-wide repository of diverse, locally produced digital works (e.g., article preprints and postprints, data sets, electronic theses and dissertations, learning objects, and technical reports) that is available for public use and supports metadata harvesting.
University of Houston Libraries, Institutional Repository Task Force. Institutional Repositories. SPEC Kit 292. July 2006. p.13
An institutional repository is…
![Page 4: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/4.jpg)
An institutional repository is not…
Most IRs currently are not preservation repositories; they do not meet all the criteria in Trustworthy Repositories Audit & Certification (TRAC) or other audits.
![Page 5: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/5.jpg)
10 basic characteristics of digital preservation repositories (CRL)
1. The repository commits to continuing maintenance of digital objects for identified community/communities.
2. Demonstrates organizational fitness (including financial, staffing, and processes) to fulfill its commitment.
3. Acquires and maintains requisite contractual and legal rights and fulfills responsibilities.
4. Has an effective and efficient policy framework.
![Page 6: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/6.jpg)
10 basic characteristics (cont.)
5. Acquires and ingests digital objects based upon stated criteria that correspond to its commitments and capabilities.
6. Maintains/ensures the integrity, authenticity and usability of digital objects it holds over time.
![Page 7: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/7.jpg)
10 basic characteristics (cont.)
7. Creates and maintains requisite metadata about actions taken on digital objects during preservation as well as about the relevant production, access support, and usage process contexts before preservation.
8. Fulfills requisite dissemination requirements.9. Has a strategic program for preservation planning and
action.10.Has technical infrastructure adequate to continuing
maintenance and security of its digital objects.
![Page 8: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/8.jpg)
The year is 2100. Can you read your files?
![Page 9: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/9.jpg)
Our questions for you
•Who has an IR? •What platform are you using? •Who’s backing it up? •Who’s part of a PLN? •Who’s having their IR journals preserved in LOCKSS or Portico? Question mark sign by Colin_K, on
Flickr
![Page 10: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/10.jpg)
Localized disasters
![Page 11: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/11.jpg)
Fire
Floodhttp://chronicle.com/blogs/wiredcampus/what-katrina-can-teach-libraries-about-sandy-and-other-disasters/40986
Hurricane
Tornadohttp://blog.al.com/spotnews/2011/10/plans_to_rebuild_birmingham_li.html
http://www.ncsml.org/Content/About-Us/Museum-History/2008-Flood.aspx http://news.bbc.co.uk/onthisday/hi/dates/stories/august/1/newsid_2526000/2526839.stm
![Page 12: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/12.jpg)
War
Tsunami
Earthquake
© 2011 UMD Libraries
http://savemlak.jp/wiki/saveMLAK/en?lang=en&uselang=en
http://www.flickr.com/photos/umd_libraries/6075914283/in/set-72157627383474133 http://www.bostonglobe.com/business/2013/02/17/the-smuggled-hard-drives-timbuktu/rCyv0QL1FdLLkw4tjv6hDO/story.html
![Page 13: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/13.jpg)
Disasters with warning
Moving servers out of the University of Iowa Libraries, 2008.
© 2008 The University of Iowahttp://digital.lib.uiowa.edu/cdm/ref/collection/flood/id/3414
![Page 14: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/14.jpg)
Disasters with no warning
University of South Florida, very localized flood
http://lib.usf.edu/offtheshelf/tampa-library/the-flood-of-09dedication-in-the-face-of-disaster/
![Page 15: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/15.jpg)
“
”
Disaster recovery strategies and backup systems are not sufficient to ensure survival and access to authentic digital resources over time. A backup is a short-term data recovery solution following loss or corruption and is fundamentally different to an electronic preservation archive.
JISC. Digital Preservation: Continued Access to Authentic Digital Assets (November 2006)
Backups vs. preservation
![Page 16: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/16.jpg)
Exit strategy
Make sure you can easily migrate all your content and metadata out of your system in a usable format.
![Page 17: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/17.jpg)
Test, test and test some more
Test that all files are as expected regarding structure and completeness.
![Page 18: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/18.jpg)
Persistent identifiers
Using persistent identifiers now will help if you move to a new repository in the future.
![Page 19: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/19.jpg)
Preserving the Web
You may want archive institutional content that is not appropriate for an IR but which is appropriate for the library’s mission.
http://dx.doi.org/10.7207/twr13-01
![Page 20: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/20.jpg)
Archive-It
Archive-It can preserve journals and other scholarly work from your institution that doesn’t go into your repository.
http://archive-it.org/collections/824
![Page 21: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/21.jpg)
Internet Archive
“The Montana State Library (MSL) last year moved a copy of its collection of 3000 born digital state publications to the Internet Archive (IA).”—Chris Stockwell for Montana State Library, 12/29/2010
http://archive.org/post/340223/how-montana-state-library-uploaded-batches-of-digital-objects-to-the-internet-archive
http://archive.org/details/MontanaStateLibrary
![Page 22: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/22.jpg)
IRs are a bit different…
The copy of the document in the repository often is the only version you have.
![Page 23: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/23.jpg)
Access copy vs. preservation copy
Digitized content may have a preservation scan as well as the version which displays to the public.
![Page 24: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/24.jpg)
IRs have special problems…
Automatically adding a cover page to brand and identify content has change the file, perhaps even removing accessibility features.
![Page 25: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/25.jpg)
File formats
When possible, use open file formats so content will remain accessible long into the future, but will you turn down content in other formats?
![Page 26: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/26.jpg)
PDF/A (ISO 19005-1:2005)
Click icon to add picturePDF/A is an ISO standard “which provides a mechanism for representing electronic documents in a manner that preserves their visual appearance over time, independent of the tools and systems for creating or rending the files.”
http://www.pdfa.org/publication/pdfa-in-a-nutshell-2-0/
![Page 27: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/27.jpg)
U Iowa electronic theses & dissertations
1931 PDFs and 7 XML documents
Supplemented by:
21 .avi
1 .avp
8 .doc
2 .mov
2 .mp3
1 .mp4
4 .mpg
1 .mxf
3 .NTS
2 .pde
6 .pdf
4 .txt
3 .wmv
18 .xls
2 .zip
![Page 28: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/28.jpg)
Public preservation policy
Make your preservation and submission policy clear so that contributors understand the risks of contributing a non-open format.
http://services.ideals.illinois.edu/wiki/bin/view/IDEALS/PreservationSupportPolicy
![Page 29: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/29.jpg)
Preservation metadata
PREMIS (PREservation Metadata Implementation Strategies)
“Preservation metadata supports activities intended to ensure the long-term usability of a digital resource.”—Caplan, p.3
http://www.loc.gov/standards/premis/understanding-premis.pdf
![Page 30: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/30.jpg)
“
”
Metadata can help support authenticity by documenting the digital provenanceof the resource — its chain of custody and authorized change history.
Caplan, Priscilla. Understanding PREMIS. Library of Congress, ©2009. p.3
Digital provenance
![Page 31: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/31.jpg)
Methods of preserving data
• Refreshing data•Migrating data• Emulating software platform• Replicating• Validating data integrity•Metadata
![Page 32: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/32.jpg)
Long-term preservation options
•Global LOCKSS Network
• Private LOCKSS Network
• Portico
![Page 33: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/33.jpg)
Global LOCKSS Network
• For e-journal content• Preserves the format as well as the
content• Light archive• Adding journals to LOCKSS• Notify LOCKSS of metadata/file changes• Not all serials are appropriate for Global
LOCKSS
![Page 34: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/34.jpg)
Private LOCKSS Network
• All material from the IR•Need at least 7 nodes/destinations• Each should be a LOCKSS Alliance member• Set up policies and governance for the PLN
![Page 35: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/35.jpg)
Setting up policies for a PLN
• How long is initial commitment? • How much notice to
withdraw? • How do members
remove data for withdrawn institution?
• Does the group need a governing body or steering committee? • Will the PLN be a dark
or light archive? • Do any of the members
have embargoed materials?
![Page 36: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/36.jpg)
Examples of PLNs
![Page 37: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/37.jpg)
Portico
• For e-books and e-journals• Source files converted to an archive
format• Dark archive• Portico is responsible for future content
migrations• Adding journals to Portico• Not all serials are appropriate for Portico
![Page 38: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/38.jpg)
Factors to consider in developing a formal preservation plan
•Organizational & financial commitment• Stakeholders• Local backups vs. long-term preservation• Storage needs
• Roles & responsibilities•Data ingestion• Policy on deletion of or embargoes for materials• Funding• Staff
![Page 39: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/39.jpg)
Organizational & financial commitment
•What is the long-term financial commitment from your library or institution?
•Do you have the support of the organization? From what level of administration?
![Page 40: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/40.jpg)
Stakeholders
•Producers
•Users
•Owners
•Managers
•Funding authorities
•Other parties?
![Page 41: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/41.jpg)
Local backups vs. long-term preservation
•Definition of backups versus preservation
•Metadata, content, software, or all of these?
•How often and who is responsible?
•PLN or other option for long-term preservation
![Page 42: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/42.jpg)
Storage needs
Disk space How much
space do you need?
Who is responsible for maintaining disks?
Software Which
software will be required?
Who migrates information as software needs change?
Equipment What
equipment will you need?
Who will fund the equipment, set it up, maintain it?
![Page 43: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/43.jpg)
Roles & responsibilities
•Who is implementing the plan?
•Who is maintaining the data and how?
•Who is providing support for accessing material and troubleshooting issues?
![Page 44: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/44.jpg)
Data ingestion
•How are you getting data into the system for preservation or backup?
•Will this be done in-house or outsourced to a third party?
•How frequently and in what format?
![Page 45: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/45.jpg)
Funding vs. staffing
• Is it easier to fund these efforts at your organization or staff them?
• How well-staffed is your organization?
•What kind of expertise do you have (or not have) in the library?
•What level of commitment does your organization have to preserve digital information?
![Page 46: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/46.jpg)
Questions?
Wendy RobertsonDigital Scholarship LibrarianUniversity of Iowa [email protected]@wendycr_ Carol Ann Borchert
Coordinator for SerialsUniversity of South Florida [email protected]
![Page 47: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/47.jpg)
Sources
Ball, Alex. Preservation and Curation in Institutional Repositories. Digital Curation Centre, UKOLN, 2010. Version 1.3 http://www.dcc.ac.uk/sites/default/files/documents/reports/irpc-report-v1.3.pdf
Caplan, Priscilla. Understanding PREMIS. Library of Congress, ©2009. http://www.loc.gov/standards/premis/understanding-premis.pdf
Digital Repository Audit Method Based On Risk Assessment (DRAMBORA). Glasgow, 2009. http://www.dcc.ac.uk/resources/repository-audit-and-assessment/drambora
JISC. Digital Preservation: Continued Access to Authentic Digital Assets (Nov. 2006) http://www.jisc.ac.uk/publications/briefingpapers/2006/pub_digipreservationbp.aspx
![Page 48: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/48.jpg)
Sources
Nestor Working Group. Catalogue of Criteria for Trusted Digital Repositories. Frankfurt am Main, Dec. 2006. Urn: de:0008-2006060703
OpenDOAR Policies Tool. http://www.opendoar.org/tools/en/policies.phpOettler, Alexandra. PDF/A in a Nutshell 2.0: PDF for long-term archiving.
Berlin: Association for Digital Document Standards e. V., ©2013. http://www.pdfa.org/wp-content/uploads/2013/04/PDFA_in_a_Nutshell_21.pdf
Pennock, Maureen. Web-Archiving. DPC Technology Watch Report 12-01 March 2013. DOI: http://dx.doi.org/10.7207/twr13-01
Reference Model for an Open Archival Information System (OAIS). Recommended Practice CCSDS 650.0-M-2. Magenta Book, June 2012. http://public.ccsds.org/publications/archive/650x0m2.pdf
![Page 49: Preserving Content from Your Institutional Repository](https://reader036.vdocuments.site/reader036/viewer/2022062312/5562adbfd8b42a6e4f8b52b3/html5/thumbnails/49.jpg)
Sources
Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC). Version 1.0. Feb 2007. http://www.crl.edu/archiving-preservation/digital-archives/metrics-assessing-and-certifying/trac
University of Houston Libraries, Institutional Repository Task Force. Institutional Repositories. SPEC Kit 292. July 2006. http://publications.arl.org/Institutional-Repositories-SPEC-Kit-292/3
University of Illinois at Urbana-Champaign. “IDEALS Digital Preservation Support Policy.” ©2013 https://services.ideals.illinois.edu/wiki/bin/view/IDEALS/PreservationSupportPolicy
University of Illinois at Urbana-Champaign. “Preparing Items for Deposit into IDEALS. File Format Recommendations” ©2013 https://services.ideals.illinois.edu/wiki/bin/view/IDEALS/SubmissionPrep#File_Format_Recommendations