vandyck long-term preservation of digital scholarly literature
TRANSCRIPT
Long-termPreservationofDigitalScholarlyLiterature
CraigVanDyckNISO-NFAISVirtualConference:MakingCertainDigitalContentisPreserved
7 December2016
WhyPreservationMatters
• Endusers• Libraries•Publishers•Grantfunders•Researchinstitutes
2
3
Stakeholders
• Scholarsrelyonpermanentaccesstodigitalmaterials• Thescholarlyliteratureislong-lived• Librariesasthestewardsofpreservation• Librariesmaynotowncopiesofthedigitalliterature• Publisher-providedaccesscanbeunstable
4
Stakeholders,cont’d
• Funderswanttheoutputfromtheirfundingtoremainavailable• Researchinstitutesneedtheirfacultytohaveaccesstomaterials;andneedtobesurethattheirfaculty’soutputwillbeaccessible
5
HowCLOCKSSWorks
• Introduction• Technology;LOCKSS• Processes• Governance• Statistics• Triggers• Challenges• Priorities
6
CLOCKSS: ControlledLOCKSS(LotsofCopiesKeepStuffSafe)
• Beganoperationsin2006• Ensuringlong-termaccesstoscholarlyliteratureforresearchers• Adiverse,robustecosystemofdigitalpreservationsolutions• CLOCKSSpreservesandarchivesonbehalfoflibraries• Librarieshaveinsistedthatpublishersarchivetheircontent
7
CLOCKSS-- Technology
• CLOCKSSusestheopensourceLOCKSStechnology,with12libraryservernodes:NA:Indiana,OCLC,Rice,Stanford,Virginia,AlbertaEurope:Edinburgh,Humboldt/Germany,Universita Cattolica /ItalyAPac:HongKongU,NII/Japan,AustraliaNationalU
8
CLOCKSS– Technology,cont’d
• CLOCKSSiscertifiedasaTrustedDigitalRepositorybytheCenterforResearchLibraries• TRACauditperfectscorefortechnology;seeDavidRosenthalblog:http://blog.dshr.org/2014/07/trac-certification-of-clockss-archive.html
9
AwordaboutLOCKSS
• FromtheStanfordUniversityLibrary• Uniquetechnologysolution:multipleserversconstantlycross-checkingeachother,ensuringthepreserveddataisvalid•Manyinstances:
GlobalLOCKSSNetwork150nodes,eachwiththeirowncollection;postcancellationaccess14PrivateLOCKSSNetworkse.g.CLOCKSS,PublicKnowledgeProject,CanadianGovernmentInformation,CARINIANA(Brazil),ADPN,USgovernmentdocuments
10
CLOCKSS-- Processes
• Contentsubmissionviafiletransferorwebharvest:https://www.clockss.org/clocksswiki/files/File_Transfer_Guidelines_-_CLOCKSS.pdfhttps://www.clockss.org/clocksswiki/files/Web_Harvest_Guidelines_-_CLOCKSS.pdf
•Webharvestisparticularlyusefulwith“longtail”publishers
11
CLOCKSS-- Governance
• CLOCKSSisa“dark”archive• Triggeredcontentismadeavailableasopenaccess•Whatdoes“trigger”mean?
- Whendigitalcontentceasestobeavailabletoendusers- Accessmustbeensured,tosupportscholarship
12
CLOCKSS– Governance,cont’d
• Communitygovernance:equalnumberoflibrariesandpublishersontheBoardofDirectors• Fundedbypublisherfeesandvoluntarylibrarycontributions• Free-standing501(c)(3)non-profit• Financiallystable
13
CLOCKSS-- Statistics
• 200publisherparticipants,750librarysupporters• 15millionjournalarticlesandbooks,adding~4million/year• 5largestpublishers=70%ofthecontent• “longtail”publishers=65%ofthepublishers
14
CLOCKSS– TriggeringContentforAccess
• Rigorousrulesandpractices• Bylawsrequire75%Boardvoteinfavor,withnomorethan2votingagainstatrigger• 29triggeredjournals;1milliondownloadsthisyear• TriggeredjournalsareopenaccessviaCLOCKSS,atStanfordandEdinburgh• CreativeCommonsAttribution-Noncommercial-NoDerivativeWorksLicense
15
Challenges:TwoAsks
1. Preservingthe“LongTail”:- Longtailjournalsarethemostat-risk,andthehardesttofindandworkwith- Weneedlibraries’prioritiesforwhattoarchivethatisnotyetarchived
2. Financialsupportforadiverseandrobustdigitalpreservationenvironment
16
CLOCKSS– 2017Priorities
• Investinginhardwareandsoftware:capacity,timeliness• Addingmorelargebackfiles• Newcontenttypese.g.datasets,video,databases• Strongertransparency• Increasedoutreach
17
CLOCKSS,concluded
CLOCKSSholdingsarepubliclyreportedintheKeepersRegistry:https://thekeepers.org/
https://[email protected]
18