the convergence of information technology, data, and management in a library imaging program

28
The Convergence of Information Technology, Data, and Management in a Library Imaging Program Author(s): Fenella G. France, Doug Emery, and Michael B. Toth Source: The Library Quarterly, Vol. 80, No. 1 (January 2010), pp. 33-59 Published by: The University of Chicago Press Stable URL: http://www.jstor.org/stable/10.1086/648462 . Accessed: 17/06/2014 21:20 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to The Library Quarterly. http://www.jstor.org This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PM All use subject to JSTOR Terms and Conditions

Upload: michaelb

Post on 12-Jan-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

The Convergence of Information Technology, Data, and Management in a Library ImagingProgramAuthor(s): Fenella G. France, Doug Emery, and Michael B. TothSource: The Library Quarterly, Vol. 80, No. 1 (January 2010), pp. 33-59Published by: The University of Chicago PressStable URL: http://www.jstor.org/stable/10.1086/648462 .

Accessed: 17/06/2014 21:20

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to TheLibrary Quarterly.

http://www.jstor.org

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 2: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

33

[Library Quarterly, vol. 80, no. 1, pp. 33–59]

� 2010 by The University of Chicago. All rights reserved.

0024-2519/2010/8001-0002$10.00

THE CONVERGENCE OF INFORMATION TECHNOLOGY, DATA,AND MANAGEMENT IN A LIBRARY IMAGING PROGRAM1

Fenella G. France,2 Doug Emery,3 and Michael B. Toth4

Integrating advanced imaging and processing capabilities in libraries, archives, andmuseums requires effective systems and information management to ensure thatthe large amounts of digital data about cultural artifacts can be readily acquired,stored, archived, accessed, processed, and linked to other data. The Library ofCongress is developing advanced digital imaging capabilities into effective pres-ervation and verification tools as part of a larger scientific preservation study, pro-cessing, and research system for implementation by other institutions. An integratedhyperspectral imaging system provides a critical nondestructive tool for researchinto paper, parchment, photographic, and other objects within the Library andother cultural heritage institutions. This article addresses the implementation ofthis capability and appropriate metadata standards and the impact on the devel-opment and convergence of information architectures and systems in libraries,archives, and museums.

1. The authors would like to thank the management and staff of the Library of Congress, inparticular, the Chief of PRTD, Eric F. Hansen, and the Preservation Directorate Director,Dianne van der Reyden. The maps were made available thanks to the support of theLibrary’s Geography and Maps Division, led by John Hebert. We would also like to thankthe library conservation team from the Conservation Division for the significant conser-vation support before and during the imaging effort. Imaging scientists include Bill Chris-tens-Barry from Equipoise Imaging, Roger Easton Jr. from the Rochester Institute of Tech-nology, and Keith Knox from Boeing Corporation. Ken Boydston of MegaVision Imagingprovided the camera and software for the imaging, and technical support from RichardChang. We appreciate the comments and support of Will Noel and others on the Archi-medes Palimpsest project.

2. Preservation Research and Testing Division, Library of Congress, 101 Independence Ave.SE, Washington, DC 20540; Telephone 202-707-5525; E-mail [email protected].

3. Emery IT, 3623 Yolando Road, Baltimore, MD 21218; E-mail [email protected]. R. B. Toth Associates, 10606 Vale Road, Oakton, VA 22124; E-mail [email protected].

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 3: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

34 THE LIBRARY QUARTERLY

Introduction

The application of mature imaging hardware, software, and standards fromearth-sensing, astronomical, medical, national security, and consumer ap-plications is decreasing risk in collecting and managing digital images anddata in libraries, archives, and museums. Early navigators, astronomers,and surveyors used both established and pioneering scientific techniquesduring early maritime and terrestrial exploration of new worlds, stars, andplanets. Centuries later, advanced imaging techniques provided mankindwith the first opportunity to clearly see and survey vast regions of earthand space. Just as mankind is now able to survey and study the planet anduniverse with a range of digital imaging technologies, imaging and pres-ervation scientists are turning the same digital imaging techniques to thestudy of historic and cultural treasures. Imaging science has expanded thehistorical record of works of art, historic documents, and national treasures.

Digital imaging in discrete spectral bands to support cultural heritageinstitutions has progressed to the point where it is now a key laboratorytool. There are very real challenges and consequences with the applicationof new technologies to libraries, archives, and museums. All three havesignificant collections of fragile items, and while digital imaging increasesaccess, the adoption of techniques for analysis requires a high level ofaccountability to develop a truly nondestructive, noninvasive, and enduringmethodology. The successful integration of spectral imaging for culturalheritage is underway simultaneously in a number of countries throughoutthe world and in collaboration. As C. Fischer and I. Kakoulli noted in 2006:“Primarily used for scientific investigations of paintings, it has also beensuccessfully applied to the study of documents, the evaluation of conser-vation treatments and digital imaging for documentation” [1, p. 3]. Whilemany developments depend upon advances in technology, others reflectthe ability to build upon the efforts of other researchers [2, 3]. The analysisfor transcription and translation of deteriorated ancient texts requires col-laboration between conservators and scholars for effective translation,while accurate determination of parameters associated with substrates andmedia without sampling is critical to the assessment and preservation ofmany international items of cultural heritage [4].

The successful transfer of spectral imaging to cultural heritage institu-tions depends on how effectively the new technology can be integratedinto work processes, study methods, and current systems. The real chal-lenge for successful application of new technologies with information tech-nology and data management is the integration of people and processeswith the technology. The convergence of advanced digital imaging andlighting systems with data standards and effective management has im-

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 4: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 35

proved access to and allowed efficient handling of large volumes of datafor libraries, archives, and museums.

The Implementation of a Digital Imaging System at theLibrary of Congress

Like other libraries and museums, the Library of Congress has workedwith a range of basic spectral imaging systems to study objects in the col-lection, including ultraviolet (UV) and infrared (IR) light and a broadbandlow-resolution multispectral system with two broad, distinct bands in theUV, one each in red, green, and blue regions of the visible spectrum, andthree bands in the IR. These lighting systems were used to assess UVflorescence of various compounds and to increase visibility of backgroundor low-visibility information, mainly for conservation assessment and treat-ment decisions. Researchers—scientific, conservation, and scholarly—incultural heritage organizations worldwide often utilize UV and IR lightingto enhance features not clearly visible to the naked eye.

Advances over the past decade have imparted the capability to utilize arobust spectral imaging system that provides large format, high-qualityimages and standardized data output with commercial off-the-shelf com-ponents, including integrated collection and data storage software. Thereduced costs and increased capabilities take spectral imaging from anexotic activity to a nondestructive working tool. Building on the pioneeringresearch and development efforts in systems design, spectral illuminationand imaging, and data management in the Archimedes Palimpsest Program[5], the Library contracted for a hyperspectral imaging study of a key objectin the collection to assess the capability and versatility of a fully integratedhyperspectral imaging system.

Hyperspectral imaging—taking a series of digital images in nanometerbandwidths of the visible and nonvisible spectrum, from ultraviolet throughvisible to infrared, often in contiguous wavelengths—can detect variancesin components or material composition of the object being imaged at anywavelength or combination of wavelengths. The resulting images may bedigitally combined with or subtracted from each other to form images forscientific or scholarly analysis. These hyperspectral images contain a wealthof information but require significant interpretation to process and analyzethe collected data. Advances in technology have greatly impacted the fieldof spectral imaging, which have been further enhanced through improve-ments in the process of image capture techniques and specific postimageprocessing (see fig. 1, in the online version of this article), offering libraries,as well as museums and archives, a useful and efficient tool for the studyof cultural objects [6].

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 5: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

36 THE LIBRARY QUARTERLY

Hyperspectral imaging of cultural objects requires well-developed meta-data to capture object identification, scientific, and spatial information toensure long-term access, management, and use of the objects. The specialnature of hyperspectral imaging is the creation of multiple images of thesame cultural object that are related to one another along multiple axes,among them content identification, spatial registration, and illuminationparameters. This convergence of scientific techniques and the creation oflarge digital data sets with library and museum objects requires a metadatastandard that encompasses multiple domains and goes beyond the pro-visions of any single existing metadata standard. Data acquisition and for-mat must anticipate the needs of libraries, archives, scholars, scientists, andother potential users of the data.

As both an international library and the de facto national library of theUnited States, the Library of Congress is a major repository and archiveof knowledge and creativity in the United States. Its extensive collectionincludes three types of media that reflect the chronology of their tech-nological introduction: historic and ancient traditional materials (e.g.,stone, papyrus, parchment, and paper), more modern nineteenth- andtwentieth-century analog audiovisual materials (e.g., photographic nega-tives, magnetic tape, motion pictures), and contemporary digital materials(optical discs, storage devices, and hardware for machine readability). Itis the host of an increasing number of permanent and temporary exhibitsthat allows public access to a rich array of significant items of culturalheritage. The hyperspectral imaging study was in part related to the latterrole, since the Waldseemuller 1507 “World Map” (the first map to use theterm “America”) must always be available for public viewing as requiredby the terms of its purchase by the Library of Congress. As outreach, apermanent exhibit was created in the Jefferson building to house this mapas part of the exhibit Exploring the Early Americas [7].

As a response to the requirements of a permanent exhibit, the map washoused in an environmentally controlled, aluminum encasement in ananoxic environment to minimize the potential degradation caused by thepresence of oxygen. Maintaining control of relative humidity and oxygendemanded a permanent hermetic seal that restricted access to the actualmap. The high-resolution hyperspectral imaging was undertaken to providebaseline information about the map and its condition, as well as to createan extensive library of images that could be used for research into printingand construction techniques; scholarly interpretations; to enhance lostinformation; and to characterize colorants, inks, and other media andsubstrates present. This challenge was exacerbated with the requirementsto complete the work within a limited time frame prior to the installationof the exhibit in December 2007.

In November 2007, the Library’s Preservation, Research and Testing

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 6: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 37

Division (PRTD) conducted the hyperspectral digital imaging studies ofthe Waldseemuller 1507 paper map and other maps from the Geographyand Maps Division, including the Carta Marina paper map and the parch-ment “Map with Ship.” A joint imaging team imaged the twelve sheets of18 # 24–inch paper on which the Waldseemuller map was printed andother items. Regular meetings were held with Library of Congress personnelbefore, during, and after the imaging process to define the Library’s studyrequirements and optimize the imaging studies to meet these requirements.

In a convergence of library and museum requirements, the Library ofCongress chose to utilize the same illumination system pioneered duringthe Archimedes Palimpsest Program at the Walters Art Museum, in orderto minimize project risk. Illumination was provided by the latest EquipoiseImaging EurekaLight LED illuminators and SideLong raking light illu-minators (see fig. 2, in the online version of this article). Raking illumi-nation was provided in two spectral bands (470 and 910 nanometers) toilluminate from either side of the object at a low, oblique angle. Whilethis technique is commonly used to reveal impressions and topographicfeatures in the paper or parchment, only the current imaging allows thepotential to combine this information with other spectral regions of in-terest. Cool-light LED sources provided illumination at 12 specific wave-lengths from two illumination panels: UV (365 nanometers), Visible (445,470, 505, 530, 570, 617, 625 nanometers), and IR (700, 735, 780, 870nanometers).

The glass camera lens did not allow imaging of reflected UV light butallowed the capture of UV fluorescence from the excitation of elementsin the object. Use of UV light was limited to the specific image capture tominimize damage to the object. Reflected and fluorescing light was cap-tured in individual registered images from each spectral band, yielding animage cube of multiple images for processing. Registration of the images—aligning all features on an image so that they match perfectly—ensuresprocessing yields that are sharp and crisp when overlaid or combined.

To minimize the impact of light on fragile items, the length of exposurewas carefully controlled and the image histogram maximized individuallyfor each exposure. The cool-light LED panels reduced another concernwith the imaging of fragile artifacts: the increase in temperature from lightsthat reduce the relative humidity or moisture content of the materials,causing dimensional changes that ultimately lead to microfractures andpoints of weakness in the organic materials. One of the concerns withhistoric documents is minimizing the effects of photodegradation, sincephotography and imaging systems often expose the item to high levels oflight required to obtain a high-quality image. The graph (see fig. 3, in theonline version of this article) clearly illustrates the reduction of light ex-posure from the normal room lights (which are usually enhanced by flood-

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 7: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

38 THE LIBRARY QUARTERLY

lights to increase this up to four times) at 350 lux to approximately 3.5lux. To make sense of these numbers, it should be noted that recom-mended lighting levels for display of light-sensitive materials are not toexceed 50 lux, with lower levels preferred for very fragile, light-sensitiveitems and long-term displays. All light damage is cumulative, so lower levelsof light employed in analyses as well as in exhibition greatly increase thelifetime of the item and its component materials.

The LED illumination system was integrated with a monochrome camerato minimize problems encountered with the Bayer pattern of small coloredfilters used on color cameras. Color cameras that collect red-green-blue(RGB) images require the use of a color filter array (a Bayer filter) thatuses twice as many green elements as red or blue to mimic the peakresponse of the human eye to green wavelengths. With this array, eachpixel is then filtered to record only one of three colors; therefore, onlyone-third of the color data are collected at each pixel. To obtain a full-color image, demosaicing algorithms are required to interpolate the fullrange of color for each point, thereby resulting in varying quality finalimages. Utilizing a monochrome camera allows all spectral information tobe captured by the camera from the object being imaged. By offeringimaging in distinct spectral bands or combinations of spectral bands with-out filters between the camera and the image, this system does not limitthe light reaching the camera elements or change registration of the im-ages. Monochrome images offered the following advantages for artifactstudies: (1) no residual artifacts are left from the color sensor’s Bayerpattern filtration; (2) without the Bayer array filtering light, the mono-chrome camera utilizes over twice the light from the scene, resulting inover twice the signal-to-noise ratio and less light on the cultural object;(3) the need for postprocessing and the anti-aliasing filter common toBayer pattern digital single lens reflex devices is eliminated.

The imaging team used a MegaVision digital imaging system that pro-vided a digital image size of about 40 megapixels, offering good resolutionand file size for efficient processing and analysis. It utilized an integratedMegaVision Monochrome E6 39 megapixel back and a camera with aSchneider Apo Macro 120 millimeter f5.6 lens that provided a dynamicrange of 12 bits per channel with a CCD Array of 7216 # 5412 pixels and6.8 micron pixel size.

Each sheet of the map was imaged verso and recto at each of the twelvewavelengths to obtain a total of twenty-four macro images per sheet, aswell as four raking illumination images (verso and recto). The imagingteam produced 300 dots per inch (dpi) 8-bit TIFF images for all illumi-nations with associated metadata for each image. Higher resolution 600dpi images were also taken and stitched together to create twenty-fourstitched, full-page, high-resolution images per sheet.

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 8: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 39

The imaging also captured spatial information for each image, an ap-proach built on concepts established during the Archimedes Palimpsestproject, in what was dubbed “scriptospatial imaging.” In the same mannerand following the same standards, the Library of Congress is establishingthe ability to spatially link data derived from the images to specific pointson the images themselves. This includes preservation and conservationannotations and data about specific regions on the document, visible and“hidden” text, and data from other studies. Prior to imaging the Waldsee-muller map at the Library of Congress, the data and program managerestablished a coordinate system for the images. With this standardizedsystem, standard geospatial software can be used to graphically link theimages and text for data integration and access.

To image the Waldseemuller map at 600 dpi resolution, the imagingteam had to take six overlapping images of each 18 # 24–inch sheet ofthe map and stitch them together to create a single, large image (see fig.4, in the online version of this article). This was accomplished by posi-tioning the map on a computer controlled x,y table that translated on thex and y axes to precisely move each sheet through the six positions. Thesixteen spectral and raking images were captured at each position alongwith the spatial coordinates for that position. Following the imaging andimage processing, a geospatial scientist digitally stitched the 600 dpi imagestogether to create single images of each sheet.

Following the initial imaging session, the imaging team met with theLibrary of Congress principals to identify the major key areas of interestfor preservation and scholarly studies. This included experimental imageprocessing to provide insights into the details that could be provided byvarious types of processing. The imaging team then took the individualspectral images for each map sheet or sheet section and processed themto reveal features of interest to the Library of Congress. The processedimages were used for a range of library studies: identifying the response ofspecific characteristics of the map, including printers’ inks, pigments, irongall, effects of treatments, and other aspects as identified by conservators/scientists prior to the imaging session, and using the multispectral rakingincidence illumination to enhance the visibility of any “topographic” featuresrelating to the woodblock printing and any other mechanical contact.

In October 2008, PRTD conducted studies of L’Enfant’s 1791 “Plan ofWashington D.C.” and some daguerreotypes with an enhanced MegaVision-Equipoise Imaging system. This was similar to the imaging system used inthe earlier studies, with the addition of an additional 1,050 nanometersspectral band in the infrared range, as well as enhanced software to allowadditional control of the camera and illumination. Building on these stud-ies and imaging systems, the Library is now integrating digital imagingsystems in the PRTD Optical Laboratory capable of producing numerous

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 9: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

40 THE LIBRARY QUARTERLY

large images. This new technology will be used for preservation studies tosupport cultural heritage institutions worldwide that have already ex-pressed interest in furthering their capacities for nondestructive enhancedspectral imaging. Digital imaging studies of parchment and paper, color-ants and inks, daguerreotypes, and encasement materials are currentlyunderway. A data management system to integrate, access, store, retrieve,and process images is critical to developing this capacity as a standardanalytical tool for libraries and museums.

Image ProcessingThe composite of the twelve discrete spectral imaging wavelengths andfour raking combinations creates the image cube (see fig. 5, in the onlineversion of this article). This illustrates the spectral response of the colorantutilized to create the red grid lines on the Waldseemuller map (sheets 6and 7), where the images captured in the data cube show the change inresponse relative to the specific wavelength illuminations. Once the spec-tral regions that best reveal a given feature are identified and utilized inthe image processing, they can then be used to enhance this feature forscholarly and preservation research.

For some scholarly, scientific, and preservation studies, digitally imagingan object in specific spectra isolated with either filters or narrow bands ofillumination will suffice in revealing underlying areas of interest [8]. How-ever, for many studies, advanced digital processing of images from multiplespectral bands is required to improve visibility of regions of interest bysuppressing some features and enhancing others, as well as ascribing cer-tain colors to certain features for better visibility [9, 10].

The collection of multiple registered images across a range of spectralbands offers tremendous opportunities for digital processing of the imagesto reveal information not readily discernible in the original images. Thetrue added value of imaging in multiple spectral bands lies in the abilityto combine images and enhance features, details, and underlying infor-mation that are not visible to the naked eye or in a single spectral band.This is accomplished by mathematically combining and suppressing theoutput from multiple spectral bands to combine the spectral signatures ofspecific wavelengths of light that offer an enhanced image of specific ele-ments of the image.

One of the leading processing techniques used to reveal areas of interestis pseudocolor processing, which was used effectively in the ArchimedesPalimpsest Program to reveal the undertext of Archimedes’ work and toseparate or suppress the overtext from the Greek book of prayer. At theLibrary of Congress the imaging team created pseudocolor images usingadvanced processing techniques to map specific components and areas ofinterest on the Waldseemuller map and other maps. Without going into

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 10: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 41

the advanced mathematics, suffice it to note that the imaging scientistsdeveloped and refined mathematical algorithms that could be applied toeach 300 dpi or 600 dpi image cube to yield processed images, includingpseudocolor and “embossed” images, the latter presenting a representationof what the original woodblock would have looked like. Utilizing differ-ences in the spectral signatures of the reflected light, the imaging scientistsprocessed the images to highlight different classes of response and pro-duced images with enhanced visibility of areas of interest [11]. This in-cluded enhanced pseudocolor images of red grid lines on two of the Wald-seemuller map sheets (see fig. 6, in the online version of this article) andvarious annotations and historical information of the maps. The scientistsalso processed multiple spectral bands to reveal printing and constructiontechniques on the Waldseemuller map in “embossed” images (see fig. 7,in the online version of this article).

Additional 1200 dpi high-resolution images were taken for processingof specific areas of interest to reveal hidden text. The team also capturedtransmitted light images to develop a technique for capturing watermarksfrom the sheets (see fig. 8, in the online version of this article). This isrequired for analysis of papers by illuminating not only the watermarkitself to trace the paper to a specific location of its production but also byrevealing the paper‘s construction. Prior to about 1800, all paper was madeby hand, and laid paper was made on a mold consisting of a frame thathad a fine mesh of wires that ran parallel to the long sides of the mold.These were intersected by thicker wires—the chain lines—that ran parallelto the shorter sides of the mold. After the water with suspended paperfibers was evenly distributed over the wire mesh, the mold would then beinverted, creating two distinct sides to the paper: the mold side carryingthe impression of the wire and chain lines. The identification and studyof watermarks is critical to analysis and establishment of the provenanceof library and archive documents, and this development is being used toestablish a technique that meets the current safety requirements of theLibrary of Congress.

The hyperspectral imaging of L’Enfant’s 1791 “Plan of Washington D.C.”clearly illustrates the convergence of the utilization of this new technology,not only for libraries but also to address the needs of archives and museumsto enhance and reveal lost information. Figure 9 (in the online version ofthis article) shows the comparison of what is currently visible to the nakedeye and the same section of the map as revealed under infrared. The neatlypenciled street grid and other features stand out in stark contrast. Anunderstanding of treatments or damage that occurred in previous centuriesbegins to be revealed through other spectral combinations (see fig. 10, inthe online version of this article), which is critical for conservation pro-fessionals’ work with the documents and artifacts to extend their lifetimes

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 11: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

42 THE LIBRARY QUARTERLY

and prevent further deterioration relating to specific chemical reactionsthat may be occurring.

The capacity to reveal hidden and lost text is of immense value to culturalheritage institutions, providing the ability to confirm provenance, recoverlost information, and allow researchers to confirm or disprove theories ofthe techniques of artists, underpaintings, overwritten text, and nondestruc-tive identification of inks and colorants (see fig. 11, in the online versionof this article). Researchers are continually searching for nondestructivemethods of analysis to provide increasing levels of information, and thistechnique ably demonstrates the challenges faced in the convergence ofacquiring information, making it accessible and managing the large vol-umes of information generated.

Further work with principle component analysis and with topographicand other processing techniques available in open-source software are cur-rently being pursued for image processing. Further development work onother artifact formats, such as early photographic processes, are provinguseful in documenting condition and identifying deterioration mechanismsin daguerreotypes, with the utilization of special imaging procedures pro-viding additional protection for fragile, ungilded images. Figure 12 (in theonline version of this article) illustrates the clear images obtained at 638nanometers of a daguerreotype not fully visible to the naked eye, as well asother deterioration processes apparent on the plate at other spectral wave-lengths.

Data ManagementInstallation of an effective imaging system in a library, as well as archivesand museums, requires not just the installation of good imaging equip-ment, training, and efficient operation. The Library has learned throughexperience and work with other imaging programs that simply imagingobjects does not address the key issues in providing digital products forusers. A full imaging system that offers useful products for a major librarymust include robust data management capabilities and information tech-nology infrastructure to ensure that users and systems are not swampedwith data and can readily access needed images. The study, preservation,storage, and display of historic paper, parchment, and other artifacts re-quires the integration of large amounts of data to provide information ofvalue to key users in the Library of Congress. Retrieving data and infor-mation about parchment, paper, and other objects; preserving the datafor current and future use; and contributing to the creation of knowledgefrom the retrieved information and digital data requires effective systemsdevelopment that includes the technology, work processes, and trainedpersonnel. There has been a shift over the past few decades from culturalheritage organizations as repositories of objects to repositories of knowl-

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 12: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 43

edge [12], so there is a real need to understand the potential impact ofthese new imaging capabilities on libraries, archives, and museums, in-cluding the value added for researchers, as well as public outreach andkeeping all systems current with technology and standards.

The Library has implemented a range of integration and managementactivities to effectively use these advanced imaging systems based on currentand future requirements. This includes identification of user needs fordata and information at the Library, identification and integration of thetechnologies and work processes needed to address these needs, and spec-ification of the data and metadata standards and requirements to ensuredata portability and access.

Members of the preservation, technical, scholarly, and scientific disci-plines all contribute to the collection and study of digital information ona range of parchment, paper, and other objects in the Library. Experiencein the Library has shown that the following developments must be ad-dressed in support of future imaging studies to collect and fully exploitimaging digital data for the preservation of cultural objects and with astandard imaging analysis tool:

1. Standardized methods and procedures for hyperspectral imaging andcollection of metadata elements for cultural objects. This includesdigital imaging studies with the MegaVision-Equipoise system as wellas data from other instruments. Appropriate standardized metadataare being defined to ensure that they meet the Library’s standards.Collection methodologies are being established with optimized equip-ment configurations and aligned with developments in digital datastorage and retrieval standards. This information is being integratedwith technical and processing information and documented for Li-brary reference and use.

2. Image processing and data management techniques that offer thebest information to address the Library studies utilizing hyperspectralimaging. Experts are working with Library personnel in the devel-opment of full life cycle, digital image processing capabilities. Thisincludes addressing the application and preservation of metadata anddata management techniques, following accepted standard practices.

3. Advanced digital image processing techniques with standard com-mercial off-the-shelf software (e.g., Photoshop and ImageJ) for dig-itally enhancing hyperspectral images [13]. This is in response toprocessing challenges for accessing software capable of carrying outthe imaging manipulations required for preservation research andthe training required to develop these capabilities for personnel ac-quiring skills in advanced image processing.

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 13: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

44 THE LIBRARY QUARTERLY

MetadataThe Library is integrating broad imaging and metadata standards, workprocesses, and technical systems with those in other libraries, archives, andmuseums. With collaboration across the library community, advanced dig-ital imaging and information systems make major contributions to artifactstudies and the information needs of a range of cultural heritage orga-nizations, including museums and archives.

For the Waldseemuller imaging project, the Library employed the meta-data scheme developed for the Archimedes Palimpsest project. The Archi-medes Palimpsest Metadata Standard (APMS) [14] is based on Dublin Coreand the Content Standard for Digital Geospatial Metadata [15]. It alsoincorporates extensions to the standard where they are needed to completethe scientific description of an image’s capture or creation. The standardis derived from existing standards including the Dublin Core MetadataElement Set 1.1 [16] and the Dublin Core Metadata Initiative Terms [17].There are six types of information defined in the standard: (1) identifi-cation, (2) spatial data reference, (3) imaging and spectral data reference,(4) data type, (5) data content, and (6) metadata reference. The Archi-medes Palimpsest Metadata Standard uses the Dublin Core elements andterms to identify the image. The elements based on the geospatial standard,those under items 2, 3, and 4, also describe the image itself. The contentof the imaged object—the manuscript page or map—is provided in item5. The standard holds, first, that its focus is the description of the digitalobject—from the Metadata Standard’s introduction: “1. Objectives. Thisstandard is intended to provide a common set of terminology and defi-nitions for the archival documentation of digital multispectral imagerydata” [14]. As such, the Dublin Core data describes the digital object: eachimage is assigned its own identifier (the Dublin Core dc:identifier); thecreators of the object (dc:creator) are the imagers, not Archimedes, MartinWaldseemuller, or Pierre Charles L’Enfant; and the keywords (dc:subject)describe the image. In this last case, the subject entries, a value will be“Map image” and not “Map.” The distinction is a subtle one, perhaps, butit is critical.

Second, the APMS adds elements that relate scientific parameters of theimaging, apply geospatial imaging data elements to document imaging,and add content elements that allow multiple images of the same objectto be related to one another. Sample element descriptions are shown inthe appendixes (available in the online version of this article). AppendixesA and B show sample spatial metadata for an image of the Waldseemullermap, with the metadata based on both Dublin Core [16] and the ContentStandard for Digital Geospatial Metadata [18]. The standard defines thespatial reference in terms of the dimensions of the object. The pixel di-mensions of the image can be determined by multiplying the resolution

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 14: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 45

by the lower-right x and y coordinates. The folio position and count ele-ments locate the image within a grid for high-resolution images of objectstoo large to image in a single shot and coordinate one image with otherimages of the same tile of the same object.

Appendix C (in the online version of this article) shows the metadatafields that implement the Imaging and spectral data information elementsinclude the wavelength of the light or lights used, their wattage, number,and angles of incidence at the four azimuthal points around the imagedobject. Appendix D (in the online version of this article) shows a fragmentof APMS section 5, content description information. The content descrip-tion information defines, through the foliation scheme and foliation num-bers, the identity of the image subject and relates images of a single subjecttogether. The APMS data elements provide other critical metadata, butthese examples demonstrate how the multiple elements work together tocoordinate select images from a larger data set based on imaging, spatial,and content parameters.

The metadata standard meets the specific needs of metadata for hyper-spectral imaging of paper and parchment documents. Key to identifyingrelated images, discovering their spectral and spatial characteristics, andperforming computational work using them is a standard that describesspatial, scientific, and content information as the APMS does. For theWaldseemuller project, the metadata are stored in each image file’sImageDescription TIFF header. Through the ID_File_Name metadata ele-ment, the metadata are connected to the image. As it happens, the imagefile names include descriptions according to a standardized naming con-vention. File names for 300 dpi images of the recto side of sheet 8 of theWaldseemuller map, for example, include 300_8_r_0_365_pack8.tif, 300_8_r_0_445_pack8.tif, 300_8_r_0_470_pack8.tif, 300_8_r_0_embossed.tif,300_8_r_0_rkLft445_pack8.tif, and so forth.

The names provide adequate metadata to distinguish multiple image ofa sheet side from the others. The metadata are provided merely as a con-venience to the user, and, while it would be possible to use simple serialnumbers for each image, this schema provides quick reference informa-tion. The definitive descriptive and scientific metadata are provided in theTIFF header. The complete set of images of a single side of a sheet formsan image cube (see fig. 5 in the online version of this article). While theDublin Core metadata elements identify the individual digital object, theother metadata elements (spatial, imaging and illumination, and data con-tent) describe the matrix of relationships between the images and locatea single image within that matrix.

Standard library and museum cataloging and metadata standards arenot designed to store hyperspectral imaging metadata or other types ofspecialized scientific data. Dublin Core, a very common and popular meta-

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 15: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

46 THE LIBRARY QUARTERLY

data scheme designed for the World Wide Web, provides elements foridentifying objects using a simplified core set of elements. It is intendedto be extremely general and applicable to all types of resources [19]. WhileDublin Core allows users to locate resources on the Web, MAchine Read-able Cataloging (MARC) is a more detailed standard designed specificallyfor cataloging documents [20]. The Museum System (TMS) is focused oncataloging objects in museum collections. None of these systems is designedto accommodate or enable discovery of the type of scientific and geospatialinformation stored in hyperspectral imaging, nor should they be. Even asystem like TMS, which allows considerable customization of user fields,should not be adapted to the needs of any single sort of scientific data. TheMetadata for Images in XML schema (MIX) [21] provide a rich set ofmetadata elements for digital images but do not cover the special needs ofhyperspectral imaging.

There is a special distinction that can be made between MARC and theAPMS. MARC is intended to refer to a document. If hyperspectral imageswere simple surrogates of the imaged document, they could be groupedtogether and treated as an alternate edition of the document, and this ispossible if the Library chooses to add hyperspectral data sets to its collec-tions. However, as pointed out above, the APMS describes the individualdigital object. The individual digital object is not the same as the documentbut is a scientific sample of the object. The metadata that describe thephysical object, its MARC record (see appendix E, in the online versionof this article), have some similarities to the metadata of the digital object,but their purposes and structure are different.

No one system of cataloging or metadata will be able to address theneeds of this special data set nor will any one of them be able to addressthe needs of the range of existing or forthcoming types of scientific andother structured data and metadata that libraries may store in digital re-positories. As the Library of Congress PRTD continues its use of hyper-spectral imaging to extend its knowledge of the condition and content ofdocuments and other objects in the Library’s collection, it has the problemof preserving and accessing the digital data through its metadata and main-taining links between the new data and catalog records for the objectssupplemented with new digital data. For example, the staff of PRTD shouldbe able to locate hyperspectral images of the Waldseemuller map basedon geospatial or illumination information, move easily from the digitalobject metadata to the Library’s catalog, and vice versa. Or, preservationstaff should be able to locate all digital objects created via hyperspectralimaging or a subset of the objects created using a certain wavelength oflight. This repository would not be limited to hyperspectral imaging andwould include other scientific analytical and scholarly linked data. Forexample, the Archimedes Palimpsest project used X-ray fluorescence

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 16: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 47

(XRF) to image a number of folios where text was obscured by twentieth-century forgeries. For these images a special extension of the metadatastandard was created to handle some fifty-plus extra data elements re-quired to record the parameters of XRF imaging [22]. The Library’srepository of scientific data will be able to accommodate a range of typesof scientific metadata with metadata that complies with a core metadatacomponent and requisite extension data. Though the data stored in therepository will be of different types and formats, they are all collected andused for the purpose of scientific preservation.

A storage system that complies with PREMIS (Preservation Metadata:Implementation Strategies) guidelines may be adapted to support a rangeof digital data and metadata. The PREMIS system provides structures foringesting, preserving, and disseminating packets of information [23]. ThePREMIS data model and data dictionary are based on the Reference Modelfor an Open Archival Information System (OAIS) produced by the ConsultativeCommittee for Space Data Systems [24]. The PREMIS system defines meta-data for the preservation of digital objects, such as identifiers, checksums,bit structure, and the relationship between objects. It allows for storingadditional metadata relating to objects.

A common method for including additional descriptive and structuralmetadata is to use METS XML files [25]. The Metadata Encoding andTransmission Standard (METS) is “a standard for encoding descriptive,administrative, and structural metadata regarding objects within a digitallibrary, expressed using the XML schema language of the World Wide WebConsortium” [26]. A PREMIS-compliant system is concerned with pre-serving the digital data and maintaining it over time. The PREMIS systemdoes not make prescriptions for descriptive metadata—metadata that de-scribe the content of digital objects, as opposed to their bits—and whatin OAIS is called Preservation Description Information (PDI). Alternatemetadata belonging to the domain of the data content can accompany thedigital object, and this is often done using METS, as depicted in figure 13(in the online version of this article). As there is some overlap betweenMETS and PREMIS, it is necessary to define how administrative and otherdata are stored in a system employing METS and PREMIS [27]. For ex-ample, both METS and PREMIS provide fields for checksum data, and acommon approach is to use redundancy and place checksum informationin both schemas.

The integration of a PREMIS digital repository of scientific informationinto a library setting creates special problems for linking systems and mak-ing decisions about the division of responsibilities between those systems.Scientific images are not mere surrogates of documents in a collection.They provide specialized information for a user group within the librarycommunity that may want to locate collection items either through existing

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 17: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

48 THE LIBRARY QUARTERLY

library catalogs or based on scientific criteria that are discoverable onlythrough systems built on scientific metadata, the descriptive metadata men-tioned above. In each scenario, users must be able to navigate from onesystem to another. This is a different problem from PREMIS systems thatstore digital representations (e.g., digital publications) or surrogates (e.g.,images of special collections items) of documents from a library’s collec-tion. Scientific images are not “another view” of an object but data in theirown right. As discussed above, catalog tools and standards are not designedto store scientific data and should not.

The different foci of the library catalog and the proposed repositoryraise questions about the division of responsibilities. It must be decidedwhat overlap there should be between the card catalog and the descriptivemetadata stored with scientific images of library documents. There is valueand convenience in having some of the same content information foundin a library catalog stored in a repository’s metadata. A common use forthis would be to export Dublin Core metadata that included, among severalothers, dc:subject values identifying an image as a hyperspectral image andas an image of a woodblock map. For this purpose, it is convenient to havecontent subject keywords stored in the object’s descriptive metadata. How-ever, it can be argued that object content information, like authorship,provenance, and date of origin already cataloged in another library system,does not belong in descriptive metadata stored in a PREMIS system. Thisis seen most clearly in the case of objects that are not yet well understood,as with a map of disputed date and provenance. A library will maintainthe data in its catalog, updating an object’s record as data change. Withouta clear mechanism or process for updating repository metadata, overlap-ping data in the repository become out of sync with catalog data that maybe incomplete or completely incorrect.

The repository should have clearly defined boundaries between all sys-tems that provide access to the data it stores. A repository that may holdmultiple types of scientific data, with variant metadata formats, cannotintegrate techniques for discovering data based on the content of any onemetadata scheme. The repository has the responsibility of maintaining theintegrity of the data it contains but can expose metadata that allow datathat are heterogeneous (within certain limits) to be searched and browsedby multiple external tools (see fig. 14, in the online version of this article).

Systems developed by the Library of Congress PRTD to preserve andmaintain hyperspectral imaging and other digital scientific data are beingdeveloped with these challenges in mind. As important as the preservationof the bits of the data collected in digital form by the PRTD is the pres-ervation of the metadata that enables user understanding of the contentof those objects. The systems and guidelines being developed will provideconnections between existing and new Library tools and will define the

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 18: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 49

boundaries and relationships between the data stored in these separatesystems. The success of the convergence of new technology with the pres-ervation of the cultural treasures of the Library depends on the successfulmanagement of the data collected: the integration of effective data man-agement and work processes, the development of data discovery systemsbased on user needs, and effective training of personnel to work with thesystem (see fig. 15, in the online version of this article).

Data InfrastructureEffective use of images, data, metadata, and associated studies requires notonly effective data management but also an infrastructure to handle thelarge files, access the metadata, and allow processing and manipulation ofthe images. This requires capabilities that may or may not be availablefrom the information services of a cultural institution, including (1) datastorage: sufficient storage devices, either local or enterprise-wide, to storemultiple large image and data files for timely access; (2) disseminationand access: systems that allow multiple users to access and disseminate dataand images for processing, studies and data integration; and (3) processing:capabilities with sufficient speed and memory to digitally process largeimage and data files.

Depending on resources available for the funding, operation, and main-tenance of the information technology infrastructure, a key question thatmust be addressed is the number of individuals or groups who requireaccess to the images and data. Once the size of the user population isdefined, the institution can then determine the appropriate capability toprovide with consideration of cost and efficiency. These include an (1)enterprise capability, (2) limited research and study capability, or (3) in-dividual capabilities. An enterprise approach is required to provide largeservers, high-bandwidth connectivity, and powerful workstations across alarge organization. A more limited storage and dissemination capabilitycan be utilized in the establishment of a smaller but robust “researchnetwork” to allow the movement and processing of images and data amonga smaller cadre with more powerful workstations involved in the researchand study. An even more limited capability can be provided for just a fewindividuals in a smaller organization or group, with powerful workstationsfor processing but dissemination and access limited to a “sneaker-net,”provided by transferring hard drives loaded with data and images. To main-tain the integrity of the imaging data and information, a “gold standard”data set is maintained offline with the original TIFF images and metadata.These data sets are used to create duplicates that are accessed by research-ers and to which research analyses and processed data, as well as otherdata, are added.

The Library of Congress is continuing to take an evolutionary approach

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 19: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

50 THE LIBRARY QUARTERLY

to hyperspectral imaging and data management as funding and resourcesallow, starting with individual hard drives and workstations, with plans inplace for a research Intranet across the core research team in PRTD. Ul-timately, this could lead to a broader enterprise approach for data storage,processing, and access across multiple users within and outside the Libraryof Congress. A critical component of this development is the integrationof like-minded cultural heritage institutions to the effort, and collaboratorsfrom international and national libraries, archives, and museums, have allindicated interest in pursuing this convergence.

Work Processes and TrainingA key step in implementing an advanced, integrated digital imaging ca-pability is defining the work processes. As part of developing and inte-grating the technologies associated with a digital imaging system, the keywork processes and work flow required to successfully provide images forresearch and study are being defined and validated. This includes discus-sion of new information and imagery technology for the collection, storage,and access to increasing amounts of information.

Imaging technologies and capabilities have a wide-reaching impact onlibrary personnel, researchers, users, and the existing work processes, asthese capabilities are integrated into existing systems. This includes thedevelopment of required work flow processes, staff training, and infor-mation management structures. There is often resistance to modifyingexisting methods with new methodologies and unrealistic expectations.Implementing new technologies can involve steep learning curves for staffand management, as well as the very real challenge of coordinating andmanaging large volumes of data that require new methods of interpreta-tion. To support this developing capability, databases of reference materialsare needed to understand the new information being generated. Effectivedata management for advanced imaging must also address the personnelrequirements by creating professionals with highly developed skills whocan navigate the traditional boundaries between disciplines and institutionsto achieve the integration required to meet user needs. The potentialimpact on the existing personnel and work processes can be significant.This needs to be balanced with the explosion of new information andpotential capabilities of the cultural heritage institution to advance currentknowledge and support the preservation of historic documents and man-uscripts. Understanding the impact of new capabilities on cultural her-itage organizations includes successful implementation, assessment, andunderstanding of the value of the technology and keeping up-to-datewith requirements for the transfer of the technology and the needs ofthe institution.

Work processes for parchment, paper, and other object image studies

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 20: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 51

in the Library include the following work tasks to collect and fully exploitscientific and scholarly analysis:

• Conservation and management—preparing the objects for imagingand managing the overall work flow and data management.

• Data and image collection—gathering various data elements for anobject and digital images from various optical and electronic sources,including metadata about the images, image collection processes, andthe object under study.

� Development of a standardized methodology for imaging setup,image acquisition processes, notation of changes in resolution,lighting exposure, spatial location, and file and folder nomencla-ture.

• Image and data processing—digitally manipulating the various spec-tral images to reveal information not available from any one image,as well as formatting and linking additional information about thedata to the collected data set, according to agreed vocabulary, data,and content standards.

� Development of a standardized format for collection of process-ing metadata and linking with associated scientific analyses.

• Personnel training—training in data acquisition and processing re-quires an innate knowledge and understanding of the document orobject under study to ensure that processed images accurately reflectfeatures in the object itself. This requires skilled interpretation andprocessing to generate accurate and useful information.

• Data management—assessing the data elements and associated meta-data to ensure they meet the system standards and entering or linkingthe data according to agreed digital storage and retrieval standards.

• Integrated research—study of the images and data, including linkingdata elements and establishing relationships between data elements byknowledgeable users to derive additional information and knowledge.

� Development of a spectral database allowing matching of spectralresponses of inks, colorants, other media, and treatments on spe-cific substrates for nondestructive identification and analysis.

� Standardized terminology and vocabulary for imaging processes.• Information storage and access—storing and retrieving data and in-

formation electronically by remote end users on the Library Intranetor on the Web by means of electronic public access, while maintainingthe integrity of the data.

The integration of preservation and scientific techniques with advancedimaging and processing techniques in the Library required effective sys-tems and information management to ensure that the large amounts ofdigital data could be readily collected, processed, stored, and accessed.

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 21: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

52 THE LIBRARY QUARTERLY

Digital Imaging and the Convergence of Cultural HeritageOrganizations

Digital imaging has a major impact on the user experience and data accessacross all types of cultural institutions. With the growing convergence ofcultural heritage organizations, the interest in both the artifact as an objectand digital images of the artifact brings to the forefront the roles of in-stitutions with a traditional focus as repositories of objects versus the cur-rent trend of being seen as repositories of knowledge. While many re-searchers lean toward the latter perspective, the focus on the original objectis apparent not only from the continued presence and requests of re-searchers to see original documents but from the large crowds at exhibitsof significant cultural artifacts where the original item is on limited display.The actual object itself plays a critical role for a range of users—researchersand public alike [28], as was evident with the six-month display of thehandwritten copy of the rough draft of the Declaration of Independencein the Library of Congress exhibit Creating the United States. Recordnumbers of visitors to the exhibit increased in the final few weeks whenit was advertised that the document was about to be taken off exhibit.Visitors enjoyed the interactive experience of being able to navigate theirway around the virtual version of the document on display, but it was theobject itself that provided the impetus for the emotional connection oftheir visit.

This reality is documented in “InterConnections, the IMLS NationalStudy on the Use of Libraries, Museums and the Internet” [29], whichnotes that over 95 percent of visitors to libraries and museums continueto visit in person, even when they frequently visit the institutional Web siteon the Internet, indicating that online visits are not replacing active visitsto the institution. Conversely, visitors are using the online connection togain knowledge and view images prior to their visit so that they can gainmaximum benefit from their visit. The report notes that people trust in-formation from libraries and museums more than any other sources ofinformation, including government, commercial, and private individualWeb sites. This imposes a high level of responsibility on cultural heritageinstitutions to accurately document whether an object is an original, a re-production, or a facsimile. The Library of Congress Web site, http://www.myloc.gov, actively encourages repeat visits by providing planning infor-mation prior to visiting the Library and posting interactive images andexperiences through downloads linked to the scanned personal “passport”provided in the exhibits. This positive complementary relationship betweencultural heritage institutions and their digital images on Web sites is im-portant for establishing a connection between visitors both online and inperson [30].

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 22: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 53

Exhibitions containing both originals and facsimiles carefully note thoseitems that are reproductions; imaging cannot replace the original docu-ment. However, the knowledge gained from the enhanced scientific andscholarly information contained in a hyperspectral data cube adds literallyanother dimension to the artifact itself. The ubiquity of interactive displaysas part of exhibits and information references in cultural heritage insti-tutions is evidence of the growing interest by a range of users to do morethan simply view the item. The use of persuasive technology and imaginghas promoted interest through the media, movies, and TV series, whereentire chapters of history or crimes can be discovered and analyzed in aweekly segment [31]. This enhanced knowledge of the capacity of tech-nology has led to a greater demand from both researchers and the public,and this convergence is evident as libraries, archives, and museums inte-grate increasingly sophisticated software and hardware to provide a seam-less user interface with their artifacts in person and online.

Providing access is a common point of reference for all cultural insti-tutions, with the growing use of the Internet offering the first access pointfor researchers, the public, and other users. The convergence of objectsin libraries, archives, and museums with digital images and data shared bythese institutions and private service providers requires all organizationsproviding digital images, information, and access to utilize the availablecyberinfrastructure and effective management of data to allow greater ac-cess. Deanna Marcum, associate librarian for library services at the Libraryof Congress, noted “making much more content much more accessible isa great and worthy goal.” Advances in technologies allow the customizationof information resources for users, with the goal of future digital librariesbeing one of enabling a comprehensive collection of resources for schol-arship, teaching, and learning with easy access for a range of users. Shenotes that access to this collection is “managed and maintained by pro-fessionals who see their role as stewards of the intellectual and culturalheritages of the world” [32]. The increased usage and demand for digitalresources in cultural heritage institutions is demonstrated from Web sitestatistics, with users utilizing available search engines to access data thatmay be linked or accessed from a range of sources, not necessarily alwaysthe home institution. Linkages between Web sites encourages users to visitassociated sites to meet their information requirements.

A key challenge for effective stewardship is determining how to managedigital resources, to relate them to the objects they document, and todevelop tools that connect and maintain as discreet the data about thedigital and the physical objects. The cataloging of digital data, whetherarchival surrogate images, scientific images, or other data, is a task in itsown right that must be coordinated with existing cataloging systems. How-ever, the metadata problem manifested in the difference between the

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 23: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

54 THE LIBRARY QUARTERLY

MARC or TMS catalog of physical holdings and the METS or Dublin Corerecord attached to a digital object must be addressed by institutions in thecourse of determining how they will use digital collections in conjunctionwith their traditional ones. While the successful development and pres-ervation of digital repositories relies on a management program based onplanning for long-term use and preservation of the data, the requirementsof use and long-term storage must shape the methods and processes ofdigital collection and capture.

The fundamental change that has driven this convergence across insti-tutions with technology relates to the previously limited access to infor-mation. In the past, power was derived from the control of informationand data. In the changing global and collaborative environment, power isnow (and should be) derived from how well information is used and valueadded to data and information that are made accessible to many. Whilethe changing role of libraries is to provide the best access possible, it mustalso create a structure that enables the users to readily access, relate, andutilize this information. The National Science Foundation coined the term“data scientist,” and professionals with these integrated skills will be criticalin establishing effective access and networks.

Conclusions

Convergences between information technology and data and informationmanagement in advanced imaging systems illustrate common challengesand opportunities across cultural heritage institutions. Preservation pro-fessionals, researchers, and scholars in libraries, archives, and museums,share common needs for access to the original object, research images,integrated data, and information structures and systems. These are bestsupported by the integration of imaging and information managementsystems and advanced image collection and processing capabilities. Aneffective imaging system requires not just the installation of imaging equip-ment, imaging objects, and storing digital products for users. It also re-quires that well-developed metadata and data management, storage, andaccess are effectively integrated across institutions following broadly ac-cepted international consensus standards and protocols. The developmentof systems that ensure flexibility, accessibility, and integrity of scientificimages and data provides a strong preservation tool for all cultural heritageinstitutions.

Building on previous efforts in the field in other institutions, the Libraryof Congress is collaborating to develop a standardized model for handlinglarge, complex spectral image files and associated data. The Library isintegrating advanced spectral imaging capabilities to collect images and a

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 24: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 55

range of scientific information about its manuscripts, documents, andother objects; to process the images; and to store and disseminate thedigital product for end users. Over time, standardization of data and meta-data structures, work processes, and personnel capabilities will provideimproved system efficiency, limited budget impacts, and broader availabilityof and increased confidence in the imaging study products. Developingstandardized methodologies will allow the technology to be adopted by awider range of users. Establishment of a spectral database of referencecolorants, inks, pigments, and substrates will help enhance the applicationof this capability to objects beyond the specific institutional holdings. Link-ing specific spectral responses with identified materials allows a truly non-destructive technique to advance the knowledge base and to preserve anextensive collection of increasingly fragile documents and artifacts.

A continued focus on collaboration between people, data, and workprocesses, and the tools and technologies for efficient access, is critical topromoting the free flow of information across cultural institutions andservice providers. Improved access allows greater management of infor-mation collections, achieving the goals of libraries to allow expanded accessto collections globally, while also addressing the quality of the informationand interaction. Data collections and the algorithms developed to processadvanced images need to be considered as part of the “product”—ratherthan just as a tool to create a final digital object. This will allow a greaterfocus on and understanding of the informatics required to structure andorganize large volumes of data. Enabling the user to effectively use andintegrate the results of new technology will generate new capabilities withgreater utilization of the cyberinfrastructure, information and personnelcapabilities, and technical tools. The maintenance and integrity of imageand data collections reinforce the enhanced capability these integratedinformation systems can provide over time. Effectively managing this in-tegration will maximize the capacity of systems to meet the needs of awider, changing, and growing generation of information users. Collabo-ration of information, scientific, and data professionals, and convergenceacross their disciplines, will assist in the integration of the technical, or-ganizational, and social processes required to implement them.

The data deluge of the past decade has highlighted the need for datamanagement, with inherent risks from a lack of focus on the requirementfor data access for generations to come. The first and greatest impact ofthis explosion of digital data is on the institution. The risk is that terabytesof data will be collected that exist for years without being accessible tousers in a structured way—available only to people who were there whenthe data were collected and are able to comb through directories on ahard drive to retrieve files manually as needed. Clearly, in this situation,the ability to use and access data deteriorates over time, as personnel move

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 25: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

56 THE LIBRARY QUARTERLY

on to new projects and interests, and human recollection of older databegins to fade. A first step is to standardize data collection practices tointegrate metadata and to standardize interim storage practices and archivestructures, but this is just a first step. It assumes the existence or devel-opment of systems to store and manage the data and anticipates somethingmore than external hard drives or folders on server(s). It anticipatestrained personnel with special skills, frameworks, and dedicated computerhardware to store, manage, preserve, and locate the data. Because mostdata sets have special needs, standards must be selected and implemen-tation plans created. Even off-the-shelf software packages for data man-agement must be customized by personnel with considerable technicalexpertise. Typically, multiple processes are occurring at the same time. TheLibrary of Congress PRTD now has proven standards for the collection ofhyperspectral images and is working on processes for collecting and main-taining the data. At the same time, research and planning is taking placefor a repository for long-term storage of the data and interfaces for access.Data that are being collected now must be collected and stored for entryinto a repository that has not yet been built and yet must be usable forPRTD personnel before the repository is in place.

While for decades government and industrial programs have convergedfor the integration of complex technical capabilities and processes, inte-gration of new technologies into the converging library, museum, andpreservation environments requires the application of a new set of re-quirements not only for the technologies but also for the data structureand work processes. As always, the risk is that the institutional challengeswill not be met because of a lack of time or will or personnel. Once thecommitment has been made to undertake such a project, the typical risksassociated with any project apply. Costs, schedules, and expectations mustbe managed. In other words, technical projects require project management.

The integration and standardization of processes, terminology, ontolo-gies, and metadata to support integrated imaging and data managementsystems and the cyberinfrastrstructure of accessible repositories is criticalto advancing this undertaking, with the Library of Congress PRTD workingto integrate broad imaging and metadata standards, work processes, andtechnical systems with those in other libraries and cultural heritage insti-tutions. For the Library, the development of systems for storing and sharingdata will be conducted in collaboration with other institutions to constructstandards for the exchange of preservation data. As a part of meeting thechallenge of coping with a growing collection of digital data, institutionsare working to make the systems they build fit the needs of users. For thisLibrary project, the users include a range of institutions around the world,with volumes of data and commensurate access and integration issues, all

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 26: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 57

converging toward a common goal of preserving object and preservationdata and knowledge.

With collaborative input and leadership, the combination of advancedpreservation spectral imaging and integrated data information manage-ment systems can make major contributions to artifact studies and theinformation needs of a range of cultural heritage organizations. The chang-ing roles of staff and other professionals in libraries, archives, and museumsdemand significant attention to the information needs of external users,as well as those benefiting internal staff requirements and the needs ofcollections. This can be attained through management and maintenanceof quality data, while allowing access to an increasing volume of integratedimages and data within a structured metadata scheme. By maintainingfocus on the content of the original artifact or document, the new digitalobject created from advanced spectral images and data for scientific, schol-arly, and preservation knowledge will allow access, interpretation, and pres-ervation of fragile items of significant cultural heritage. This will allowlibraries, archives, and museums to converge with information technologyand data management in supporting their common role as effective stew-ards of these artifacts for generations to come.

REFERENCES

1. Fischer, C., and Kakoulli, I. “Multispectral and Hyperspectral Imaging Technologies inConservation: Current Research and Potential Applications.” Reviews in Conservation 7(2006): 3–16.

2. Plaza, A.; Benediktsson, J. A.; Boardman, J.; Brazile, J.; Bruzzone, L.; Camps-Valls, G.;Chanussot, J.; Fauvel, M.; Gamba, P.; Gualtieri, A.; Marconcini, M.; Tilton, J. C.; Triani,G. “Recent Advances in Techniques for Hyperspectral Image Processing.” Elsevier Science(July 2007): S110–S122. http://disi.unitn.it/rslab/papers/R67-RSE-Plaza.pdf.

3. Tong, Q.; Zhu, Y.; and Zhu, Z. Multispectral and Hyperspectral Image Acquisition and Processing:Proceedings of SPIE—the International Society for Optical Engineering. Wuhan, China: Inter-national Society for Optical Engineering, 2001.

4. Down, Jane L.; Young, Gregory S.; Williams, R. Scott; MacDonald, Maureen A. “Analysisof the Archimedes Palimpsest.” In Works of Art on Paper, Books, Documents and Photographs:The International Institute for Conservation, Contributions to the Baltimore Congress, 2–6 September2002, edited by Vincent Daniels, Alan Donnithorne, and Perry Smith, pp. 52–58. London:International Institute for Conservation of Historic and Artistic Works, 2002.

5. Netz, Reviel, and Noel, William. The Archimedes Codex. London: Weidenfeld & Nicolson,2007.

6. Knox, K. T.; Easton, R. L., Jr.; and Christens-Barry, W. A. “Image Restoration of Damagedor Erased Manuscripts.” Paper presented at the “European Signal Processing Confer-ence,” Lausanne, August 2008. http://www.eurasip.org/Proceedings/Eusipco/Eusipco2008/papers/1569105284.pdf.

7. Library of Congress. “Exploring the Early Americas, First Step in the Library of CongressExperience.” Library of Congress Information Bulletin, January–February 2008, http://www.loc.gov/loc/lcib/08012/exhibit.html.

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 27: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

58 THE LIBRARY QUARTERLY

8. Ware, G. A.; Chabries, D. M.; Christiansen, R. W.; and Martin, C. E. “Multispectral Doc-ument Enhancement: Ancient Carbonized Scrolls.” Geoscience and Remote Sensing Sympo-sium, Proc. IEEE 6 (2000): 2486–88.

9. Hosse, K., and Schilcher, M. “Temporal GIS for Analysis and Visualization of CulturalHeritage.” Paper presented at the “New Perspectives to Save Cultural Heritage,” CIPA2003 Nineteenth International Symposium, Antalya, Turkey, September 30–October 4,2003. http://www.cipa.icomos.org/fileadmin/papers/antalya/8.pdf.

10. Akca, D., and Gruen, A. “Re-sequencing A Historical Palm Leaf Manuscript with Boundary-Based Shape Descriptors.” Paper presented at the “New Perspectives to Save CulturalHeritage,” CIPA 2003 Nineteenth International Symposium, Antalya, Turkey, September30–October 4, 2003. http://www.cipa.icomos.org/fileadmin/papers/antalya/12.pdf.

11. Knox, K. T.; Easton, R. L., Jr.; and Christens-Barry, W. A. “Multispectral Imaging of theArchimedes Palimpsest.” 2003 AMOS Conference, Air Force Maui Optical and Super-computing Site, Maui, HI, September 2003.

12. Reilly, B. F., Jr. “Developing Print Repositories: Models for Shared Preservation and Ac-cess.” Center for Research Libraries, June 2003. http://www.clir.org/pubs/reports/pub117/pub117.pdf.

13. Russ, J. C. The Image Processing Handbook. 4th ed. Boca Raton, FL: CRC Press, 2002.14. Archimedes Palimpsest Program. “Archimedes Palimpsest Metadata Standard 1.0.” Revi-

sion 5. Walters Art Museum, Baltimore, June 7, 2006.15. Federal Geographic Data Committee (FGDC-STD-001-1998). “Content Standard for Dig-

italgeospatial Metadata.” Revised June 1998. http://www.fgdc.gov/standards/status/csdgm_rs_ex.html.

16. Dublin Core Metadata Initiative (DCMI). Dublin Core Metadata Element Set, Version 1.1:Reference Description, 2000–2008, 2008. http://www.dublincore.org/documents/dces/.

17. Dublin Core Metadata Initiative (DCMI). DCMI Metadata Terms, 2000–2008, 2008. http://www.dublincore.org/documents/dcmi-terms/.

18. Federal Geographic Data Committee. (FGDC-STD-001-1998). “Content Standard for Dig-ital Geospatial Metadata.” Revised June 1998. http://www.fgdc.gov/standards/projects/FGDC-standards-projects/metadata/base-metadata/v2_0698.pdf.

19. Dublin Core Metadata Initiative (DCMI). Using Dublin Core, November 7, 2005. http://www.dublincore.org/documents/usageguide/.

20. Library of Congress Network Development and MARC Standards Office. MARC Standards,March 13, 2008. http://www.loc.gov/marc/.

21. Library of Congress Network Development and MARC Standards Office. NISO Metadatafor Images in XML (NISO MIX), May 13, 2008. http://www.loc.gov/standards/mix/.

22. Archimedes Palimpsest Program. “Archimedes Palimpsest Metadata Standard XRF Ex-tensions.” Draft. Walters Art Museum, Baltimore, June 7, 2006. http://www.archimedespalimpsest.org/pdf/Metadata%20Standard%20XRF%20Extensions%20Final%20Draft.txt.

23. PREMIS Editorial Committee, Library of Congress. PREMIS Data Dictionary for PreservationMetadata Version 2.0. Washington, DC, 2008. http://www.loc.gov/standards/premis.

24. Consultative Committee for Space Data Systems. Reference Model for an Open Archival Infor-mation System (OAIS), 2002. Washington, DC, December 1, 2008. http://www.public.ccsds.org/publications/archive/650x0b1.pdf.

25. Guenther, R. S. “Battle of the Buzzwords: Flexibility vs. Interoperability When ImplementingPREMIS in METS.” D-Lib Magazine 14, nos. 7/8 (July/August 2008). http://www.dlib.org/dlib/july08/guenther/07guenther.html.

26. Library of Congress Network Development and MARC Standards Office. Metadata Encodingand Transmission Standard. http://www.loc.gov/standards/mets/.

27. Dappert, A., and Enders, M. “Using METS, PREMIS and MODS for Archiving eJournals.”

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions

Page 28: The Convergence of Information Technology, Data, and Management in a Library Imaging Program

CONVERGENCE IN A LIBRARY IMAGING PROGRAM 59

D-Lib Magazine 14, nos. 9/10 (September/October 2008). http://www.dlib.org/dlib/september08/dappert/09dappert.html.

28. Bee, R. “The Importance of Preserving Paper-Based Artifacts in a Digital Age.” LibraryQuarterly 78, no. 2 (2008): 179–94.

29. Griffiths, Jose-Marie, and King, Donald W. “InterConnections: The IMLS National Studyon the Use of Libraries, Museums and the Internet; Conclusions.” Report, February 2008.http://www.interconnectionsreport.org/reports/ConclusionsFullRptB.pdf.

30. Marty, P. F. “Museum Websites and Museum Visitors: Before and After the Museum Visit.”Museum Management and Curatorship 22, no. 4 (December 2007): 337–60.

31. Fogg, B. J. Persuasive Technology—Using Computers to Change What We Think and Do. SanFrancisco: Elsevier/Morgan Kaufman, 2003.

32. Marcum, D. “Requirements for the Future Digital Library: An Address to the ElsevierDigital Libraries Symposium.” Philadelphia, January 25. 2003. http://www.clir.org/pubs/resources/dbm_elsevier2003.html.

This content downloaded from 185.2.32.146 on Tue, 17 Jun 2014 21:20:18 PMAll use subject to JSTOR Terms and Conditions