home-grown digital library system built upon open source xml technologies and metadata standards...
TRANSCRIPT
![Page 1: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/1.jpg)
Home-Grown Digital Library System
Built Upon Open Source XML Technologies and Metadata Standards
David LacyVillanova University
![Page 2: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/2.jpg)
Why Did We Do This?
![Page 3: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/3.jpg)
Seriously, Why Did We Do
This?
![Page 4: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/4.jpg)
System Components
• A METS Metadata Editor• A series of batch-process service image generation
tools• An XML Database repository• A file server• An OAI server• A series of VuFind Record Drivers
![Page 5: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/5.jpg)
Architecture Components
• METS XML• eXist-db• Orbeon Forms (Xforms Processor)• Tesseract (OCR)• Imagemagick
![Page 6: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/6.jpg)
METS(Metadata Encoding and Transmission Standard)
• <metsHdr>• <dmdSec>• <amdSec>• <fileSec>• <structMap>• <structLink>• <behaviorSec>
![Page 7: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/7.jpg)
Orbeon Forms(XML & XForms Processor)
• Browser independent, plugin free, XForms Processor
• AJAX driven interface controls• XML Database (eXist) integration• XML pipeline (XPL) engine for processing XML
![Page 8: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/8.jpg)
XPL Pipelines
• Vocabulary for describing a processing model for XML– File System Controls– XQuery Submissions– Session Management
![Page 9: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/9.jpg)
<xforms:submission><xforms:trigger>
<xforms:action ev:event=”DOMActivate”><xforms:submission id="batch-attach-submission"
method="post" replace="none" ref="instance('rename-file-instance')" action="/rename-file.xpl" >
<error handling stuff></xforms:submission>
</xforms:action></xforms:trigger>
![Page 10: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/10.jpg)
XPL File Processor <p:processor name="oxf:xslt"> <p:input name="data" href="#instance"/> <p:input name="config"> <xsl:stylesheet version="2.0"> <rename>
….FilenameDirectoryNew FilenameNew Directory
</rename> </xsl:stylesheet> </p:input> <p:output name="data" id="rename-info"/> </p:processor>
<p:processor name="oxf:file"> <p:input name="config" href="#rename-info" /> </p:processor>
![Page 11: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/11.jpg)
Collection Development
• Special Collections Material• Strategic Partnerships• Catholica• United States Irish History• Regional History• Faculty and Alumni Scholarly Material• > 9000 items
![Page 12: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/12.jpg)
(Rapid) Work-flow
• Select item• Scan TIFFs• Process service images• Instantiate Digital Item• Batch-Attach TIFFs and Service Images• Add Metadata• Index into VuFind
![Page 13: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/13.jpg)
Service Images
• Process Scanned Images (Cron)
• OCR (Tesseract)
• Produce Service Images (ImageMagick)– Large– Medium– Thumbnail
![Page 14: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/14.jpg)
Collection View
• Add Collections• Add Resources / Items• Edit Metadata• Batch-Attach Files• View Raw METS XML• Relocate Item• Delete Item
![Page 15: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/15.jpg)
Resources and Collections View
![Page 16: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/16.jpg)
Batch Attach
• Read Processed Images (via oxf:directory-scanner)
• Add nodes to <fileSec> (via xforms:insert)
• Move Files to File Server(via oxf:file pipeline)
![Page 17: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/17.jpg)
Batch Attatch
![Page 18: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/18.jpg)
![Page 19: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/19.jpg)
![Page 20: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/20.jpg)
Metadata - <metsHdr>
• Completion Status• Agent Information
– Editors– IP Owners– Disseminators– Etc.
![Page 21: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/21.jpg)
Metadata - <dmdSec>
• Descriptive Metadata• Dublin Core (DC)• Looking to expand this
area to other descriptive standards
![Page 22: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/22.jpg)
![Page 23: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/23.jpg)
Metadata - <fileSec> and <structMap>
• Physical description• Control Order• Add / Delete files• Edit Labels
![Page 24: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/24.jpg)
![Page 25: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/25.jpg)
Metadata - <fileSec> and <structMap>
• 2 levels of file association– Page Level– Document Level
![Page 26: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/26.jpg)
![Page 27: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/27.jpg)
![Page 28: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/28.jpg)
![Page 29: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/29.jpg)
![Page 30: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/30.jpg)
![Page 31: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/31.jpg)
![Page 32: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/32.jpg)
Problems• XML file size / Large Volumes
– Orbeon document serialization and XML processing occurs during several events
• Could disable this at cost of AJAX functionality– Solved
• Paginate the table displaying page/line items• Retrieve relative rows/items from repository• Save document using XQuery Upate
• Infinite METS Flexibility
– Not solved
![Page 33: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/33.jpg)
Front End
• Expose Content via OAI-PMH• Index into VuFind• Search Metadata and OCR/Full Text• Digital Object Viewer and Page Turner
– Page items– Document items
![Page 34: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/34.jpg)
OAI-PMH Server
• Written in XQuery• METS or DC
![Page 35: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/35.jpg)
![Page 36: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/36.jpg)
![Page 37: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/37.jpg)
![Page 38: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/38.jpg)
![Page 39: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/39.jpg)
![Page 40: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/40.jpg)
![Page 41: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/41.jpg)
Roadmap
• Incorporate Other Metadata– MODS, TEI, PREMIS
• Breakout METS Metadata Editor• Alternative Repository Integration• JPEG2000 Support• Document Delivery (PDF wrappers, ePub)• Logical <structMap>
![Page 42: Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University david.lacy@villanova.edu](https://reader035.vdocuments.site/reader035/viewer/2022062404/5514bb8b550346f06e8b6705/html5/thumbnails/42.jpg)
Roadmap
• ContentDM Migration