![Page 1: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/1.jpg)
NCSU Libraries
Tools Development and Demonstration:North Carolina Geospatial Data Archiving Project
Jim TuttleNorth Carolina State University Libraries
![Page 2: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/2.jpg)
NCSU Libraries
Process Overview
• Data transfer• Threat and format analysis, validation• Archive package organization• Selective format migration• Metadata normalization and supplementation• Source metadata translation• Statistics collection• Extra-repository AIP management
![Page 3: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/3.jpg)
NCSU Libraries
Data Transfer
• Python Md5sum comparison• 'Transfer set' metadata capture in 'Seed file'
![Page 4: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/4.jpg)
NCSU Libraries
Threat and format analysis, validation
Python wrappers for the following:
• Virus – ClamAV• Compressed files (tar, zip, gzip, bzip)• Geodatabases (extension and size)• Executable files (magic numbers)• Jhove validation
![Page 5: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/5.jpg)
NCSU Libraries
Archive package organization
• ESRI ArcGIS toolbar for selected formats
![Page 6: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/6.jpg)
NCSU Libraries
Archive package organization
• Rule-based python logic– filestem – extension relationships
( multi-file format validation)
– directory structure• Manual intervention
– metadata.doc• NOID assignment
![Page 7: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/7.jpg)
NCSU Libraries
Selective Format Migration
• Coversions using ArcGIS toolbar– e00 interchange to coverage to shapefile– geodatabase to raster, shapefile, etc
• Original files retained
![Page 8: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/8.jpg)
NCSU Libraries
Metadata Normalization & Supplementation
• Agency-specific XML templates in ArcCatalog with synchronization flags
• Provenance and curation metadata scripted
![Page 9: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/9.jpg)
NCSU Libraries
Source Metadata Translation
• Hub-and-spoke model a la Echo Depository– repository agnostic– modular conversion
hub– facilitate repository
software migration & inter-archive exchange
![Page 10: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/10.jpg)
NCSU Libraries
Statistics Collection
• Python scripted statistics generation:– number of files by format– cumulative size by format– mean file size– collection size– agency contribution
![Page 11: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/11.jpg)
NCSU Libraries
Extra-repository AIP management
• Workflow Management Database populated as a spoke on the metadata/ingest hub
• External tracking of NOID, Handle, ISO keywords, other metadata for interaction with other systems
![Page 12: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project](https://reader036.vdocuments.site/reader036/viewer/2022081520/568150fc550346895dbf1d9a/html5/thumbnails/12.jpg)
NCSU Libraries
Questions?
Jim TuttleGeospatial Data Librarian &Project Coordinator
NCGDAPNCSU Librariesjim_tuttle at ncsu dot edu
http://www.lib.ncsu.edu/ncgdap/