ncsu libraries 9 october 2006 epa meeting preservation partnership with library of congress: ndiipp...

46
9 October 2006 EPA Meeting NCSU Libraries Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project Steve Morris Jim Tuttle Rob Farrell Jeff Essic

Upload: shawn-cole

Post on 05-Jan-2016

220 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

9 October 2006 EPA Meeting

NCSU Libraries

Preservation Partnership with Library of Congress:

NDIIPP and the North Carolina Geospatial Data Archiving Project

Steve Morris

Jim Tuttle

Rob Farrell

Jeff Essic

Page 2: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

What is NDIIPP? Why NCSU Libraries?

• NDIIPP = National Digital Information Infrastructure and Preservation Program

• Responding to concern that we might be in the middle of a “digital dark age” Congress earmarked $100 million for digital preservation efforts through 2010

• Timeline– Aug. 2003: Library of Congress (LC) puts out call for

proposals for “preservation partners”– Sept. 2004: LC finalizes agreements with eight principal

partners, including NCSU.– Oct. 2004: the three-year projects begin

• A cooperative agreement … not a grant– emphasis on ongoing interaction with LC and other

partners, with transfer of learning experience to LC as primary outcome

Page 3: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

NC Geospatial Data Archiving Project (NCGDAP)

• Partner: NC Center for Geographic Information & Analysis (state agency)

• Focus: State and local agency digital geospatial data in NC as state demonstration

• Objective: Engage existing spatial data infrastructure (SDI) in the problem of preservation

• Tied to the NC OneMap initiative, which provides for seamless access to data, metadata, and inventories

Page 4: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geospatial Data Types: Vector &Attribute Data

Time seriesParcel Boundary Changes 2001-2004

North Raleigh, NC

Page 5: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geospatial Data Types: Vector Data

Time seriesParcel Boundary Changes 2001-2004

North Raleigh, NC

Page 6: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geospatial Data Types: Aerial Imagery

Page 7: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geospatial Data Types: Aerial Imagery

Page 8: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

85+ NC counties with orthophotos1-5 flights per county30-200 gb per flight

Geospatial Data Types: Aerial Imagery

Page 9: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Today’s Geospatial Data as Tomorrow’s

Cultural Heritage

Future uses of data are difficult to anticipate (as with Sanborn Maps).

Page 10: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Digital Preservation Points of Failure

• Data is not saved, or …

• can’t be found, or …• media is obsolete, or

…• media is corrupt, or

…• format is obsolete,

or …• file is corrupt, or …• meaning is lost

Solutions:

MigrationEmulationEncapsulationXML

Page 11: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Risks to Digital Geospatial Data

• Producer focus on current data– Data overwrite as common practice

• Future support of data formats in question– No open, supported format for vector data

• Shift to web services-based access– Data becoming more ephemeral

• Inadequate or nonexistent metadata– Impedes discovery and use

• Increasing use of spatial databases for data management– Complex entities: the whole is greater

than the sum of the parts

Page 12: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

• Technical solutions: How do we archive acquired content over the long term?– Build a data repository: not as an end in itself but as a

catalyst for discussion within the data community– Develop a repository ingest workflow: create technical

points of engagement with the NDIIPP partners

• Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production?– Engage data producer community and spatial data

infrastructure through outreach and engagement; influence practice

– Sell the problem to software vendors and standards development

– Find overlap with more compelling business problems: disaster preparedness, business continuity, road building, etc.

– Start a discussion about roles at the local, state, and federal level

NCGDAP Approach to Preservation

Page 13: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Repository Ingest Workflow

• Flexible, extensible processes

• Clear, documented procedures

• Adherence to standard practices, where they exist

• Automation

Page 14: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Technical Solution:Building a Digital Repository

• Three “Rights”:– Right format– Right tags (metadata)– Right relationship

Oh, and of course, valid for the rest of the Digital Age!

NCGDAP is about researching methodologies…

Page 15: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

What is the “Right” Format???

• Well, it’s complicated…

•Databases•Multi-part datasets

•Open Source

•Developments

Web Services

Page 16: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Our Format Methodology

• Decide on archival format(s)

• Migrate non-archival formats

• Archive both versions of the data set

We need a methodology that can do this a few hundred thousand times…

initially.

Page 17: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

The Zip Codes Example

Page 18: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Where Is the Data Set?

Page 19: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Here Is One!

Page 20: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Needles in the Haystack

• Computer Programs Written– Utilize functionality of GIS– Iterate through the data sets– Create “bundles” for deposit

• Process Steps1. Locate a data set 2. Determine the format3. Make appropriate conversion4. Create and isolate “bundle” with new and

original format5. Repeat

Page 21: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Custom Tools

Page 22: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Custom Tools

Page 23: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Hub-and-spoke Metadata Transformation

Page 24: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Hub-and-spoke Metadata Transformation

Page 25: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Preserving Local Collections

Page 26: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Preserving Local Collections

Page 27: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Preserving Local Collections

Page 28: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Preserving Local Collections

Page 29: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geologic and Historic Topographic Maps: Georeferencing and Preservation

Page 30: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 31: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 32: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Historic Topographic Map Preservation

• 165 Historic 15-minute series topographic maps for NC

• Date range: 1892-1959

• Documentation at http://www.lib.ncsu.edu/gis/historictopos.html

• Available on NCSU Libraries Geodata server

Page 33: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geologic Map Preservation

• 290 Geologic Maps for NC

• Map sources are US Geologic Survey, NC Geologic Survey, theses and dissertations

• Documentation at http://www.lib.ncsu.edu/gis/geolmaps.html

• Public download at http://wfs.enr.state.nc.us/NCGeologicMaps/

Page 34: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Geologic Map Preservation

1,200 – 24,000

1:500,000 – 1:2.5 M

1:31,680 – 1:430,000

Page 35: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 36: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 37: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 38: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 39: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 40: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 41: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Page 42: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

NCGS Project Summary

• Project came to us - workplan and intern identified • Preservation risk - data was stored on external drive• Content is in high demand by patrons, hardcopy only,

scarce to obtain• Collection acquired at no cost to Libraries• Data files publicly available for download• Partnership with NC Dept. of Environment and Natural

Resources; increasing interest in preservation• Early raster dataset for NCGDAP – test for large data

volumes, ingest process, metadata creation• NCGS Open File Report forthcoming

Page 43: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

• Engaging spatial data infrastructure– Evaluating metadata and content standard

adherence– Cultivation of content exchange networks– Sept. 2006 survey of current practice in local

agencies• External partnerships

– Partners on JISC-funded effort in the UK (Edinburgh)

• Engaging software vendors– Meetings with ESRI development teams

• Engaging standards development processes– Nov. 2005, partnered with University of

Edinburgh on presenting the preservation problem to the Open Geospatial Consortium (OGC) Technical Committee

– Oct. 2006, partnered with NARA on initiating a formal working group on digital preservation within the OGC

NCGDAP: Engagement with the Data Community

Page 44: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

NCGDAPon the Road

Presentations, posters, andworkshops

Jan. 2005- Sept. 2006

Highlights:

O’Reilly Where 2.0OGC Meeting (Germany)Digital Curation Center (UK)IS&T Archiving (Canada)IASSIST (UK)ESRI InternationalJoint NDIIPP & JISC Meeting

National/International: 37State/Local: 21

Page 45: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

• Project shifting to data acquisition mode• Current contract ends Oct. 2007• Likely continuation of project funding through Oct.

2010• Four responses to additional LC “Requests for

Expression of Interest (RFEI)”– Development of content exchange networks– Development of tool for automated capture of

web mapping services– Participation in repository exchange tests– Multi-state project involving State Archives

… RFEI status pending

NCGDAP: Future Directions

Page 46: NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project

NCSU Libraries

Questions?

North Carolina Geospatial Data Archiving Project website

http://www.lib.ncsu.edu/ncgdap

Library of Congress NDIIPP website

http://www.digitalpreservation.gov/