next generation archives: the nc geospatial data archiving ... · nc geospatial data archiving...

42
Next Generation Archives: The NC Geospatial Data Archiving Project Jeff Essic North Carolina State University Libraries Zsolt Nagy North Carolina Center for Geographic Information and Analysis Coastal Geotools ‘09 March 3, 2009

Upload: others

Post on 12-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

Next Generation Archives: The NC Geospatial Data Archiving Project

Jeff EssicNorth Carolina State University Libraries

Zsolt NagyNorth Carolina Center for Geographic Information and Analysis

Coastal Geotools ‘09 March 3, 2009

Page 2: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

2

NC Geospatial Data Archiving Project (NCGDAP)

Three year partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP)

One of 8 initial NDIIPP collection building partnerships

Focus on state and local geospatial content in North Carolina (state demonstration)

Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories

Page 3: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

3

NCGDAP Specifics

Funding:$520,000 for 2005-2007$500,000 for 18 month extension

Staff:1.5 FTE at NCSUApprox. same at NCCGIA

Website: http://www.lib.ncsu.edu/ncgdap

Page 4: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

4

Selected Geospatial Data Archive ProjectsProject Organizations Funding

Persistent Archives Testbed San Diego Supercomputer Center, NARA

NARA

VanMap San Diego Supercomputer Center

Inter-PARES

Geospatial Repository for Academic Deposit & Extraction

EDINA JISC

Geospatial Electronic Records CIESIN NHPRC

various Carleton University various

National Geospatial Digital Archive

UC Santa Barbara NDIIPP

Maine GeoArchives State of Maine NHPRC

Page 5: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

5

Tracking data, map servers, and web services since 2000

Earliest use: Links to local data contacts and downloads

Now: Ranked 3rd in traffic among entry points to entire library website

Community help in site maintenance

Project Roots: NCSU Libraries Data Directory

Page 6: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

6

0

10

20

30

40

50

60

70

80

90

100

2000 2001 2002 2003 2004 2005 2006 2007 2008

Num

ber o

f Cou

ntie

s

Map ServerData DownloadWMS

100 Counties in North Carolina

County Map and Data Services in NC

Page 7: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

7

Value in Older Data: Cultural Heritage

Future uses of data are difficult to anticipate (as with Sanborn Maps)

Page 8: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

8

Geospatial Data: Compelling Issues

Dynamic contentConstantly updated informationData versioning

Digital object complexitySpatially enabled databasesComplicated, multi-component formatsProprietary formats

Page 9: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

9

Digital Preservation Points of Failure

Data is not saved, or …can’t be found, or …media is obsolete, or …media is corrupt, or …format is obsolete, or …file is corrupt, or …

meaning is lost

Page 10: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

10

Risks to Geospatial Data

Producer focus on current dataData overwrite as common practice

Future support of data formats in questionNo open, supported format for vector data

Shift to web services-based accessData becoming more ephemeral

Inadequate or nonexistent metadataImpedes discovery and use

Increasing use of spatial databases for data management

The whole is greater than the sum of the parts

Page 11: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

11

Preservation Business Case

Land use change analysisSite location analysisReal estate trends analysisDisaster responseResolution of legal challengesImpervious surface change mapping

Page 12: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

12

Business Case: Identifying Land Use Changes

Use case: Land use and impervious surface change analysis

1993

2005

1998

2002

1999

Page 13: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

13

Geospatial Data Preservation Challenges

Data CaptureBackups are common, but not long-term archivesProducer focus is on current dataShift to web services-based access

Inadequate or Nonexistent MetadataConsistent NC survey stats: Only 40% of data producers create and maintain metadata

Page 14: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

14

Challenge: Vector Data Formats

No widely-supported, open vector formats for geospatial data

Spatial Data Transfer Standard (SDTS) not widely supportedGeography Markup Language (GML) – diversity of application schemas and profiles a challenge for “permanent access”

Spatial DatabasesThe whole is more than the sum of the parts, and the whole is very difficult to preserveCan export individual data layers for curation, but relationships and other context are lost

Page 15: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

15

Challenge: Other Data Types

Cartographic RepresentationSoftware Project Files, PDFs, GeoPDFs, WMS images

Web 2.0 contentStreet views, Mashups

Oblique Imagery

3D Models

Page 16: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

16

Other Challenges

Rights managementData versioningDigital Object ComplexitySemantic issuesContent PackagingLarge scale content transferIntegrating older analog materialsMore …

Page 17: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

17

Different Ways to Approach Preservation

Technical solutions: How do we preserve acquired content over the long term?

Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be preserved—from point of production?

Current use and data sharing requirements – not archiving needs – are most likely to drive improved preservability of content and improvement of metadata

Page 18: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

18

Question: Frequency of Capture?

Content Exchange – Getting Data in Motion

Repository Development

Repository of Temporal Data Snapshots

Page 19: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

19

Frequency of Capture

Issue: How frequently should county and municipal vector data layers be captured in archives?

Parcels, centerlines, jurisdictions, zoning, …

Parcel Boundary Changes 2001-2004, North Raleigh, NC

Page 20: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

20

Frequency of Capture SurveysHow often should continually changing vector datasets be captured?Tap into data custodian understanding of production patterns and usesTap into local innovationLearn about local business drivers for data archiving

2006 and 2008 surveys of NC cities and counties2008 survey of archival practice in state agencies in NCPlanned survey of data users in NC

Page 21: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

21

FOC 2006 Survey Results: Overview

58% response, two-thirds of whom create and retain periodic snapshotsLong-term retention more common in counties with larger populationsStorage environments vary, with servers and CD-ROMs most commonWide variation in frequencies of capture.Offsite storage (or both onsite and offsite) is used by nearly half of the respondentsPopularity of historic images has resulted in scanning and geo-referencing of hardcopy aerial photos among one-third of the respondents

Page 22: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

22

Content Exchange Infrastructure

High volume of state/federal requests for local dataSolving the present-day problems of data sharing is a pre-requisite to solving the problem of long-term accessNov. 2007: NC Geographic Information Coordinating Council (GICC) approved “Ten Recommendations in Support of Geospatial Data Sharing”http://www.ncgicc.org/

Page 23: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

23

Most costly part of archive development is identifying, negotiating acquisition, and then transferring data

Important ObjectivesMinimize Direct ContactProvide MetadataClarify RightsRoutinize Transfers

Leverage other business uses that put data in motion:

Continuity of operationsHighway PlanningFloodplain MappingCensus

Getting the Data in Motion

Page 24: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

24

Getting the Data in Motion

OrthophotoData DistributionSystem – “sneakernet”

Transfer of large quantities of imagery

Street Centerline Data Distribution System

Efficient transfer of data from 100 counties, with metadata and clarified rightshttp://www.ncstreetmap.com

NC GIS Inventory

• Efficient data identification• Adding preservation elements

NC OneMap Data Download and Viewer

• Public access• Data visualization

Page 25: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

25

Repository Development

Downloading or acquiring “low hanging fruit”Tapping into current data flowsDeveloping our own metadata when necessaryConverting and preserving vector data in shapefile format

Page 26: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

26

Repository Status

Acquired 6+ TB of data with more on the way

Disk space being used initially for “data staging”Inventorying

In the process of ingesting content into DSpaceMetadata generation

Page 27: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

27

Data Preservation Like Fruit Desiccation?

Complex data representations can be made more preservable (yet less useful) through simplification.

Conversion of various formats to shpImage outputs (web services, PDF maps, map image files)

Open GeoPDF standard Analogous to paper mapsCombines data, symbology, annotationMore data intelligence than simple imagePDF content retained in addition to, NOT instead of data

Page 28: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

28

Engaging Spatial Data Infrastructure

Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production?

Engage and outreach to the data producer community and SDISell the problem to software vendors and standards developmentFind overlap with more compelling business problems: disaster preparedness, business continuity, road building, etc.Discuss roles at the local, state, and federal level

Page 29: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

29

Data inventories support content identification

Metadata standards support discoverability and use

Content standards support data interoperability over time and help eliminate semantic confusion

Data exchange networks:Minimize need to make contactAdd technical, administrative, descriptive metadataEstablish rights and provenance

SDI Role in Data Preservation

Page 30: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

30

NC Spatial Data Infrastructure: NC OneMapNext generation mechanism to coordinate and disseminate geographic information in North Carolina and interact with the NSDI.NC GICCInventory for all geospatial data holdings – http://nc.gisinventory.netDevelop content standards for key data themes

One of the defined characteristics of NC OneMap is that “Historic and temporal data will be maintained and available”.

Page 31: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

31

Archival and Long Term Access Working Group

Initiated by NC Geographic Information Coordinating Council in 2008 to address growing concerns of state and local agencies about long-term access to dataFederal, state, regional, and local agency representationKey focus

Best practices for data snapshots and retentionState Archives processes: appraisal, selection, retention schedules, etc.

Valuable outcome of NCGDAP – multiple parties and levels discussing data archiving on their own.

Page 32: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

32

Archival and Long Term Access Working Group

Final Report approved by NC GICC in November, 2008

Best Practices for:Archiving ScheduleInventoryStorage MediumFormatsNaming

http://www.ncgicc.org/

MetadataDistributionPeriodic ReviewData IntegrityPublicity

Page 33: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

33

MetadataDistributionPeriodic ReviewData IntegrityPublicity

How to Recognize a Retention Schedule: Sample Schedule Item from NC OneMap

Page 34: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

34

Sample Proposed Local Schedule—County Management Schedule

Page 35: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

35

Page 36: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

36

Page 37: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

37

NDIIPP Multi-State Geospatial Project

Lead organizations: North Carolina Center for Geographic Information & Analysis (NCCGIA) and State Archives of NCPartners:

Leading state geospatial organizations of Kentucky and UtahState Archives of Kentucky and UtahNCSU Libraries in catalytic/advisory role

State-to-state and geo-to-Archives collaboration2 year project: Nov. 2007-Dec. 2009Archives as part of Spatial Data Infrastructure

Page 38: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

38

OGC Data Preservation Working Group

Formed Dec. 2006Engage archival communityFind points of intersection with other OGC activities:

GML for archivingContent packagingLarge scale data transfersTime in decision support

Page 39: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

39

Cultural: Changing Industry Thinking

Is the geospatial industry “temporally-impaired?”Lack of access to older dataLack for tool/model support for temporal analysisMetadata: poor support for changing dataEducation: building class projects around available data (i.e., not temporal)

Increased interest now in temporal applications?Increased demand for temporal data?Improved tool support: ArcGIS 9.2 animation tools; Geodatabase History, etc.Emerging commercial market in older data

Page 40: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

40

“Supporting temporal analysis requirements” gets more attention than “archiving and preservation”Leverage existing infrastructureCurrent data sharing needs drive infrastructure improvements that help archivingLeverage business needs that are more compelling than preservation (e.g., continuity of operations)Facilitate stakeholder ownership of the solutionsMine state and local archiving innovations

Conclusions

Page 41: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

41

Slide Presentation:

http://www.lib.ncsu.edu/ncgdap/presentations.html

Steve Morris Jeff EssicHead, Digital Library Initiatives Geospatial Data Services LibrarianNCSU Libraries NCSU Librariesph: (919) 515-1361 ph: (919) [email protected] [email protected]

Page 42: Next Generation Archives: The NC Geospatial Data Archiving ... · NC Geospatial Data Archiving Project (NCGDAP) Three year partnership between university library (NCSU) and state

42

Getting the Data in Motion

Harvesting use cases for older data as part of outreach

Factors Driving Capture of Temporal Data

0.0%10.0%

20.0%30.0%

40.0%50.0%

60.0%

IT policy Recordsretention

policy

Tax adminrules

Land usechangeanalysis

Resolutionof legalissues

Historicmapping

Other

% o

f Res

pond

ents

Survey of current archiving practice among NC counties and municipalities