presentation to argis - atlanta region gis user group october 30, 2013 jennifer doty |...

33
Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Data Archiving & Preservation: Best Practices for GIS Jennifer Doty | [email protected] Data Management Specialist Emory Center for Digital Scholarship

Upload: annabella-foster

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Presentation to ARGIS - Atlanta Region GIS User GroupOctober 30, 2013

Data Archiving & Preservation: Best Practices for GIS

Jennifer Doty | [email protected] Data Management Specialist

Emory Center for Digital Scholarship

Page 2: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Overview

Best practices for managing geospatial data:• File formats• Naming conventions• Folder structure• Storage and backup• Documentation

Trends in geospatial data archiving:• Federal funding agencies’ requirements• State initiatives for preservation

2

Page 3: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Best Practices: File Formats Type of data Acceptable formats for

sharing, reuse and preservation

Other acceptable formats for data preservation

Geospatial datavector and raster data

ESRI Shapefile (essential - .shp, .shx, .dbf, optional - .prj, .sbx, .sbn)

geo-referenced TIFF (.tif, .tfw)

CAD data (.dwg)

tabular GIS attribute data

ESRI Geodatabase format (.mdb, .gdb)

MapInfo Interchange Format (.mif) for vector data

Keyhole Mark-up Language (KML) (.kml)

Adobe Illustrator (.ai), CAD data (.dxf or .svg)

binary formats of GIS and CAD packages

3UK Data Archive File Formats guide, http://www.data-archive.ac.uk/create-manage/format/formats-table

Page 4: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

4

Best Practices: File Formats

GeoMAPP Geospatial Data File Formats Reference Guide:• provides quick reference of common

geospatial raster and vector dataset types• serves as tool to identify geospatial format

types based on file extensions• also includes information on standards and

specifications for documenting geospatial data

http://www.geomapp.net/docs/GeoMAPP_Geospatial_data_file_formats_FINAL_20110701.xls

Page 5: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Best Practices: Naming Conventions

• Create meaningful but brief naming conventions for your project

• Use file names to classify broad types of files • Avoid using spaces and special characters• Begin names with letters, not numbers

e.g. Census2010_blockgroups_GA, not 2010Census…• Avoid very long file names

5

Page 6: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

6

Best Practices: Naming Conventions

Example: keyword_steward_extent_date.ext

• Keyword (essential)—be as descriptive of the contents of the data as possible by using a word or short phrase

• Steward (essential)—either the creator of the dataset or the last one to make a significant modification to a dataset

• Extent (optional)—may be included to indicate resolution of the data (e.g. county, state, or international)

• Date (optional)—may be used to indicate the date of creation or the age range of the content. Recommended format is YYYYMMDD

Indiana Geographic Information Council, http://www.igic.org/standards/namingstandard.pdf

Page 7: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

7

Best Practices: Naming Conventions

Versioning:• useful to indicate file revisions or edits,

especially in collaborations• can be through discrete or continuous

numbering, depending on minor or major revisions– think of software versioning—ArcGIS 10 was

significant change from 9.x., but ArcGIS 10.1 was (relatively) minor change to 10

Page 8: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Best Practices: Folder Structure

• Separate directories for scratch workspace and final data

• Hierarchy—is deep or shallow best for your project?

8

Page 9: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

9

Page 10: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Tape library, CERN, Geneva by Cory Doctorow / CC BY-SA 2.0

Page 11: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Best Practices: Storage & Backup

Storage Considerations:• Accessibility • Read/Write speed• Size limits—overall vs. file size

Options:• Local—PC drive, flash drive, external hard drive• Server—department/organization server space• Cloud—Dropbox, Google Drive, etc.

11

Page 12: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Best Practices: Storage & Backup

Backup Considerations:• Accessibility (local, server, cloud)• Redundancy (rule of thumb—here, near, far)

Options:• Incremental/Snapshot• Automated

12

Page 13: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Met

adat

a is

a lo

ve n

ote…

by

sara

h0s

/ CC

BY-

NC-

ND

2.0

Page 14: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

14

Best Practices: Documentation

“When thoughtfully populated, geospatial metadata can be a critical resource for understanding and managing geospatial data for current and future GIS practitioners and those trying to preserve the data.”-Utilizing Geospatial Metadata to Support Data Preservation Practices, January 2011, GeoMAPP (http://www.geomapp.net/publications_categories.htm)

Page 15: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Best Practices: Documentation

Metadata—represents the who, what, when, where, why and how

Standards:• CSDGM (FGDC)• ISO 19115-2003 / 19139

15

Page 16: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

16

FGDC’s Content Standard for Digital Geospatial Metadata (CSDGM)

http://www.fgdc.gov/csdgmgraphical/index.html

Page 17: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

17

CSDGM Fields for Preservation

Page 18: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

18

Checklist: CSDGM Fields for Preservation

Identification Information - basic info about data set, including:• party responsible—usually creator• publication date—date the data set is completed and ready for

use• title—”where” “what” “when” • maintenance/update frequency—annually, as needed, based on

census, etc.• bounding coordinates• keywords (theme and place)• access and use constraints—any restrictions, disclaimers, or

guidance on data set attribution• contact details

GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

Page 19: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

19

Checklist: CSDGM Fields for Preservation

Data Quality Information – provides historical lineage and source descriptions for the data used in the creation of the data set, including:• originator• publisher, publication date & place• “currentness” of source data• process description

GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

Page 20: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

20

Checklist: CSDGM Fields for Preservation

Spatial Reference Information - description of the reference frame for, and the means to encode, coordinates in the data set, including:• map projection name• coordinate system name• unit of measure• geodetic model—datum, ellipsoid

GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

Page 21: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

21

Checklist: CSDGM Fields for Preservation

Entity and Attribute Information - details about content of the data set—the entities, their attributes, and domains from which attribute values may be assigned, including:• entity label• attribute label and description

GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

Page 22: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

22

Checklist: CSDGM Fields for Preservation

Metadata Reference Information - information on the party responsible for creating the metadata and the currentness of the metadata:• metadata standard name• metadata standard version

GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

Page 23: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Data Management Initiatives

Federal agency mandates for sponsored research:• NSF & NIH requirements for DM plans• GIS Inventory (Ramona) & Federal Grants data

sharing plans—gisinventory.net

Other related initiatives:• USGS DM working group• DM training for early career researchers

23

Page 24: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

24

FGDC Geospatial Data Lifecycle Model

http://www.fgdc.gov/policyandplanning/a-16/stages-of-geospatial-data-lifecycle-a16.pdf

Page 25: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

25

State & National Initiatives in Geospatial Data Archiving

GeoMAPP - Geospatial Multistate Archive and Preservation Partnership (www.geomapp.net):• federally funded partnership between the Library of

Congress and state geospatial and archives staff from North Carolina, Kentucky, Montana, and Utah

National Digital Stewardship Alliance (NDSA), Geospatial Content Team (www.digitalpreservation.gov/ndsa):• report identifying appraisal and selection activities as they

effect decisions defining geospatial content of enduring value for the nation

Page 26: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

Open GeoPortal @ Emory

NASA Goddard Photo and Video / CC BY

Page 27: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center
Page 28: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center
Page 29: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center
Page 30: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center
Page 31: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center
Page 32: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

32

Gre

en Q

uesti

on M

ark

by m

ikec

ogh

on F

lickr

/ C

C BY

Page 33: Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center

33

Contact Information:

Jennifer Doty | [email protected] Data Management Specialist

Michael Page | [email protected] & Geospatial Data Librarian

Emory Center for Digital Scholarshipdigitalscholarship.emory.edu