managing your data paget

15
Data management Advances in data management practices and technologies for ecosystem science Ecological Society of Australia, Wed 1 October 2014 Matt Paget – AusCover data and systems coordinator CSIRO Land & Water, Canberra

Upload: terrestrial-ecosystem-research-network

Post on 24-Dec-2014

29 views

Category:

Science


4 download

DESCRIPTION

Managing your data. Presentation by Matt Paget at ESA conference workshop 1 October 2014

TRANSCRIPT

Page 1: Managing your data paget

Data managementAdvances in data management practices and technologies for ecosystem scienceEcological Society of Australia, Wed 1 October 2014

Matt Paget – AusCover data and systems coordinatorCSIRO Land & Water, Canberra

Page 2: Managing your data paget

Introduction and Overview

What is data management?From raw data to managed dataRole of a data managerWhy managed data?What does this mean for you?What is a data management plan?What does managed data look like?

Matt Paget – Data and systems coordinator for AusCover

AusCover is the satellite remote sensing data facility of TERNBiophysical remote sensing products, Ground and Airborne validation

Page 3: Managing your data paget

What is data management?

MetadataInformation and context

FormatsRead and use

LocationDiscover and access

Page 4: Managing your data paget

From raw to managed data

Data• Format• Metadata• Location

Individual• Raw• Activity• Local

Page 5: Managing your data paget

From raw to managed data

Data• Format• Metadata• Location

Individual• Raw• Activity• Local

Community• Common, Usable• Activity, Who, When• Shared storage

Page 6: Managing your data paget

From raw to managed data

Data• Format• Metadata• Location

Individual• Raw• Activity• Local

Community• Common, Usable• Activity, Who, When• Shared storage

Multi-discipline, Published• Standard, Open Access• Who, What, Why, When, Where• Managed storage

Page 7: Managing your data paget

Role of a data manager

Make data more usable and discoverable

Liaise with researchersFind a solution that works for everyone

Keep abreast of latest practicesUnderstand evolving standards

Provide tools and resources to help researchers manage their own data

License, Metadata, Format conversion, Storage

Promote data discover and interoperabilityHarvestable metadata, Web-based data access

Page 8: Managing your data paget

Why managed data?

Open data, Open science, Good science citizen

Citations and Published data• Exposure• Credit• Sharing your research

Data management plans• ARC• Organisations• Collaborations

Future Proof• Safety of data• Protect investment• Counter memory loss

Interoperability• Value adding• Uptake and Relevance• Alternative use

Page 9: Managing your data paget

What does this mean for you?

Help is availableTeam or Community data managers

Collect metadata earlyUse a field data collection tool(ALA, ODK) or record carefully

Insist on a data management planWhat is a data management plan?

Data management is a skillThe more you do it the easier it is

Page 10: Managing your data paget

What is a data management plan?

Data coordination• Who will manage andcollate the data

A data management plan or DMP is a formal document that outlines how you will handle your data both during your research, and after the project is completed.

http://en.wikipedia.org/wiki/Data_management_plan

Protocols and sampling strategy• How will the data be collected• Refer, cite and credit as required

Raw data to products• What do you expect to derive or deduce

Metadata and data• Collate relevant information• Consider data formats for sharing/reuse

Long-term storage and management• Where will the data be stored• Who will manage the data• How will updates or changes be managed• Who will pay for these services

Access and use policies• License• Embargo requirements

1 2

3 4

5 6

Page 11: Managing your data paget

What does managed data look like? #1Metadata headings Recommendations

Title Clear and complete. Spell out acronyms. Include spatial reference if relevant

Abstract Summary of the work undertaken and the resulting data product

License Select a license type. Recommend CC BY.Get advise on the using the right words.

Point(s) of Contact Name, Institution and Email at a minimum.Lead author and Data manager/contact.

Space and time Representative space and time bounds

Sampling strategy or algorithm Description of the sampling process or algorithm

Data quality Description of the quality, limitations and relevance of the data

Keywords Selected keywords taken from your discipline and/or a vocabulary (e.g., GCMD)

Link(s) to the data URL for web-enabled data access

References List of publications and other data products that relate to this data

Prim

ary

info

rmati

onSe

cond

ary

info

rmati

on

Page 12: Managing your data paget

What does managed data look like? #2

Common data formats Recommendations

Notebooks, non-electronic Susceptible to data loss. Consider translating to an electronic form

Text, CSV Mostly fine. Easily readable. Consider a ReadMe filefor metadata

MS Excel (xls) Not too bad because most people have the software. Susceptible to change and data loss

Database Very popular. Good for stable, searchable data. Web access. Requires knowledge of the software

ArcGIS/View, ERDAS, ENVI Proprietary GIS and raster software. Ok to work with. Generally avoid these formats for data publishing – expensive software required

GeoTIFF, shape files, netCDF, etc Supported by the Open Geospatial Consortium. Multiple software and web services

Page 13: Managing your data paget

What does managed data look like? #3

Storage locations Recommendations

Desktop and/or external hard drive Susceptible to data loss. Highly managed data and hardware

Shared machines Mostly fine. Sharable within a group or team. Generally managed/redundent hardware

Institution/community repository Shareable within a community. Access is variable. Easy to divorce data from the owner

Web services Sharable to all. Machine and human access. Access should be as easy for the owner as anyone else

Page 14: Managing your data paget

ResourcesMetadata

Formats

Storage

Data collection• ALA field tools .. http://www.ala.org.au/get-involved/citizen-science/fielddata-software• ODK forms .. http://www.auscover.org.au/xwiki/bin/view/Field+Sites/ODK+Forms• AusPlots field handbookMetadata preparation• SHaRED .. http://www.shared.org.au• Metadata template .. http://www.auscover.org.au/xwiki/bin/view/Product+pages/Product+page+template+-+Field+data

• Community practices • Open Geospatial Consortium .. http://www.opengeospatial.org

• Home institution• Australian eResearch Organisations .. http://www.aero.edu.au

Page 15: Managing your data paget

Questions and advice

Matt Paget [email protected] data and systems coordinator

Anita Smyth [email protected], SHaRED data facilitator

Siddeswara Guru [email protected] data coordinator