Download - Managing your data paget
Data managementAdvances in data management practices and technologies for ecosystem scienceEcological Society of Australia, Wed 1 October 2014
Matt Paget – AusCover data and systems coordinatorCSIRO Land & Water, Canberra
Introduction and Overview
What is data management?From raw data to managed dataRole of a data managerWhy managed data?What does this mean for you?What is a data management plan?What does managed data look like?
Matt Paget – Data and systems coordinator for AusCover
AusCover is the satellite remote sensing data facility of TERNBiophysical remote sensing products, Ground and Airborne validation
What is data management?
MetadataInformation and context
FormatsRead and use
LocationDiscover and access
From raw to managed data
Data• Format• Metadata• Location
Individual• Raw• Activity• Local
From raw to managed data
Data• Format• Metadata• Location
Individual• Raw• Activity• Local
Community• Common, Usable• Activity, Who, When• Shared storage
From raw to managed data
Data• Format• Metadata• Location
Individual• Raw• Activity• Local
Community• Common, Usable• Activity, Who, When• Shared storage
Multi-discipline, Published• Standard, Open Access• Who, What, Why, When, Where• Managed storage
Role of a data manager
Make data more usable and discoverable
Liaise with researchersFind a solution that works for everyone
Keep abreast of latest practicesUnderstand evolving standards
Provide tools and resources to help researchers manage their own data
License, Metadata, Format conversion, Storage
Promote data discover and interoperabilityHarvestable metadata, Web-based data access
Why managed data?
Open data, Open science, Good science citizen
Citations and Published data• Exposure• Credit• Sharing your research
Data management plans• ARC• Organisations• Collaborations
Future Proof• Safety of data• Protect investment• Counter memory loss
Interoperability• Value adding• Uptake and Relevance• Alternative use
What does this mean for you?
Help is availableTeam or Community data managers
Collect metadata earlyUse a field data collection tool(ALA, ODK) or record carefully
Insist on a data management planWhat is a data management plan?
Data management is a skillThe more you do it the easier it is
What is a data management plan?
Data coordination• Who will manage andcollate the data
A data management plan or DMP is a formal document that outlines how you will handle your data both during your research, and after the project is completed.
http://en.wikipedia.org/wiki/Data_management_plan
Protocols and sampling strategy• How will the data be collected• Refer, cite and credit as required
Raw data to products• What do you expect to derive or deduce
Metadata and data• Collate relevant information• Consider data formats for sharing/reuse
Long-term storage and management• Where will the data be stored• Who will manage the data• How will updates or changes be managed• Who will pay for these services
Access and use policies• License• Embargo requirements
1 2
3 4
5 6
What does managed data look like? #1Metadata headings Recommendations
Title Clear and complete. Spell out acronyms. Include spatial reference if relevant
Abstract Summary of the work undertaken and the resulting data product
License Select a license type. Recommend CC BY.Get advise on the using the right words.
Point(s) of Contact Name, Institution and Email at a minimum.Lead author and Data manager/contact.
Space and time Representative space and time bounds
Sampling strategy or algorithm Description of the sampling process or algorithm
Data quality Description of the quality, limitations and relevance of the data
Keywords Selected keywords taken from your discipline and/or a vocabulary (e.g., GCMD)
Link(s) to the data URL for web-enabled data access
References List of publications and other data products that relate to this data
Prim
ary
info
rmati
onSe
cond
ary
info
rmati
on
What does managed data look like? #2
Common data formats Recommendations
Notebooks, non-electronic Susceptible to data loss. Consider translating to an electronic form
Text, CSV Mostly fine. Easily readable. Consider a ReadMe filefor metadata
MS Excel (xls) Not too bad because most people have the software. Susceptible to change and data loss
Database Very popular. Good for stable, searchable data. Web access. Requires knowledge of the software
ArcGIS/View, ERDAS, ENVI Proprietary GIS and raster software. Ok to work with. Generally avoid these formats for data publishing – expensive software required
GeoTIFF, shape files, netCDF, etc Supported by the Open Geospatial Consortium. Multiple software and web services
What does managed data look like? #3
Storage locations Recommendations
Desktop and/or external hard drive Susceptible to data loss. Highly managed data and hardware
Shared machines Mostly fine. Sharable within a group or team. Generally managed/redundent hardware
Institution/community repository Shareable within a community. Access is variable. Easy to divorce data from the owner
Web services Sharable to all. Machine and human access. Access should be as easy for the owner as anyone else
ResourcesMetadata
Formats
Storage
Data collection• ALA field tools .. http://www.ala.org.au/get-involved/citizen-science/fielddata-software• ODK forms .. http://www.auscover.org.au/xwiki/bin/view/Field+Sites/ODK+Forms• AusPlots field handbookMetadata preparation• SHaRED .. http://www.shared.org.au• Metadata template .. http://www.auscover.org.au/xwiki/bin/view/Product+pages/Product+page+template+-+Field+data
• Community practices • Open Geospatial Consortium .. http://www.opengeospatial.org
• Home institution• Australian eResearch Organisations .. http://www.aero.edu.au
Questions and advice
Matt Paget [email protected] data and systems coordinator
Anita Smyth [email protected], SHaRED data facilitator
Siddeswara Guru [email protected] data coordinator