sediment experimentalist network (sen): sharing and reusing methods and data in earth surface...
DESCRIPTION
Presentation given to the Summer Institute for Earth Surface Dynamics (SIESD) 2014 at St. Anthony Falls Laboratory, University of Minnesota, about the Sediment Experimentalist Network (SEN). SEN is an EarthCube Research Coordination Network, whose goal is to integrate the efforts of sediment experimentalists and build a knowledge base for guidance on best practices for data collection and management.TRANSCRIPT
Sediment Experimentalist Network (SEN):Sharing and reusing methods and data in
Earth surface process experiments
Leslie Hsu (IEDA, Lamont-Doherty Earth Observatory), Wonsuck Kim (UT Austin), Brandon McElroy (U Wyoming), Raleigh Martin (UCLA)
Charles Nguyen, Danny Im, UMN
August 2014, NCED SIESD at UMNSAFL
Do we need a Sediment Experimentalist Network?
1. Expectations about research data are evolving1. Data deluge: There is more
data from new technologies
2. Funding agencies are asking for data management plans.
3. Journals are asking for links to archived full datasets
4. Metrics for datasets are being developed, allowing better attribution
2. Experimental data is long tail data
Long Tail CharacteristicsMore specialisedLow volumeOn C drivesHard to findHeterogeneousCollected by many peopleCitizen scienceEtcEtc
Long Tail: Environmental and
Earth sciences
The Head: Astronomy, Climate,High Energy Physics, Genomics
L. Wybornhttp://juliegood.wordpress.com/tag/long-tail/
3. Grand challenges in experimental geomorpholgy require data syntheses
• Repeatibility• Scalability• Autogenic vs. Allogenic processes
4. Informatics programs are growing
• NSF ACI Advanced Cyberinfrastructure• AGU ESSI Earth and Space Science Informatics• NSF EarthCube program• ESIP: Earth Science Information Partners• SEAD: data services for managing data• GeoSoft: documenting and sharing software and scripts• CINERGI: providing community access to tools and resources
Yes, we need SEN.SEN’s goal is to integrate the efforts of sediment experimentalists and build a knowledge base for guidance on best practices for data collection and management, and to be the liaison to cyberinfrastructure and geoinformatics communities.
Three components of SEN
SEN-EC Experimental
Collaboratories
• Facilitate collaboration between experimental laboratories
• Develop collaborative infrastructure
• Broadcast experiments • Distributed experiments• Experimental
reproducibility
SEN-KB Knowledge
Base
• Develop online resources for experimental data management
• SEN data catalog and wiki
• SEN-Wiki• Recruit datasets for
inclusion in online repositories
SEN-ED Education &
Data Standards
• Facilitate community discussion of data practices and standards
• Disseminate guidelines• Provide training about data
management and sharing
Today’s objectives
• What is SEN, why do we need it?
• What is the data life cycle?
• How can SEN help you?
• Tell SEN what you need
• SEN data challenge
Normal degradation of data
Michener et al. (1997)
The Data Life Cycle1. Proposal Development and Data Management Plans
2. Project Start‐up
3. Data Collection and File Creation
4. Data Analysis
5. Data Sharing (through publication)
6. Depositing Data
7. …Discovery and back to 1…
http://www.icpsr.umich.edu/files/ICPSR/access/dataprep.pdf
Conception
Gestation
Birth
Maturation
Adult Life
Death, Burial
Planning2012
Jorge Cham
The funding agencywants your plan and the promise that the data will be available forever.
Data Management Plans
Many funding agencies now require a data management plan
Many resources exist for creating data management plans, e.g.• IEDA Data management plan tool• CDL (California Digital Libraries) tool• MIT Data Management Plans page
Proposal Information
http://www.iedadata.org/compliance/plan
Data Acquisition / Processing Summary
Proposed Data Products
Data collection
Preparing data for sharing
Science 11 February 2011: Vol. 331 no. 6018 pp. 692-693
Preparing data for sharing
Data templates and standards exist• Sample descriptions• Analytical geochemical data• Cruise data (dives, sensors)
Standards are discipline specific, and must be developed with community input
Depositing data
In the recent past, the most popular place to deposit datasets with long-term accountability was in journal appendices (Supplemental Material)
Formatting requirements inhibited data archiving (e.g. plain text, page size limits)
Depositing data
Now, there are many options for depositing datasets, e.g.
• Discipline-specific online repositories
• General, institutional repositories• Self-publishing with e.g. FigShare
Data and software publication
• Data papers• Dataset publication and peer review• Persistent identifiers• Software publication and licenses• Altmetrics (e.g. impactstory.org)
SEN and the data lifecycle
The experimental life cycle parallels the data life cycle.
SEN activities are designed to help in each step.
S. Ahn
SEN Activities
• Workshops• Training• Tools• News• Experiments• Discussion list
Future events
● Nov 2014: Utrecht workshop
● Dec 2014: Proposed AGU Town Hall – Sharing and publishing data in Earth Surface Process Science
● Fall 2015: Binghamton Geomorphology SymposiumExperimental Geomorphology: Sharing and Reusing Data
Tell SEN what you need
SEN Data Catalog, http://sedexp.net
Create user
or
sedimentexpsiesd2014
SEN Wiki, http://sedexp.net
SEN Sediment and Instrument Lists
Where have others bought sediment and instruments?goo.gl/NUA5mS (Tabs 1 and 2)
SEN and CINERGI Resource Viewer
What websites, databases, and journals might help me?goo.gl/Yp5Aud
SEN 2014 Challenge
http://goo.gl/YKTsNm
For a trip to an upcoming SEN workshop!(students at a U.S. Institution)
Summary
1. What is SEN, why do we need it?
2. What is the data life cycle?
3. What are some tools that SEN provides to help me discover and manage data?
4. How can I earn a trip to a SEN workshop?
Links and ReferencesSEN homepage: workspace.earthcube.org/sen
EarthCube Program: www.earthcube.org
Michener, William, James Brunt, John Helly, Thomas Kirchner, Susan Stafford. (1997). “Nongeospatial Meta data for the Ecological Sciences Ecological Applications, Vol. 7, No. 1, pp. 330–342.http://www.icpsr.umich.edu/files/ICPSR/access/dataprep.pdf
Nature Articles• http://www.nature.com/ngeo/journal/v4/n9/pdf/ngeo1259.pdf• http://www.nature.com/ngeo/journal/v4/n9/pdf/ngeo1248.pdf
Science Special Data Issue: http://www.sciencemag.org/content/331/6018/692.short
Data Planning• http://www.iedadata.org/compliance/plan• https://dmp.cdlib.org/• http://libraries.mit.edu/guides/subjects/data-management/plans.html
Contact: [email protected]