moondb: restoration & synthesis of planetary geochemical data

61
1 www.iedadata.org LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Upload: kerstin-lehnert

Post on 15-Jul-2015

282 views

Category:

Science


0 download

TRANSCRIPT

Page 1: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

1

www.iedadata.org

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 2: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Agenda

• Introduction

• Overview of the MoonDB Project

• Overview of relevant EarthChem Systems & Services• Data Publication & the EarthChem Library

• Data Rescue

• Data Synthesis - PetDB

• Discussion, questions• What data do you have?

• What help do you need?

• How should we stay in touch?

2LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 3: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 3

Page 4: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

MoonDB’s Goals

• advance preservation, access and utility of lunar sample data• help investigators ‘rescue’ (restore) and share

unpublished data

• compile data from the literature into a PetDB-type synthesis

• provide a platform for future data to be made openly accessible, while being seamlessly integrated with the historical data and equivalent data for terrestrial samples.

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 4

Page 5: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

MoonDB’s Deliverables

• Development of MoonDB

• compile lunar sample data from the literature into an online searchable synthesis database

• Rescue of Lunar Sample Data at Risk

• rescue unpublished legacy data and metadata that are in danger of being lost to complement the published datasets

• MoonDB Reference Catalog

• consolidate the various reference databases

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 5

PetDB

IEDA Data Rescue

EarthChem Library

AACO Databases

Page 6: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Timeline

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 6

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8

Reference Catalog prototype released

MoonDB User Interface Beta version

Ingest first submitted datasets

Release of MoonDB Reference Catalog

Release of MoonDB full version

10/1/2015 9/31/2017

Page 7: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Two Steps to Make Data Useful

7

1. Restore the data

2. Synthesize the data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 8: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Restoration of “Data at Risk”

• "Data at Risk” are scientific data that are • not in formats that permit full electronic access to the information they

contain.

• Data at Risk may be • non-digital (e.g., handwritten or photographic),

• on near-obsolete digital media (such as floppy disks),

• or insufficiently described (lacking metadata).

• Some born-digital data are considered "at risk" if they cannot be ingested into managed databases because they lack adequate formatting or metadata.

Definition from the ICSU CODATA Data at Risk Task Group (DARTG)

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 8

Page 9: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Rescue Initiatives

• ‘Data at Risk’ Task Group @ CODATA• Phase I: Build inventory of data that are at risk

• Phase II: Design missions to rescue that information

• ‘Heritage Data’ Interest Group @ Research Data Alliance

• International Data Rescue Award in the Geosciences• Joint initiative of IEDA (Integrated Earth Data Applications) and Elsevier

• First award in 2013

• 2015 award to be announced at EGU 2015

• IEDA Data Rescue Mini-Awards

9LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 10: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IEDA Data Rescue

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 10

Page 11: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IEDA Data Rescue11LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 12: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Lessons Learned

• Investigators Lessons• Take ownership of your own legacy

• Data curation by others may not be complete or correct

• Data rescue of an entire career does not need to be overwhelming • Start with small steps• Disciplinary repositories will help and guide you to what is needed

• Despite the time investment, data rescue is worth it• Others will now be able to re-use the data• Notes taken years ago actually explain anomalies

• Repository Lessons• For Long Tail Data, every project is different • A small incentive will motivate investigators• Data Rescue missions help the repository determine next steps for

development of tools and services

12LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 13: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

13LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 14: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

EarthChem Data Systems

14

MetadataData &

Metadata

Data Data Data Data Data

EarthChem Library

Data Data Data

Search Search

Data

Synthesis DB’s

Data

Search

EarthChem Portal

DB DB DB DB DB

Data & Metadata

[.xls]

[XML]

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Data Restoration Data Synthesis

Page 15: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

EarthChem: Making Data Useful for Science

15

• Map of basalt samples from mid-ocean ridges• Color scaled to the 87Sr/86Sr ratio measured on these samples• Data from >300 references compiled within 2 minutes

(Visualization with GeoMapApp: add another 2 minutes)

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 16: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Synthesis Database: PetDB

16LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

as of 3/15/2015

Page 17: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

PetDB: Impact in Science

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 17

Meyzen et al. (2007): „Isotopic portrayal of theEarth's upper mantle flow field.“

Gale, A; Dalton, C A; Langmuir, C H; Su, Y; Schilling, J-G (2013): “The mean composition of ocean ridge basalts”

As of 3/2015, PetDB has been cited in >550 published

articles.

Page 18: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Search: Geospatial

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 18

Page 19: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Search: Geo-Feature

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 19

Page 20: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Search: Lithology

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 20

Page 21: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Search: Expedition/cruise

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 21

Page 22: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Search: Data Availability

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 22

Page 23: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 23

Page 24: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Filter by Data Quality

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 24

Page 25: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

EarthChem Library

• Data repository for geochemical data and related data types

• Operated as part of IEDA (Interdisciplinary Earth Data Alliance)• Sustainable funding through a Cooperative Agreement with NSF

• Community governance & guidance

• Follows Leading Practices for data publication• Persistent identification of data & samples (DOI, IGSN)

• Agreements with data centers for long-term archiving

• Easy data submission

• Release dates set by contributors (up to 2 years moratorium)

• Links between different versions of a dataset

• Cross-referencing with publishers, data citation index, etc.

• Links to awards (compliance!)

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 25

Page 26: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 26

Page 27: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Provenance & Quality

27LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 28: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

28LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 29: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Summary of a Sample

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 29

more data

Page 30: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Getting the data for the synthesis

• Many challenges• Data are dispersed throughout the literature.

• Many data were never published.

• Data are not sufficiently documented (inconsistent or missing metadata).

• Solutions• Data rescue

• Data managers

• Students/interns

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 30

Page 31: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Restoration Needs

• Digitization – transcribe from analog media into spreadsheets; help from students?

• Documentation – samples, provenance (lab, instrument, etc.), data quality; EarthChem data templates

• Standardization – MoonDB vocabularies & data templates

• Accessibility – data publication (ECL), links between systems

• Citability – DOIs, example citations

• Guidance/Training – calls and emails with disciplinary repository staff, regular communication (meetings, webinars)

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 31

Page 32: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Templates

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 32

Page 33: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Templates

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 33

Page 34: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IEDA Data Rescue Initiative

• preserve valuable legacy data sets that are in danger because of impending retirement or degradation

• augment data collections maintained by IEDA

• improve procedures and tools for user contributions

• 2013 & 2015 International Data Rescue Award in the Geosciences

• Town Hall at EGU General Assembly 2015

• IEDA Data Rescue Mini-Awards

• Data Rescue Process Study (collaboration with Elsevier Research Data Services)

34LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 35: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IEDA Data Rescue Mini-awards

• $7,000 awards to support investigators for properly compiling, documenting, and transferring data that are in danger

• Open competition, announced across the Earth Sciences• Proposal evaluated by IEDA User Committee

• Criteria: highest impact on future research based on quality, size, rarity, unique location or data type

• Requirement: Data need to be made accessible to the community for re-use by inclusion in IEDA data collections (EarthChem, MGDS, SESAR)

35LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 36: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

EarthChem Library

• Mechanism for easy data submission

• Review of contributed data by data managers – QA/QC

• Data become citable – credit for contributors!

• Mechanism for making data persistently discoverable & accessible

• Long-term archiving at PDS

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 36

Page 37: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

ECL Data Submission

37LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 38: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Discovery & Access

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 38

Page 39: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 39

Page 40: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 40

Page 41: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 41

Page 42: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Journals

42LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 43: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Data Publication Example

43LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 44: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 44

Page 45: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 45

Page 46: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

The Future of Data in Scientific Articles

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 46

http://www.copdess.org

Page 47: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Commitment of Publishers

“Earth and space science data should, to the greatest extent possible, be stored in appropriate domain repositories that are

widely recognized and used by the community, follow leading practices, and

can provide additional data services.”

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 47

Page 48: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

COPDESS Signatories

Publishers

• American Astronomical Society• American Geophysical Union• American Meteorological Society• Center for Open Science• Elsevier• European Geophysical Union

• Geochemical Society• ICSU World Data System• John Wiley and Sons• Meteoritic Society• Mineralogical Society of America• Nature Publishing Group• Paleonotological Society• Proceedings of the National Academy of Sciences• Science

Data Facilities

• BCO-DMO• CLIVAR and Carbon Hydrographic Data Office

(CCHDO)• CINERGI• CUAHSI• Continental Scientific Drilling Coordination Office

(CSDCO)• Council of Data Facilities• Geological Data Center of Scripps Institution of • IRIS• IEDA• LacCore: National Lacustrine Core Facility• Magnetics Information Consortium (MagIC)• Neotoma Paleoecology Database• National Snow and Ice Data Center• OpenTopography• Rolling Deck to Repository (R2R) Program• UNAVCO

48LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 49: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

MoonDB: The Data Restoration Team

• Richard Carlson, Carnegie Institution of Washington

• Erik Hauri, Carnegie Institution of Washington

• Bradley L. Jolliff, Washington University in St. Louis

• Clive Neal, University of Notre-Dame

• Marc Norman, Australian NationalUniversity

• Larry Nyquist, NASA JSC

• Charles Shearer, University of New Mexico

• Chi-Yu Shih, NASA JSC

• Lawrence A. Taylor, University of Tennessee in Knoxville

• G. Jeffrey Taylor, University of Hawaii

• Paul Warren, University of California Los Angeles

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 49

Join!

Page 50: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 50

Page 51: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Links to Literature

51LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 52: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

International GeoSample Number

• A globally unique and persistent identifier for physical objects in the Earth Sciences that is guaranteed to be unique via a centralized control mechanism.

• Resolves to virtual sample representations (sample metadata profiles) managed at federated IGSN Allocating Agents.

52

The EarthChem Portal shows 75 publications with geochemical data referenced to a sample with the name M1 (or M-1). The map shows the locations of M1 samples. (www.earthchem.org)

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 53: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IGSN

53

Unique, persistent and resolvable identifier for sampling features

Governed by an international non-profit organization IGSN eV

Objective: ensure proper citation of samples and persistent access to sample metadata

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 54: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IGSN Metadata

• Identification• Sample name(s), registrant

• Description• Material, classification, age, size, comments

• Geospatial information• Geographical names, coordinates

• Collection• Expedition/cruise, platform, date, collector,

technique

• Archiving/access• Physical location of sample (repository), contact

• Relationship to other (sub-)samples

54LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 55: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Sample Geneology

55LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 56: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

SESAR Services:Links to Relevant Resources

• Images

• Documents (.pdf, .xls, .doc)

• Publications

• Data

56LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 57: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

57LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 58: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Linking Samples, Data, & Publications

58LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 59: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

SESAR System for Earth Sample Registration

• Registry for the International Geo Sample Number IGSN

• Catalog of sample metadata• Search for samples

• Access to object metadata profiles

• Tools for sample registration and metadata management: MySESAR• User interface to submit sample metadata (registration)

• User interface for metadata management (e.g., edit metadata, transfer ownership etc.)

59LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 60: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

IGSN: Registration

60

Allocating Agent

• Sample Name• Location• Sample type• ….

IGSN:XYZ08H7JG

IGSN eVRegistry

Sample Label

1. Submit metadata

2. Create IGSN, store metadata

3. Register IGSN

5. Send to user

6. Use IGSN

4. Confirm uniqueness

Lab or field

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 61: MoonDB: Restoration & Synthesis of Planetary Geochemical Data

Governance: IGSN e.V.

• Non-profit organization registered in Germany (“eingetragener Verein”) to operate an IGSN registration service with a distributed infrastructure for use by and benefit of its members

• Currently 14 members from US, Germany, Australia

• By-laws modeled after the DataCite Consortium

• Membership required for organizations that want to set up their own Allocating Agent.

• Membership NOT required to use IGSNs.

61LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data