beer workshop november 9, 2008 has data management gone mainstream? presented at the beer workshop...

31
BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert C. Groman

Upload: taylor-armstrong

Post on 27-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Has Data Management Gone Mainstream?       

Presented at the

BEER Workshop

Coconut Grove (Miami), Florida November 9, 2008

Robert C. Groman

Page 2: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Talk Overview

• Has data management gone mainstream?

• “Data” is a plural noun = facts, statistics, or items of information.

• Metadata = motherhood and apple pie

• Accessing data: Is a picture worth a thousand bytes?

• Data Interoperability

Page 3: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Purpose

• Raise level of awareness (and appreciation) for data management

• “Lighter and informative”

• Want to use some formulas

• Difference between an engineer and a mathematician

Page 4: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Venn Diagram:Data and Metadata

All data and information (D)necessary to use the data. Data (d)

Metadata (m) D ≠ m + d

Facts, statistics, or

items of information

Set Theory

Page 5: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Probability of having all the necessary data and information necessary to reuse someone else's data.

Second order effects:

•Length of cruise

•Success of cruise

•Participants

•Immediate activity following the cruise

Page 6: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Theorems†

• Theorem 1: The probability that all the necessary data and information are collected and preserved to allow another researcher to properly use your data is inversely proportional to the time since the data were collected.

• Corollary: Unless data and information are collected and preserved during the experiment (cruise), subsequent researchers will have a difficult time using your data.

• Theorem 2: The longer the time since the data were collected the less likely the data will ever be considered “final”.

†Proofs are left to the reader as an exercise.

Page 7: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Seeing Versus Using Someone’s Data

• Maybe you don’t want others to use your data. Hard to believe, but this does happen. For example:– I’m not done publishing my papers based on my data– My graduate student is almost done analyzing the data– It’s not final yet – no, but they still may be useful– My dog ate it (no, I haven’t heard this one yet.)

• Old policies and practices about data archiving• New policies about data sharing, data publishing and

data archiving– Web accessible– NSF mandate (It is for real this time.)– The sum is greater than its parts

Page 8: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

The more people use your data the better they get.

• Heisenberg Uncertainty Principal (HUP) does NOT seem to apply• If Δx and Δp are the uncertainties in the measurements of the position and momentum, then the product ΔxΔp is at least on the order of

Planck's constant. • When measuring conjugate quantities, the product of their standard deviations must be at least h / 4π • Not to be confused with the term observer effect (OE) which refers to changes that the act of observing will make on the phenomenon

being observed.

Page 9: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Biological and Chemical Oceanography Data Management Office

BCO-DMO

• NSF funded 3 year project to provide short and medium term data management, including web based access, to all NSF funded projects from their biological and chemical oceanographic programs

• Large NSF projects are expected to have their own data management offices

• Web site: http://www.bco-dmo.org/

Page 10: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Data Stewardship

• “a concern for creation and preservation of data and all intermediate phases - focuses …on the management of data over the long term” [Baker and Chandler, 2008];

• Data quality control;• Treatment of all information as data fosters data re-use;• Data that lack sufficient metadata has limited value

beyond the research program for which they were collected; and

• Metadata should include sufficient information to support discovery, value assessment, and accurate re-use of the data.

Page 11: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

MapServer interface and interoperability enhancements

• Provides access to geo-referenced scientific data and metadata

• Presents distributed data sets in a unified way• Uses MapServer as the visualization application• Visualize data with graphics generated on-the-fly• Request custom subsets of measurements in a

variety of file formats • Compare data from different sources

Page 12: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Interoperability

• Ability to get someone else's data and use it on your system. (How easy is this really?)

• True interoperability. Get someone else's data and use it directly in your application. Do the units match and do the data acquisition and processing steps match yours or are accounted for, including instrumentation differences?

Page 13: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

JGOFS/GLOBEC Data Management System

Page 14: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

http://globec.whoi.edu/map

Skip

Page 15: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Cruise Tracks

Page 16: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Select 5 Cruises

Page 17: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Click on “Show Data” Button

Page 18: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Select CD data in EN307

Page 19: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Shows stations and optional grid lines

Page 20: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

EN307 graph it options

Page 21: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Depth versus salinity and versus temperature

Page 22: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Select another cruise: AL9906

Page 23: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Select MOC1 data set

Page 24: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Map it options for abundances

Page 25: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Interoperability features (for free)

Page 26: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

MapServer Supports Interoperability Features

• Open Geospatial Consortium standards– Web Mapping Service (WMS), and

– Show me the data

– Web Feature Service (WFS)– Get me the data

• Retains the functionality of the JGOFS/GLOBEC Data Management System– Download data as ASCII, CSV, Matlab, NetCDF

Page 27: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Related Activities• MMI – Marine Metadata Interoperability

– “Promoting the exchange, integration and use of marine data through enhanced data publishing, discovery, documentation and accessibility."

• UNOLS Subcommittee to Report on Best Practices for the Collection of Data and Metadata at Sea to Promote Public Dissemination

– Too new to even have its own web site

• The Working Group on Zooplankton Ecology (WGZE), with guidance from the Working Group on Marine Data Management (WGMDM), is providing these general metadata guidelines for plankton data collected and submitted to ICES. (2003)

• Sensor Interoperability Metadata Workshop (2006)• ICES ASC 2006 and 2008 theme sessions on data management, data

sharing and related topics• NOAA Coastal Services Center Data Transport Laboratory (DTL)

– Integrated Ocean Observing System (IOOS) – Ocean.US data management and communications (DMAC) strategy

• Gulf of Maine Ocean Data Partnership • Many, many more ….

Page 28: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Metadata Schema

The print size issmall to protect the innocent and

guilty.

Page 29: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

What is the difference between an engineer and a mathematician?

Page 30: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

Page 31: BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert

BEER Workshop November 9, 2008

References

• Karen, S. Baker and Cynthia L. Chandler, Enabling long-term oceanographic research: Changing data practices, information management strategies and informatics, Deep-Sea Research II, 55 (2008), 2132-2142.