data management in the u.s. globec program

42
Data Management in the U.S. GLOBEC Program SCOR/IGBP Meeting on Data Management for Marine Research Projects Robert C. Groman Woods Hole Oceanographic Institution 8 – 10 December 2003 Click here for PowerPoint version.

Upload: hamal

Post on 17-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Data Management in the U.S. GLOBEC Program. SCOR/IGBP Meeting on Data Management for Marine Research Projects Robert C. Groman Woods Hole Oceanographic Institution 8 – 10 December 2003 Click here for PowerPoint version. U.S. GLOBEC : Goal. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Management in the  U.S. GLOBEC Program

Data Management in the U.S. GLOBEC Program

SCOR/IGBP Meeting on Data Management for Marine Research Projects

Robert C. GromanWoods Hole Oceanographic Institution

8 – 10 December 2003

Click here for PowerPoint version.

Page 2: Data Management in the  U.S. GLOBEC Program

U.S. GLOBEC: Goal

• To understand the population dynamics. Ultimately want to be able to predict changes in distribution and abundance of key species as a result of changes in the physical and biotic environment, such as from climate change.

Page 3: Data Management in the  U.S. GLOBEC Program

Three U.S. Programs

• Georges Bank – field program started in 1995 with some cruises earlier; field program ended in 1999.

• Northeast Pacific – field program started in 2000. Gulf of Alaska too.

• Southern Ocean – field program started in 2001 and ended in 2003.

Page 4: Data Management in the  U.S. GLOBEC Program

Georges Bank Study Area

Page 5: Data Management in the  U.S. GLOBEC Program

Northeast Pacific Study Area

Page 6: Data Management in the  U.S. GLOBEC Program

Southern Ocean Study Area

Page 7: Data Management in the  U.S. GLOBEC Program

Project Components

• Field program: Georges Bank project completed 120 cruises with 360 days at sea.

• Laboratory experiments

• Retrospective studies

• Analysis and synthesis

Page 8: Data Management in the  U.S. GLOBEC Program

Georges Bank Data

120 Cruises in inventory118 Cruise reports printed 56 Cruise reports on-line Data objects on-line: See web site Missing data: Zooplankton counts VPR data Acoustics data Mooring data

Page 9: Data Management in the  U.S. GLOBEC Program

Northeast Pacific/CGOA

Coastal Gulf of Alaska (CGOA)

35 Cruise reports on-line; 16 will be on-line soon; 36 to come later35 Event logs on-line 11 from LTOP cruises [29 missing] 9 from Haldorson Trawling cruises [8 missing] 10 from Process/Survey cruises [5 missing] 5 from SECM cruises (via on-line report) [10 missing] 

 

Page 10: Data Management in the  U.S. GLOBEC Program

Northeast Pacific/CCS

California Current System (CCS)

46 Cruise reports on-line; 2 more in the works

47 Event logs on-line

36 from CCS - LTOP cruises (via on-line report)

11 from CCS – Process/Other cruises

Page 11: Data Management in the  U.S. GLOBEC Program

Northeast Pacific Summary

Available on-line data include CTD, SST, alongtrack, bottle, SeaSoar, nutrients, pigments, and event logs.

Data soon to be added include CTD, nutrients, and zooplankton.

Page 12: Data Management in the  U.S. GLOBEC Program

Southern Ocean11 Cruises in inventory11 Cruise reports on-line11 Event logs on-line

110 data objects on-line, locally or linked remotely, including:

Ice core and water column bacteria studiesBird studiesBIOMAPERII Chlorophyll, irradiance and productivity studiesMOCNESS CTD dataNutrient dataSea ice dataWhale sonobuoy data120 kHz acoustic backscattering dataADCP dataAlongtrack dataSeal tracking and biologyBathymetry

Page 13: Data Management in the  U.S. GLOBEC Program

Southern Ocean Data in the Works

IWC Whale dataCTD rosette dataMOC1/MOC10/net collection dataXCTD/XBT dataPenguin studies (exists on SO GLOBEC website)ROV dataMooring data

Page 14: Data Management in the  U.S. GLOBEC Program

Data Policy

• Dissemination of data to scientific investigators and others on a timely basis

• Make available when useful (not necessarily only when finalized)

• Serve data and information, such as reports, papers, and other program documentation

Page 15: Data Management in the  U.S. GLOBEC Program

Data Characteristics and Distribution Approach

• Data from many, distributed, researchers (greater than 100 contributors)

• Open access – read only by everyone• Restricted access supported, but rarely used• Quality control is contributors’ responsibility and

on-going• Emphasis on access to data and information as

early as possible• Data sets most useful when used with other data

Page 16: Data Management in the  U.S. GLOBEC Program

Data Acknowledgement Policy

• Any person making substantial use of a data set must communicate with the investigator(s) who acquired the data prior to publication and anticipate that the data collector(s) will be co-author(s) of published results. This extends to model results and to data organized for retrospective studies.

• See on-line policy statement

Page 17: Data Management in the  U.S. GLOBEC Program

Data are accessed from the U.S. GLOBEC Data Server, http://globec.whoi.edu.

Page 18: Data Management in the  U.S. GLOBEC Program

Data can be viewed

Page 19: Data Management in the  U.S. GLOBEC Program

Or plotted

Page 20: Data Management in the  U.S. GLOBEC Program

Track plot of NBP0202

Page 21: Data Management in the  U.S. GLOBEC Program

Or downloaded

Page 23: Data Management in the  U.S. GLOBEC Program

Instruments

• CTDs

• Rosette

• MOCNESS (3 flavors)

• Bongo tows

• Acoustic biomass measurements

• Video Plankton Recorder

• Drifters, MET packages, . . .

Page 24: Data Management in the  U.S. GLOBEC Program

Sensors and Computed Parameters

• Conductivity, temperature, pressure, fluorescence, transmittance, acoustics, light (PAR), video, wind speed/direction, AVHRR, . . .

• Biomass, taxonomic composition/size distribution, species (counts, size, stage, status, rates, behavior), density, currents, stratification, heat flux, nutrients, turbulence, chlorophyll, . . .

Page 25: Data Management in the  U.S. GLOBEC Program

Data Access

• Using the JGOFS Data Management System developed by G. Flierl, J. Bishop, D. Glover, and S. Paranjpe

• Distributed access via standard web browsers

Page 27: Data Management in the  U.S. GLOBEC Program

Distributed Data

• Ten distributed data servers use the US JGOFS software

• Uses the Web httpd protocol - integrates very well with standard web pages

• Handles tabular data in ASCII, Matlab format, and user-supplied formats using methods. It is object oriented and data driven.

Page 28: Data Management in the  U.S. GLOBEC Program

NostalgiaIn the Olden Days ….

• Reformatting and processing data was a common activity

• Merging navigation with measured and computed results also took time

• First data management system used 9 track tapes for data storage, run in batch

• Second system used data on disk with techniques to located data within degree squares to improve performance

Page 29: Data Management in the  U.S. GLOBEC Program

Meta-data• Data about data.• Document information about data elements or

attributes (name, size, data type, etc), about records or data structures (length, fields, columns, etc), and about data (where it is located, ownership, etc.). Meta-data may include descriptive information about the context, quality and condition, or characteristics of the data.

Page 30: Data Management in the  U.S. GLOBEC Program

Detailed Meta-data

• Pros – required for full understanding of data within a database management system. Required if others want to use the data

• Cons – pain in the neck to prepare, maintain, and enter (Best to take advantage of tools)

• Currently completing Global Change Master Directory’s DIF records

Page 31: Data Management in the  U.S. GLOBEC Program

What’s Happening

• Organizations creating systems to access their own meta-data and/or data.

• Umbrella databases linking to other peoples meta-data and/or data. (OBIS, GMBIS, …)

• Linking to meta-data is more manageable than is linking to other people’s data.

Page 32: Data Management in the  U.S. GLOBEC Program

Other Efforts

• LabNet – consortium of marine organizations to make their data available (uses 4D Geobrowser “index cards”)

• Ocean Data View - access WOCE, NGDC, and other data sets. CTD, bottle, XBT …

• OBIS – “portal” (aggregation server) for biological data (using Darwin Core 2 – OBIS)

Page 33: Data Management in the  U.S. GLOBEC Program

Other Efforts, continued• ZOPE – object oriented application server• LAS – web-based, active-image based data

interface for registered data. Used by US JGOFS Program

• uBio – (Universal Biological Indexer and Organizer) a networked information service for biological information resources based on the Taxonomic Name Server (TNS), a thesaurus; an index.

Page 34: Data Management in the  U.S. GLOBEC Program

Other Efforts, continued• Hexacoral – biggest user in OBIS; uses DiGIR

(D.G. Fautin, et al.)• DiGIR – Distributed Generic Information

Retrieval. Uses XML protocol to get the data. Extends XML to do queries. Uses php software package to execute the code. Supports 14 or 15 databases, e.g SQL based. Three options for JGOFS: export to flat file, export to MySQL, or write own perl script to interface directly to DiGIR (ZooGene -> OBIS)

Page 35: Data Management in the  U.S. GLOBEC Program

Other Efforts, continued• Oregon State University, Randy Keller and

Paul Johnson, mapping specialist at HMRG• Steve Hankin, “An Implementation Plan for

the Data and Communication Subsystem of the U.S. Integrated Ocean Observing System”

• Margo Edwards at HIG and Dawn Wright at OSU

Page 36: Data Management in the  U.S. GLOBEC Program

Other Efforts, continued

• RIDGE, petrological data. Endeavor Observatory website, Lamont’s PetDB

• SIO Ocean Exploration data portal, http://sioexplorer.ucsd.edu

• University of Washington’s Endeavor GIS and Portal to Endeavor Data (PED)

Page 37: Data Management in the  U.S. GLOBEC Program

Educational “Tools”

• Virtual Research Vessel, University of Oregon and Oregon State University

• REVEL, University of Washington

• Dive and Discover, WHOI

Page 38: Data Management in the  U.S. GLOBEC Program

Protocols

• OpenDAP (was DODS) http

• DiGIR uses XML; but too verbose for physical data. OBIS may use OpenDAP for physical data.

• JGOFS http

Page 39: Data Management in the  U.S. GLOBEC Program

Other Projects and Protocols

• Apologies for references I’ve missed.

• There are many other efforts underway in all these areas.

Page 40: Data Management in the  U.S. GLOBEC Program

In the Trenches

• What temperature: Sea surface, air, at depth?

• Units?• How collected?• How calibrated?• Data quality control still labor intensive

even though we can collect and store gigabytes of data daily

Page 41: Data Management in the  U.S. GLOBEC Program

Future Data Management and Display Efforts

• Enhance data search capabilities

• Add additional graphical display (visualization) options

• Improve interface between data system and visualization/analysis tool

• Consider other protocols, such as OpenDAP

Page 42: Data Management in the  U.S. GLOBEC Program

End