Transcript
  • Streamlining Oceanic Biogeochemical Dataset Assemblyin Support of Global Data Products

    Eugene F. Burger1, Benjamin Pfeil3, Kevin O’Brien2, Linus Kamb2, Steve Jones3, Karl Smith21NOAA/PMEL, Seattle, WA; 2University of Washington/JISAO, Seattle, WA; 3Bjerknes Climate Data Centre (BCDC), Bergen, Norway

    GoalThese tools streamline OA data processing, quality control and archival by bridging the dataworkflow gap between data collection and data archival of biogeochemical data and metadata used byresearchers. These tools add value to the data by delivering high quality datasets.

    This application extends the web-based tools developed for SOCAT with a richer feature set applied toa broader range of biogeochemical variables that are measured by the Ocean Acidification researchcommunity. The workflow will contribute to the timely production of scientific indicators that aredependent upon these datasets, including synthesis products such as GLODAP.

    Built-in Data Sanity CheckWith the data properly identified, the sanity checkwarns the user if data are outside the bounds of pre-set data limits. Examples of data checks includeout-of-bounds values, inconsistent latitude,longitude, depth values, or data submitted in anincorrect unit. Columns or individual records witherrors are highlighted to indicate flagged values.This development is lead by the Bjerknes ClimateData Centre (BCDC) developers.

    ESS12.10

    Pre-QC Data PreviewA collection of preview plots allow the user to assess dataintegrity. A variety of plots showing overview information suchas observation locations and a selection of property-propertyplots can highlight obvious data errors that the user can correctand resubmit the data. This step improves data quality byreducing common data mistakes before the data can besubmitted to more rigorous quality control.

    1. Easy Data Ingest and Data CheckData SubmissionData can be submitted in human readable and easilyeditable comma separated value (CSV) od Excelformat. The data submission tool recognizesfrequently submitted variables and identifies these insubmitted data. This allows ease and flexibility indata submission.

    High Quality Datasets, Low Data Management BurdenThese tools and workflow reduces the data management burden for scientists, while atthe same time delivering high quality data in interoperable and standards-based formats thatpromote easier use of these high-value data. These data processes will help scientists meettheir obligations for data documentation, data access, and archival.

    2. Integrated Metadata Entry ToolWhere possible, metadata are extracted fromuploaded data. These extracted metadata are pre-populated in the metadata tool integrated with thedata upload dashboard. Completed metadata aswell as base reusable templates can be uploaded inExcel, CSV, or XML formats.

    3. Quality Control ConsoleQuality control functionality being added to thedashboard allows the user to interactively reviewand set data quality flags for selected data pointsfor a subset of biogeochemical variables.

    4. Streamlined Data ArchivalArchiving the high quality, high value data andmetadata to a National Archive Center of choiceensures long term preservation. Using services,developed in collaboration with NCEI (for USsubmitters), data submission effort is reduced to afew button-clicks. Streamlined archival processesreduce the overhead for scientists to meet theirdata management obligations. User options willbe added for the submitter to select the archivedestination.

    The data ingest dashboard

    At right: The quality control functionality that will be extended to incorporate a broader range of data used with ocean acidification

    research

    Collection Level 1 & 2 QCData Handling Processes Measure Retrieve Process Archive Analyze

    Data Ingest Verify Metadata QC Archive

    The Data Processing GapData assembly in support of global data products, such as GLODAP, and submission of data tonational data centers for long-term preservation, demand significant effort. Delays in data assemblycan negatively affect the timely production of scientific indicators that are dependent upon thesedatasets and data products.

    What if data submission, metadata assembly and quality control can be combined into a singleapplication? To support more streamlined data management processes NOAA’s, Pacific EnvironmentalLaboratory (PMEL), with support for the NOAA Ocean Acidification Program, (OAP), and the BjerknesClimate Data Centre (BCDC) within the Bjerknes Centre for Climate Research (BCCR) developers aredeveloping such an application. This application has the potential for application towards a broadercommunity, including the GLODAP collaborators.

    ContactEugene F. Burger, [email protected], +1 206.526.4586Benjamin Pfeil, [email protected], +47 55 58 98 39

    The metadata entry and upload tool is integrated with other components

    Errors detected by the Sanity Check are highlighted

    An example of a preview plot, beforedata are submitted for quality control


Top Related