an overview of the earth system bridge project (and a bit of background about csdms) scott d....

40
An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for Earth System Bridge Former Chief Software Architect for CSDMS University of Colorado, Boulder April 21, 2015

Upload: dale-wiggins

Post on 16-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

An Overview of theEarth System Bridge Project (and a bit of background about CSDMS)

Scott D. PeckhamSenior Research Scientist at INSTAAR

Lead PI for Earth System BridgeFormer Chief Software Architect for

CSDMSUniversity of Colorado, Boulder

April 21, 2015

Page 2: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for
Page 3: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Earth System Bridge, CoG pagehttp://www.earthsystemcog.org/projects/earthsystembridge

Page 4: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Background:Community Surface Dynamics

Modeling System (CSDMS)

Page 5: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

http://csdms.colorado.edu

Page 6: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

http://csdms.colorado.edu/wiki/Model_download_portal

Page 7: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

CSDMS: 1289 Members, 5 Working Groups, 6 Focus Research Groups

Page 8: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Linking Component-based Models:How Can Two Models Differ?

• Programming language(C, C++, Fortran, Java, Python, etc.)Solution: Babel and Bocca (CCA toolchain)

• Computational grid (triangles, rectangles, Voronoi, points, etc.)Solution: ESMF regridder (parallel, spatial interpol.)

• Timestepping scheme(fixed, adaptive, local)Solution: Temporal interpolation tool

• Variable namesNeed some means of “semantic mediation”

Solution: CSDMS Standard Names• Variable units

Solution: UDUNITS (Unidata)

Page 9: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Language Interoperability with Babel

Language interoperability is a powerful feature of the CSDMS framework.Components written in different languages can be rapidly linked in HPCapplications with hardly any performance cost. This allows us to “shop” foropen-source solutions (e.g. libraries), gives us access to both procedural andobject-oriented strategies (legacy and modern code), and allows us to addgraphics & GUIs at will.

Page 10: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

CSDMS Regridding Tools• ESMF Regrid (multi-proc., Fortran)• OpenMI Regrid (single-proc., Java)• Custom-built regridders

Page 11: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Model Coupling Metadata and theCSDMS Standard Names

http://csdms.colorado.edu/wiki/CSDMS_Standard_Nameshttp://csdms.colorado.edu/wiki/CSN_Metadata_Names

Page 12: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Semantic Matching for Model Variables

Hydro Model A

Output variables:

•streamflow•rainrate

Hydro Model A

Output variables:

•streamflow•rainrate

Hydro Model B

Input variables:

•discharge•precip_rate

Hydro Model B

Input variables:

•discharge•precip_rate

CSDMS Standard Names

•channel_exit_water_x-section__ volume_flow_rate

•atmosphere_water__rainfall_volume_flux

CSDMS Standard Names

•channel_exit_water_x-section__ volume_flow_rate

•atmosphere_water__rainfall_volume_flux

Goal: Remove ambiguity so thatthe framework can automaticallymatch outputs to inputs.

Page 13: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

The CSDMS Standard NamesData Models like RDF and EAV use triples like: Subject + Predicate + Object, and Entity/Object + Attribute + Value (object-oriented)

CSDMS Standard Names use a similar pattern for creating unambiguous and easily understood standard variable names or “preferred labels” according to a set of rules. These are then used to retrieve values and metadata. The pattern is:

Object name + [Operation name] + Quantity name

Examples:atmosphere_carbon-dioxide__partial_pressureatmosphere_water__rainfall_volume_fluxearth_ellipsoid__equatorial_radiussoil__saturated_hydraulic_conductivity

We have also started building a set of standard Attribute and Process Names.

Page 14: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Standard Assumption NamesAssumption Type: Example

Boundary conditions: no_slip_boundary_conditionConserved quantities: momentum_conservedCoordinate system: cartesian_coordinate_systemAngle conventions: clockwise_from_north_conventionDimensionality: 2_dimensionalEquations used: navier_stokes_equation Closures: eddy_viscosity_turbulence_closureFlow-type assumptions: laminar_flowFluid-type assumptions: herschel_bulkley_fluidGeometry assumptions: trapezoid_shapedNamed model assumptions: green_ampt_infiltration_modelThermodynamic processes: isenthalpic_processApproximations: boussinesq_approximationAveraging methods: reynolds_averagedNumerical methods used: arakawa_c_gridState of matter: liquid_phase

Page 15: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

The CSDMS Standard Names

Main Page: csdms.colorado.edu/wiki/CSDMS_Standard_NamesBasic Rules: csdms.colorado.edu/wiki/CSN_Basic_RulesObject Names: csdms.colorado.edu/wiki/CSN_Object_TemplatesOperation Names: csdms.colorado.edu/wiki/CSN_Operation_TemplatesQuantity Names: csdms.colorado.edu/wiki/CSN_Quantity_TemplatesProcess Names: csdms.colorado.edu/wiki/CSN_Process_NamesAssumption Names: csdms.colorado.edu/wiki/CSN_Assumption_NamesMetadata Names: csdms.colorado.edu/wiki/CSN_Metadata_NamesModel Metadata Files: csdms.colorado.edu/wiki/CSN_MMF_Example

The CSDMS Standard Names can be viewed as a lingua franca that provides a bridge for mapping variable names between models. They play an important role in the Basic Model Interface (BMI). Model developers are asked to provide a BMI interface that includes a mapping of their model's internal variable names to CSDMS Standard Names and a Model Coupling Metadata (MCM) file that provides model assumptions and other information.

IMPORTANT: Model developers continue to use whatever variable names they want to in their code, but then "map" each of their internal variable names to the appropriate CSDMS standard name in their BMI implementation.

Page 16: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Standard Name Crosswalks, Etc.

Develop crosswalks between the CSDMS Standard Names (2573 names) and the:

•CUAHSI VariableName CV (786 names, 50%) •CF Convention Standard Names (2630 names, 60%)

tendency_of_mass_concentration_of_ammonia_in_air atmosphere_air_ammonia__time_derivative_of_mass_concentration

Work with deep Earth process modeling community to develop new CSDMS Standard Names (PSU meeting)

Work with hydrologic modeling community (CUAHSI) tofurther develop CSDMS Standard Names (NFIE mtg)

Page 17: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Metadata for Model:TopoFlow – Meteorology

Author Scott D. Peckham

Email [email protected]

Domains Hydrology, meteorology

Language

Python

License MIT

Purpose TopoFlow is a spatially-distributed hydrologic model that consists of many stand-alone components. This component provides meteorology variables, such as precipitation rate, temperature, relative humidity, and shortwave and longwave radiation.

Goto Model Objects Page

MCM Tool

A prototype or mock-up of a smart phone app based on Model Coupling Metadata (MCM) for describing a model.

(under development)

Page 18: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

The Basic Model Interface (BMI)

http://csdms.colorado.edu/wiki/BMI_Descriptionhttps://github.com/csdms/bmi/blob/master/bmi.sidl

Page 19: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for
Page 20: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for
Page 21: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

The Basic Model Interface (BMI)

BMI requires a developer to make some relatively simple, noninvasive, and framework-independent changes to his/her model source code, mostly by adding some new functions. Functions can be grouped into:

Model Control Functions (initialize, update, finalize)Model Information Functions (e.g. time-stepping scheme)Variable Information Functions (e.g. dimensions, data type, units)Variable Getters and SettersGrid Information Functions

These functions make the model self-describing (so the framework can “see” and reconcile model differences) and fully controllable.

CSDMS can automatically wrap BMI-enabled models to provide them with a CMI interface which allows them to be used in the CSDMS framework and to gain other new capabilities, including NetCDF output.

Page 22: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

How BMI and CMI Work

Page 23: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

The BMI Breakthrough: For Modelers• Requires minimal effort.

• Is noninvasive (adds no dependencies on CSDMS data. structures or code. No interference with developer’s design.

• Allows model to still be used as before in a stand-alone mode.

• Requires no new code intended to accommodate the needs of other models (unlike OpenMI 1.4).

• Is framework agnostic and requires no modeling-framework specific knowledge (e.g. the CCA concept of ports); developers do not “code to” the framework.

• Only requires developer to do things that would be necessary for the model to be used in any modeling framework.

• Provides added value, e.g. ability to couple to other models and to write output variables to standardized NetCDF files.

Page 24: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

BMI Breakthrough: For SE at CSDMS

• One “CMI wrapper” (code) can be used for any BMI-enabled model, instead of unique wrapper code for each model.

• Make it possible to largely automate the conversion of contributed process models to CSDMS components.

• Dramatically reduces code maintenance time.

• Addresses the “frozen code” problem. i.e. minimal additional effort to bring a new version of the same model (e.g. with enhancements or bug fixes) into the framework.

• Minimal impact on performance of the model.

• Makes it possible for the CSDMS framework to automatically accommodate differences between models by calling service components when needed.

• The SE no longer needs to learn all about each model.

Page 25: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Earth System Bridge – BMI Adapters

BMIBMI

CSDMSCSDMS

ESMFESMF

PyrePyre

OMSOMS

OpenMIOpenMI

Adapters will allow a BMI-enabled model to be usedin any component-based modeling framework.

Page 26: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Earth System – FrameworkDescription Language (ES - FDL)https://earthsystemcog.org/projects/es-fdl/

Page 27: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

ES - Framework Description Language

Page 28: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Earth System Bridge Demonstration Projects

Demonstrate the new capabilities provided by Earth System Bridge technology by using them to couple operational/federal atmosphere-ocean models at the global-to-regional scale to academic (or operational) hydrologic/land/coastal models at the regional-to-local scale.

•ESMF: Couple the MPIPOM-TC ocean model to a local-scale hydrologic or inundation model. (MPIPOM-TC = Message Passing Interface Princeton Ocean Model for Tropical Cyclones)

•CSDMS: Complete implementation and testing of a Basic Model Interface (BMI) for WRF model (Weather Research and Forecasting) and link it within the CSDMS framework to the local-scale, spatial hydrologic TopoFlow model.

Page 29: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

• Peckham, S.D., E.W.H. Hutton and B. Norris (2013) A component-based approach to integrated modeling in the geosciences: The Design of CSDMS, Computers & Geosciences, special issue: Modeling for Environmental Change, 53, 3-12 .

• Peckham, S.D. (2014) The CSDMS Standard Names: Cross-domain naming conventions for describing process models, data sets and their associated variables, Proceedings of the 7th Intl. Congress on Env. Modelling and Software, International Environmental Modelling and Software Society (iEMSs), San Diego, CA. (Eds. D.P. Ames, N.W.T. Quinn, A.E. Rizzoli).

• Peckham, S.D. (2014) EMELI 1.0: An experimental smart modeling framework for automatic coupling of self-describing models, Proceedings of HIC 2014, 11th International Conf. on Hydroinformatics, New York, NY.

For More Information

Page 30: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for
Page 31: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Model Coupling MetadataModels are based on a number of real-world objects, e.g. atmosphere, river channel. (Think object-oriented.)

There are “assumptions” that apply to the model, its computational grid, its objects and the attributes/quantities that describe those objects.

MCM uses the CSDMS Standard Assumption Names and the CSDMS Standard Variable Names. The assumption names are used to “decorate” or “tag” other model entitities (above).

Page 32: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Taming Heterogeneity with Interfaces

Before:Each resource is unique.Own ways of doing things.Respond differently.Can become unstable.Difficult to control.

After:Uniform outward appearance.Respond to same commands.Interchangeable units.Have a chain of command.Work as a team.

Page 33: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Reconciling Differences with Standards

If we reconcile differences between the resources in a pairwise manner, the amount of work, etc. grows fast: Cost(N) = N (N-1) / 2 ~ N2.

vs.

Introduce a new, generic or standard representation (the “hub”), then map resources to and from it. The amount of work, maintenance, etc. drops to: Cost(N) = N.

Page 34: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Building a Modeling FrameworkCSDMS has integrated a variety of powerful, open-source tools to build its modeling framework, such as:

Babel – Language interoperability (C,C++,Java,Python,Fortran)Bocca – Component preparation and project managementCcaffeine – Low-level model coupling (parallel environ.)ESMF Regrid – Multi-processor spatial regriddingOpenMI Regrid – Single-processor spatial regriddingNetCDF – Scientific data format (self-describing, etc.)VisIt – Visualization of large data sets (multi-proc.)

We developed our own interface standards, BMI & CMI, and greatly extended the original Ccaffeine GUI to create our CSDMS Web Modeling Tool for interactive model coupling.

Page 35: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Some Key Concepts and TermsModel: State variables for some system of interest are discretized in space (on a computational grid) and new values are computed from previous values by marching forward in time according to a set of rules (e.g. laws of physics).

Model Component: A model that has been specially prepared for plug-and-play reusability. (i.e. a standard interface & compiled for language interoperability).

Interface: A standardized set of functions, which a caller uses to interact with a resource such as a model, database or service. Heterogeneous resources wrapped with a standard interface then operate in a “familiar” way.

Modeling Framework: A software container in which pre-compiled model components are instantiated, configured and dynamically linked to rapidly create a customized model. (With a “birds-eye view” of all components.)

Workflow: Software that allows a chain of steps to be saved, modified and re-executed. Each step uses an application to create an intermediate product that is passed along to the next step, often as a file.

Page 36: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

How Can a Model be Converted to aReusable, Plug-and-Play Component?

CSDMS has developed two model interfaces, one called Basic Model Interface (BMI) and one called Component Model Interface (CMI) for this purpose.

BMI requires a developer to make some relatively simple, noninvasive, and framework-independent changes to his/her model source code, mostly by adding some new functions. These provide a caller with 3 key things: (1) fine-grained control (i.e. IUF), (2) variable getters and setters and (3) model metadata. The model can still be used in “stand-alone mode,” just as before.

CSDMS automatically wraps BMI-enabled models to provide them with a CMI interface. CMI makes function calls to the (1) framework, (2) service components (the “reconcilers”), (3) the BMI of the wrapped model and (4) the CMI of other plug-and-play components. CMI gives a model many new capabilities, including NetCDF output.

Page 37: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Metadata for Model:TopoFlow – Meteorology

Author Scott D. Peckham

Email [email protected]

Domains Hydrology, meteorology

Language

Python

License MIT

Purpose TopoFlow is a spatially-distributed hydrologic model that consists of many stand-alone components. This component provides meteorology variables, such as precipitation rate, temperature, relative humidity, and shortwave and longwave radiation.

Goto Model Objects Page

MCM Tool

A Prototype or mock-up of a smart phone app based on Model Coupling Metadata (MCM) for describing a model.

(under development)

Page 38: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Objects for Model:TopoFlow - Meteorology

atmosphere_aerosol_dust

atmosphere_air-column_water~vapor

atmosphere_bottom_air

atmosphere_bottom_air_flow

atmosphere_bottom_air_land_heat~net~latent

atmosphere_bottom_air_land_heat~net~sensible

atmosphere_bottom_air_water~vapor

atmosphere_water

earth

land_surfacemodel_grid

physics

snowpack

water

Quantities for Object:land_surface

Quantity SI Units Grid Type

albedo 1 G1 config, output

aspect_angle radians G1 config, output

emissivity 1 G2 output

geodetic_latitude degrees G1 config, output

longitude degrees G1 config, output

slope_angle radians G1 output

temperature deg_C G2 output

+ +

Q AAO

Page 39: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Assumptions for Quantity:aspect_angle

Sign_and_Angle_Conventions

counter-clockwise_from_east

+

Page 40: An Overview of the Earth System Bridge Project (and a bit of background about CSDMS) Scott D. Peckham Senior Research Scientist at INSTAAR Lead PI for

Objects for Model:TopoFlow - Meteorology

atmosphere_aerosol_dust

atmosphere_air-column_water~vapor

atmosphere_bottom_air

atmosphere_bottom_air_flow

atmosphere_bottom_air_land_heat~net~latent

atmosphere_bottom_air_land_heat~net~sensib

atmosphere_bottom_air_water~vapor

atmosphere_water

earth

land_surfacemodel_grid

physics

snowpack

water

+ Assumptions for Model:TopoFlow - Meteorology

Equations

some_named_equation

+AO AO