an extensible python user environment jeff daily karen schuchardt, pi todd elsethagen jared chase...

1
An Extensible Python User Environment Jeff Daily Karen Schuchardt, PI Todd Elsethagen Jared Chase H41G-0956 Website http:// subsurface.pnl.gov/salssa Acknowledgements The research reported here is supported by the U.S. Department of Energy through “Process Integration, Data Management, and Visualization for Subsurface Sciences,” Scientific Discovery through Advanced Computing (SciDAC) About Pacific Northwest National Laboratory The Pacific Northwest National Laboratory, located in southeastern Washington State, is a U.S. Department of Energy Office of Science laboratory that solves complex problems in energy, national security and the environment, and advances scientific frontiers in the chemical, biological, materials, environmental and computational sciences. The Laboratory employs 4,000 staff members, has a $760 million annual budget, and has been managed by Ohio-based Battelle since 1965. For more information about the science you see here, please contact: Jeff Daily Pacific Northwest National Laboratory P.O. Box 999, MS K7-90 Richland, WA 99354 (509) 372-6548 [email protected] Collaborators PNNL-SA-63866 The Support Architecture for Large-Scale Subsurface Analysis (SALSSA) provides a sophisticated graphical user interface (GUI) and the underlying data management framework enabling scientists to efficiently set up groundwater simulation models and store, retrieve, and analyze the rapidly growing volumes of data produced by their research. Our SALSSA Organizer integrates everything seen here and runs on any modern platform such as Windows, Mac OS X, and Linux. Front to back: Visualization of pore-scale fluid flow computed using the parallel Smoothed Particle Hydrodynamics code developed under the Hybrid Numerical Methods for Multiscale Simulations of Subsurface Biogeochemical Processes project. Colors represent local fluid velocity. Visualization created by Kwan-Liu Ma and Chad Jones of the Institute for Ultra-Scale Visualization, University of California at Davis. Tecplot TM output from a calcite precipitation study. A custom GUI visualizing a STOMP input grid. Provenance Tracking and Data Management Keeps track of everything a user does without burdening them with where to store their data. Job Launching Concurrent jobs to multiple machines, load balancing, and real-time updates all with the push of a button. Process Integration Seamless end-to-end synergy of computational models and desktop tools. All processes known to our environment are organized into a task tree with user- specified category labels Tasks may include multiple simulation models, input file generators, text editors, visualization tools, etc. Additional tasks are added by editing our text-based registry or through auto- discovery Visualization tools are registered just like any other task and can be desktop tools like Tecplot TM or those requiring a supercomputer to produce Users can begin new tasks by double-clicking on their data items or within the task tree Unknown data items can be automatically registered or new tasks can be browsed for within user’s desktop by our “Open With” dialog Plug-in custom wxPython GUIs to allow complex processing or visualization Visualize provenance using an interactive graph Submits remote jobs to UNIX and Linux workstations, Linux clusters, and supercomputers Launches multiple jobs concurrently, to multiple machines Monitors remote job progress with the option to terminate Generates boilerplate text for job submission scripts which can be customized on a per-machine per-task basis Computational Models STOMP: a general-purpose tool for simulating subsurface flow and transport Smoothed Particle Hydrodynamics (SPH): a Lagrangian particle method for solving systems of partial differential equations Our user environment enables the integration of these two models. Summary Although we support the STOMP and SPH codes directly, we have developed a general user environment than can be applied to any application domain. We have enjoyed using Python to implement our software. It is great for Rapid Application Development since there is no need to compile it and it has clear, intuitive syntax. It is inherently cross-platform and is Free and Open Source. However, it has its limitations. Errors that are typically caught by compilers in other languages become runtime errors with Python. Its support of the Web Service stack is limiting. Lastly, it doesn’t integrate well with languages besides C/C++ and FORTRAN. Future Work Graphical timeline view Registration wizard Reduce graph complexity Integration with archival storage systems to support large datasets Automation of non-interactive task sequences Higher-level wizards for executing common task sequences Add annotations to your tasks and mouse over them in the graph to display them Automatically transfers all inputs for staging to remote workstations and then stores outputs back to the Alfresco Content Management System Stores metadata as Resource Description Framework triples in OpenRDF using Open Provenance Model semantics

Upload: reginald-potter

Post on 26-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: An Extensible Python User Environment Jeff Daily Karen Schuchardt, PI Todd Elsethagen Jared Chase H41G-0956 Website  Acknowledgements

An Extensible Python User EnvironmentJeff Daily Karen Schuchardt, PI Todd Elsethagen Jared Chase H41G-0956

Websitehttp://subsurface.pnl.gov/salssa

Acknowledgements

The research reported here is supported by the U.S. Department of Energy through “Process Integration, Data Management, and Visualization for Subsurface Sciences,” Scientific Discovery through Advanced Computing (SciDAC)

About Pacific Northwest National Laboratory

The Pacific Northwest National Laboratory,

located in southeastern Washington State, is

a U.S. Department of Energy Office of

Science laboratory that solves complex

problems in energy, national security and the

environment, and advances scientific frontiers

in the chemical, biological, materials,

environmental and computational sciences.

The Laboratory employs 4,000 staff

members, has a $760 million annual budget,

and has been managed by Ohio-based

Battelle since 1965.

For more information about the science you see here, please contact:

Jeff Daily

Pacific Northwest National Laboratory

P.O. Box 999, MS K7-90

Richland, WA 99354

(509) 372-6548

[email protected]

Collaborators

PN

NL-

SA

-638

66

The Support Architecture for Large-Scale Subsurface Analysis (SALSSA) provides a sophisticated graphical user interface (GUI) and the underlying data management framework enabling scientists to efficiently set up groundwater simulation models and store, retrieve, and analyze the rapidly growing volumes of data produced by their research.

Our SALSSA Organizer integrates everything seen here and runs on any modern platform such as Windows, Mac OS X, and Linux.

Front to back:

Visualization of pore-scale fluid flow computed using the parallel Smoothed Particle Hydrodynamics code developed under the Hybrid Numerical Methods for Multiscale Simulations of Subsurface Biogeochemical Processes project. Colors represent local fluid velocity. Visualization created by Kwan-Liu Ma and Chad Jones of the Institute for Ultra-Scale Visualization, University of California at Davis.

TecplotTM output from a calcite precipitation study.

A custom GUI visualizing a STOMP input grid.

Provenance Tracking and Data Management

Keeps track of everything a user does without burdening them with where to store their data.

Job Launching

Concurrent jobs to multiple machines, load balancing, and real-time updates all with the push of a button.

Process Integration

Seamless end-to-end synergy of computational models and desktop tools.

All processes known to our environment are organized into a task tree with user-specified category labels Tasks may include multiple simulation models, input file generators, text editors, visualization tools, etc. Additional tasks are added by editing our text-based registry or through auto-discovery

Visualization tools are registered just like any other task and can be desktop tools like TecplotTM or those requiring a supercomputer to produce

Users can begin new tasks by double-clicking on their data items or within the task tree Unknown data items can be automatically registered or new tasks can be browsed for within user’s desktop by our “Open With” dialog

Plug-in custom wxPython GUIs to allow complex processing or visualization

Visualize provenance using an interactive graph

Submits remote jobs to UNIX and Linux workstations, Linux clusters, and supercomputers Launches multiple jobs concurrently, to multiple machines Monitors remote job progress with the option to terminate Generates boilerplate text for job submission scripts which can be customized on a per-machine per-task basis

Computational Models

STOMP:

a general-purpose tool for simulating subsurface flow and transport

Smoothed Particle Hydrodynamics (SPH):

a Lagrangian particle method for solving systems of partial differential equations

Our user environment enables the integration of these two models.

Summary

Although we support the STOMP and SPH codes directly, we have developed a general user environment than can be applied to any application domain.

We have enjoyed using Python to implement our software. It is great for Rapid Application Development since there is no need to compile it and it has clear, intuitive syntax. It is inherently cross-platform and is Free and Open Source. However, it has its limitations. Errors that are typically caught by compilers in other languages become runtime errors with Python. Its support of the Web Service stack is limiting. Lastly, it doesn’t integrate well with languages besides C/C++ and FORTRAN.

Future Work

Graphical timeline view Registration wizard Reduce graph complexity Integration with archival storage systems to support large datasets Automation of non-interactive task sequences Higher-level wizards for executing common task sequences

Add annotations to your tasks and mouse over them in the graph to display them Automatically transfers all inputs for staging to remote workstations and then stores outputs back to the Alfresco Content Management SystemStores metadata as Resource Description Framework triples in OpenRDF using Open Provenance Model semantics