overview: software productivity challenges for extreme...
TRANSCRIPT
SIAM CSE15, March 15, 2015
Overview: Software Productivity Challenges for Extreme-Scale Science Lois Curfman McInnes, Mike Heroux, Hans Johansen In collaboration with IDEAS project members
Outline
Confluence of trends: complexity! Multiphysics and multiscale simulations Emerging extreme-scale architectures
Software productivity crisis Recent DOE activities: SWP for extreme-scale science Related community efforts
Introduction to IDEAS project Interoperable Design of Extreme-scale Application
Software
2
New architectures provide unprecedented opportunities for new science
3
Coupler to exchange!uxes from one
componentto another
Land Model
Sea Ice Model
Atmosphere Model
Ice Sheet ModelO
cean
Mod
el
Future Components
(e.g. subsurface)
������������
�� ������
������
����������
��������
������
����������
����
�������������
���������
������������������
���������
����
��������
����
����
��������������
���������
������
����
��� ��� ��� ���
�� !!"
�� !
!"
�
���
���
���
��#
����$%&�$'()*
���
�����$%&�'*+(,
��������� �� -�.!
/#"
A. Siegel, ANL
E. Myra, Univ. of Michigan
K. Lee, SLAC
K. Evans, ORNL A. Hakim, PPPL E. Kaxiras, Harvard
“The great frontier of computational physics and engineering is in the challenge posed by high-fidelity simulations of real-world systems … typically characterized by multiple, interacting physical processes (multiphysics), interactions that occur on a wide range of both temporal and spatial scales.” − The Opportunities and Challenges of Exascale Computing, ASCAC Subcommittee on Exascale Computing, R. Rosner (Chair), Office of Science, U.S. Department of Energy, 2010
Multiphysics: greater than 1 component governed by its own principle(s) for evolution or equilibrium
Also: broad class of coarsely partitioned problems possess similarities to multiphysics problems
Software productivity challenges permeate HPC multiphysics, multiscale applications
doi:10.1177/1094342012468181
IJHPCA, Feb 2013: Special issue
Fluid-structure interaction Fission reactor fuel
performance Reactor core modeling Crack propagation Fusion Subsurface science,
hydrology Climate Radiation hydrodynamics Geodynamics Accelerator design
Applications Applied Math
Software / Hardware
Multiphysics Simulations
4
Flexible multiphysics/multiscale software is essential
We must fundamentally rethink approaches to multiphysics models, algorithms, and solvers with attention to data motion, data structure conversion, and overall application design.
Challenges: Enabling the introduction of new models, algorithms, and data structures Addressing CS issues for coupled codes, e.g.,
mapping codes to machine topologies load balancing resilience strategies
Competing goals of software interface stability and software reuse with the ability to innovate algorithmically and develop new physical models
Composability, sharing methods and code, common infrastructure
“The way you get programmer productivity is by eliminating lines of code you have to write.”
– Steve Jobs, Apple World Wide Developers Conference, Closing Keynote Q&A, 1997
What works well for each component (separately) may not be optimal for coupled codes!
5
Increasing complexity of extreme-scale architectures
1.E+01
1.E+02
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1.E+08
01/01/92
01/01/96
01/01/00
01/01/04
01/01/08
01/01/12
01/01/16
Rmax (G
flops/sec)
Heavyweight Lightweight
Hybrid Trend: CAGR=1.88
Machine peak flops grow steadily …
1.E+01
1.E+02
1.E+03
1.E+04
1/1/92
1/1/96
1/1/00
1/1/04
1/1/08
1/1/12
1/1/16
Clock (M
Hz)
Node Processor Clock (MHz) (H)Node Processor Clock (MHz) (L)Node Processor Clock (MHz) (M)Acclerator Core Clock (MHz) (H)Acclerator Core Clock (MHz) (L)Acclerator Core Clock (MHz) (M)
… but it has not come from clock speed (10+ years ago)
Reference: P. Kogge (Notre Dame), John Shalf (LBNL), preprint
DIMM
DIMM
DIMM
DIMM
ProcessorSocket
DIMM
DIMM
DIMM
DIMM
ProcessorSocket
Router I/O Socket
DIMM
DIMM
DIMM
DIMM Processor
Socket
DIMM
DIMM
DIMM
DIMM Processor
Socket
DRAM
DRAM
DRAM
DRAM Processor
Socket
DRAMDRAMDRAMDRAM
GDDRGDDRGDDRGDDR
GPUSocket
DIMM
DIMM
DIMM
DIMM
ProcessorSocket
Router I/O Socket
GDDR
GDDR
GDDR
GDDR GPU
Socket
DIMM
DIMM
DIMM
DIMM Processor
Socket
6
Peak has come from core counts (MPI parallelism) and accelerators (threads) … … and more programming difficulties with distributed memory and communication.
Software engineering and HPC: Efficiency vs. other quality metrics
Source: Code Complete Steve McConnell
7
Confluence of trends
Fundamental trends: Disruptive HW changes: Requires thorough algorithm/code
refactoring Demands for coupling: Multiphysics, multiscale
Challenges: Need refactorings: Really, continuous change Modest app development funding: No monolithic apps Requirements are unfolding, evolving, not fully known a priori
Opportunities: Better design and SW practices & tools are available Better SW architectures: Toolkits, libraries, frameworks
Basic strategy: Focus on productivity
8
Recent DOE activities: Exploring crisis in SW productivity for extreme-scale science
Pre-history: DARPA-HPCS, DOE community meetings, SciDAC, SC, ICSE-
CSE, etc. Climate and environment NRC reports, FSP planning
Summit: Extreme-Scale Application Software Productivity, Feb 2013
Whitepaper: Extreme-Scale Scientific Application Software Productivity:
Harnessing the Full Capability of Extreme-Scale Computing, Sept 2013
Workshop and report: Software Productivity for Extreme-Scale Science, Jan 2014
Whitepapers Relevant reading
Minisymposium at SIAM PP14: Software Productivity for the Next Generation of Scientific
Applications (8 presentations)
Available via: http://www.orau.gov/swproductivity2014
9
Related community efforts 10
WSSSPE: Working towards Sustainable Software for Science wssspe.researchcomputing.org.uk
Software Carpentry software-carpentry.org
CSE15 Minitutorial: Lab Skills for Scientific Computing, Greg Wilson MT3, MT4: Tues, March 17, 2:15-6:05 pm
Software Engineering for Science www.se4science.org
Software Engineering for HPC in CSE (SEHPCCSE) Software Engineering for CSE (SECSE)
Computational Science Stack Exchange SciComp.StackExchange.com
Outline
Confluence of trends: complexity! Multiphysics and multiscale simulations Emerging extreme-scale architectures
Software productivity crisis Recent DOE activities: SWP for extreme-scale science Related community efforts
Introduction to IDEAS Project Interoperable Design of Extreme-scale Application
Software
11
Interoperable Design of Extreme-scale Application Software (IDEAS)
Motivation Enable increased scientific productivity, realizing the potential of extreme- scale computing, through a new interdisciplinary and agile approach to the scientific software ecosystem.
Objectives Address confluence of trends in hardware and
increasing demands for predictive multiscale, multiphysics simulations.
Respond to trend of continuous refactoring with efficient agile software engineering methodologies and improved software design.
Approach ASCR/BER partnership ensures delivery of both crosscutting methodologies and
metrics with impact on real application and programs. Interdisciplinary multi-lab team (ANL, LANL, LBNL, LLNL, ORNL, PNNL, SNL)
ASCR Co-Leads: Mike Heroux (SNL) and Lois Curfman McInnes (ANL) BER Lead: David Moulton (LANL) Topic Leads: David Bernholdt (ORNL) and Hans Johansen (LBNL)
Integration and synergistic advances in three communities deliver scientific productivity; outreach establishes a new holistic perspective for the broader scientific community.
Impact on Applications & Programs Terrestrial ecosystem use cases tie IDEAS to modeling and simulation goals in two Science Focus Area (SFA) programs and both Next Generation Ecosystem Experiment (NGEE) programs in DOE Biologic and Environmental Research (BER).
Software Productivity for Extreme-Scale
Science Methodologies for Software Productivity
Use Cases: Terrestrial Modeling
Extreme-Scale Scientific Software Development Kit
(xSDK)
12
www.ideas-productivity.org
IDEAS project structure and interactions 13
ASCR Math & CS Exascale Co-Design ALCF
SciDAC Exascale Roadmap NERSC OLCF
DOE Extreme-scale Programs DOE Computing Facilities
SFAs
BER Terrestrial Programs
CLM
NGEE
ACME
Use Cases for Terrestrial Modeling
Lead: J. David Moulton (LANL) Tim Scheibe (PNNL) Carl Steefel (LBNL) Glenn Hammond (SNL) Reed Maxwell (CSM) Scott Painter (ORNL) Ethan Coon (LANL) Xiaofan Yang (PNNL)
Extreme-Scale Scientific Software Development Kit
Lead: Mike Heroux (SNL) Jed Brown (ANL) Irina Demeshko (SNL) Kerstin Kleese-Van Dam (PNNL) Sherry Li (LBNL) Vijay Mahadevan (ANL) Daniel Osei-Kuffuor (LLNL) Barry Smith (ANL) Ulrike Yang (LLNL)
Methodologies for Software Productivity
Lead: Hans Johansen (LBNL) Roscoe Bartlett (ORNL) Todd Gamblin* (LLNL) Christos Kartsaklis (ORNL) Pat McCormick (LANL) Sri Hari Krishna Narayanan (ANL) Andrew Salinger* (SNL) Jason Sarich (ANL) Dali Wang (ORNL) Jim Willenbring (SNL)
Outreach and Community
Lead: David Bernholdt (ORNL) Katie Antypas* (NERSC) Lisa Childers* (ALCF) Judy Hill* (OLCF)
Crosscutting Lead: Lois Curfman McInnes (ANL)
IDEAS: Interoperable Design of Extreme-scale Application Software
ASCR Co-Leads: Mike Heroux (SNL) and Lois Curfman McInnes (ANL) BER Lead: J. David Moulton (LANL)
Executive Advisory Board
DOE Program Managers
ASCR: Thomas Ndousse-Fetter BER: David Lesmes
* Liaison
Use cases: Multiscale, multiphysics representation of watershed dynamics
Use Case 1: Hydrological and biogeochemical cycling in the Colorado River System
Use Case 2: Thermal hydrology and carbon cycling in tundra at the Barrow Environmental Observatory
Leverage and complement existing SBR and TES programs: LBNL and PNNL SFAs NGEE Arctic and Tropics
General approach: Leverage existing open source application
codes Improve software development practices Targeted refactoring of interfaces, data
structures, and key components to facilitate interoperability
Modernize management of multiphysics integration and multiscale coupling
14
IDEAS interconnections 15
Use cases: Drive efforts. Traceability from all efforts But generalized for future efforts
Methodologies (“HowTo”) for SWP: Metrics: Define for all levels of project. Track progress
Software Productivity for Extreme-Scale
Science Methodologies for Software Productivity
Use Cases: Terrestrial Modeling
Extreme-Scale Scientific Software Development Kit
(xSDK)
Workflows, lifecycles: Document and formalize. Identify best practices
xSDK: frameworks + components + libraries Build apps by aggregation and composition
Outreach: Foster communication, adoption, interaction
First of a kind: Focus on software productivity
Extreme-‐Scale Scien/fic So2ware Ecosystem
Libraries • Solvers, etc. • Interoperable.
Frameworks & tools • Doc generators. • Test, build framework.
Extreme-‐Scale Scien/fic So2ware Development Kit (xSDK)
SW engineering • ProducAvity tools. • Models, processes.
Domain components • ReacAng flow, etc. • Reusable.
DocumentaAon content • Source markup. • Embedded examples.
TesAng content • Unit tests. • Test fixtures.
Build content • Rules. • Parameters.
Library interfaces • Parameter lists. • Interface adapters. • FuncAon calls.
Shared data objects • Meshes. • Matrices, vectors.
NaAve code & data objects • Single use code. • Coordinated component use. • ApplicaAon specific.
Extreme-‐scale Science Applica/ons
Domain component interfaces • Data mediator interacAons. • Hierarchical organizaAon. • MulAscale/mulAphysics coupling.
16
xSDK focus 17
Common configure and link capabilities Initial emphasis: Chombo, hypre, PETSc, SuperLU, Trilinos Approach:
Determine common definition of configure arguments, eliminate namespace collisions
Develop approach that can be adapted by any library development team for standardized configure/link process
Develop testing capabilities to assure configure/link processes continue to work indefinitely
Library interoperability Designing for performance portability
Agile, iterative, and incremental cycles for extreme-scale science
18
Issues for extreme-scale software: software engineering science/library testing refactoring performance portability
Methodology relevant for lifecycle needs
Goal: Put steps in place to encourage adoption and reuse of research libraries, and improve longevity of ASCR software investments through refactoring, componentization.
19
Concept
Impo
rtanc
e
HPC Software Maturity
broadest adoption
Outreach and community
Begin changing the way computational and domain science communities think about software development.
Training: Bringing practical information about techniques and tools to software developers, and advice on tailoring. Targeting BER and broader DOE (via SciDAC, INCITE, ALCC,
NNSA, ACTT, etc.) with 2 trainings per year. Community development: Online tools to facilitate
conversation about productivity issues and solutions. Incrementally build and refine community via outreach to
different programs, offices. Leverage computing facility liaisons, and team’s existing
relationships with other programs for Outreach.
20
Better software productivity is essential for extreme-scale CSE
Better SW productivity can give us better, faster and cheaper Better: Science, portability, robustness, composability
Faster: Execution, development, dissemination
Cheaper: Fewer staff hours and lines of code
21
IDEAS project Enabling production of high-quality
science results, rapidly and efficiently Multiscale terrestrial ecosystem science
Broadly: DOE extreme-scale scientific apps
Delivering first-of-a-kind extreme-scale scientific software ecosystem xSDK
SWP methodologies (“HowTo”)
Outreach and community
Essential mechanism for progress In time of disruptive change In presence of multiple
design tradeoffs
Coming up … 22
1:55-2:15 Software Productivity Application: Integrated Modeling for Fusion Energy David Bernholdt
2:20-2:40 Software Productivity Challenges in Environmental Applications David Moulton
2:45-3:05 Software Productivity Community Input: Concerns and Priorities Jeff Carver and Mike Heroux