computing needs assessment: methodology and practice karen petraska, nasa office of the cio james...
TRANSCRIPT
Computing Needs Assessment:Methodology and Practice
Karen Petraska, NASA Office of the CIOJames McCabe, Computing ConsultantAugust 2011
Outline
Methodology
• Introduction
• Objectives
• Proving and Scaling the Process
• Preliminary Work
Practice
• Interviewed Scientists from the NASA ESM Community
• Whiteboard Sessions
• Use-Case Scenarios
• Describing Workflows
Practice (Cont)• Process is Providing Valuable
Insight• Found Varying Degrees of
Complexity in ES Models• Generalizing Earth Science
Modeling and Analysis• There are a Number of
Interesting Topics that are Emerging
• Future Work• Appendices
2
METHODOLOGY
3
Introduction
Computing needs assessment is similar to requirements analysis process, and consists of:
• Targeting NASA Mission Communities• Conducting in-depth interviews and whiteboard sessions with
scientists and engineers• Learning how missions get done, on which IT assets, what is
needed, when and where throughout scientists’ entire project lifecycles
• Using these data to characterize mission IT workflows, developing use case scenarios and models to express general characteristics
• Recommending refinements to NASA’s IT system to reflect evolving needs of missions for IT
4
Objectives
• Ensure IT assets are available when needed• Optimize investments in IT assets for mission purposes• Balance overall performance in mission computing and analysis• Identify and address evolving needs of missions for IT• Develop processes and templates to assess computing needs
throughout NASA• Go beyond “more, bigger, faster” to understand system-wide needs• Determine what is common across mission groups, as well as what
is unique for each group
Our primary working assumption is that the most valuable resource is scientists’ and engineers’ time, whatever we can do to improve their effectiveness will pay off in improved mission results
5
Proving and Scaling the Process
• NASA Mission Communities are very large, highly unlikely a few individuals could interview them all
• Therefore, we are working on interview questions and templates that could be used by others, to parallelize this process
• In order to prove this process, we decided to start small with a focus group, the SMD Earth Science Modeling (ESM) Community
• We began in June with preliminary work:– Development of draft interview questions– Defining important terms for science IT– Laying foundation to describe science workflows
• Interviewing scientists throughout July and August
At this point we are prepared to discuss our methodology and some early findings, and deliver a full set of results at the December American Geophysical Union meeting 6
Preliminary WorkDraft Interview QuestionsCategories of Topics Discussed with ESM Scientists:• Work Patterns and Behaviors
– How, when, and where do scientists receive, generate, transport, analyze, store, and visualize data, how can science be done better
• Computing Job Characteristics– What their computing and analysis jobs need in order to run, how could jobs be run
more effectively, how can systems be better utilized by them
• Environments and Support• Enhancements, Evolution, and Revolution
– What can be done from an IT perspective to improve science
• Computing, Analysis, Visualization Characterizations– What do the various systems that they use look like
Lists of questions associated with these topics are provided in Appendix A.
7
Preliminary WorkDefining Important Terms
High-End Computing The application of specialized capabilities and large amounts of processing power (tens of thousands of processors) per job to solve the largest and most complex problems of that period (e.g. Grand Challenge class problems)
Mid-Range Computing Large computing clusters (on the order of thousands of nodes) that may have some of the specialized capabilities of high-end computing, but are generally smaller and less specialized. As such, they operate earlier in the project lifecycle, on subsets of Grand Challenge class problems
Low-End Computing Departmental, project, and individual computing resources, ranging from a few to tens of processors, with little to no specialization
8
Preliminary WorkDescribing Science Modeling Workflows• Based on coding models from Biegel and Kepner• Used as a starting point to begin discussions with scientists• For example, model for running codes:
9
Initial Run Scale and Optimize Full Run
PRACTICE
10
Interviewed Scientists from NASA Earth Science Modeling Community
LaRC• Atmospheric Sciences
GSFC• Atmospheric Chemistry• Goddard Modeling and
Assimilation Office (GMAO)• Global Modeling Initiative (GMI)• NASA Unified Weather
Research and Forecasting (NU-WRF)
ARC• NASA Earth Exchange (NEX)
JPL• Ice Sheet System Model
(ISSM)• Estimating the Circulation and
Climate of the Ocean (ECCO)• Carbon Monitoring• Climate
11
A list of scientists interviewed to date is provided in Appendix B
Whiteboard Sessions
Using our preliminary work, had whiteboard sessions with individuals and groups, in which they outlined their science, workflows, and IT assets used
Example: NASA Unified Weather Research and Forecasting (NU-WRF)
12
LIS: Land Information SystemGOCART: Goddard Chemistry Aerosol Radiation and TransportSDSU: Satellite Data Simulator UnitRAD: Radiation
Use-Case Scenarios
For each group interviewed, developed a Use-Case Scenario that describes their:
• Work and science• Workflows• IT assets used• Issues• Desirables and potential impacts• Future IT needs
13
Describing Workflows
When possible, steps in the process of developing science data are captured, along with any IT-related information
Example: Cloud Modeling at LaRC
14
NCAR: National Center for Atmospheric ResearchASDC: Atmospheric Sciences Data CenterGrADS: Grid Analysis and Display SystemC3M: CERES, CALIPSO, CloudSat, MODIS
Process is Providing Valuable Insight
15
Found Varying Degrees of Complexity in Earth Science Models
16
Increasingly complex science models stress the performance levels of underlying IT infrastructure
LIS: Land Information SystemGOCART: Goddard Chemistry Aerosol Radiation and TransportSDSU: Satellite Data Simulator UnitRAD: Radiation
Generalizing Earth Science Modeling and Analysis
17
Earth Science Model Processing and Analysis
There are a Number of Interesting Topics that are Emerging • Science data
– Growth in science data set sizes is outpacing our transport capabilities– Data set sizes approaching PB– Possible architectural options include: co-locating resources with data sets,
improving transport, or replicating data sets
• Processing queues– Queue wait times increase time to solution by up to 300%
• Security– Security implementations reduce end-to-end performance, there is a need to
better balance between the two
• Support to scientists– Technical support across project lifecycle critical to scientists’ success
• Environments (Processors, Storage, Communications, Libraries, Compilers)
– As model complexities increase, its becoming harder for scientists to integrate across systems 18
Future Work
Near-Term – Completing Assessment of Earth Science Modeling• Gather and derive more data to improve performance of NASA IT
assets to science communities• Round out scope of NASA ESM Community
– Other SMD ESM groups (Goddard Institute for Space Science)– ESM at other agencies (Los Alamos and Oak Ridge National Labs), for
calibration purposes
• Compare and contrast results
Results from this work will be presented at the American Geophysical Union meeting in December
19
Future Work
Longer Term – Expanding Scope to Other Missions
At end of August we will begin rolling out the assessment process throughout other directorates in NASA
This will consist of:• Identifying points of contact in each mission to lead this effort• Conducting workshops to introduce process templates and
questions, and teach the assessment process• Working with leads to facilitate applying process to their subject
groups• Assisting with the processing of data and synthesizing into agency
results
20
BACKUP SLIDES
Appendix A – List of Interview Questions
Work Patterns and Behaviors:– Work breakdown – categorizing
major work components– Time to solution (intermediate and
final)– How, when, and where do they
receive, generate, process, store, and display data
– Interactions with computing facilities and personnel
– Other, non-science skills needed GSFC
Job Characteristics:– Numerical schemes – Degree of parallelism– Interprocessor latency– Internal and external communications– Characterization of data sets– Levels of precision– Provenance and logging– Use of metadata– Locality of data– Level of data sensitivity– Mission-critical jobs– Time-critical jobs
22
Appendix A – List of Interview Questions
Resource Utilization:– Scheduling of computing resources– Degree of interactivity with
resources– Billing units and usage– Issues with using computing
resources
Enhancements and Evolution:– Possible ways to enhance work– Potential value of enhancements– Technologies of interest and why– Desired computing platforms and
why
Software, Tools, and Utilities:– Languages and compilers – Tools and utilities– COTS software– Specialized software– Software engineering needed– Customizations to any of the
above
Visualization Characterization:– Locality of visualization resources– Real-time visualization
23
Appendix A – List of Interview Questions
Computing Characterization:– Interconnect type– Interprocessor communications
speed– Average/Peak processing loads– Current computing hardware types– Architectures: shared or single
memory– Number of processors, cores– Types of processors– Per-processor and total memory– Local cache– Local storage size and type
Computing Characterization:– Locality of computing resources– Specialized hardware– Throughput– Latency/Delay– Reliability/Operational Availability– Job management
Storage Characterization:– Locality of storage resources– Quantity and types of storage
24
Appendix B – List of Scientists Interviewed to Date
LaRC• Atmospheric Sciences
– Kuan-Man Xu– Anning Cheng
GSFC• NU-WRF
– Robert Burns– Jim Geiger– Joe Santonello– Sujay Kumar– Toshi Matsui
GSFC• Atmospheric Chemistry
– Qian Tan
• GMAO– Michele Rienecker– Ron Gelaro– Arlindo de Silva– Bill Putman
ARC• NEX
– Rama Nemani
25
Appendix B – List of Scientists Interviewed to Date
JPL• ISSM
– Eric Larour– Mathieu Morlighem– Helene Seroussi
• ECCO– Ichiro Fukumori– Benny Cheng– Ou Wang
JPL• Carbon Monitoring
– Kevin Bowman
– Robert Ferraro– Frank Li
26