sdm spa/utah ahm/mar05– nc state 1 on large data-flow scientific workflows: an astrophysics case...
TRANSCRIPT
SDM SPA/Utah AHM/Mar05– NC State 1
On Large Data-Flow Scientific Workflows:
An Astrophysics Case Study
Integration of Heterogeneous Datasets using
Scientific Workflow Engineering
Presenter:
Mladen A. Vouk
SDM SPA/Utah AHM/Mar05– NC State 2
Team(Scientific Process Automation - SPA)
Sangeeta Bhagwanani (MS student - GUI interfaces)
John Blondin (NCSU Faculty,TSI PI) Zhengang Cheng (PhD student –
services, V&V) Dan Colonnese (MS student, graduated,
workflow grid and reliability issues) Ruben Lobo (PhD student, packaging) Pierre Moualem (MS student, fault-
tolerance)
Jason Kekas (PhD student, Technical Support)
Phoemphun Oothongsap (NCSU, Postdoc, high-throughput flows)
Elliot Peele (NCSU, Technical Support) Mladen A. Vouk (NCSU faculty, SPA
PI) Brent Marinello (NCSU, workflows
extensions, Others …
SDM SPA/Utah AHM/Mar05– NC State 3
NC State researchers are simulating the death of a massive star leading to a supernova explosion. Of particular interest is the dynamics of the shock wave generated by the initial implosion of the star which ultimately destroys the star as a highly energetic supernova.
SDM SPA/Utah AHM/Mar05– NC State 4
Key Current TaskEmulating “live” workflows
SDM SPA/Utah AHM/Mar05– NC State 5
Key Issue
Very important to distinguish between a custom-made workflow solution and a more cannonical set of operations, methods, and solutions that can be composed into a scientific workflow.
Complexity, skill level needed to implement, usability, maintainability, “standardization” e.g., sort, uniq, grep, ftp, ssh on unix boxes
vs. SAS (that can do sorting), home-made sort, SABUL,
bbcp (free, but not standard), etc.
SDM SPA/Utah AHM/Mar05– NC State 6
Topic – Computational Astrophysics
Dr. Blondin is carrying out research in the field of Circumstellar Gas-Dynamics. The numerical hydrodynamical code VH-1 is used on supercomputers, to study a vast array of objects observed by astronomers both from ground-based observatories and from orbiting satellites. The two primary subjects under investigation are interacting
binary stars - including normal stars like the Algol binary, and compact object systems like the high mass X-ray binary SMC X-1 - and supernova remnants - from very young, like SNR 1987a, to older remnants like the Cygnus Loop.
Other astrophysical processes of current interest include radiatively driven winds from hot stars, the interaction of stellar winds with the interstellar medium, the stability of radiative shockwaves, the propagation of jets from young stellar objects, and the formation of globular clusters.
SDM SPA/Utah AHM/Mar05– NC State 7
InputData
HighlyParallelCompute
Output~500x500files
Aggregate to ~500 files (< 10+GB each)
HPSSarchive
Data Depot
Logistic NetworkL-Bone
Local MassStorage 14+TB)
Aggregate to one file (~1 TB each)
VizWall
Viz Client
Local 44 Proc.Data Cluster- data sits on local nodes for weeks
Viz Software
SDM SPA/Utah AHM/Mar05– NC State 8
Workflow - Abstraction
Model
SendData
Merge &Backup
To VizWall
Parallel Computation
RecvData
Parallel Visualization
Data Mover Channel(e.g. LORS, BCC, SABUL, FC over SONET
Split & Viz
Web or Client GUIWebServices
Head NodeServices
Head Node
Services
Mass Storage
Fiber C. or Local NFS
ModelMergeBackupMoveSplitViz
ConstructOrchestrate
Monitor/SteerChange
Stop/Start
Control
SDM SPA/Utah AHM/Mar05– NC State 9
SDM SPA/Utah AHM/Mar05– NC State 10
SDM SPA/Utah AHM/Mar05– NC State 11
SDM SPA/Utah AHM/Mar05– NC State 12
Current and Future Bottlenecks
0 50 100 150 200 250 300 350
10SlicesSer
10SlicesPar
100SlicesS
100SlicesP
500SlicesS
500SlicesP
JobWait (hrs)
Run (hrs)
Merge (hrs)
MT
Tranfer (hrs)
Vizsplit
Vizrun
Computing Resources and Computational Speed (1000+ Cray X1 processors, compute times of 30+ hrs, wait time)Storage and Disks (14+ TB, reliable and sustainable transfer speeds 300+ MB/s , AutomationReliable and Sustainable Network Transfer Rates (300+ MB/s)
SDM SPA/Utah AHM/Mar05– NC State 13
Bottlenecks (B-specific) Supercomputer, Storage, HPSS, Ensight Memory
Average per job wait time is 24-48 hrs (could be longer if more processors are requested or more time slices are calculated).
One run – a 6 hrs (run time) on Cray X1 currently uses 140
processors, and produces 10 time steps. Each time step has 140 Fortran-binary files (28 GB total). Hence, currently, this is 280 MB per 6hr run. Takes about 300 to 500 slices for full visualization (30 to
50 runs , and about 280x(300 to 500)= 10 to 14 TB of space). The 140 files of a time step are merged into one (1) netcdf file (takes
about 10 min) BBCP the file to NCSU at about 30 MB/s, or about 15 min per time slice
(this can be done in parallel with next time-slice computation). In the
future network transfer speeds and disk access speeds may be an issues.
SDM SPA/Utah AHM/Mar05– NC State 14
B-specific Top-Level W/F Operations
Operators: Create W/F (reserve resources), Run Model, Backup Output, PostProcess Output (e.g., Merge, Split), MoveData, AnalyzeData (Viz, other?), Monitor Progress (state, audit, backtrack, errors, provenance), Modify Parameters
States: Modeling, Backup, Postprocessing (A, .. Z), MovingData, Analyzing Remotely
Creators: CreateWF, Model?, Expand Modifiers: Merge, Split, Move, Backup, Start, Stop,
ModifyParameters Behaviors: Monitor, Audit, Visualize, Error/Exception
Handling, Data Provenance, …
SDM SPA/Utah AHM/Mar05– NC State 15
Goal: Ubiquitous Canonical Operations for Scientific W/F
Support Fast data transfer from A to B (e.g., LORS,
SABUL, GridFTP, BBCP?, other …) Database access Stream merging and splitting Flow monitoring Tracking, Auditing, provenance Verification and Validation Communication service (web services, grid
services, xmlrpc, etc.) Other …
SDM SPA/Utah AHM/Mar05– NC State 16
Issues (1)
Communication Coupling (loose, tight, v. tight, code-level) and Granularity (fine, medium?, coarse)
Communication Methods (e.g., ssh tunnels, xmprpc, snmp, web/grid services,etc.) – e.g., apparently poor support for Cray
Storage issues (e.g., p-netcdf support, bandwidth) Direct and Indirect Data Flows (functionality,
throughput, delays, other QoS parameters) End-to-end performance Level of abstraction Workflow description language(s) and exchange issues
– interoperability “Standard” scientific computing “W/F functions”
SDM SPA/Utah AHM/Mar05– NC State 17
Issues (2) Problem is currently similar to old-time punched-card
job submissions (long turn-around time, can be expensive due to front end computational resource I/O bottleneck) - need up front verification and validation – things will change
Back-end bottleneck due to hierarchical storage issues (e.g., retrieval from HPSS)
Long term workflow state preservation - needed Recovery (transfers, other failures) – more needed Tracking data and files - needed Who maintains equipment, storage, data, scripts,
workflow elements? Elegant solutions my not be good solutions from the perspective of autonomy.
EXTREMELY IMPORTANT!!! – We are trying to get out of the business of totally custom-made solutions.
SDM SPA/Utah AHM/Mar05– NC State 18
Workflow - Abstraction
Model
SendData
Merge &Backup
To VizWall
Parallel Computation
RecvData
Parallel Visualization
Data Mover Channel(e.g. LORS, SABUL, FC over SONET
Split & Viz
Web or Client GUIWebServices
Head NodeServices
Head Node
Services
Mass Storage
Fiber C. or Local NFS
ModelMergeBackupMoveSplitViz
ConstructOrchestrate
Monitor/SteerChange
Stop/Start
Control
Goal: 2 -3 Gbps TRatesEnd-To-End
Goal: 1TB per Night
SDM SPA/Utah AHM/Mar05– NC State 19
Communications
Web/Java-based GUI Web Services for Orchestration - overall and
less than tightly coupled sub-workflows LSF and MPI for parallel computation Scripts – (in this example csh/sh based, could
be Perl, Python, etc.) on local machines – interpreted language
High-level programming language for simulations, complex data movement algorithms, and similar – compiled language
SDM SPA/Utah AHM/Mar05– NC State 21