parallel tomography shava smallen cse dept. u.c. san diego
TRANSCRIPT
Parallel Tomography
Shava Smallen
CSE Dept.
U.C. San Diego
Parallel Tomography
Tomography: reconstruction of 3D image from 2D projections
Used for electron microscopy at National Center for Microscopy and Imaging Research (NCMIR)
Coarse-grain, embarrassingly parallel application
Goal of project: Achieve performance using AppLeS scheduling and Globus services
Parallel Tomography Structure
driver
reader
ptomo
ptomo
ptomo
writer
preprocessor
destsource
Off-line
Solid lines = data flowdashed lines = control
Parallel Tomography Structure
driver
reader
ptomo
ptomo
ptomo
writer
preprocessor
destsource
On-line
Solid lines = data flowdashed lines = control
Parallel Tomography Structure
driver: directs communication among processes, controls work queue
preprocessor: serially formats raw data from microscope for parallel processing
reader: reads preprocessed data and passes it to a ptomo process
ptomo: performs tomography processing writer: writes processed data to destination (i.e.
visualization device, tape)
NCMIR Environment Platform: collection of heterogeneous resources
workstations (e.g. sgi, sun) NPACI supercomputer time
(e.g. SP-2, T3E)
How can users achieve execution performance in heterogeneous, multi-user environments?
Performance and Scheduling
Users want to achieve minimum turnaround time in the following scenarios: off-line: already collected data set on-line: data streamed from electron microscope
Goal of our project is to develop adaptive scheduling strategies which promote both performance and flexibility for tomography in multi-user Globus environments
Adaptive Scheduling strategy Develop schedule which adapts to deliverable resource
performance at execution time Application scheduler will dynamically
select sets of resources based on user-defined performance measure
plan possible schedules for each set of feasible resources
predict the performance for each schedule implement best predicted schedule on selected
infrastructure
AppLeS = Application-Level Scheduling Each AppLeS is an adaptive application scheduler
AppLeS+application = self-scheduling application
scheduling decisions based on dynamic information (i.e. resource load forecasts) static application and system information
Resource Selection available resources
workstations run immediately execution may be slow due to load
supercomputers may have to wait in a queue execution fast on dedicated nodes
We want to schedule using both types of resources together for an improved execution performance
Allocation Strategy We have developed a strategy which
simultaneously schedules on both workstations immediately available supercomputer nodes
avoid wait time in the batch queue information is exported by batch scheduler
Overall, this strategy performs better than running on either type of resource alone
Preliminary Globus/AppLeS Tomography Experiments
Resources 6 workstations available at Parallel
Computation Laboratory (PCL) at UCSD immediately available nodes on SDSC
SP-2 (128 nodes) Maui scheduler exports the number of
immediately available nodes• e.g. 5 nodes available for the next 30 mins
10 nodes available for the next 10 mins
Globus installed everywhere
W
W
W W
W
WNCMIR
SDSC
SP-2
W
W
W
W W
W
PCL
Allocation Strategies compared 4 strategies compared:
SP2Immed/WS: workstations and immediately available SP-2 nodes
WS: workstations only SP2Immed: immediately available SP-2 nodes only SP2Queue(n): traditional batch queue submit using n nodes
experiments performed in production environment ran experiments in sets, each set contains all strategies
e.g. SP2Immed, SP2Immed/WS, WS, SP2Queue(8) within a set, experiments ran back-to-back
Results (8 nodes on SP-2)
Results (8 nodes on SP-2)
Results (16 nodes on SP-2)
Results (16 nodes on SP-2)
Results (32 nodes on SP-2)
Results (32 nodes on SP-2)
Targeting Globus
AppLeS uses Globus services GRAM, GSI, & RSL
process startup on workstations and batch queue systems
remote process control Nexus
interprocess communication multi-threaded callback functions for fault tolerance
Experiences with Globus
What we would like to see improved: free node information in MDS (e.g. time availability,
frequency) steep learning curve to initial installation
knowledge of underlying Globus infrastructure startup scripts
more documentation more flexible configuration
Experiences with Globus
What worked: once installed, software works well responsiveness and willingness to help from Globus
team mailing list web pages
Future work
Develop contention model to address network overloading which includes network capacity information model of application communication requirements
Expansion of allocation policy to include additional resources other Globus supercomputers, reserved resources (GARA) availability of resources
NPACI Alpha project with NCMIR and Globus
AppLeS
AppLeS/NCMIR/Globus Tomography Project Shava Smallen, Jim Hayes, Fran Berman, Rich
Wolski, Walfredo Cirne (AppLeS) Mark Ellisman, Marty Hadida-Hassan, Jaime Frey
(NCMIR), Carl Kesselman, Mei-Hui Su (Globus)
AppLeS Home Page www-cse.ucsd.edu/groups/hpcl/apples