Experiences with the Globus Toolkit on AIX and deploying the Large Scale Air
Pollution Model as a grid service
Ashish Thandavan
Advanced Computing and Emerging Technologies Centre
Outline
eMinerals project OGSA Testbed project Air Pollution Model Air Pollution Model as a grid service GT3 on the IBM Air Pollution Model as grid service (again) Current status & Future work
Our IBM pSeries parallel computer
Four IBM pSeries p655 nodes each with Eight 1.5 GHz POWER4 processors 16 GB RAM 2 x 36 GB internal storage
Private Gigabit ethernet
AIX 5.2 and SLES 8.0 (dual boot)
IBM C, C++ & Fortran compilers, IBM’s POE & LoadLeveler scheduler
Environment from the Molecular Level
Using molecular simulation techniques to investigate fundamental problems associated with key environmental issues such as nuclear waste storage, pollution & weathering
Researchers & resources from 7 UK institutions Funded by NERC
www.eminerals.org
OGSA Testbed
Test and evaluate the first implementation of the OGSA core by installing the GT3 toolkit on a service-based grid testbed spanning organisational boundaries
Deploy a set of applications on this testbed as grid services & document experiences
5 project partners & 2 associate members Funded by EPSRC
dsg.port.ac.uk/projects/ogsa-testbed/
Applications from Reading
Phylogenetic Tree Construction– Sequential & parallel (MPI) versions in C– Runs on clusters & on IBM machine
Large Scale Air Pollution Model– Only parallel implementation (MPI) in Fortran 77– Runs only on IBM machine
Air Pollution Model
Developed by the Danish National Environmental Research Institute (NERI)
Mathematically represents interaction of physical & chemical processes :
– Horizontal transport– Horizontal diffusion– Chemical transformations due to emissions– Deposition of pollutants to surface– Vertical exchange
Air Pollution Model
Physical and Chemical Processes System of Partial Differential Equations
Splitting procedur
eHorizontal Transport
Horizontal Diffusion
Chemistry & Emissions
Deposition
Vertical Exchange
Horizontal Transport & Horizontal Diffusion
Vertical Exchange
Chemical Reactions & Emissions & Deposition
Running the Air Pollution Model
Hundreds of Fortran 77 programs organised logically into sub-directories (each a module)
Makefile provided; compiler xlf_r is used A ‘main’ directory contains executable &
configuration file (with initial input params) Loadleveler job submission file calls poe with
appropriate parameters
Air Pollution Model as Grid Service
IBM machine has GT3 installed
1. APM available as a grid service that clients talk to
2. APMGS invokes MMJFS
3. MMJFS submits job to LoadLeveler
… ideal scenario!!
Client
Client
ClientInternet
IBM Resource
GT3 on the IBM machine
A non-starter…. GT 3.0.2 – didn’t build successfully… GT 3.2beta – didn’t build successfully… GT 3.2.1 – didn’t build successfully…
Different compilers (xlc, gcc) & ‘flavors’
(<g|vendor>cc<32|64>[dbg][pthr])
GT3 on the IBM machine
NSF Middleware Initiative provided pre-compiled binaries of GT3.2.1 for AIX 5.2
Release notes of latest version (5.1) confirmed problems with – globus_grim (due to which container cannot be
started) on AIX– MMJFS on AIX– Other minor issues
Possible solution
GT3 installed on intermediary
Accepts client requests
Performs authentication & authorisation
Mediates job requests between client and IBM resource
Client
Client
Client
Internet
Globus gatekeeper IBM
Resource
Air Pollution Model as a grid service- job submit, cancel, hold, release, etc.
3
Globus Machine
Client
Local LAN
IBM Resource
567
8
BA
1
42
Air Pollution Model as a grid service- job and system status monitoring
3
Globus Machine
Client
Local LAN
IBM Resource
567
8
BA
1
42
D C
9
101112
13
Constituent parts
Three parts to this solution
1. Grid Service
2. Server – resource communication Job submission, hold, release, cancel Job & system status
3. Client Interface
Current status
Resource communication – Job submission of simple commands implemented– Job & system status under development
Grid service part partially implemented Command line client interface (developed &
used mostly for testing the other parts)
Future Work
Generic RemoteResourceJobService– will support different schedulers like Condor,
PBS, SGE, LSF, etc
WSRF implementation of the grid service Option to encrypt server-resource traffic Client interface - portal & application versions XML messaging, more robust