1 diane – distributed analysis environment jakub t. moscicki cern it/api
DESCRIPTION
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, 3 Computing Models Ñ desktop computing Ñ personal computing resource Ñ may lack CPU, high-speed access to networked databases,... Ñ "mainframe" computing Ñ shared supercomputer in a LAN Ñ expensive and may have scalability problems Ñ cluster computing Ñ a collection of nodes in a LAN Ñ complex and harder to manage Ñ grid computing Ñ a WAN collection of computing elements Ñ even more complexTRANSCRIPT
1
DIANE – Distributed Analysis DIANE – Distributed Analysis EnvironmentEnvironment
Jakub T. MoscickiCERN IT/[email protected]
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 2
Distributed Analysis: MotivationDistributed Analysis: Motivation
why do we want distributed data analysis? move processing close to data
for example ntuple job description ~ kB the data itself ~ MB, GB, TB ...
rather than downloading gigabyte data let the remote server do the job
do it in parallel – faster clusters of cheap PCs
this is the view of analysis application provider
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 3
Computing ModelsComputing Models desktop computing
personal computing resource may lack CPU, high-speed access to networked
databases,... "mainframe" computing
shared supercomputer in a LAN expensive and may have scalability problems
cluster computing a collection of nodes in a LAN complex and harder to manage
grid computing a WAN collection of computing elements even more complex
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 4
Cluster Computing at CERNCluster Computing at CERN batch data analysis
e.g.: lxbatch currently in production workload management system (e.g. LSF) automatic scheduling and load-balancing
batch jobs – hours, days to complete interactive data analysis
currently desktop, will have to be distributed for LHC tried in the past for ntuple analysis
PIAF (Parallel Interactive Analysis Facility) running copies of PAW on behalf of the user. 8 nodes and tight coupling with the application layer (PAW)
semi-interactive analysis becomes more important – minutes... hours
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 5
HEP public/workgroup clusters features
many users, many jobs diverse applications:
ntuple analysis, simulation, ... interactive ... semi-interactive ... batch ~ 100s of machines
dynamic environment users may submit their analysis code
mixed CPU and I/O intensive some applications may be preconfigured
general analysis e.g. ntuple projections or experiment specific apps load balancing important
thanks to Anaphe team
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 6
Topology of I/O Intensive App.
ntuple mostly I/O intensive rather than CPU intensive
fast DB access from cluster slow network from user to
cluster very small amount of data
exchanged between the tasks in comparison to"input" data
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 7
Parallel Ntuple Analysis data driven all workers perform same task (similar to SPMD) synchronization quite simple (independent workers) master/worker model
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 8
Simulation in Medical Apps example: brachytherapy
optimization of the treatment planning by MC simulation features
CPU intensive few users, few jobs one preconfigured application interactive: seconds .. minutes ~ 10s of machines
ongoing joint collaboration with G4and hospital units in Torino, Italy
thanks to M.G. Pia
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 9
Simulation in Space Science LISA: MC simulation for gravitational
waves experiment Bepi Colombo mission: HERMES experiment features
CPU intensive big jobs (10 processor-years) preconfigured applications batch: days 1000+ machines
requirements: error recovery important monitoring and diagnostics
thanks to A. Howard
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 10
Master/Worker model
applications share the same computation model so also share a big part of the framework code but have different non-functional requirements
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 11
What DIANE is?What DIANE is? R&D project in IT/API
semi-interactive parallel analysis for LHC middleware technology evaluation & choice
CORBA, MPI, Condor, LSF... also see how to integrate API products with GRID
prototyping (focus on ntuple analysis)
time scale and resources: Jan 2001: start (< 1 FTE) June 2002: running prototype exists
sample Ntuple analysis with Anaphe event-level parallel Geant4 simulation
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 12
What DIANE is?What DIANE is? framework for parallel cluster computation
application-oriented master-worker model common in HEP applications
application-independent apps dynamically loaded in a plugin style callbacks to applications via abstract interfaces
component-based subsystems and services packaged into component libraries core architecture uses CORBA and CCM (CORBA
Component Model ) integration layer between applications and the
GRID environment and deployment tools
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 13
What DIANE is What DIANE is not not ?? DIANE is not
a replacement for a GRID and its services a hardwired analysis toolkit
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 14
DIANE and GRID DIANE as a GRID computing element
...via a gateway that understands Grid/JDL ... Grid/JDL must be able to descibe parallel jobs/tasks
DIANE as a user of (low level) Grid services ...authentication, security, load balancing... and profit from existing 3rd party implementations
python environment is a rapid prototyping platform and may provide a convinient connection between DIANE
and Globus Toolkit via pyGlobus API
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 15
Architecture Overview layering: abstract middleware interfaces and components plugin-style application loading
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 16
Client Side DIANE
thin client / lightweight XML job description protocol just create a well-formed job description in XML send and read the results back as XML data messages
connection scenarios standalone clients: C++, python client apps
explicit connection from a shell prompt flexibility and choice of command-line tools
clients integrated into analysis framework: e.g. Lizard/python hidden connection behind-the-scenes
Web access: Java-CORBA binding, SOAP (?) universal and easy access
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 17
Data Exchange Protocol (1) XDR concept in C++
Specify data format Type and order of data fields
Data messages Sender and receiver agree on the format Message is send as opaque object (any) C++ type may be different at each side
Interfaces with flexible data types E.g. store list of identifiers (unknown type)
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 18
Data Exchange Protocol (2)class A : public DXP::DataObject
{
public:
DXP::String name; // predefined fundamental types
DXP::Long index;
DXP::SequenceDataObject<DXP::plain_Double> ratio;
B b; // nested complex object
A(DXP::DataObject *parent) : DXP::DataObject(parent), name(this), index(this), ratio(this), B(this) {}
};
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 19
Data Exchange Protocol (3) External streaming supported, e.g
Serialize as CORBA::byte_sequence Serialize to XML (ascii string) Visitor pattern – new formats easy
Handles Opaque objects (any) Typed objects – safe “casts”
DXP::TypedDataObject<A> a1,a2; // explicit format
DXP::AnyDataObject x = a1; // opaque object
a2 = x;
if(a1.isValid()) // "cast” successful"
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 20
Server Side Architecture
Corba Component Model (CCM) pluggable components & services make a truly component system on the core architecture
level common interface to the service components
difficult due to different nature of the services implementations
example: load-balancing service Condor - process migration LSF - black-box load balancing custom PULL implenetation - active load balancing
but first results show that it is feasible
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 21
DIANE & CORBA CORBA
industry standard (mature and tested) scalable (we need 1000s of nodes and processes) language and platform independent (IDL)
C, C++, Java, python,... many implementations commercial and open source directly supports OO, abstract interfaces CORBA facilities:
naming service, trading service etc. Corba Component Model
supports component programming (evolution of OO)
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 22
Component Technology components are not classes!
components are deployment units they live in libraries, object files and binaries they interact with the external world only via an abstract interface total separation from underlying implementation
classes are source code organization units they exist on different design levels and support different semantics
utility classes (e.g. STL vectors or smart pointers) mathematical classes (e.g. HepMatrix) complex domain classes (e.g. FML::Fitter)
but a class may implement a component OO fails to reuse, component technology might help
(hopefully)
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 23
Component Technology component-container idiom
run-time context is external to the definition of the component
component may be flexibly connected via ports to other components at run-time
Component interface
Attributes
OFFERED My
BusinessComponent
Facets
Eventsources
Eventsinks
Receptacles
REQUIRE
D
thanks to P.Merle / OMG
24
Server Side DIANE
25
Serv
er S
ide
DIAN
E
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 26
CORBA and XML in Practice inter-operability (shown in the prototype ntuple application)
cross-release (muchos gracias XML!) client running Lizard/Anaphe 3.6.6 server running 4.0.0-pre1
cross-language (muchos gracias CORBA!) python CORBA client (~30 lines) C++ CORBA server
compact XML data messages 500 bytes to server, 22k bytes from server of XML
description factor 106 less than original data (30 MB ntuple)
thin client: no need to run Lizard on the client side as an alternative use case scenario
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 27
Load balancing service Black-box (e.g. LSF)
limited control -> submit jobs (black box) job queues with CPU limits automatic load balancing, scheduling (task creation and
dispatch) prototype: deployed (~10s workers)
Explicit PULL LB custom daemons more control -> explicit creation of tasks load balancing callbacks into specific application prototype: custom PULL load-balancing (~10s workers)
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 28
Dedicated Interactive Cluster (1) Daemons per node
Dynamic process allocation
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 29
Dedicated Interactive Cluster (2) Daemons per user per node
Thread pools, per-user policies
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 30
Error Recovery Service The mechanisms
daemon control layer make sure that the core framework process are alive periodical ping – need to be hierarchized to be
scalable worker sandbox
protect from the seg-faults in the user applications memory corruption exceptions signals
based on standard Unix mechanisms: child processes and signals
31thanks to G.Chwajol
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 32
Other Services Interactive data analysis
connection-oriented vs connectionless monitoring and fault recovery
User environment replication do not rely on the common filesystem (e.g. AFS) distribution of application code
binary exchange possible for homogeneous clusters distribution of local setup data
configuration files, etc… binary dependencies (shared libraries etc)
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 33
Optimization Optimizing distributed I/O access to data
clustering of the data in the DB on the per-task basis depends on the experiment-specific I/O solution
Load balancing framework is not directly addressing low level issues ...but the design must be LB-aware
partition the initial data set and assign data chunks to tasks how big chunks? static/adaptive algorithm?
push vs pull model for dispatching tasks etc.
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 34
Further Evolution expect full integration and collaboration
with LCG according to their schedule software evolution and policy
distributed technology (CORBA, RMI, DCOM, sockets, ...) persistency technology (LCG RTAGs -> ODBMS, RDBMS,
RIO) programming/scripting languages (C++, Java, python,...)
evolution of GRID technologies and services Globus LCG, DataGrid, CrossGrid (interactive apps) ...
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 35
Limitations Model limited to Master/Worker More complex synchronization patterns
some particular cpu-intensive applications require fine-grained synchronization between workers - this is NOT provided by the framework and must be achieved by other means (e.g MPI)
Intra-cluster scope: NOT a global metacomputer
Grid-enabled gateway to enter Grid universe otherwise the framework is independent thanks to
Abstract Interfaces
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 36
Similar Projects in HEP PIAF (history)
using PAW TOP-C
G4 examples for parallelism at event-level BlueOx
Java using JAS for analysis some space for communality via AIDA
PROOF based on ROOT
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 37
Summary first prototype ready and working
proof of concept for up to 50 workers ~1000 workers needs to be checked
initial deployment integration with Lizard analysis tool Geant 4 simulation
active R&D in component architecture relation to LCG – to be established
38
That's about it
cern.ch/moscicki/work cern.ch/anaphe aida.freehep.org
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 39
Facade for end-user analysis
3 groups of user roles developers of distributed analysis applications
brand new applications e.g. simulation advanced users with custom ntuple analysis code
similar to Lizard Analyzer execute custom algorithm on the parallel ntuple scan
interactive users do the standard projections just specify the histogram and ntuple to project
user-friendly means: show only the relevant details hide the complexity of the underlying system
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 40
Facade for end-user analysis
CERN Computing Seminar, 11 Sept 2002 CERN IT/API, [email protected] 41
Ntuple Projection Example example of semi-interactive analysis
data: 30 MB HBOOK ntuple / 37K rows / 160 columns time: minutes .. hours
timings desktop (400Mhz, 128MB RAM) - c.a. 4 minutes standalone lxplus (800Mhz, SMP, 512MB RAM) - c.a. 45
sec 6 lxplus workers - c.a. 18 sec
why 6 * 18 = 45 ? job is small, so big fraction of the time is compilation and
dll loading, rather than computation pre-installing application would improve the speed caveat: example running on AFS and public machines