lcg /aa/root 1 proposal for improvements rene brun 14 january 2004 lcg/aa/root relationship some...
TRANSCRIPT
LCG /AA/ROOT 1
Proposal for Improvements
Rene Brun14 January 2004
LCG/AA/ROOT Relationship
Some slides of this talk were presented at the Architects Forum 30 October 2003
LCG /AA/ROOT relationship 2
Applications Area Organisation
Applicationsmanager
Architectsforum
Applicationsarea
meeting
Simulationproject
PIproject
SEALproject
POOLproject
SPIproject
decisionsstrategy
consultation
ROOTUser - provider
The ‘user/provider’ relationship is working
Good ROOT/POOL cooperation . POOL gets needed modifications, ROOT gets debugging/development input
ROOT will be the principal analysis tool; full access to its capability is a blueprint requirement
ALICE directly using ROOT as their framework
Torre
LCG /AA/ROOT relationship 3
User/Provider relationship
• It works in the sense that teams did not show unwillingness to cooperate.
• The cooperation is ambiguous. The wheel is reinvented many times.
• The duplication of efforts will give problems in the near future (dictionaries, plug-in managers, collections and many more (coming slides))
• Manpower would be better used in improving critical areas.
• Alice has not joined the train.
LCG /AA/ROOT relationship 4
User/Provider relationship
• The current orientation is OK if the idea is to use ROOT as a back-end in a few places and alternative solutions are seriously considered with clear deliverables.
• If ROOT is the choice for the two essential areas: event storage and interactive data analysis, this has important implications.• In this case the user/provider relationship is not
appropriate:• ROOT must be better integrated in the LCG. This has
implications for the LCG/AA plans and also for the ROOT planning.
LCG /AA/ROOT relationship 5
Motivation for this presentation
• We have two options in front of us:• Continue the current process assuming that everything is
OK in the best of the worlds. ROOT is happy, LCG/AA is happy.
• Take advantage of the useful internal review to rethink the general orientation.
• We have a unique opportunity now, with enough experience with all the projects, to take the necessary actions to decrease the entropy in the interest of the LHC and also non-LHC users.
• We must capitalize on one year of useful experience in AA to setup a convergent and coherent process.
LCG /AA/ROOT relationship 6
MAIN Motivation
Make it simplerfor our users
Current system is too complexFar too many layers
LCG /AA/ROOT relationship 7
Plan of talk
• In the following slides, I review the main projects: POOL, SEAL, SIMU and PI,ARDA with a proposal for a better integration with ROOT.
• I start with a few slides indicating where we are with ROOT. Our current developments are relevant to the LCG work.
SEAL: single dictionary,plug-in manager, mathlibs
POOL: collections, performance, goals
SIMU: VMC, geometry and geometries interfaces
PI: what next?
ARDA: Distributed Analysis and ROOT/PROOF
SPI: using/moving to the infrastructure
LCG /AA/ROOT 1
ROOT status
• Version 3.05/07 released in July 2003
• Version 3.10/02 released in December
• Working on major new version 4.0
LCG /AA/ROOT relationship 9
ROOT version 4 Highlights (1)
Support for Automatic Schema evolution for foreign classes without using a class version number.Support for large files (> 2 GBytes)New data type Double32_t (double in memory, saved as float on the file)Native support for STL: automatic Streamers, no code generation anymore.
Tree split mode with STL vector/list Plug-in Manager upgrade (rlimap) with
automatic library/class discovery/load.•
LCG /AA/ROOT relationship 10
ROOT version 4 Highlights (2)
PROOF/Alien in productionXrootd (collaboration with Babar)New Linear Algebra packageGeometry interface to OpenGL/Coin3DSupport for Qt (alternative option to x11).GUI builder with GUI code generationNew GUI Histogram editor Interface with Ruby
First development release just before the ROOT workshop (25 February SLAC)
Final PRO release scheduled for June.
LCG /AA/ROOT relationship 13
ROOT and SPI
• If the model evolves from a “user-provider” relationship to a real and effective integration of ROOT in the LCG plans, it will become obvious that ROOT should use the same infrastructure (SPI).
• The current work from Torre is an essential ingredient to simplify the development and build procedures, a prerequisite for convergence.
• It is too early to take a practical decision as it depends on the acceptation of this plan and on real achievements.
LCG /AA/ROOT relationship 15
SEAL: Duplications
• Due to well known historical reasons, SEAL is duplicating systems already provided by ROOT,eg:• Object dictionary• Plug-in manager• Regular Expressions• Compression algorithms
• In the following, I will discuss only the dictionary and the plug-in manager.
LCG /AA/ROOT relationship 16
Seal libraries size and dependencies
SealBase6.60 MB
SealUtil0.85 MB
SealServices1.58 MB
IOTools1.29 MB
SealZIP2.15 MB
SealKernel1.62 MB
ReflectionBuilder1.02 MB
Reflection2.40 MB
PluginManager1.28 MB
SealCLHEPdict4.09 MB
CLHEP1.50 MB
SealSTLdict5.13 MB
GMinuit2.45 MB
LCG /AA/ROOT relationship 17Technology dependent
.h.h
ROOTCINTROOTCINT
CINT dictionary codeCINT dictionary code
DictionaryGeneration
CIN
T
dic
tio
nar
yC
INT
d
icti
on
ary
I/O I/O
Data I/O
SEAL Dictionary: Reminder
GCC-XMLGCC-XML
LCG dictionary codeLCG dictionary code
.xml.xml
Code GeneratorCode Generator
LC
Gd
icti
on
ary
LC
Gd
icti
on
ary
Gat
eway
Gat
eway
Reflection
Oth
er
Cli
en
tsO
the
r C
lie
nts
Hum !All boxes aretechnology dependent!
LCG /AA/ROOT relationship 18
SEAL: The dictionary saga
• There were 4 reasons to develop an alternative dictionary:• Make it independent of ROOT/CINT.• Make it available with other languages.• Remove parsing limitations of rootcint.• Necessary for POOL alternative backend.
• The alternative language is a false problem. All collaborations are heavily investing in C++, and anyhow the SEAL dictionary is not appropriate for languages coming with introspection/reflection capabilities.
• The other 3 reasons must be seen with a different angle, if ROOT is the choice for storage manager and analysis engine.
• Everybody agrees that having 2 dictionaries is a nightmare, a source of more and more conflicts and new problems.
LCG /AA/ROOT relationship 19
LCG Dictionary size Atlas (Nov version)
In November, we investigated the size of the LCG dictionary in case of Atlas, CMS and ROOT itself. LHCb were not in a position to estimate the size because they did not have the code generator yet.
As a positive effect of this exercise, the SEAL team has been able meanwhile to gain a factor 3 in the size of the dictionary on disk, but no estimation of the gain (if any) in memory.
Library Classes.o LCGdict.o LCGdict/class CINTdict.o-------------------------------------------------------------------SimpleTrack 10.7k 144k 13.45EventHeader 12.7k 89k 7.00FourMom 49k 13k 0.26GenerateObject(HepMC) 388k 326k 0.84LArSimEvent 26k 88k 3.38 EventInfo 33k 120k 3.63 65k
ATLAS (27 classes)
4.7 +- 4.4
LCG /AA/ROOT relationship 20
LCG Dictionary size CMS (Nov version)
• Bill compared the sizes of the same CMS dictionary object files (*.o) (COBRA+ORCA) on disk produced by lcgdict versus that for rootcint produced dictionaries.
• Total number of dictionaries = 30• Total number of classes = 359• Average data members per class = 435/359 = 1.2• Average functions per class 1868/359 = 5.2
• All were compiled with gcc_3.2.3 with the -O2 option, and all the symbols were stripped (with strip) for the purpose of this comparison.
• The size ratios are quite consistent across dictionaries, so we give the total sizes.
• ROOT: 3.45 Mb• POOL: 5.37 Mb
So the lcg dictionary files are approx. 50% larger. Note that the CMS classes above are only the base classes of the framework.
It would have been interesting to have more statistics based on concrete application classes with more data members and functions.
LCG /AA/ROOT relationship 21
LCG Dictionary size ROOT (Nov version)
• It was easy to generate the dictionaries for about one half of all ROOT classes (320/600)
• In order to evaluate the impact in memory of the LCG dictionary, I linked the dictionaries with the ROOT executable module.• Full ROOT Process Memory Size = 12.30 Mbytes• Same + lcg dictionary = 28.30 Mbytes
Remark1: The lcg dictionary for 1/2 of the ROOT classes is 1.3 times bigger than ROOT itself.
Remark2: The LCG dictionary does not contain all the information available in the CINT dictionary.
LCG /AA/ROOT relationship 22
ROOT Dictionary size
If all classes have a dictionary, the size of the dictionary may become a large fraction of the executable module!
LCG /AA/ROOT relationship 23
The CINT dictionary
Data structuresGClassInfo API
Data members, functions
C++ parser(s)rootcint
ByteCodeGenerator
Byte Code Interpreter
The CINT library is small: 1.5 MByteCINT is more than just a parser and API to the dictionary
LCG /AA/ROOT relationship 24
The CINT dictionary evolution
• Data Members• Supports already all C++ features (no missing
important features like typedef or enum)• Future is to look into XTI in case there is
progress with the C++ committee
• Parser/Code generator• The number of failing cases has considerably
dropped. We consider parsing failures with high priority. They are in general fixed in the “next week” CINT release.
LCG /AA/ROOT relationship 25
Dictionary: How to make progress
• Review asap functionality provided by LCGdict and CINT
• Collect info from CMS/Atlas,others on the size of dictionaries.
• Investigate how many classes (*.h) can be parsed by gccxml and not by rootcint.
• Compare the two APIs and data structures.• Investigate feasibility of supporting two parsers
with one single dictionary in memory.• Investigate portability of gccxml on all ROOT
supported platforms.
LCG /AA/ROOT relationship 26
Dictionary: which options?
• Start from LCG dict• Requires lcgdict to be available on all platforms where
CINT runs today• Requires deep changes in the byte code and in the
interpreter.• Start from CINT dictionary
• Improving the API• Keeping/Improving rootcint• Adapting gccxml as an alternative parser• Both options
Following discussions in Nov/Dec, a proposal for a common C++ API to the CINT dictionary is in preparation. Because the user must see only C++ objects, this requires also a mini C++ data structure (must be small compared to CINT)
LCG /AA/ROOT relationship 27
Dictionaries : root only
X.hCINT
DS
rootcintXDictcint.cxx
CINT
API
ROOT
Root meta C++
CINT
LCG /AA/ROOT relationship 28
Dictionaries : situation today
X.h
X.xml
XDictlcg.cxx
LCGDICTDS
CINTDS
rootcint
lcgdict
gccxml
XDictcint.cxx
CINT
API
LCG
API
POOL
ROOT
Root meta C++
CINT
LCG /AA/ROOT relationship 29
Dictionaries : step 1 gain space
X.h
X.xml
XDictlcg.cxx
LCGDICT DS C++
CINTDS
rootcint
lcgdict
gccxml
XDictcint.cxx
CINT
API
LCG2
API
POOL
ROOT
Root meta C++
CINT
LCG /AA/ROOT relationship 30
Dictionaries : step 2 simplification
X.h
meta DS C++
CINTDS
rootcintXDictcint.cxx
CINT
API
LCG
ROOT
APIPOOL
ROOT
CINT
LCG /AA/ROOT relationship 31
Dictionaries : step 3 coherency
X.h
meta DS C++
CINTDS
XDict.cxx
CINT
API
LCG
ROOT
APIPOOL
ROOT
CINT
gccxml
rootcint
LCG /AA/ROOT relationship 32
Plug-in Manager(s)
• A Plug-in manager is an essential tool helping in making a system more modular
• It simplifies dynamic linking and unlinking.• It would be nice to converge on one single
manager to minimize side-effects.• The ROOT plug-in manager is very powerful and
simple to use (see slide).• It does not require an object factory machinery.
The interpreter is already doing it for free.• It is being extended to automate/simplify several
operations, such as automatic discovery of the shared lib containing a class.
LCG /AA/ROOT relationship 33
Definition of plug-ins in ROOT
Plugin.TFile: ^rfio: TRFIOFile RFIO "TRFIOFile(const char*,Option_t*,const char*,Int_t)"+Plugin.TFile: ^castor: TCastorFile RFIO "TCastorFile(const char*,Option_t*,const char*,Int_t,Int_t)"+Plugin.TFile: ^dcache: TDCacheFile DCache "TDCacheFile(const char*,Option_t*,const char*,Int_t)"+Plugin.TFile: ^chirp: TChirpFile Chirp "TChirpFile(const char*,Option_t*,const char*,Int_t)"Plugin.TSystem: ^rfio: TRFIOSystem RFIO "TRFIOSystem()"Plugin.TSQLServer: ^mysql: TMySQLServer MySQL "TMySQLServer(const char*,const char*,const char*)"+Plugin.TSQLServer: ^pgsql: TPgSQLServer PgSQL "TPgSQLServer(const char*,const char*,const char*)"+Plugin.TSQLServer: ^sapdb: TSapDBServer SapDB "TSapDBServer(const char*,const char*,const char*)"+Plugin.TSQLServer: ^oracle: TOracleServer Oracle "TOracleServer(const char*,const char*,const char*)"Plugin.TGrid: ^alien TAlien RAliEn "TAlien(const char*,const char*,const char*,const char*)"Plugin.TVirtualPad: * TPad Gpad "TPad()"Plugin.TVirtualHistPainter: * THistPainter HistPainter "THistPainter()"Plugin.TVirtualTreePlayer: * TTreePlayer TreePlayer "TTreePlayer()"Plugin.TVirtualTreeViewer: * TTreeViewer TreeViewer "TTreeViewer(const TTree*)"Plugin.TVirtualGeoPainter: * TGeoPainter GeomPainter "TGeoPainter()"Plugin.TVirtualUtil3D: * TUtil3D Graf3d "TUtil3D()"Plugin.TVirtualUtilHist: * TUtilHist Hist "TUtilHist()"Plugin.TVirtualUtilPad: * TUtilPad Gpad "TUtilPad()"Plugin.TVirtualFitter: Minuit TFitter Minuit "TFitter(Int_t)"+Plugin.TVirtualFitter: Fumili TFumili Fumili "TFumili(Int_t)"Plugin.TVirtualPS: ps TPostScript Postscript "TPostScript()"+Plugin.TVirtualPS: svg TSVG Postscript "TSVG()"Plugin.TViewerX3D: x11 TViewerX3D X3d "TViewerX3D(TVirtualPad*,Option_t*)” +Plugin.TViewerX3D: qt TQtViewerX3D QtX3d "TQtViewerX3D(TVirtualPad*,Option_t*)”
name class Shared lib How to call
LCG /AA/ROOT relationship 34
MathLibs
• It is important for HEP to have one well identified Math library (source, libs), with• Full control of the source• That we can port on as many platforms as
possible• A good test suite and documentation
• This does not mean that we have to develop new algorithms/classes/functions.
• In Nov/Dec we had a few meetings to discuss a proposal for a Mathlib in C++, an alternative to a proposal by SEAL.
LCG /AA/ROOT relationship 35
Mathlibs (2)
KernlibMathlib
CLHEP
ROOTTMath
TMatrixTCL
GSLsubset
New MathlibOpen Source
Not HEP/LCG restricted
Convert only on demand what is not already converted by TCL
Give to GSL our mods as C/GSL functions
Take small subset and freeze
From GSL, Import functions not found elsewhere.Wrap C functions in classes like in TMath
ROOT Linear algebra is being extended and improved
LCG /AA/ROOT relationship 36
Mathlibs proposals
A: SEAL proposal: Install GSL, collaborate with the GSL team.
B: Rene/Eddy proposal: copies available
LCG /AA/ROOT relationship 37
Why a Mathlib in C++
1. We want to interact with real objects (data and algorithms), not just algorithms.
2. We want to provide higher level interfaces hiding the implementation details (algorithms). A true Object-Oriented API should remain stable if internal storage or algorithms change. One can imagine the Mathlib classes being improved over time, or adapted to standard algorithms that could come with the new C++ versions.
3. Many classes require a good graphics interface. A large subset of CERNLIB or GSL has to do with functions. Visualizing a function requires to know some of its properties, eg singularities or asymptotic behaviors. This does not mean that the function classes must have built-in graphics. But they must be able to call graphics service classes able to exploit the algorithms in the functions library.
4. Many objects need operators (matrices, vectors, physics vectors, etc).5. We want to embed these objects in a data model. Users start to request
that the math library takes care of memory management and/or persistency of the object . See for instance the LHC-feedback [5], where persistency of the CLHEP was requested. The user would like to save and restore random-generator seeds etc .
6. We want to have an interactive interface from our interpreters, hence a dictionary.
LCG /AA/ROOT relationship 38
C/Fortran/GSL versus C++
Object-Oriented API vs Procedural APIgsl style : double gsl_sf_gamma(double x)int gsl_sf_gamma_e(double x, gsl_sf_result* result)
root style : TF1 gamma(TMath::Gamma,0,1)gamma.Eval(x)gamma.Derivative(x)gamma.Integral(from,to)gamma.GetRandom()gamma.Draw()
LCG /AA/ROOT relationship 39
Mathlib Proposal picture
libGSL++.soContains full standard GSL
+ CINT dictionary
TMath or/and TMath likeC++ static functions
Contains the most used math functions
High Level C++ classesFunctions (a la TF1), Physics Vectors
Linear Algebra, Random Numbers, Minimisation
Callable from interpreter(s)
Persistency
ROOT
libraries
LCG /AA/ROOT relationship 40
Summary of proposal B
• Install standard gsl: libGSL.so• Provide a CINT front-end (say libGSL++.so)
• Nearly done, thanks Lorenzo
• Extend TMath with more static functions from CERNLIB, GSL,..
• New Linear Algebra from Eddy (see later)• Extend functions classes TF1 and like with more
algorithms. 2/3 of the estimated total work already done. Main work is the development of a test/benchmark
suite
LCG /AA/ROOT relationship 43
CLHEP linear algebra problems
• CLHEP inversion :• sizes <= 6 : Limited precision Cramer algorithm• sizes > 6 : unscaled LU factorization (Cernlib DFACT)
• Suppose Hilbert matrix A(i,j) = 1/(i+j+1) i,j=0,..,4 and calculate E = A * A^-1• Cramer : i!=j E(i,j) < 10e-7 while• scaled LU : i!=j E(i,j) < 10e-13
• Of course inaccuracy worse for larger matrix. Scaling the matrix with a large or small number will make Cramer under/over flow. Unscaled LU factorization can under/over flow • example Hilbert matrix size > 12, routine will return error
• CLHEP not thread-safe
LCG /AA/ROOT relationship 44
Features found only in ROOT4.0
In-place matrix multiplication passing of lazy matrix (recipe without
instantiation) eigen-vector/value analysis for symmetric and
non-symmetric matrix condition number for arbitrary matrix (Hager
algorithm) many decomposition classes: LU, Chol, QRH, SVD each allowing repeated solutions without
decomposing again thread safe persistency
LCG /AA/ROOT relationship 45
More tests and benchmarks
• Like for the linear Algebra classes, similar test suites and benchmarks should be implemented for:Basic algorithms (TMath like)Statistical Analysis and probabilitiesFunctions: integrals, derivatives, root-finding Interpolations, approximations.Random numbers: basic, functions, histogramsPhysics vectorsMinimization algorithms
LCG /AA/ROOT relationship 47
POOL Objectives (Dirk’s slide)
• To allow the multi-PB of experiment data and associated meta data to be stored in a distributed and Grid enabled fashion
• various types of data of different volumes (event data, physics and detector simulation, detector data and bookkeeping data)
• Hybrid technology approach, combining • C++ object streaming technology, such as Root I/O, for the bulk data • transactional safe Relational Database (RDBMS) services, such as MySQL,
for catalogs, collections and meta data
• In particular, it provides • Persistency for C++ transient objects • Transparent navigation from one object across file and technology
boundaries- Integrated with a external File Catalog to keep track of the file physical location,
allowing files to be moved or replicated
Source of problemsAnd misunderstanding
Two catalogs ?
LCG /AA/ROOT relationship 48
POOL Objectives
• Hybrid technology approach, combining • C++ object streaming technology, such as Root I/O, for the bulk data • transactional safe Relational Database (RDBMS) services, such as MySQL,
for catalogs, collections and meta data
If an alternative solution is in mind, it must be a complete solution. In particular, an automatic schema evolution algorithm has to be part of POOL itself.An alternative solution prevents exploiting more features of the current back-endConcentrating on one back-end will eliminate unnecessary overheads and duplicated code.
It is urgent to come back to the blueprint objectiveCombining ROOT as an event store with a RDBMS-based catalog
LCG /AA/ROOT relationship 49
POOL Objectives
• Hybrid technology approach, combining • C++ object streaming technology, such as Root I/O, for the bulk data • transactional safe Relational Database (RDBMS) services, such as MySQL,
for catalogs, collections and meta data
ROOT I/O is much more than a simple object streaming technology.-It supports automatic schema evolution (a large fraction of the code)-It supports collections (directories of keys, Trees with containers appropriate for queries in interactive analysis).-It supports “object-streaming” with sockets, shared-memory.-It supports access to remote files and is GRID-aware-Collections are designed to work in a parallel/GRID setup with PROOF
LCG /AA/ROOT relationship 50
POOL libraries size and dependencies
SealBase6.60 MB
SealKernel1.62 MB
ReflectionBuilder1.02 MB
Reflection2.40 MB
PluginManager1.28 MB
AttributeList0.15 MB
FileCatalog0.13 MB
EDGCatalog3.46 MB
DataSvc0.21 MB
Collection0.96 MB
RootStorageSvc2.22 MB
RootCollection0.18 MB
PersistencySvc0.29 MB
PoolCore0.43 MB
StorageSvc1.97 MB
libCore6.40 MB
libCint1.40 MB
libTree1.24 MB
Seal 12.92
Pool 6.54
Root 9.04
Tot 28.72
LCG /AA/ROOT relationship 51
POOL/ROOT performance problems
25 1 pool tree write 1.91 +89%25 1 pool tree read 1.19 +176%25 1 root tree write 1.0125 1 root tree read 0.4325 1 pool key write 2.04 +63%25 1 pool key read 1.72 +212%25 1 root key write 1.2525 1 root key read 0.55
25 10 pool tree write 1.93 +89%25 10 pool tree read 1.10 +175%25 10 root tree write 1.0225 10 root tree read 0.4025 10 pool key write 1.67 +33%25 10 pool key read 1.6 +180%25 10 root key write 1.2525 10 root key read 0.57
25 200 pool tree write 1.56 +56%25 200 pool tree read 1.07 +154%25 200 root tree write 1.0025 200 root tree read 0.4225 200 pool key write 1.61 +27%25 200 pool key read 1.6 +180%25 200 root key write 1.2625 200 root key read 0.57
500 50 pool tree write 22.38 +19%500 50 pool tree read 5.23 +27%500 50 root tree write 18.79500 50 root tree read 4.09500 50 pool key write 19.72 +3.5%500 50 pool key read 5.77 +29%500 50 root key write 19.04500 50 root key read 4.46
The current POOL/ROOT mapping has performance problems that must be understood (not just a few per cent!)
Numbers from Ioannis, Markus
LCG /AA/ROOT relationship 52
ROOT foreign classes & POOL
A checksum algorithm implemented in ROOT4. Provides auto schema evolution without having to specify a class version number. Must be tested in POOL.
Must look at possible performance problems due to the fact that POOL does not use ClassDef (important function IsA missing for POOL). This could explain the very poor POOL performance for LHCb when using vector<T*>.
It is very strange that this performance problem has not yet been seen by CMS and Atlas!
LCG /AA/ROOT relationship 53
POOL: ref<T> and collections
• If ref<T> and collections are not understood by ROOT, it will be a source of constant troubles and misunderstanding.
• The development of these two entities should have been done in collaboration to optimize the implementation. Remember the early discussions about TRef, TLongRef, TUUID and ref<T>.
• The existing POOL collections are mapped on ROOT Trees (any bonus compared to native Trees?).
• If new collections are required (to be seen!), they must be designed with data analysis in mind, including parallelism.
• Progress in this area requires a close cooperation with the experiments with prototyping of a few implementations using the different solutions.
• We already have interfaces of ROOT collections with many RDBMS systems, including queries. (MySQL, Oracle, SapDB, PostGres)
LCG /AA/ROOT relationship 54
POOL: caching
• There is a confusion between “commit” that guarantees data base integrity and “buffering” to improve performance.
• The cache with “I take ownership” is intrusive and with consequences on the user framework.
• The solution with “no ownership” is not optimum. It implies multiple copies and duplicates the efficient buffering implemented in ROOT.
• A review of the POOL/ROOT communication will have to address these problems by removing unnecessary layers.
LCG /AA/ROOT relationship 55
POOL: Access to the catalog
• A coherent system will require a good interface between ROOT and the POOL catalog (both C++ and CINT).
• ROOT has already an abstract interface TGrid with an implementation with Alien.
• POOL will not be the only catalog around. It is important to consider the generality and variety of interfaces.
LCG /AA/ROOT relationship 57
ROOT and the Simulation project
• The VMC has 2 goals:• Experiments define their geometry once only• The comparison between physics packages is facilitated.
• The VMC proposes 3 standard interfaces:• Standardize the interface to the generators and the
particle stack.• Standardize the interface to the step manager (hits
scoring)• Standardize the interface to the geometry
- Definition and Validation (checker)- Navigator in detector simulation (fast and slow)- Queries from a reconstruction algorithm- Graphics
LCG /AA/ROOT relationship 58
Geometry and Geometries
GeometryIn memory
XMLFiles
Eg, GDML
ROOTfile
C++ classesGeant3rz file
C++ classesC++ classes
G3
G4
Fluka
Recons
geometry
geometry
LCG /AA/ROOT relationship 59
Geometries: not the same goal !
XMLFiles
Eg, GDML
External description onlyUsed as input to a real geometry (G4, ROOT)
Checker , Viewer may be implementedRequires some data structure in memory
This has very limited functionality.Interesting (may be?) for input.
Too much emphasis on this solution
GeometryIn memory
(G3,G4,ROOT)
Simulation/Reconstruction orientedC++ API for the constructionInput can be via first solution
Checker, Viewer must be (are) implementedProvide interface to navigators
THIS IS THE MAIN HORSE TO BE OPTIMIZED
LCG /AA/ROOT relationship 61
ROOT and PI (1)
• The PI-AIDA interface to ROOT exist. This still requires some consolidation, but it should not be expanded to other areas.
• A generic PyRoot interface (in SEAL) must be optimized, automatized. Examples of PyRoot illustrating the complementarity to CINT (instead of an alternative) should be written and used in tutorials.
• The PyRoot interface belongs logically to the Root source.
• Ruby-Root (superior to PyRoot?)
LCG /AA/ROOT relationship 64
PI libraries size and dependencies
SealBase6.60 MB
PluginManager1.28 MB
AIDA_Plugin0.038 MB
AIDA_Proxy0.74 MB
PluginTreeROOT0.40 MB
AIDA_CPP0.08 MB
AIDA_ROOT0.49 MB
PluginHistogramROOT0.30 MB
libCore6.40 MB
libCint1.40 MB
libTree1.24 MB
libHist2.77 MB
Seal 7.88
PI 2.05
Root 11.81
Tot 21.74
AIDAUtilitiesxx MB
CLHEP1.50 MB
LCG /AA/ROOT relationship 65
ROOT and PI (3) ARDA
• The project scope should be different.• I would like to see more cooperation on the
development of ROOT/PROOF with a close participation with the experiments.
• ROOT/PROOF requires also a close collaboration with the file catalog providers,POOL, Alien, others.
• Data challenges are the best opportunity to develop a coherent model.
Instead of Abstract interfaces common to all data analysis packages, it is more efficient and less bureaucratic to develop file exchange formats
LCG /AA/ROOT relationship 67
ROOT & ARDA
• Our program of work has clearly a very large overlap with the proposed ARDA project:• Distributed computing (move process to data):
PROOF• Distributed data access: xrootd/PROOF
• However, Data Analysis is not just an extension of Distributed Production.
• Data Analysis will be Batch AND Interactive with more and more Interactivitywith more and more Interactivity
LCG /AA/ROOT relationship 68
Data Volume & Organisation
100MB 1GB 10GB 1TB100GB 100TB 1PB10TB
1 1 500005000500505
TTree
TChain
A TChain is a collection of TTrees or/and TChains
A TFile typically contains 1 TTree (or a few)
A TChain is typically the result of a query to the file catalogue
LCG /AA/ROOT relationship 69
Data Volume & Processing TimeUsing technology available in 2004
1” 10” 1’ 10’ 1h 10h 1day 1month
1” 1” 10” 1’ 10’ 1h 10h 1day 10days
1” 1” 1” 10” 1’ 10’ 1h 10h 1day
1’ 10’ 1h 10h
100MB 1GB 10GB 100GB 1TB 10TB 100TB 1PB
ROOT 1 Processor P IV 2.4GHz 2004 : Time for one query using 10 per cent of data
Interactive batch
PROOF 10 Processors
PROOF 100Processors
PROOF/ALIEN 1000Processors
LCG /AA/ROOT relationship 70
Data Volume & Processing TimeUsing technology available in 2010
1” 1” 1” 10” 1’ 10’ 1h 10h 1day
1’ 10’ 1h
100MB 1GB 10GB 100GB 1TB 10TB 100TB 1PB
ROOT 1 Processor XXXXX 2010 : Time for one query using 10 per cent of data
Interactive batch
PROOF 10 Processors
PROOF 100Processors
PROOF/ALIEN 1000Processors
1” 1” 10” 1’ 10’ 1h 10h 1day 10days
1” 1” 1” 1” 10” 1’ 10’ 1h 10h
LCG /AA/ROOT relationship 71
GRID: Interactive AnalysisCase 1
• Data transfer to user’s laptop• Optional Run/File catalog• Optional GRID software
Optionalrun/FileCatalog
Remotefile servereg rootd
Trees
Trees
Analysis scripts are interpretedor compiled on the local machine
LCG /AA/ROOT relationship 72
GRID: Interactive AnalysisCase 2
• Remote data processing• Optional Run/File catalog• Optional GRID software
Optionalrun/FileCatalog
Remotedata analyzer
eg proofd
Trees
Trees
Commands, scripts
histograms
Analysis scripts are interpretedor compiled on the remote machine
LCG /AA/ROOT relationship 73
GRID: Interactive AnalysisCase 3
• Remote data processing• Run/File catalog• Full GRID software Run/File
Catalog
Remotedata analyzer
eg proofd
Trees
Trees
Commands, scripts
Histograms,trees
TreesTreesTrees
TreesTreesTrees
slave
slave
slave
slave
slave
slave
Analysis scripts are interpretedor compiled on the remote master(s)
LCG /AA/ROOT relationship 74
Proof Alien (Interactive and batch)
ALIEN PROOF ?
by Nina C.Fulford
This is a story about fear! Fear of the Government! Fear of your company boss! Fear of ones career! Fear for ones life maybe? Whatever way you look at it, it is based on fear. How did we allow ourselves to be ruled by fear in this day and age of so-called knowledge, progress and science? You tell me, because I want to know! For years every so called expert on this planet has claimed to have two goals: find (1) proof of alien presence on this world, either in the form of an artifact beyond our present science or in the form of an alien life form. (2) Find the missing links to our past. In order to do the second one they use millions of dollars in public money for their research. Did you think they paid their own way? No! You support this with your taxes! If they can spend thousands wandering around Africa looking for bones and getting all excited when some animals skull turns up they want to claim is an ancestor of man, then why aren't they interested in an equally old skull that is out of this world figuratively and physically?
LCG /AA/ROOT relationship 75
Summary-1: technicalities
Several important changes are proposed to optimize the relationship between ROOT and LCG/AA.
The proposed changes will reduce considerably the complexity of the system and will improve drastically the performance. This is good for developers and end-users.
If the idea is accepted, a concrete plan for implementation, starting by the most urgent tasks (dictionary, POOL, mathlib) could be setup very soon. Very positive meetings so far.
LCG /AA/ROOT relationship 76
Summary-2: organisation
The current development model• Experts design/implement/release• Experiments validate the product
should be changed to:• Experts discuss with experiments to understand their
event models, possibly influencing their design• They prototype the different models• They integrate and release (fast iterative process subject
to less surprises) This will improve the feedback mechanism It will reduce risks Simplified structures should be put in
place.