3 rd may’03nick brook – 4 th lhc symposium1 data analysis – present & future nick brook...
TRANSCRIPT
3rd May’03 Nick Brook – 4th LHC Symposium 1
Data Analysis – Present & Future
Nick Brook
University of Bristol•Generic Requirements & Introduction
•Expt specific approaches:
3rd May’03 Nick Brook – 4th LHC Symposium 2
Detectors: ~2 orders of magnitude more channels than today
Triggers must choose correctly only 1 event in every 400,000
High Level triggers are software-based
Computer resourceswill not be availablein a single location
Complexity of the Problem
3rd May’03 Nick Brook – 4th LHC Symposium 3
Complexity of the Problem
Major challenges associated with:Communication and collaboration at a distanceDistributed computing resources Remote software development and physics analysis
3rd May’03 Nick Brook – 4th LHC Symposium 4
Analysis Software System
Reconstruction
Selection
Analysis
Re-processing3 per year
Iterative selectionOnce per month
Different Physics cuts& MC comparison~1 time per day
~25 Individual~25 Individualper Groupper GroupActivityActivity
(10(1066 –10 –1088 events) events)
New detector calibrations
Or understanding
Trigger based andPhysics basedrefinements
Algorithms appliedto data
to get results
30 kSI2000sec/event
1 job year
30 kSI2000sec/event
1 job year
30 kSI2000sec/event
3 jobs per year
30 kSI2000sec/event
3 jobs per year
0.25 kSI2000sec/event~20 jobs per month
0.25 kSI2000sec/event~20 jobs per month
0.1 kSI2000sec/event~500 jobs per day
0.1 kSI2000sec/event~500 jobs per day
Monte Carlo
50 kSI2000sec/event
50 kSI2000sec/event
~20 Groups’~20 Groups’ActivityActivity
(10(109 9 101077 events) events)
2GHz ~ 700 SI2000
Experiment-Experiment-Wide ActivityWide Activity(10(1099 events) events)
3rd May’03 Nick Brook – 4th LHC Symposium 5
Data Management
tools
Detector/EventDisplay
Data Browser
Analysis jobwizards
Generic analysis Tools
ReconstructionReconstruction
SimulationSimulation
LCGLCGtoolstools
GRIDGRID
FrameworkFramework
DistributedData Store
& ComputingInfrastructure
ExptExpttoolstools
StableUser Interface
Coherent set of basic tools and mechanisms
Software development and installation
Analysis Software System
3rd May’03 Nick Brook – 4th LHC Symposium 6
PhilosophyPhilosophy
• we want to perform analysis from day 1 (now) !
• building on Grid tools/concepts to simplify distributed environment
Time
Hype
Peak of Inflated Expectations
Trough of Disillusionment
Slope of Enlightenment
Plateau of Productivity
Trigger
3rd May’03 Nick Brook – 4th LHC Symposium 7
Data Challenges & Production Tools
All experiments have well-developed production tools for co-ordinated data challenges e.g. CHEP talks on
DIRAC – Distributed Infrastructure with Remote Agent Control
Tools provide management of workflows, job submission, monitoring, book-keeping, …
3rd May’03 Nick Brook – 4th LHC Symposium 8
AliEn (ALIce ENvironment) is an attempt to gradually approach and tackle computing problems at LHC scale
and implement ALICE Computing Model
Main features– Distributed file catalogue built on top of RDBMS – File replica and cache manager with interface to MSS
• CASTOR,HPSS,HIS…• AliEnFS – Linux file system that uses AliEn File Catalogue and replica manager
– SASL based authentication which supports various authentication mechanisms (including Globus/GSSAPI)
– Resource Broker with interface to batch systems• LSF,PBS,Condor,BQS,…
– Various user interfaces • command line, GUI, Web portal
– Package manager (dependencies, distribution…)– Metadata catalogue – C/C++/perl/java API– ROOT interface (TAliEn)– SOAP/Web Services
– EDG compatible user interface• Common authentication• Compatible JDL (Job description language) based on CONDOR ClassAds
3rd May’03 Nick Brook – 4th LHC Symposium 9
(…)
DB
I
DB
D
RD
BM
S(M
ySQ
L)
LDA
P
V.O
.Pack
ages
&C
om
mand
s
Perl C
ore
Perl M
odule
s
Exte
rnal
Libra
ries
File &
Meta
data
C
ata
logue
SO
AP
/XM
LC
ESE
Logger
Data
base
Pro
xy
Auth
enti
cati
on
RB
Use
r Inte
rface
AD
BI
Config
Mgr
Packa
ge
Mgr W
eb
Port
al
Use
r A
pplica
tion
API (C
/C+
+/p
erl)
CLI
GU
I
AliEn
Core
Com
pon
en
ts &
serv
ices
Inte
rfaces
Exte
rnal soft
ware
Low
level
Hig
h level
FS
AliEn Architecture
3rd May’03 Nick Brook – 4th LHC Symposium 10
ALICE have deployed a distributed computing environment which meets their experimental needs Simulation & Reconstruction
Event mixing
Analysis
Using Open Source components (representing 99% of the code), internet standards (SOAP,XML, PKI…) and scripting language (perl) has been a key element - quick prototyping and very fast development cycles
close to finalizing AliEn architecture and API
OpenAliEn?
3rd May’03 Nick Brook – 4th LHC Symposium 11
PROOF – The Parallel ROOT Facility
Collaboration between core ROOT group at CERN and MIT Heavy Ion Group
Part of and based on ROOT framework Uses heavily ROOT networking and other infrastructure
classes Currently no external technologies
Motivation:o interactive analysis of very large sets of ROOT data
files on a cluster of computerso speed up the query processing by employing
parallelismo to extend from a local cluster to a wide area “virtual
cluster” - GRID. o analyze a globally distributed data set and get back a
“single” result with “single” query
3rd May’03 Nick Brook – 4th LHC Symposium 12
PROOF – Parallel Script Execution
root
Remote PROOF Cluster
proof
proof
proof
TNetFile
TFile
Local PC
$ root
ana.Cstdout/obj
node1
node2
node3
node4
$ root
root [0] .x ana.C
$ root
root [0] .x ana.C
root [1] gROOT->Proof(“remote”)
$ root
root [0] tree->Process(“ana.C”)
root [1] gROOT->Proof(“remote”)
root [2] chain->Process(“ana.C”)
ana.C
#proof.confslave node1slave node2slave node3slave node4
*.root
*.root
*.root
proof
proof = master server*.root
proof
proof = slave server
TFile
TFile
3rd May’03 Nick Brook – 4th LHC Symposium 13
PROOF & the Grid
3rd May’03 Nick Brook – 4th LHC Symposium 14
Converter
Algorithm
Event DataService
PersistencyService
DataFiles
AlgorithmAlgorithm
Transient Event Store
Detec. DataService
PersistencyService
DataFiles
Transient Detector
Store
MessageService
JobOptionsService
Particle Prop.Service
OtherServices
HistogramService
PersistencyService
DataFiles
TransientHistogram
Store
ApplicationManager
ConverterConverter
Gaudi – ATLAS/LHCb software framework
3rd May’03 Nick Brook – 4th LHC Symposium 15
GAUDI Program
GANGAGU
I
JobOptionsAlgorithms
Collective&
ResourceGrid
Services
HistogramsMonitoringResults
GANGA: Gaudi ANd Grid AllianceJoint Atlas and LHCb project,
Based on the concept of Python bus:use different modules whichever are required to provide full functionality of the interfaceuse Python to glue this modules, i.e., allow interaction and communication between them
3rd May’03 Nick Brook – 4th LHC Symposium 16
Python Software Bus
Server
BookkeepingDBProductio
nDB
EDG UI
PYTHON SW BUS
XML RPC server
XML RPC module
GANGA Core Module
OS Module
Athena\GAUDI
GaudiPython PythonROOT
PYTHON SW BUS
GU
I
JobConfigurati
onDB
Remote user
(client)
Local JobDB
LAN/WAN
GRID
LRMS
3rd May’03 Nick Brook – 4th LHC Symposium 17
Most of base classes are developed. Serialization of objects (user jobs) is implemented with the Python pickle module.
GaudiApplicationHandler can access Configuration DB for some Gaudi applications (Brunel). It is implemented with the xmlrpclib module. Ganga can create user-customized Job Options files using this DB.
DaVinci and AtlFast application handlers are implemented Various LRMS are implemented - allows to submit and to get
simple monitoring information for a job on several batch systems.
Much of GRID-related functionality is already implemented in GridJobHandler using EDG testbed 1.4 software. Ganga can submit, monitor, and get output from GRID jobs.
JobsRegistry class provides jobs monitoring via multithreaded environment based on Python threading module
GUI available - using wxPython extension module ALPHA release available
Current Status
3rd May’03 Nick Brook – 4th LHC Symposium 18
… … … …
Reconstruction,L1, HLTORCA DST
CMS analysis/production chain
… … … …
DigitizationORCA
Digis:raw data
bx
AnalysisIguana/
Root/PAW
Ntuples:MC info,tracks,
etc
DST strippingORCA
… … … …
MB… … … … MC ntuples
Event generationPYTHIA
b/ e/ JetMet
Calib
rati
on
Detector simulationOSCAR
Detector Hits
MB… … … …
3rd May’03 Nick Brook – 4th LHC Symposium 19
Production system and data repositories
ORCA analysis farm(s) (or distributed `farm’
using grid queues)
RDBMS based data
warehouse(s)
PIAF/Proof/..type analysis
farm(s)
Local disk
User
Tier 1/2
Tier 0/1/2
Tier 3/4/5
TAGs/AODsdata flow
Physics Query flow
Productiondata flow
TAG and AOD extraction/conversion/transport services
Data extraction
Web service(s)
Local analysis tool: Iguana/ROOT/… Web browser
Query Web service(s)
Tool plugin
module
CMS components and data flows
3rd May’03 Nick Brook – 4th LHC Symposium 20
Grid-enabling the working environment for physicists' data analysis
Clarens consists of a server communicating with various clients via the commodity XML-RPC protocol. This ensures implementation independence.
The server will provide a remote API to Grid tools:
Client
RPC
Web Server
Clarens
Service
http
/htt
ps
The Virtual Data Toolkit: Object collection access Data movement between Tier centres using GSI-
FTP CMS analysis software (ORCA/COBRA), Security services provided by the Grid (GSI) No Globus needed on client side, only certificate Current prototype is running on the Caltech proto-Tier2
CLARENS – a CMS Grid Portal
3rd May’03 Nick Brook – 4th LHC Symposium 21
CLARENS
Proxy escrow Client access available from wide variety of languages
PYTHON C/C++ Java application Java/Javascript browser-based client
Access to JetMET data via SQL2ROOT Root access to remote data files Access to files managed by San Diego SC storage
resource broker (SRB)
Several web services applications have been built on the Clarens web service architectures:
3rd May’03 Nick Brook – 4th LHC Symposium 22
Summary
•all 4 expts have successfully “managed” distributed production
• many lessons learnt – not only by expt but useful feedback to m/w providers
• a large degree of automisation achieved
•Expts moving onto next challenge – analysis
• Chaotic, unmanaged access to data & resources
• Tools already (being) developed to aid Joe Bloggs
•Success will be measured in terms:
• Simplicity, stability & effectiveness
• Access to resources
• Management & access to data
• Ease of development of user applications