alice operations short summary lhcc referees meeting june 12, 2012
TRANSCRIPT
ALICE Operationsshort summary
LHCC Referees meeting June 12, 2012
Data taking in 2012
2
Stable operation, steady data taking, good performance of the LHC and ALICE
Accumulation of RAW since beginning of 2012 runTotal 450 TB of physics data
Data processing
3
2011 pp data processed in Pass1 most of the periods in Pass2 some special (2.76GeV reference data) Pass3
2011 PbPb data Pass2 completed in time for “Hard Probes” Pass3 in preparation for QM’2012 in August
2012 pp processing with new calibration schema Presently running large MC productions
… and user analysis
Production job efficiencies
4
MC (aliprod), RAW (alidaq), QA and AOD filteringAverages: aliprod: 90%, alidaq: 75%, global: 82%
LHC11h Pass1
LHC11h Pass2
Chaotic and organized analysis
5
• The efficiency remains almost constant (trains)
March – average 10K jobs7.9GB/sec from SE
Last month – 11K (+10%)9.8GB/sec from SE (+20%)
Analysis efficiencies
6
Analysis uses ~1/3 of available CPU resources
Average efficiency ~30%, with large variations
I/O intensive jobs are not ‘inefficient’ they simply use other aspects of the hardware…
… that said, the CPU/Wall is what is being measured and reported and thus we work to improve it
Efficiency strongly depends on the type of analysis
The Analysis Trains
Polling together many user analysis tasks (wagons) in a single set of Grid jobs (the train)
Managed through a web interface by a Physics Working Group conductor (ALICE has 8 PWGs)
Provides a configuration and test platform (functionality, memory, efficiency) and a submission/monitoring interface
Speed – few days to go through a complete period
AliROOT
Analysis Framework
MonALISA
Web interfaceLPM
AliEn
Grid jobs
7
Status Trains created for 6 PWGs
4 PWGs already actively submitting trains
2 in testing phase
Up to 30 wagons (individual tasks) for the most advanced
Even for non-optimized trains the average efficiency is >60%
8
Future of trains
9
Only viable method for running large scale (many hundreds of TB input data) analysis on a regular basis
~300TB is what the users have to go through today to analyze the 2010 and 2011 PbPb data samples + associated MC
Works equally well with smaller samples – the time gain is significant over chaotic user analysis
Does not replace chaotic fully, but individual users do no need to run over everything…
Storage
10
Following a T1-T2 workshop in January – sites call for a significant improvement of SE monitoring and control
xrootd development is ongoing new MonALISA tools for the servers
These will be presented and refined during the 2nd Federated Data Workshop at Lyon CCIN2P3 (September 2012)
We expect that the combined system will significantly improve
the efficiency of the jobs ..by detecting bottlenecks and services overloads
Services summary
11
Stable operation of all components
Minor hiccups, very good support at all centers
Storage remains critical
Continuous cleanup of replicas (of older productions) and having less replicas of ESDs (analysis firmly on AODs)
The ongoing installation of 2012 pledges is reducing the urgency at some sites
CPU requirements update
CCRC F2F 10/01/2008 12
2012
2013
OLD
2012
2013
NEW
Summary
13
The 2011 pp & HI period data is processed and being analyzed
2012 pp data taking and processing ongoing
Many thanks to the T0/T1/T2 centers for the excellent performance and support