alice operations short summary lhcc referees meeting june 12, 2012

ALICE Operationsshort summary

LHCC Referees meeting June 12, 2012

Data taking in 2012

2

Stable operation, steady data taking, good performance of the LHC and ALICE

Accumulation of RAW since beginning of 2012 runTotal 450 TB of physics data

Data processing

3

2011 pp data processed in Pass1 most of the periods in Pass2 some special (2.76GeV reference data) Pass3

2011 PbPb data Pass2 completed in time for “Hard Probes” Pass3 in preparation for QM’2012 in August

2012 pp processing with new calibration schema Presently running large MC productions

… and user analysis

Production job efficiencies

4

MC (aliprod), RAW (alidaq), QA and AOD filteringAverages: aliprod: 90%, alidaq: 75%, global: 82%

LHC11h Pass1

LHC11h Pass2

Chaotic and organized analysis

5

• The efficiency remains almost constant (trains)

March – average 10K jobs7.9GB/sec from SE

Last month – 11K (+10%)9.8GB/sec from SE (+20%)

Analysis efficiencies

6

Analysis uses ~1/3 of available CPU resources

Average efficiency ~30%, with large variations

I/O intensive jobs are not ‘inefficient’ they simply use other aspects of the hardware…

… that said, the CPU/Wall is what is being measured and reported and thus we work to improve it

Efficiency strongly depends on the type of analysis

The Analysis Trains

Polling together many user analysis tasks (wagons) in a single set of Grid jobs (the train)

Managed through a web interface by a Physics Working Group conductor (ALICE has 8 PWGs)

Provides a configuration and test platform (functionality, memory, efficiency) and a submission/monitoring interface

Speed – few days to go through a complete period

AliROOT

Analysis Framework

MonALISA

Web interfaceLPM

AliEn

Grid jobs

7

Status Trains created for 6 PWGs

4 PWGs already actively submitting trains

2 in testing phase

Up to 30 wagons (individual tasks) for the most advanced

Even for non-optimized trains the average efficiency is >60%

8

Future of trains

9

Only viable method for running large scale (many hundreds of TB input data) analysis on a regular basis

~300TB is what the users have to go through today to analyze the 2010 and 2011 PbPb data samples + associated MC

Works equally well with smaller samples – the time gain is significant over chaotic user analysis

Does not replace chaotic fully, but individual users do no need to run over everything…

Storage

10

Following a T1-T2 workshop in January – sites call for a significant improvement of SE monitoring and control

xrootd development is ongoing new MonALISA tools for the servers

These will be presented and refined during the 2nd Federated Data Workshop at Lyon CCIN2P3 (September 2012)

We expect that the combined system will significantly improve

the efficiency of the jobs ..by detecting bottlenecks and services overloads

Services summary

11

Stable operation of all components

Minor hiccups, very good support at all centers

Storage remains critical

Continuous cleanup of replicas (of older productions) and having less replicas of ESDs (analysis firmly on AODs)

The ongoing installation of 2012 pledges is reducing the urgency at some sites

CPU requirements update

CCRC F2F 10/01/2008 12

2012

2013

OLD

2012

2013

NEW

Summary

13

The 2011 pp & HI period data is processed and being analyzed

2012 pp data taking and processing ongoing

Many thanks to the T0/T1/T2 centers for the excellent performance and support

alice operations short summary lhcc referees meeting june 12, 2012

Documents

user analysis slide

type of analysis slide

new slide

tb of physics data slide

analysis efficiencies

organized analysis

pp data

data processing