mandate of the lcg/eis team

23
Enabling Grids for E-sciencE www.eu-egee.org Experience Supporting the Integration of LHC Experiments Computing Systems with the LCG Middleware Simone Campana LCG Experiment Integration and Support CERN-IT / INFN-CNAF

Upload: zahir-melendez

Post on 31-Dec-2015

21 views

Category:

Documents


0 download

DESCRIPTION

Experience Supporting the Integration of LHC Experiments Computing Systems with the LCG Middleware Simone Campana LCG Experiment Integration and Support CERN-IT / INFN-CNAF. Mandate of the LCG/EIS Team. EIS : Experiment Integration and Support Team - PowerPoint PPT Presentation

TRANSCRIPT

Enabling Grids for E-sciencE

www.eu-egee.org

Experience Supporting the Integration of LHC Experiments Computing Systems

with the LCG Middleware

Simone CampanaLCG Experiment Integration and Support

CERN-IT / INFN-CNAF

CHEP06 – 12-17 February 2006 – Mumbay (India) 2

Enabling Grids for E-sciencE

[email protected]

Mandate of the LCG/EIS Team

EISEIS : Experiment Integration and SupportExperiment Integration and Support Team

Help LHC Experiments integratingintegrating their production environment with the Grid Middleware and utilities.

Offer supportsupport during all steps of integration processunderstanding of the middleware functionalitytesting new prototypal componentsgetting on the LCG Infrastructure.

One person dedicated to each LHC Experiment

ProductionProduction is the main focus. Experiment Support does not mean User SupportUser Support. Experiment Support does not mean GOCGOC.

CHEP06 – 12-17 February 2006 – Mumbay (India) 3

Enabling Grids for E-sciencE

[email protected]

Main Tasks

IntegrationIntegration Middleware functionality and usage Functionality tests Customized distributions and missing tools Discuss requirements

And bring them to the attention of the developers

Experiment and User SupportExperiment and User Support Documentation: Manuals, Guides, FAQ First line User Support Monitoring experiment specific production system

Provide infrastructure expertiseProvide infrastructure expertise Monitoring/Managing Services

GRID and Experiment Specific Solving site-related problems Service Challenge Second Level Support (on shift)

CHEP06 – 12-17 February 2006 – Mumbay (India) 4

Enabling Grids for E-sciencE

[email protected]

Tools… Tools… Tools…

Data Management Customized version of LCG Data Management clients

Workload Management Monitoring of the job “standard error” and “standard output”

g-peek Estimate job normalized CPU and Wall Clock time left on CPU

Information System C++ Generic API (with ldap and R-GMA backends) User friendly querying tools

Generic Framework for Job Submission Intensively used by GEANT4

Many others …

Several functionalitiesSeveral functionalities provided by the tools have been integrated in the integrated in the MiddlewareMiddleware See the g-peek functionality

CHEP06 – 12-17 February 2006 – Mumbay (India) 5

Enabling Grids for E-sciencE

[email protected]

Monitoring Tools

ATLAS SC3

Service Monitor

LHCb specific

Site Functional Tests

CHEP06 – 12-17 February 2006 – Mumbay (India) 6

Enabling Grids for E-sciencE

[email protected]

Experiment Software Installation

Lcg-ManageSoftwareLcg-ManageSoftware

Lcg-ManageVOTagLcg-ManageVOTagTank&SparkTank&Spark

gsskloggssklog

Lcg-asis Lcg-asis UI

WN

CE

CHEP06 – 12-17 February 2006 – Mumbay (India) 7

Enabling Grids for E-sciencE

[email protected]

VO-BOX

First prototype developed and packageddeveloped and packaged by EIS.

Evaluation of the Globus GSI-enabled ssh server and relative configuration

Development of a ad-hoc proxy renewal server with relative user level tool

Overall configuration of the node type Inclusion of UI clients and gssklog

Following up installation issuesinstallation issues and further discussions on possible evolution discussions on possible evolution

CHEP06 – 12-17 February 2006 – Mumbay (India) 8

Enabling Grids for E-sciencE

[email protected]

EIS on ALICE

EIS For Data Challenges 04 and 05Data Challenges 04 and 05

Offered support for the integration of ALICE framework with LCG servicesIntegration with existing LCG services Development of new tools

Follow up of production exerciseProvided solution for site specific problemsFollow up of services deployment at the sites

Collected ALICE requirements for middleware developers

CHEP06 – 12-17 February 2006 – Mumbay (India) 9

Enabling Grids for E-sciencE

[email protected]

EIS on ALICE

Development ALICE specific user level toolsALICE specific user level tools

Integration of Monalisa monitoring system with LCGLater, the tools have been generalized for other use-cases

FTS transfer handling clientThen integrated in the ALICE framework

Publication of VO specific services in the Information System Included as part of the VO-BOX middleware component

CHEP06 – 12-17 February 2006 – Mumbay (India) 10

Enabling Grids for E-sciencE

[email protected]

Some Results of the last PDC04

◘ Statistics after phase 1 (ended April 4, 2004): ➸ ALICE::CERN::LCG is the interface to LCG-2 ➸ ALICE::Torino::LCG is the interface to GRID.IT

4

~ 1.3 million files, 26 TB data volumeS. Bagnasco. SC3 Detailed Planning Workshop, CERN 13.June, 05 )

CHEP06 – 12-17 February 2006 – Mumbay (India) 11

Enabling Grids for E-sciencE

[email protected]

EIS in ATLAS

Support in the development of ATLAS framework Data Management Workload

management Operational support

Exclusion of problematic sites

Follow up of site configuration problems

Understanding of failures and suggestion of solutions

jobs per day

0

1000

2000

3000

4000

5000

6000

7000

8000

6/2

5/2

004

7/2

5/2

004

8/2

5/2

004

9/2

5/2

004

10/2

5/2

004

11/2

5/2

004

12/2

5/2

004

1/2

5/2

005

2/2

5/2

005

3/2

5/2

005

4/2

5/2

005

5/2

5/2

005

6/2

5/2

005

7/2

5/2

005

Number of jobs per dayNumber of jobs per day

Data Data Challenge 2Challenge 2

Rome Rome ProductionProduction

Large event production production for Physics Rome workshop Rome workshop

EIS support activitiesEIS support activities

CHEP06 – 12-17 February 2006 – Mumbay (India) 12

Enabling Grids for E-sciencE

[email protected]

Rome Production experience on LCG

Jobs distributed to 45 different computing resources

Ratio generally proportional to Ratio generally proportional to the size of the clusterthe size of the cluster indicates an overall good job

distribution.

No site in particular ran large majority of jobs. The site with the largest number

of CPU resources (CERN), contributed for about 11% of the ATLAS production.

Other major sites ran between 5% and 8% of the jobs each.

Achievement toward a more

robust and fault-tolerant systemrobust and fault-tolerant system does not rely on a small number

of large computing centers.

cnaf.infn.it 7% roma1.infn.it

5%

lnl.infn.it 4%

ba.infn.it 2%

mi.infn.it 2%

others infn.it5%

ihep.su 2%

in2p3-cc.fr 5%

in2p3-cppm.fr 1%

others fr1%

prague.cz 3%

rl.ac.uk 7%shef.ac.uk

5%

fzk.de 5%

cern.ch 11%

grnet.gr 1%

nikhef.nl 5%

sara.nl 2% others

5%

triumf.ca 2%ifae.es

1%ific.uv.es

4%

ft.uam.es 5%

sinica.edu.tw 3%

others ac.uk2%

ox.ac.uk 2%

cnaf.infn.it

roma1.infn.it

lnl.infn.it

ba.infn.it

mi.infn.it

others infn.it

ihep.su

in2p3-cc.fr

in2p3-cppm.fr

others fr

prague.cz

rl.ac.uk

shef.ac.uk

ox.ac.uk

others ac.uk

sinica.edu.tw

ft.uam.es

ific.uv.es

ifae.es

triumf.ca

fzk.de

cern.ch

grnet.gr

nikhef.nl

sara.nl

others

The percentage of ATLAS jobs run at each LCG site

CHEP06 – 12-17 February 2006 – Mumbay (India) 13

Enabling Grids for E-sciencE

[email protected]

EIS in ATLAS

Service Challenge 3Service Challenge 3 Support to the ATLAS Data Management System

File Transfer Service (FTS) and LCG File Catalog (LFC)Prototype Data Location Interface (DLI) developed

• ATLAS WMS and DDM integration.

Role in the technical coordination of the ATLAS Service Challenge activities

ensuring the readiness of the sites before and during the exercise following up issues with the different services.

TestingTesting Several new glite components (WMS, gpbox, FTS …) In the context of the task force and in collaboration with ARDA

User SupportUser Support Analysis on LCG produced data

CHEP06 – 12-17 February 2006 – Mumbay (India) 14

Enabling Grids for E-sciencE

[email protected]

EIS in CMS

LFCLFC evaluation as a POOL file catalogPOOL file catalog use case: local file catalog performance tests

Results: LFC and POOL_LFC interface issues discovered and fixed

LFCLFC evaluation as a Data Location SystemData Location System implementation of a Python API performance tests

Results: LFC was found to be an valid implementation of a DLS; performance issues discovered and fixed

CHEP06 – 12-17 February 2006 – Mumbay (India) 15

Enabling Grids for E-sciencE

[email protected]

EIS in CMS

Service Challenge 3Service Challenge 3

fake analysis job submission analysis of job failures and related statistics Results: much better understanding of the stability of the LCG

infrastructure when intensively used

SupportSupport

active in the solution of Grid-related problems for the MC production and user analysis (CRAB) activities

CMS VO managementVO management

CHEP06 – 12-17 February 2006 – Mumbay (India) 16

Enabling Grids for E-sciencE

[email protected]

The CMS Analysis Jobs

Taken from the CMS Taken from the CMS

Dashboard Dashboard (ARDA)(ARDA)

CHEP06 – 12-17 February 2006 – Mumbay (India) 17

Enabling Grids for E-sciencE

[email protected]

EIS in LHCb

EIS supported LHCb along many activities:Data Challenge 04Data Challenge 04Service Challenge 3Service Challenge 3Analysis exerciseAnalysis exercise

Operation support chasing/tackling sites and middleware related

problemsdeveloping experiment specific monitoring tools

T1-T1 transfer monitor for SC3 VO oriented plug-ins for SFT

CHEP06 – 12-17 February 2006 – Mumbay (India) 18

Enabling Grids for E-sciencE

[email protected]

EIS in LHCb

Integration of LHCb framework and LCG middleware Offering suggestions for an optimized middleware usage Development of user level tools

Query the information system, interactions with SRM, LFC, DLI.

Repackaging or customized version of existing tools lcg_utils and GFAL

User Support Especially for analysis users Using the GGUS portal

Testing of new components CREAM CE, g-pbox, WMS …

CHEP06 – 12-17 February 2006 – Mumbay (India) 19

Enabling Grids for E-sciencE

[email protected]

The LHCb Data Challenge

DIRAC alone

LCG inaction

1.8 106/day

LCG paused

Phase 1 Completed

3-5 106/day

LCG restarted

187 M Produced Events

61% efficiency for LCG

Number of Jobs run

versus time

Jobs run in LCG and

Dirac-only sites

CHEP06 – 12-17 February 2006 – Mumbay (India) 20

Enabling Grids for E-sciencE

[email protected]

WISDOMWISDOM: research on : research on malaria medical caremalaria medical care Major success in EGEE

1 million of potential 1 million of potential medicines tested in 1 medicines tested in 1 weekweek

1000 CPUs employed in EGEE/LCG

Support to Biomedical community and the WISDOM project

First no-HEPno-HEP VO supported by EIS Different needs, access pattern,

user scenarios Scattered and heterogeneous

community Main support activities for Biomed:

Improvement of Job submission strategy

Adaptation of application to Grid Environment

Oparational support User Support

Biomedical Data ChallengeData Challenge in July - August 2005 ~70000 jobs run 1 TB of data produced equivalent of ~70 CPU years

computed.

CHEP06 – 12-17 February 2006 – Mumbay (India) 21

Enabling Grids for E-sciencE

[email protected]

GEANT4

GEANT4GEANT4: simulation of particle interactions with simulation of particle interactions with mattermatter. HEP and nuclear experiments, medical, accelerator, space

physics 3 major productions on LCG

First 2 hosted by dteam and alice, third as a real VO Aimed to test new version of software

EIS support in GEANT “Gridification” process“Gridification” process Development of tools for job submission an handling

Then extended and generalized for other VOs Creation and administration of the GEANT VO

Contact point for the EGEE ROC managers Operational support during production

CHEP06 – 12-17 February 2006 – Mumbay (India) 22

Enabling Grids for E-sciencE

[email protected]

Relief Projects of UNOSAT

Case StudyCase Study: Indian Ocean Tsunami Relief and Development 29th Dec 2004: First Map distributed online to field users January 2005: 200,000 tsunami maps downloaded in total

UNOSAT has a huge amount of data to be stored

Good amount of storage provided by CERN

Running and storing data in LCG/EGEE can certainly assist UNOSAT in their purposes

In Summer 2005 the collaboration with LCG started

Gridification prcess similar Gridification prcess similar to GEANT4 experienceto GEANT4 experience

CHEP06 – 12-17 February 2006 – Mumbay (India) 23

Enabling Grids for E-sciencE

[email protected]

Summary

Our mailing list: [email protected] WEB site: http://lcg.web.cern.ch/LCG/eis.htm

EIS provides help integratingintegrating VOVO specific software environmentsoftware environment with GRID middlewareGRID middleware Direct experiment support via a contact persons Special middleware distributions documentation User support

Data Challenges, Service Challenges and Distributed ProductionsData Challenges, Service Challenges and Distributed Productions Follow up of operational issues

maintaing experiment specific servicesassisting sites with configuration problems

Not anymore “sporadic” exercises. Overall a very interesting a productive experienceinteresting a productive experience

LHC experiments and other VOs seem to find EIS team very supportive