alie n @grid pablo saiz/cern p.buncic, j-e. revsbech r.piskac, v.sego, l. aphecetche alice...

39
Ali En @GRID Pablo Saiz/CERN P.Buncic, J-E. Revsbech R.Piskac, V.Sego, L. Aphecetche ALICE Collaboration ALICE Environment on the GRID

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

AliEn

@GRID

Pablo Saiz/CERN

P.Buncic, J-E. RevsbechR.Piskac, V.Sego, L. Aphecetche

ALICE Collaboration

ALICE Environment on the GRID

04/18/23 [email protected] 2

AliEn

@GRIDContent

Alice at LHCAlice Computing ModelBuilding AliEnAliEn ComponentsDeploying AliEnAliEn RoadmapConclusions

04/18/23 [email protected] 3

AliEn

@GRID

ALICE @ LHC

04/18/23 [email protected] 4

AliEn

@GRIDCERN - LHC

04/18/23 [email protected] 5

AliEn

@GRIDConstruction

04/18/23 [email protected] 6

AliEn

@GRIDProblem

Typical next generation HEP experiment Large scale simulation & reconstruction effort Heavily distributed processing and event storage

~1000 scientists in ~100 of institutions Complex analyses of distributed data Large files (one event up to 2GB)

10^9 files/year (x n, n>2)2 PB/year

Experiment lifetime 20-25 years

GRID Widely accepted as a solution

04/18/23 [email protected] 7

AliEn

@GRIDAlice Use Cases

Simulation, Data Challenges & Reconstruction Centrally managed production of background events Distributed processing and event storage

Event mixing Not necessarily centrally managed Once background events exist, the subsequent requests for

event mixing must be routed to the location which holds required input

Analysis Using AliEn API, PROOF will locate optimal site(s) for macro

execution, try to execute it in parallel, collect the output and return it to the user (or register it in the catalogue)

04/18/23 [email protected] 8

AliEn

@GRID

ALICE Computing Model

04/18/23 [email protected] 9

AliEn

@GRID AliEn

@GRID

AliROOTAliROOT

ROOTROOT

UserUserSimulation,

Reconstruction,Calibration,

Analysis SystemSystemGUI

Persistent IOUtility libs

WorldWorldInterfaces &Distributed computing

environment

C++

C++

anything

Nice! I only

have to learn C++

04/18/23 [email protected] 10

AliEn

@GRIDChallenge

Can we provide, building on top of available public domain and open

source components and standards, a functional distributed computing

infrastructure to community of ALICE users which will remain operational

even if underlying technologies keep changing?

04/18/23 [email protected] 11

AliEn

@GRID

Building AliEn

04/18/23 [email protected] 12

AliEn

@GRIDBuilding AliEn

04/18/23 [email protected] 13

AliEn

@GRID

Open Source Components

SASL/OpenSSL/OpenCA as authentication protocolGlobus/GSS as an implementation of authentication compatible with other Grid projectsCONDOR ClassAds language for job description (compatible with EU DataGrid) OpenLDAP for configuration managementApache for Web PortalMySQL as relational database backendBbftp as file transfer protocols

04/18/23 [email protected] 14

AliEn

@GRID

Gluing it together…

Already existing pieces of code (NA49 file catalogue) in perl5

Good interface to different databases Easy Web integration

Simple Object Access Protocol (also known as Service Oriented Access Protocol)

Good Perl implementation (SOAP::Lite) on client and server side

Possibility to provide client access from many different platforms and languages (Java,C,C++…)

Provides standard means to invoke procedures (services) in distributed environment

04/18/23 [email protected] 15

AliEn

@GRIDComponents

ALIEN

C lus te rM o nito r

P roc e s sM onitor

C o m putingEle m e nt

Sto rageEle m e nt

FTD Inform ation

S e rvic e

D BP ro xy

Lo gge rAuthe

n

CP US erver

W e bP o rtal

R B

U se rApplic atio n

(C /C + + )

A liEn -Ba se

R PM

AliEn -C lien t

R PM

AliEn -Po rta l

R PM

A liEn -SE

R PM

AliEn -C E

R PMAliEn -Se r

ve r

R PM

AliEn -A lice

R PM

AP I

C /C ++

AliEn Services

Modules & libraries

04/18/23 [email protected] 16

AliEn

@GRID

R D B M S

D BD rive r

U se rInte rfac e

B aseC lie nt

C lus te rM o nito r

B aseC lie nt

P roc e s sM onitor B ase

C lie nt

C o m putingEle m e nt

B aseC lie nt

C E

Alic eAtlas

Sto rageEle m e nt

B aseC lie nt

SE

FTD

B ase

C lie nt

IS

B ase

C lie nt

D BP ro xy

B aseC lie nt

Lo gger

B ase

C lie nt Authen

B aseC lie nt CP U

S erver

B ase

C lie nt

Se rve rSe rve r

Se rve r

U se rApplic atio n

(C /C + + )

LD AP

W e bP o r tal

B aseC lie nt

P o r tal

R B

B ase

C lie nt

Se rve rSe rve r“Web of Services”

04/18/23 [email protected] 17

AliEn

@GRIDStatistics…

SLOCCount is Open Source Software/Free Software, licensed under the FSF GPL.Please credit this data as "generated using 'SLOCCount' by David A.

Wheeler."

Total Physical Source Lines of Code (SLOC) 21,428Development Effort Estimate, Person-Years 5Estimated Average Number of Developers 5.06Total Estimated Cost to Develop 674,796

AliEn ComponentsAliEn

Portal

scripts

SASL

Monitor

GUI

API

DBD

GSS

Alice

Html

Total SLOC grouped by language

perl

ansic

sh

04/18/23 [email protected] 18

AliEn

@GRIDStatistics…

SLOCCount is Open Source Software/Free Software, licensed under the FSF GPL.Please credit this data as "generated using 'SLOCCount' by David A.

Wheeler.“

Total SLOC grouped by languageansic

perl

sh

cpp

php

asm

yacc

lisp

java

lex

python

Modulesapache

perl-5.6.1

globus

openldap-2.0.23

Gtk-Perl-0.7008

openssl-0.9.6b

freetype-1.3.1

SWIG-1.3.11

imlib2-1.0.5

gpt-1.0

edb-1.0.2

Total Physical Source Lines of Code (SLOC) 1,711,001Development Effort Estimate, Person-Years 496.53Estimated Average Number of Developers 87.63Total Estimated Cost to Develop 67,074,030

04/18/23 [email protected] 19

AliEn

@GRID

1%

99%

AliEn

Modules

Benefits of development based on OpenSource components are more than obvious…

AliEn vs OpenSource

04/18/23 [email protected] 20

AliEn

@GRID

AliEn Components

04/18/23 [email protected] 21

AliEn

@GRID

AliEn SASL implementation

SASL is the Simple Authentication and Security Layer, a method for adding authentication support to connection-based protocols AliEn now has perl module with implementation GSSAPIThis allows us to use

all SASL authentication schemes old AliEn authentication (token, AFS password, SSH) X509 certificates Globus/GSI (credential delegation)

AliEn distribution includes necessary Globus/MDS/GSI software

This allows us to develop secure Peer-To-Peer File Transfers based on machine/protocol/user certificates and LDAP based configuration management

04/18/23 [email protected] 22

AliEn

@GRIDAuthentication

ClientProxy Server

DatabaseLDAP

Request methods

List of methods

SASL AuthenticationChecking if user

exists

Data Data

X509(AliEn/Globus)PKI/RSA (ssh)Token (AliEn)AFS password

04/18/23 [email protected] 23

AliEn

@GRIDFile catalogue

ALICEUSERS

ALICESIM

Tier1

ALICELOCAL

|--./| |--cern.ch/| | |--user/| | | |--a/| | | | |--admin/| | | | || | | | |--aliprod/| | | || | | |--f/| | | | |--fca/| | | || | | |--p/| | | | |--psaiz/| | | | | |--as/| | | | | || | | | | |--dos/| | | | | || | | | | |--local/

|--simulation/| |--2001-01/| | |--V3.05/| | | |--Config.C| | | |--grun.C

| |--36/| | |--stderr| | |--stdin| | |--stdout| || |--37/| | |--stderr| | |--stdin| | |--stdout| || |--38/| | |--stderr| | |--stdin| | |--stdout

| | | || | | |--b/| | | | |--barbera/

Files, commands (job specification) as well as job

input and output and metadata are stored in the catalogue

04/18/23 [email protected] 24

AliEn

@GRID

Command Interface

04/18/23 [email protected] 25

AliEn

@GRIDGUI: AliEn Xfiles

04/18/23 [email protected] 26

AliEn

@GRIDWeb Portal

http://alien.cern.ch

•Generic Web portal•Virtual Organizations

•Alice•Atlas•NA49•Demo•Mammogrid

04/18/23 [email protected] 27

AliEn

@GRIDTask Queue

“Pull” rather than

“push”architecture

T ier0

T AS K Q UEUE

CP US erver

ACCT

REM O T ES IT E

Rem oteQ ueue

Clus terM onitor

J ob

1Pro ce ssM o n ito r

J ob

1Pro ce ssM o n ito r

J ob

2Pro ce ssM o n ito r

J ob

nPro ce ssM o n ito r

ACCT

REM O T E S IT Eor

AN O T HERG RID

Rem oteQ ueue

Clus terM onitor

AliEnS erver

EDG /G lo b us

04/18/23 [email protected] 28

AliEn

@GRID

AliEnTasks CEs

alien job-submit job.jdl

Broker

Yes: Select

Match ? No: Next

CE contacts CPUServer and presents its own ClassAd, Resource Broker will match them against job ClassAds

and select the most appropriate job to run on that CE

Resource Broker

04/18/23 [email protected] 29

AliEn

@GRID

Resource Broker

Resource Broker

OptimizerOptimizer

04/18/23 [email protected] 30

AliEn

@GRIDClass Ads &JDL

Requirements = ( other.Type == "machine" )

&&(member(other.Packages,"AliRoot") ); Packages = "AliRoot"; Arguments = "--round 2002-02 --run 00071 --event 269 --version v3.08.02 -–grun

G+F"; Executable = "/Alice/bin/AliRoot.sh"; InputFile = { "LF:/alice/simulation/2002-02/v3.08.02/00071/Config.C", "LF:/alice/simulation/2002-02/v3.08.02/00071/grun.C" }; Type = "Job";

An Example – JDL file to run Alice Simulation job:

04/18/23 [email protected] 31

AliEn

@GRIDClass Ads &JDL

Requirements = ( other.Type == “Job" ); Type = “machine"; Host = “alienx.cern.ch”;CE =“Alice::CERN::LXBATCH”;Packages = { "AliRoot“,

“ROOT”, “AliRoot::3.08.02”

}; CloseSE = { “Alice::CERN::Castor”,

“Alice::CERN::File”,“Alice::CERN::scratch”

};

Class Ads of CE:

04/18/23 [email protected] 32

AliEn

@GRID

Computer Computer

LocalCenter

GRID CENTER

Computer

Monitoring

In order to develop and deploy more refined Resource Broker we need monitoring framework

Frequent data updates, large data volume for large number of computers

The idea is to implement hierarchy of clients and servers where each client (child) maintains the history of measurements reports the summary information to upper layer (parent) using SOAP protocol

04/18/23 [email protected] 33

AliEn

@GRID

Deploying AliEn

04/18/23 [email protected] 34

AliEn

@GRID

First implementation of Alice World Computing Model

AliEn@GRID

04/18/23 [email protected] 35

AliEn

@GRID

Production Summary

5682 events validated, 118 failed (2%)Up to 300 concurrently running jobs worldwide (5 weeks)5 TB of data generated and stored at the sites with mass storage capability (CERN 73%,CCIN2P3 14%, LBL, 14%, OSC 1%)

GSI, Karlsruhe, Dubna, Nantes, Budapest, Bologna, Zagreb, Birmingham, Utrecht, Calcutta in addition ready by now

CERNCCIN2P3LBLTorinoCataniaPadovaOSCNIKHEFBari 13 clusters, 9

sites

10^5 CPU hours

04/18/23 [email protected] 36

AliEn

@GRID

AliEn Roadmap

04/18/23 [email protected] 37

AliEn

@GRID

AliEn as a meta-GRID

AliEn User Interface

AliEn stackiVDGL stack EDG stack

04/18/23 [email protected] 38

AliEn

@GRIDRoadmap…

Optimization and test suitePROOF interface & support for interactive jobsEDG interface GRID partitioningQueue optimization (based on AliEn monitoring) Implementation of Web services

SOAP (Simple Object Access Protocol) WSDL (Web Services Description Language) UDDI (Universal Description Discovery & Integration)

Virtual datasets

04/18/23 [email protected] 39

AliEn

@GRIDSummary

AliEn framework is a lightweight, simplified but functionally equivalent alternative to full blown GRID based on standard components (SOAP, Web services) It has been tested in production will be continuously developed with aim to provide long term stable interface to GRID(s) for Alice users AliEn will used to provide GRID component for MammoGRID – 3 year, 2M Euro project funded by EC, starting in September

Summary of AliEn features (visit http://alien.cern.ch)

Authentication module which supports various authentication methods (Globus/GSI)Distributed file catalogue built on top of RDBMS with user interface that mimics the file system Secure file transport and replication ServiceTask queue which holds commands to be executed in the system and Resource Broker

Configuration and Information Service Computing and Storage elementsMetadata catalogue Monitoring frameworkC/C++/perl APIWeb portal