distributed computing in high energy physics with grid technologies (grid tools at phenix)

24
PNPI HEPD seminar 4 th November 2003 1 Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Upload: amanda

Post on 22-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX). Topics. Grid/Globus HEP Grid Projects PHENIX as the example Conceptions and scenario for widely distributed multi cluster computing environment Job submission and job monitoring Live demonstration. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

PNPI HEPD seminar4th November 2003

1Andrey Shevel

Distributed computing in High Energy Physics with Grid Technologies

(Grid tools at PHENIX)

Page 2: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Topics Grid/Globus HEP Grid Projects PHENIX as the example Conceptions and scenario for

widely distributed multi cluster computing environment

Job submission and job monitoring Live demonstration

Page 3: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

What is the Grid “Dependable, consistent, pervasive

access to [high-end] resources”. Dependable: Can provide performance and

functionality guarantees. Consistent: Uniform interfaces to a wide

variety of resources. Pervasive: Ability to “plug in” from

anywhere.

Page 4: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Another Grid description Quote from Information Power Grid (IPG) at NASA

http://www.ipg.nasa.gov/aboutipg/presentations/PDF_presentations/IPG.AvSafety.VG.1.1up.pdf

Grids are tools, middleware, and services for: providing a uniform look and feel to a wide variety of

computing and data resources; supporting construction, management, and use of widely

distributed application systems; facilitating human collaboration and remote access and

operation of scientific and engineering instrumentation systems;

managing and securing the computing and data infrastructure.

Page 5: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Basic HEP requirements in distributed computing

Authentication/Authorization/Security

Data Transfer

File/Replica Cataloging

Match Making/Job submission/Job monitoring

System monitoring

Page 6: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

HEP Grid Projects

European Data Grid www.edg.org

Grid Physics Nuclear www.griphyn.org

Particle Physics Data Grid www.ppdg.net

Many others.

Page 7: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Possible Task Here we are trying to gather computing power

around many clusters. The clusters are located in different sites with different authorities. We use all local rules as they are: Local schedulers, policies, priorities; Other local circumstances.

One of many possible scenarios is discussed in this presentation.

Page 8: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

General scheme: jobs are planned to go where data are and to less loaded clusters

Remotecluster

RCFMainDataRepository

Partial Data Replica

File Catalog

User

Page 9: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

GridFTP(Globus-url-copy)

Globusjob-manager/fork

Package GSUNY

Cataloging engine

BOSS BODE

User Jobs

GT 2.2.4.latest

Base subsystems for PHENIX Grid

Page 10: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Input/Output Sandbox(es)

Master Job (script) submitted by userMajor Data Sets

(physics or simulated data)

Minor Data Sets(Parameters, scripts, etc.)

Satellite Job (script)Submitted by Master Job

Conceptions

Page 11: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

The job submission scenario at remote Grid cluster

To determine (to know) qualified computing cluster: available disk space, installed software, etc.

To copy/replicate the major data sets to remote cluster.

To copy the minor data sets (scripts, parameters, etc.) to remote cluster.

To start the master job (script) which will submit many jobs with default batch system.

To watch the jobs with monitoring system – BOSS/BODE.

To copy the result data from remote cluster to target destination (desktop or RCF).

Page 12: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Master job-script The master script is submitted from your desktop and

performed on the Globus gateway (may be in group account) with using monitoring tool (it is assumed BOSS).

It is supposed that the master script will find the following information in the environment variables: CLUSTER_NAME – name of the cluster; BATCH_SYSTEM – name of the batch system; BATCH_SUBMIT – command for job

submission through BATCH_SYSTEM.

Page 13: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Local desktop

Globusgateway

Submission of MASTER job

Through globus-jobmanager/fork

Remote Cluster

Job

subm

issio

n wi

th

Com

man

d $B

ATCH

_SUB

MIT

MASTER job is performingOn Globus gateway

Job submission scenario

Page 14: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Transfer the major data sets There are a number of methods to transfer

major data sets: The utility bbftp (whithout use of GSI) can

be used to transfer the data between clusters;

The utility gcopy (with use of GSI) can be used to copy the data from one cluster to another one.

Any third party data transfer facilities (e.g. HRM/SRM).

Page 15: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Copy the minor data sets There are at least two alternative methods to

copy the minor data sets (scripts, parameters, constants, etc.): To copy the data to

/afs/rhic.bnl.gov/phenix/users/user_account/… To copy the data with the utility

CopyMinorData (part of package gsuny).

Page 16: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Package gsunyList of scripts

General commands (ftp://ram3.chem.sunysb.edu/pub/suny-gt-2/gsuny.tar.gz) GPARAM – configuration description for set of

remote clusters GlobusUserAccountCheck – to check the

Globus configuration for local user account. gping – to test availability of the Globus gateways. gdemo – to see the load of remote clusters. gsub – to submit the job on less loaded cluster; gsub-data – to submit the job where data are; gstat, gget, gjobs – to get status of the job,

standard output, detailed info about jobs.

Page 17: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Package gsunyData Transfer

gcopy – to copy the data from one cluster (local hosts) to another one.

CopyMinorData – to copy minor data sets from cluster (local host) to cluster.

Page 18: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Job monitoring After the initial development of the description of

required monitoring tool (https://www.phenix.bnl.gov/phenix/WWW/p/draft/shevel/TechMeeting4Aug2003/jobsub.pdf ) it was found the packages: Batch Object Submission System (BOSS) by

Claudio Grandi http://www.bo.infn.it/cms/computing/BOSS/

Web interface BOSS DATABASE EXPLORER (BODE)

by Alexei Filine http://filine.home.cern.ch/filine/

Page 19: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Basic BOSS components boss executable:

the BOSS interface to the user

MySQL database: where BOSS stores job information

jobExecutor executable: the BOSS wrapper around the user job

dbUpdator executable: the process that writes to the database while the job is running

Interface to Local scheduler

Page 20: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Basic job flow

boss submitboss queryboss kill

BOSSDB

BOSS Local

SchedulerExec

node n

Execnode m

Globusgateway

BODE(Web interface)

Here is cluster N

gsub

mas

ter-

scrip

t

Globus Space

To wrap the job

Page 21: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

gsub TbossSuny # submit to less loaded cluster

[shevel@ram3 shevel]$ cat TbossSuny . /etc/profile. ~/.bashrcecho " This is master JOB"printenvboss submit -jobtype ram3master -executable ~/andrey.shevel/TestRemoteJobs.pl -stdout \ ~/andrey.shevel/master.out -stderr ~/andrey.shevel/master.err

[shevel@ram3 shevel]$ CopyMinorData local:andrey.shevel unm:.+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ YOU are copying THE minor DATA sets --FROM-- --TO--Gateway = 'localhost' 'loslobos.alliance.unm.edu' Directory = '/home/shevel/andrey.shevel' '/users/shevel/.'

Transfer of the file '/tmp/andrey.shevel.tgz5558' was succeeded

Page 22: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Status of the PHENIX Grid Live info is available on the page

http://ram3.chem.sunysb.edu/~shevel/phenix-grid.html

The group account ‘phenix’ is available now at SUNYSB (rserver1.i2net.sunysb.edu) UNM (loslobos.alliance.unm.edu) IN2P3 (in process now)

Page 23: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

http://ram3.chem.sunysb.edu/~magda/BODE

User: guestPass: Guest101

Live Demo for BOSSJob monitoring

Page 24: Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Andrey ShevelPNPI HEPD seminar4th November 2003

Computing Utility (instead conclusion)

It is clear that computing utility (computing cluster built up for concrete collaboration tasks).

The computing utility can be implemented anywhere in the World.

The computing utility can be used from anywhere (France, USA, Russia, etc.).

Most important part of the computing utility is man power.