pilots 2.0: dirac pilots for all the skies federico stagni, a.mcnab, c.luzzi, a.tsaregorodtsev on...

17
Pilots 2.0: DIRAC pilots for all the skies Federico Stagni , A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

Upload: veronica-skinner

Post on 25-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

Pilots 2.0:DIRAC pilots for all the

skies

Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev

On behalf of the DIRAC consortium and the LHCb collaboration

Page 2: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 2

History class(with possibly an LHCb bias)

Page 3: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 3

Some time ago

We didn’t have a big variety of resources:m WLCG sites - AKA the "Grid"

o EDG, EGEE, EGEE-II, EGEE-III, EGI, EMI, EMI2, EMI3...

Submitting Jobs (from a central queue) to:m The LCG resource broker (aka WMS): a queue for dispatching to m CEs (LCG, then CREAM): a queue for dispatching tom Batch queues in front of the WNsm Finally, running on a WN

Number of queues: 4

m LCG inefficiencies exposed to end usersm High load on LCG brokers

So, pilot jobs came

Page 4: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 4

Some time ago

Pilot jobs came also because VOs wanted their "own" machines for their Grid

jobs

o Or, at least, to privatize them for some hours

Pilot jobs became the first way of privatizing grid WNs:

o set the environment

o install the middleware

m (LHCb)DIRAC: installed in every job via the pilot jobs (wget)m Application software: installed with SAM jobs

Page 5: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

5

Overlay network paradigm

Computing Resources

Grid

Site Clusters

PCs Agents

AA

A

A

A

A A

Agents form an overlay layer hiding the underlying diversity

A. Tsaregorodtsev,

CHEP 2006,

DIRAC

presentation

Let’s (in 2015) substitute the word “agent” with the word

“pilot”

Page 6: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 6

Pilots jobs are not only agents

A DIRAC pilot has, at a minimum, to:m install DIRACm configure DIRACm run and agent: the “JobAgent”

o That fetches (matches, in fact) a job from the central jobs queueP Or, more than one job only

In DIRAC, a pilot has to run on each and every computing resource type.

m Agents Pilots form an overlay layer hiding the underlying diversityo …or not?

Well, not completely… until Pilots 2.0

Page 7: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 7

More recently

Many LHC and non-LHC communities started having quite some variety of resources

m WLCG sites o CREAM CE: direct pilot jobs submission

P and no more central brokers are needed one less queue o But also other CEs, e.g. ARC CEs are a popular choice among sites

m WNs are VMs that the experiment provides on VAC, Cloudo VAC: an IAACo Cloud: an IAASo Condor-based systems

m Various forms of opportunistic computingo HLT farms

P Some experiments made a cloud, other (i.e. LHCb not)o (HPC) opportunistic sites

P Usually, not used as real HPC, anyway…o BOINC

P both IAAS and IAAC

It seems like the grid is not anymore

“The Grid”

Heterogeneity is the norm

Track 7, Tue, 14:00 – 16:00

Managing VMs with VAC and VcycleA. McNab, Track 7, Mon, 17:00

Page 8: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 8

Exercise: monitoring of pilots as example of heterogeneity

m Exercise: get the Logging Info of the piloto LCG: edg-job-logging-info –v 2 --noint <ref>o gLite: glite-wms-job-logging-info –v 3 --noint <ref>o CREAM: glite-ce-job-status –L 2 <ref>o ARC: a feature requesto DIRAC CEs: …nopeo CLOUDs: depends from cloud to cloud and from what you expecto BOINC: NAo HLT: …not?o Opportunistic : …o … resources of tomorrow: ??

So, we can keep adding support for each and every type of resources, or

we can just embed this functionality in the pilot

m 2nd exercise: get the log output fileo Same story as above!

Page 9: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 9

Pilots 2.0 as overlay layer

Page 10: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 10

Requirements

A pilot is what creates the possibility to run jobs on a worker node.

m A pilot 2.0 is a standalone scriptm Can be sent, as a “pilot job”

o To all “Grid” CEsm Can be run as part of the contextualization of a VM

o Or on an HPC machineP Or on a …whatever machine

m Can run on every computing resource, provided that:o Python 2.6+ on the WNo It is an OS onto which we can install DIRAC

Page 11: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 11

Pilots running everywhere

m The same pilot used everywhere

Computing Resources

Grid

Opportunistic

(volunteer)PCs

PilotsP

P

P

P

P

P PP

VAC

P

CLOUD

PHLT

P

P

P

Page 12: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 12

Coding rules

A toolbox of pilots capabilities (that we will call "commands") is available to the pilot script

Each command implements a single, atomic, functions, e.g.: o Run an environment test o Install DIRAC (or its extension)o Configure DIRAC o Run the JobAgento Run monitoring threado Report usage o ... and whatever it is needed

m Communities can easily extend the content of the toolbox, adding more commands

m If necessary, different computing resource types can run different commandso All configurable

Page 13: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 13

LHCb pilots 2.0

Page 14: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

CHEP2015, Federico Stagni, CERN 14

Pilots 2.0 can be easily extended

m LHCb requirement: CVMFSo For the distribution of all LHCb software, including LHCbDIRACo LHCb Pilots will try to “install” (set up) LHCbDIRAC from CVMFS

P If it fails (e.g., it’s a just deployed version, and the cache is cold – or a test version), will fall back installing the old way

o LHCb won’t download CAs: instead it uses what is on CVMFSo …

LHCb experience with running jobs in VMsA. McNab, Track 7, Tue, 17:30

Page 15: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

15

“Pilots to fly in all the skies”

CHEP2015, Federico Stagni, CERN

Page 16: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

16

Summary and prospects

m Pilots 2.0 are the “pilots to fly in all the skies”m Available to all DIRAC communities as of DIRAC v6r12

P Some may have not noticed the change…m Easy to extend

o Actively developedm Pilots 2.0 are the real federators

o WNs look the same everywhereP Make a pilot to run, and you will monitor all of them the same way

m Extended by LHCbo Mainly for fully using CVMFS

Prospects:m Add more generic commandsm One will be for running SAM jobs

CHEP2015, F.Stagni, CERN

Page 17: Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration

17

Thanks!

Question, comments

?

CHEP2015, F Stagni, CERN