ganga panda dietrich liko. motivation access to osg resources by ganga collaboration with us...

GANGA PANDA

Dietrich Liko

Motivation

• Access to OSG resources by GANGA

• Collaboration with US colleagues

• Possibly an alternative way of submitting jobs to (some) LCG sites

Does it make sense ?• GANGA has been designed as a backend neutral system, PANDA is just

another option

• We have to stay close to pathena capabilities and add our strength – GUI– Job management– GANGA Robot– AMI etc

• Other aspects like incomplete datasets are a problem for GANGA and for pathena

• Aim: GANGA has to be an attractive interface also for PANDA users.

How does pathena work

• It is surprisingly similar to GANGA• Small details are different• The idea is to run standard pathena jobs via

PANDA– We just provide the parameters– The maintenance of the run script stays with

PANDA

Some aspects

• PANDA Build jobs• PANDA supports a workflow and use a kind of

DAG to compile just once• PANDA backend extended to include an

optional PANDABuildJob

PANDA Job

backend = Panda ( status = None , actualCE = None , site =

'ANALY_BNL_ATLAS_1' , id = 2093001 , buildjob = PandaBuildJob ( status = 'finished' , id = 2093000 ) )

• Single Jobs– Two backend jobids in

one Job

• Subjobs:– PandaBuildJob part of

the master job

How to define a Panda Job

• Definition of the Job is part of the RunTimeHandler

• Upload tar file to WebDAV server

• Create a job definition, that will be pickeled and uploaded to the PANDA server

JobSpec jspec = JobSpec() jspec.jobDefinitionID = job.id jspec.jobName = commands.getoutput('uuidgen') jspec.AtlasRelease = 'Atlas-%s' % app.atlas_release jspec.homepackage = 'AnalysisTransforms' jspec.transformation = '%s/runAthena10' % Client.baseURLSUB if job.inputdata: jspec.prodDBlock = job.inputdata.dataset else: jspec.prodDBlock = 'NULL' jspec.destinationDBlock = self.outputdataset jspec.destinationSE = job.backend.site jspec.prodSourceLabel = 'user' jspec.assignedPriority = 1000 jspec.computingSite = job.backend.site

FileSpec finp = FileSpec()finp.lfn = lfnfinp.GUID = guid#finp.fsize =#finp.md5sum =finp.dataset = job.inputdata.datasetfinp.prodDBlock = job.inputdata.datasetfinp.dispatchDBlock = job.inputdata.datasetfinp.type = 'input'finp.status = 'ready‘ <- data is already present. One can also wait for a file to

arrivejspec.addFile(finp)

Similar for output and log file

Issues

• Some pyton issues• Datetime and decimal required– At least python 2.3

• outputdata

Outputdata

• pathena extracts the outputdata from the option file– Many possibilities– ntuple, hist, ESD, AOD, TAG, AANT, THIST, iRoot, EXT, Stream1– A fixed schema is used to overload the names

• Ntuple -> dataset name, ntuple name, jobiduser.DietrichLiko.ganga.25.PIPPO._00001.root

• Manual extraction of these parameters turns out to be not an option – Last missing part of the handler

• Can only run in athena environment

Immediate work

• Finish the last piece• Testing• Documentation

• Support for most pathena options

• Afterwards: Iteration with Johannes on further support

ganga panda dietrich liko. motivation access to osg resources by ganga collaboration with us...

Documents