ganga panda dietrich liko. motivation access to osg resources by ganga collaboration with us...
TRANSCRIPT
GANGA PANDA
Dietrich Liko
Motivation
• Access to OSG resources by GANGA
• Collaboration with US colleagues
• Possibly an alternative way of submitting jobs to (some) LCG sites
Does it make sense ?• GANGA has been designed as a backend neutral system, PANDA is just
another option
• We have to stay close to pathena capabilities and add our strength – GUI– Job management– GANGA Robot– AMI etc
• Other aspects like incomplete datasets are a problem for GANGA and for pathena
• Aim: GANGA has to be an attractive interface also for PANDA users.
How does pathena work
• It is surprisingly similar to GANGA• Small details are different• The idea is to run standard pathena jobs via
PANDA– We just provide the parameters– The maintenance of the run script stays with
PANDA
Some aspects
• PANDA Build jobs• PANDA supports a workflow and use a kind of
DAG to compile just once• PANDA backend extended to include an
optional PANDABuildJob
PANDA Job
backend = Panda ( status = None , actualCE = None , site =
'ANALY_BNL_ATLAS_1' , id = 2093001 , buildjob = PandaBuildJob ( status = 'finished' , id = 2093000 ) )
• Single Jobs– Two backend jobids in
one Job
• Subjobs:– PandaBuildJob part of
the master job
How to define a Panda Job
• Definition of the Job is part of the RunTimeHandler
• Upload tar file to WebDAV server
• Create a job definition, that will be pickeled and uploaded to the PANDA server
JobSpec jspec = JobSpec() jspec.jobDefinitionID = job.id jspec.jobName = commands.getoutput('uuidgen') jspec.AtlasRelease = 'Atlas-%s' % app.atlas_release jspec.homepackage = 'AnalysisTransforms' jspec.transformation = '%s/runAthena10' % Client.baseURLSUB if job.inputdata: jspec.prodDBlock = job.inputdata.dataset else: jspec.prodDBlock = 'NULL' jspec.destinationDBlock = self.outputdataset jspec.destinationSE = job.backend.site jspec.prodSourceLabel = 'user' jspec.assignedPriority = 1000 jspec.computingSite = job.backend.site
FileSpec finp = FileSpec()finp.lfn = lfnfinp.GUID = guid#finp.fsize =#finp.md5sum =finp.dataset = job.inputdata.datasetfinp.prodDBlock = job.inputdata.datasetfinp.dispatchDBlock = job.inputdata.datasetfinp.type = 'input'finp.status = 'ready‘ <- data is already present. One can also wait for a file to
arrivejspec.addFile(finp)
Similar for output and log file
Issues
• Some pyton issues• Datetime and decimal required– At least python 2.3
• outputdata
Outputdata
• pathena extracts the outputdata from the option file– Many possibilities– ntuple, hist, ESD, AOD, TAG, AANT, THIST, iRoot, EXT, Stream1– A fixed schema is used to overload the names
• Ntuple -> dataset name, ntuple name, jobiduser.DietrichLiko.ganga.25.PIPPO._00001.root
• Manual extraction of these parameters turns out to be not an option – Last missing part of the handler
• Can only run in athena environment
Immediate work
• Finish the last piece• Testing• Documentation
• Support for most pathena options
• Afterwards: Iteration with Johannes on further support