report for job management meeting alessandro maraschini alessandro.maraschini@elsagdatamat
DESCRIPTION
Report for Job Management meeting Alessandro Maraschini [email protected]. All Hands , NIKHEF (Amsterdam) 2008, February 20-22. Summary. Introduction Job Cancellation Management Wider LBproxy explotation LB and CREAM Integration Purging the LBProxy DB AOB. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/1.jpg)
Enabling Grids for E-sciencE
www.eu-egee.org
Report for Job Management meetingAlessandro Maraschini [email protected]
All Hands , NIKHEF (Amsterdam)
2008, February 20-22
![Page 2: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/2.jpg)
AH, Amsterdam, 2008, Feb 20th 2
Enabling Grids for E-sciencE
Summary
• Introduction
• Job Cancellation Management
• Wider LBproxy explotation
• LB and CREAM Integration
• Purging the LBProxy DB
• AOB
![Page 3: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/3.jpg)
AH, Amsterdam, 2008, Feb 20th 3
Enabling Grids for E-sciencE
Introduction
• January the 17th / 18th, Rome (Elsag Datamat)• http://trinity.datamat.it/projects/EGEE/Meeting/index.htm
• Improve (technical) communication WMS/LB people
• Mainly BrainStorming Activity– Not possible at AH – too technical– Not possible via mail – needs interactivity
![Page 4: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/4.jpg)
AH, Amsterdam, 2008, Feb 20th 4
Enabling Grids for E-sciencE
Job Cancellation Management
• Problem: Sometimes difficult to know what component is responsible for cancellation– It depends on “where” the job is at the moment– May be affected by race contitions
• Check Job state machine for possible issues on race conditions:– E.g. Job cancellation request while registering/submitting– Possible solutions:
Lock on each job information Return code (backward compatible) by LB logging calls
![Page 5: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/5.jpg)
AH, Amsterdam, 2008, Feb 20th 5
Enabling Grids for E-sciencE
Wider LBProxy explotation
• Scope: Reducing Persistent information stored by WMS components during their lifecycle by logging to LBProxy
100% secure – logging call corresponds to actual DB update No significant latency – comparable to file writing
• Filelist / Jobdir approach:– Used by: WMProxy & WM
Job Management Requests E.g.: Submission, list-match, cancel, …
– Used by: JC & LM Active jobs (non-terminated) Timers (manage condor events)
– Proposal: First Phase: storing Filelist/Jobdir info ALSO to LBProxy for
recovery operations Further investigation on possible race conditions will be needed
![Page 6: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/6.jpg)
AH, Amsterdam, 2008, Feb 20th 6
Enabling Grids for E-sciencE
Wider LBProxy explotation (2)
• Berkley DB approach – used by ICE
Storage of Job cache information E.g. active/non-terminated jobs
– Could be switched to LBProxy logging but:• System currently stable
• Stressfull tests survived
• Manpower needed
![Page 7: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/7.jpg)
AH, Amsterdam, 2008, Feb 20th 7
Enabling Grids for E-sciencE
LB and CREAM Integration
• Objectives– Access the Job State uniformly– Provide logging-info, L&B notifications– Export to JP– Help Job Tracebility
• Proposal– Install LBProxy on CREAM node
• Tasks– CREAM Events and Job state machine
Definition & Implemention
– LB producer Library Preferred language: JAVA
– Extension of LBproxy i.e. JobId mapping, new (possibly) events
![Page 8: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/8.jpg)
AH, Amsterdam, 2008, Feb 20th 8
Enabling Grids for E-sciencE
Purging and LBProxy DB
• Current LBProxy Policy– Purge the job as soon as it is cleared
Cleared events logged by purger activity
• Proposal– Purge inactive jobs
Query LBProxy for forgotten Jobs• “forgotten” criteria has to be carefully evaluated
o What status - for how long
o timeout needed
– Ensemble “purgers” Various purger daemons on WMS can be embedded in
single component• Wms.purger, Proxy-cache-purger, JC/LM purgers, …
![Page 9: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/9.jpg)
AH, Amsterdam, 2008, Feb 20th 9
Enabling Grids for E-sciencE
AOB
• Userinterface can query JP– Information on purged jobs can still be achieved by
further query on JP– Implementation: Simple WS interface call performed
upon “no results” returned by LB
• Job Status Notification– Users want job output files– They keep asking LB status until “DONE”– Worse approach – performances killer– On output request, Client may subscribe for status
notification on LB server
![Page 10: Report for Job Management meeting Alessandro Maraschini alessandro.maraschini@elsagdatamat](https://reader035.vdocuments.site/reader035/viewer/2022081515/5681479f550346895db4d883/html5/thumbnails/10.jpg)
AH, Amsterdam, 2008, Feb 20th 10
Enabling Grids for E-sciencE
Nex Meeting
• As soon as we have collected a reasonable issues to be discussed– Where? When? May be agreed in Amsterdam– Probably Bologna/Milan/CERN around May/June,
2008