status of tasks forces josep flix (pic/ciemat) on behalf of the wlcg operations coordination team
DESCRIPTION
Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team WLCG Operations Coordination F2F – CERN [11 th February 2014]. Ongoing Tasks Forces. Today ’ s detailed talks. I am providing here a brief summary for some other TFs. gLExec deployment. - PowerPoint PPT PresentationTRANSCRIPT
WLCG OpsCoord F2F @ CERN11th February 2014 1
Status of Tasks Forces
Josep Flix (PIC/CIEMAT)On behalf of the WLCG Operations Coordination Team
WLCG Operations Coordination F2F – CERN[11th February 2014]
WLCG OpsCoord F2F @ CERN11th February 2014 2
Ongoing Tasks Forces
WLCG OpsCoord F2F @ CERN11th February 2014 3
Today’s detailed talks
‣ I am providing here a brief summary for some other TFs
WLCG OpsCoord F2F @ CERN11th February 2014 4
gLExec deployment
‣ Multi-user pilot jobs should make use of gLExec to change user identity. The TF aims to coordinate the deployment of gLExec without interfering with current Exp. workflows
‣ Each site has its gLExec infrastructure regularly tested through SAM tests (at some point to become critical)
‣ # of closed tickets is 75‣ # of open tickets is 20
‣ A serious bug in gLExec has been discovered:https://twiki.cern.ch/twiki/bin/view/LCG/GlexecDeployment#Known_issues
‣ CMS will made gLExec SAM test critical soonhttps://twiki.cern.ch/twiki/bin/view/LCG/GlexecDeploymentTracking
WLCG OpsCoord F2F @ CERN11th February 2014 5
perfSONAR deployment
‣ Goal is to encourage all WLCG sites to deploy, configure and register perfSONAR-PS instances gathering network metrics on the network paths for all of the WLCG sites
‣ A new release (3.3.2) is available:‣ sites should upgrade. Procedure is straight-forward and requires
no re-configuration‣ A campaign to get remaining sites installed and out-of-date
installations upgraded is still ongoing (tickets)‣ Several sites are behind this deployment: ‣ WLCG OpsCoord is raising the issue with the WLCG management
and experiments [I. Bird @ LHCONE WS]
https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment
WLCG OpsCoord F2F @ CERN11th February 2014 6
Tracking Tools Evolution I‣ Developers, deployers, experts of GGUS, SNOW,
Savannah, JIRA and the experiments discuss development options for each tool and interfaces between them, when required
‣ GGUS releases:‣ Last release done on 29thJan. 2014‣ Includes several minor bug fixes, and new WLCG Monitoring SU
‣ Next release: 26th of February‣ Prototype of multiple site notification expected before the
end of the month in a test instance (hopefully in Prod. by March)
WLCG OpsCoord F2F @ CERN11th February 2014 7
Tracking Tools Evolution II
‣ Developers, deployers, experts of GGUS, SNOW, savannah, JIRA and the experiments discuss development options for each tool and interfaces between them, when required
‣ Savannah to JIRA migration:‣ Very slow progress in this area ‣ Main issue for the 'GGUS Shopping list' tracker (cross-
references between tickets) still not solved after more than one year
‣ Other trackers do not depend on this functionality, so it might be the moment to accept that these references will be lost during the migration
https://twiki.cern.ch/twiki/bin/view/LCG/TrackingToolsEvolution
WLCG OpsCoord F2F @ CERN11th February 2014 8
XrootD deployment
‣ The aim of this task force is to help the deployment at the WLCG sites of the Xrootd federated data storage for the FAX (ATLAS) and AAA (CMS) projects.
‣ Campaign for publishing xrootd endpoints in GOC/OIM is about to start (tickets!!)
‣ this will ease the operations and monitoring effort
https://twiki.cern.ch/twiki/bin/view/LCG/XrootdDeployment
WLCG OpsCoord F2F @ CERN11th February 2014 9
SHA-2 Migration‣ How services used by WLCG VOs (ALICE, ATLAS, CMS,
DTEAM, LHCb, ops) can be tested for SHA-2 readiness‣ The EOS SRM for LHCb is not OK yet‣ patch needed to support the "root" protocol expected by
LHCb jobs‣ voms-proxy-init on lxplus crashes when creating SHA-2
RFC proxies (discovered by CMS)‣ works OK with Java-based version provided by voms-clients3
‣ VOMRS:‣ VOMS-Admin test cluster will soon be available‣ host certs of future VOMS service from new SHA-2 CERN CA ‣ campaign to get the new servers recognized in LSC files across the
Grid (also provide such files in rpms)
https://twiki.cern.ch/twiki/bin/view/LCG/SHA2readinessTesting
WLCG OpsCoord F2F @ CERN11th February 2014 10
Machine/Job Features
‣ Machine/Job features to provide information from a resource provider (batch system, IaaS) to the payload:
‣ static (eg. power of the machine, number of cores, local scratch space)
‣ dynamic (eg. shutdown time of a VM)
‣ Current prototype at CERN lxbatch (bare metal / vWNs):‣ received feedback from ALICE who were testing mjf on the CERN batch nodes
(waiting for feedback from ATLAS/CMS)‣ For cloud-like installations the TF has decided to look into alternatives of
communicating the features:‣ investigating nosql key/value stores as a viable alternative. A test instance
has been setup and is being validated right nowhttps://twiki.cern.ch/twiki/bin/view/LCG/MachineJobFeatures
WLCG OpsCoord F2F @ CERN11th February 2014 11
IPv6 validation and deployment
‣ The imminent exhaustion of the IPv4 address space will eventually require to migrate the WLCG services to an IPv6 infrastructure. TF works in close relation with the HEPIX IPv6 Working Group
‣ Agreed at the last F2F that it would be beneficial toprogress with volunteering sites moving to dual stack
‣ trying to understand how to make sure the instability this would cause does not have negative impact on the site (?)
‣ A document for the MB is being prepared, covering also the case of the MW readiness WG
https://twiki.cern.ch/twiki/bin/view/LCG/WlcgIpv6
WLCG OpsCoord F2F @ CERN11th February 2014 12
Conclusions
‣ All of the TFs are progressing well!‣ Sites and experiments are encouraged to actively
participate on the discussions and the TFs!
‣ WLCG Operations coordination twiki:https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsCoordination‣ Mailing list: [email protected]