ATLASSitesJamboree
18-20Jan2017
StorageDT• ReminderfromAlessandra• IfstoragegoesinDTfor>=48h• analysisqueueswillbesetin– brokeroff120hbeforetheDTand– offline72hbefore,– whileproducLon
• queueswilljustbesetoffline48hearlier.• Thepoliciesarehere.– hOps://twiki.cern.ch/twiki/bin/view/AtlasCompuLng/GridDataProcessing#Switcher
2
MatureAtlasDistributedCompuLng
• Almost3weeksof“unaOended”producLonoverChristmasbreak
• Forthe1stLme,ATLASproducLonanddatamanagementsystem
• workedonit’sown– OnlyashortglitchwhennewDBReleasewasinstalledwithoutthesetup.shfileincvfms
– NointervenLonwasrequiredfromeithercentralsystemsorfromproducLonmanagers
• Noreprocessingwasrunning• ButderivaLoncampaignisofthesameorderofcomplexity• Noseriousissueswithsites.BigThanks!!
3
4
Resourcesfor2017
5
2016pledges
2017OLD
2017Approved
2017Pledges
Balancewrtpledge
Balancewrt2016
T0CPU 257 300 404 404 0% 57%T1CPU 571 682 921 808 -12% 42%T2CPU 633 846 1125 928 -18% 47%SUMCPU 1461 1828 2450 2140 -14% 46%T0DISK 17 20 25 25 0% 18%T1DISK 52 57 68 69 1% 33%T2DISK 68 78 83 78 -6% 15%SUMDISK 137 155 176 172 -2% 26%T0TAPE 42 53 77 77 0% 83%T1TAPE 119 173 188 174 -7% 46%SUMTAPE 161 226 265 251 -5% 56%
• FromSimoneCampanaTalk:
– Flatbudgetforesees+20%CPUand+15%storageperyear.Wegetmuchmorethanthat
– Weareshortbyonly2%instorageand14%inCPU.FAsdidlistentous(weaskedtoinvestinstorage)
– SomeFAsdidnotpledgemoresaidtheywillprovidemore
– Whatwas“opportunisLc”lastyearmighthavebeenpledgedthisyear(sowemightseelessopportunisLcresources)
– Someagencies/sitesinvestedindisk,someinCPUasweaskedoneyearagoalready
– IthinkthesituaLonisrathergood.ManythankstoallFAsfortheireffort
6
SummaryofrecommendaLonsforsitesDavidCameron
• Workernodes– Memory-sLll2GB/core
• Don’tkillonvmem– Scratch-20GB/coreor100GB/8cores– Network–0.25MB/s/core– Sooware–HEPOS_libsandcvmfsisallyouneed
• OS/VirtualisaNon/Containers– CentOS7issLllnotrecommendedL
• ATLASisreadybutmiddleware(EMIWN)isnot– Many(most?)sitesusingsomevirtualisaLonlikeOpenstack
• Wedon’tseethissodon’tworryaboutit– Containers:
• Singularitylookspromising,usefultomanageCentOS7transiLon
• Batchsystem/CE– Amodernbatchsystem(SLURM,HTCondor)makesthingseasierforATLAS– Requestedfairshares
• Analysis:T2:25%,T1:5%• ProducLon:T2:75%,T1:95%
– SCORE:20%– MCORE:80%
– DynamicSCORE/MCOREparLLonsarerecommended– CEs
• HTCondor-CE(US)andARCCE(restofworld)arebecomingstandards
7
SummaryofrecommendaLonsforsitesDavidCameron
• DDM/Storage– ConsolidaLonofsmallstorageisencouraged
• UseanalternaLveremoteSE• Orbecomeacachesite(xrootd/ARCcache)
– Non-SRMdiskSEisnowpossible– ATLASsLllasksforGridFTP,HTTPandXrootdtobesupported– AstandardspacereporLngmethodisevolving– Tapes
• Wewillworkonfilesizes• Networking
– NosignificantchangesinusageforeseenfortherestofRun-2– Usageisheavilyinfluencedbyjobbrokeringstrategy,recentlythishas improvedalot– MONARChasreallygone,site“closeness”isbasedonactualmeasurements– ForRun-3,increasewillberequired,100GbpsisprobablyokforTier-1s– IPv6
• SitesmayprovideIPv6-onlyworkernodesaoer1stApril2017• AllATLASserviceswillbedual-stackbythen• FullsiteIPv6-only:probably4yearsawayatleast
• Monitoring– CurrentdashboardswillstayunLlnewinfrastructureisready
• Butpleasetrytousethenewonebeforeyouareforcedto– Heavydevelopmentandcommissioningofnewdashboardsongoing
• Newframeworkisveryflexible(maybetooflexible)• Customdashboardsareusefulbut1perpersonisprobablytoomuch• ADCwillworkonofficial/validateddashboards• Harvester/Eventservice
• Harvester/Eventservice– ATLASwantstobeOerusethesites– ForthisitneedsmoreinformaLon
• InfowillbetakenfromtheCEorpilot• ThiswillbeeasieronthemoremodernCE/batchsystems
– Oneconsequenceshouldbefewer(visible)pandaqueuespersite• Ideallyone
– Eventservice• CurrentlybeingcommissionedonGridsites• IfyouhavepreemptablequeuesparLcipaLonisencouraged
8
IPv6StatusandPlansAlastairDewhurst
• DualStackStorage
– Allsitesareencouragedtoupgradetheirstoragetodualstack:• 10+sitesalreadyupgraded.• [email protected].
– OnlyCERNFTSisconfiguredtoallowFTStransfersviaIPv6currently.• BNLwillupgradeearlyFebruary2017.• RALwillupgradesummer2017
– IfasiteupgradestodualstackcontactDDMsupportsotheycanswitchyoutoCERNFTS.
• IPv6onlyCPU– FromApril2017,sitescanprovidetheirCPUresourcesasIPv6only.– Ifasitewantstodothis,pleasegetincontactinadvance.– By2018wewouldhopethisistransparent.– QMULandBrunelhaveIPv6onlyCPUalready.
9
IPv6StatusandPlansAlastairDewhurst
• ATLASstatus– Assumingsitehasdualstackstorage.WNwilltalktothefollowingcentralnodes:
• ForPanda:– ProducLonPandaServers:aipanda03[0-7].cern.ch
• ForRucio– Authnodes:rucio-auth-prod-0[1,2].cern.ch– Prodnodesthrough3HAproxyfrontendsrucio-lb-prod-0[1-3].cern.ch
• AllusehOp(s).– Rucio
• Attheendoflastyear,RucioteammigratedallnodestoCC7.• EnabledIPv6atthesameLme.• Allrequirednodesarenowdone!
– RucioUIwebfront(rucio-ui.cern.ch)alsomadedualstack.– Panda
• PqndaproducLonnodessLllIPv4only.• aipanda007.cern.ch(devnodeisdualstack)
– PilotsrunningagainstBrunel/QMUL.– Debuggingproblemswithpilotcode.
• SLllbelievewecanmeetApril2017deadline.– Otherservices(FronLer,APF,AGIS)
• Plantoupgradeby2018.
• IPv6deploymentplanwillbeupdatedin2018.• AllowsitestocompletelymigratetoIPv6bystartofRun3(2021)?
10
CentOS7AlessandraForL
• CentOS7naNve– Workonrelease21startedaoerthesummer2016PhysicsvalidaLon– OnceC7releasesareavailablebothpla{ormwillbeusedunLltheendofRun2– C7releasescannotbevalidatedonSL6nodes– cannotbemixedresourcesbehindthesamePandaQueues– CurrenttesLning
• CurrenttesNng– oneedtocreateanewPandaSitefortesLng– addSL7tothequeuesnametoeasilyidenLfythem– SWvalidaLon– HCtests
• Middlewarestatus– OSGdistributesCentOS7middleware– EGIhasnowanrpmintheUMDtesLngrepositories– MWREADY-135– UMDCentOS7tesLngrepository– YUMrepofile– Noneedtogothroughrecipestogettherpmsinplace– Itneedstobetestedbysites
• TRIUMFstartedtolookintoit– TarballversionoftheUMDrpmavailableinCVMFS
• /cvmfs/grid.cern.ch/centos7-wn-preview-v01
11
CentOS7AlessandraForL
• Containers&VirtualisaLon– RALismovingtowardsrunningservicesinsidecontainers,controlledbyMesos.
• Someservicesalreadymigrated(FTSandSquid).– BatchFarmwillbeenLrelymigratedbyAprilnextyear.– C7machineswillrun“SL6WN”insidecontainers.– ATLAS,CMS,LHCbandALICEjobswork
• HOWTOmigrate– ItisrecommendedtokeeptheSL6andC7resourcesseparatedevennow– Bigbangupgrade:declareadownLmeandcomebackwithC7workernodesbehindthesamePandaSiteand
PandaQueues.• swreleaseswillhavetobewipedandrevalidated
– SL5releaseswillnotbereinstalled– Rollingupgrade:declarean“atrisk”DT,createanewPandaSitewithnewmaster+slavesqueues.– swreleaseswillautomaLcallybevalidated– IneithercasethemigraLonhastobecommunicatedtoATLAS([email protected])– SitesusingSL6containersorVMdon'tneedtoannounce
• Recommended?– MovingtheWNstoCentOS7isNOTyetrecommendedonEGIresourcesasthemiddlewareisnotingood
shapeyet.• TesLngappreciatedwouldspeedupitsreleasethough
– ATLASSL6applicaLonsrunningincompaLbilitymodehavebeenvalidatedandsitesthathavetomovecanmove
– Ifyouhavetomovepleasedon'tsetupthenodeswithouttellingatlas.– hOps://twiki.cern.ch/twiki/bin/view/AtlasCompuLng/CentOS7Readiness– AskAlessandra
12
LightweightSitesCedricSerfon
• SoluNonsforlightweightsites– DifferentpossibiliLesforlightweightsites:
• Storage-lesssite• HavingalocalStorageElementisnotarequirementtorunproducLonjobs:
– Eveneasiernowwithnewsitemover– 2sitesalreadyintheprocessofmigraLngtostorage-less:RO-14andRO-16– Morecandidateswouldbegood.DDMopswillhelpindecommissioningtheendpoints
• Siteusingcache(arc-cacheorxrootdcache)– Arccache– Ruciosupportstheconceptofcacheswhicharecontrolledoutsideitselfandmaynotbeconsistent.– ArcCEcachecanpublishitscontenttoRuciothroughadd/deletemessages.– Thecacheservicecancreatedumpsofcachecontent,andaseparatescriptrunsperiodicallytocalculatethedifferencesandsendmessagestoRucio.– ThecacheRSEsareassociatedtotheCE’sPanDAqueueandsoPanDAcanbrokerjobstoqueueswherethedataiscached.– Arccacheperformances
» QueueusingArccacherunningformanymonthsinDurham» EfficiencyofqueuesusingArc-cacheandlocalstorageverysimilar(arccachequeueevenabitbeOerxrootdcache
– xrootdcache» Squidlikecacheproxyonsurface
• Usedisktocachethedata• Workaroundfirewall• Easytouse:hOpproxy,infuturerootproxy• Differentunderthehook• ForstaLclargefiled• MulL-threadtohandledataintensiveload• Capableofbothwholefilecachingandfileblockcaching• Protocoltoclient:xrootandhOp
• DistributedStorage
– AlaNordugrid.WorkstransparentlyforNordicsites(dCache)foryears– NevertestedforDPMbutshouldbedoableonwellconnectedsites:
» Onesiterunstheheadnode+somedisknodesOthersitesonlyrundisknodes » Probablyneedssomeworkonthefirewallrules
– IfonewantstoconsolidateexisLngsites,nontrivialtomergetheDBfromthedifferentsites(rememberLFCconsolidaLon)• FederatedStorage
– WebDAVfederaLon—DynaFed– Thisisnotdistributedstorage,butahostedserviceisprovidingthefederaLon– WebDAVisnowsupportedbothinROOT(TDavixFile)andDDMlevel(Rucio+Metalink)– ProvidesinglevolaLleRSEas"entrypoint"totheFederaLon– FederatedstoragesystemsmustnotberegisteredasRSEsseparately
» DoubleaccounLngofdata» UnavoidabledeleLon/transferracecondiLons
– AllowsthetransparentuseofMicrosooAzureandAmazon-styleS3cloudstorages– Privatekeysundercontrolofsite,doesnotneedtobepublishedtoATLAS
13
LightweightSitesCedricSerfon
• SRMlessstorage– Upload/downloadinthepilotviaruciouploadallowsuseofnon-SRMprotocols(gsiop,xroot,WebDAV)
– 3rdpartytransferwithgsiop,xrootdvalidated.SLllneedtovalidateWebDAV
• Proposal– Verysmallsites(<100TB):
• Keepthepandaqueues• DecommissiontheStorageelement• Possibilitytosetupacache(arc/xrootd)
– Smallsites(>100TBand<400TB):• Trytoconsolidatewithoneclosesite• Orgotofederatedstorage
14
RecommendaNonsforsitesAlessandraForL
• Memory– fromcgroupsRSS
• SmapsPSS:physicalmemoryusedbyajobwithoutdoublecounLng• cgroupsRSS:physicalmemoryusedbythejobswithoutdoublecounLng
• Whatbatchsystemsdo?– Batchsystemswithoutcgroups
• SeethesameRSSasreportedinsmaps• KillonvmemwhichisNOTaphysicalmemorymeasure• Ifyouinsistonthisyouneedtosetitatleast3LmestheRAMrequestedbythejob• Ifyoukillwiththescheduleritislikelytothesameproblem
– Siteswithcgroups• Cansetupsooandhardlimitsonthevaluesthejobreports• SoolimitallowsthekerneltodecideifthejobcankeeponusingtheextraRAMorhastoswap• HardlimitwillkillthejobbasedonRAM• Ooensetto2or3LmestheRAMrequestedbythejob
• Whichbatchsystem– ATLASrecommendssitesmovetoaBSsupporLngcgroupsandothermoremodernfeatures.
• HTCondor• SLURM
• Ces– ARC-CE
• MostusedatnewsitesandsitesmovingtoHTCondoratEGIsites• WellintegratedwithSLURM
– HTCondorCE• MostusedintheUS• IfyouhaveHTCondorbatchsystemisjustanaddiLonallayerofconfiguraLon
– CREAM-CE• MostusedinEGIforlegacyreasonsbutATMisbestintegratedwitholderbatchsystemsliketorque/mauiandSGE• IfyouchangebatchsystemyoumaywanttoconsiderreviewingalsoyourCE
• Sharesatsites– Analysis:25%– ProducLon:75%
• SCORE:20%• MCORE:80%
– HoweverthisshareisnotconstanLnLme• StaLcparLLonsetupisNOTrecommended• Reminderthatrecipesformoredynamicapproachesfor3batchsystemscanbefound
• hOps://twiki.cern.ch/twiki/bin/view/LCG/DeployMulLCore#Batch_system_related_informaLon
15
RecommendaNonsforsitesAlessandraForL
• WNHardware– About20GBofdiskscratchspace– Foran8coreMCOREslot,~80-100GBissufficient– Atleast2GBof(physical)RAM
• Having3-4GBwouldbebeneficial• EnoughswapspacesuchthatRAM+swap>=4GB
– Asaruleofthumb,about0.25Gbit/sofnetworkbandwidth• MightwanthigherformorepowerfulCPUs.
• Storage– Changeofstoragetopology– Biggersites(T1andT2)withsatellitesindepentlyfromlocaLon– EvoluLonofsitestowardscachesorfederaLons– Consolidatestorage
• 75%ofstorageat~30sites• Smallsites<400TBdiscouragedfrombuyingstorageunlesstheycangoaboveoraggregatewithothersites
• Squid– CondiLondataandsoowareareaccessedthroughsquid– FronLer&CVMFS– Sitesarerequestedtoinstallatleastonesquid
• Twoforredundancyandloadbalancing– FronLersquidorOSsquid?
• FronLersquidhassomepatchestoboostperformance.Itisalsoahigherversionwithbugfixes.• OSsquidiseasiertomaintainbecauseistherebydefault.• T2scangetawaywiththeOSsquidbutATLASrecommendstousetheFronLerversion
– Monitoring:• h_p://wlcg-squid-monitor.cern.ch/snmpstats/indexatlas.html
• Traceability– Glexechasbeendropped– WLCGTraceabilityTFworkingonothertoolsandmodels
• SingularitycontainersoluLonbeingtestedatCERNandinOSG• 1singleexecutabledoesn'tneedadaemon• Canisolatepayloadfrompilotenvironment• CannotdotraceabilitythatwillhavetobedoneatVOlevel
– ATLASalreadydoesthis» Site:Lme/(IP|host)->VO
– VO:Lme/host->user+payload• BeingintegratedinslurmandHtcond
16
Recap–AFSphaseout@CERN• Drivenbyslowdemiseofupstreamproject• No“hard”deadlinebutexpecttofinalizeinLS2(~4Q2018)• Docs:• hOps://twiki.cern.ch/twiki/bin/view/IT/AfsPhaseout• hOps://twiki.cern.ch/twiki/bin/viewauth/AtlasCompuLng/AtlasAFSPhaseOut
• 2017plans–EOS– Knownproblems:– Numberoffilesperinstance(AFS:3.5GB)→newEOSnamespace,ETA3Q2017– “Smallfile”performance(create:untar,compile)→EOSFUSErewrite,ETA2Q2017
• Also:– Roll-outof“citrine”EOSbranch– UpgradetoCC7
• Recap2016AFSphaseout-“easy”– Targets:
• Sooware→CVMFS–• Websites→EOSWEB(setup,start)• FUSEaccessforPLUS&BATCH
– Starton/projectmigraLon– hOps://its.cern.ch/jira/browse/NOAFS
• 2017–externalAFSdisconnecLontest• 2017-02-1509:00CET;24h;ITSSBentry• Goals:flushunknownAFSdependencies+createawareness
– Experimentsetup-sLllrefertoCERNAFS?– Sitesetup:
• preferAFSoverCVMFS?• /afs/cern.chuserhomedirectories?Talktous.Before.
– Usersetup:• PreferAFSoverCMFS,EOS?
• (NudgetowardsalternaLves)• Willrepeat,willeventuallyclosecompletely(earlyifsecurityissues)
• 2017plans–AFS“harderstuff”• (conLnuewith2016)
– ProjectmigraLons– ActuallyremovesoowarefromAFS(ATLASproblem:CMTnightlies)– WebspacemigraLon(alsoforuserpages)–ATLASWebsite(s)?
• /work:– –2Q2017:Stopself-service,createviaLckets(jusLficaLon?thingsnotworkingonEOS)– –4Q2017:nolongercreateAFSworkspaces(per-experimentdecision),startmigraLon
• /user:groundworkforlater$HOME• –Split“UNIX”accountfrom“AFSaccount• – Allownon-AFShomedirectoriesinLDAP• HEPIXscripts”-ATLAS:takefromCVMFS?”
17