clouds in high energy physics - university of victoria · pdf fileclouds in high energy...
TRANSCRIPT
CloudsinHighEnergyPhysics
RandallSobie
UniversityofVictoria
RandallSobieIPP/Victoria 1
Overview
• CloudsareintegralpartofourHEPcomputinginfrastructure
– PrimarilyInfrastructure-as-a-Service(IAAS)
• Ouruseiswideranginganddiverse
– CERNAgileInfrastructure
– Tier-1computingatcentressuchasBNL,FNALandRAL
– Tier-2computingaroundtheworld
• ExpandinguseofHEP-clouds,privatecloudsandcommercialclouds
RandallSobieIPP/Victoria 2
Motivation
RandallSobieIPP/Victoria 3
Awiderangeofreasonsforusingclouds
• Easemanagementofexistinginfrastructure
• Separationofapplicationandsystem
administration
• Simplifiesallocationofresources
• Leveragesoftwaredevelopment
• Opportunisticcomputing
• Non-HEPcomputingcentres
• Commercialcloudresources
Typesofcloudresources
RandallSobieIPP/Victoria 4
DedicatedVirtual
cluster
CloudcomputinginHEPistypicallyproviding5-20%oftheprocessingofcurrentprojects
“Dedicated”clouds(OwnedbyHEP)
“Opportunistic”clouds(privateandcommercial)Opportunistic
Clouddeployments
RandallSobieIPP/Victoria 5
Traditionalbare-metal
Staticcloud(e.g..LTDABaBar,HLTclouds)
Standalone/privatecloud(e.g.PNNL,NorduGrid)
Bare-metalorin-housecloudwithexternalcloud(e.g..CERN,BNL)
Distributedclouds(e.g.UK,Canada,Australia,INFNClouds)
RandallSobieIPP/Victoria 6
D. Giordano WLCG Workshop 9/10/2016
OpenStack Clouds at CERN In production:
q 4 clouds q >230K cores q >8,000
hypervisors
90% of CERN’s compute resources are now delivered on top of OpenStack
A further 42K cores to be installed in next few months
2
>90%ofCERNcomputeresourcesarevirtualizedUpto42Kcorestobeinstalledinthenextfewmonthssubjecttofunding
RandallSobieIPP/Victoria 7
condor_collector condor_nego,ator
Workernodes
condor_startd
condor_rooster
Virtualworkernodes
condor_startd
ARC/CREAMCEs
condor_schedd
Centralmanager
OfflinemachineClassAds
Draining
CloudsatRAL
• 892coresu,lizingaCephstoragebackend.
• 3alterna,ngracksofCPUandStoragenodes.
• Tier1servicesnowrunningonCloudVMs.
• EngagingwithvariousEuropeanCloudprojects(e.gDataCloud).
S3andSwiSStorage• StoringDockerimagesforContainer
Orchestra,onviaSwiS.
BatchWorkontheCloud• For~1.5yearstheRALHTCondorbatchsystemhas
madeopportunis,cuseofunusedcloudresources.
• HTCondorroosterdaemonusedtoprovisionVMs.
• Runningjobsfromall4LHCexperiments&manynon-LHCexperiments.
Openstackserviceunderdevelopment.AvailabletoLHCVOsnextyear.
RandallSobieIPP/Victoria 8
13
FNAL/CMS AWS January/February 2016
9
BNL/ATLAS AWS September 2015
Tier-1CloudburstingontoEC2
RandallSobieIPP/Victoria 9
SpecialpurposecloudsBaBarLongTermDataAccess(LTDA)System
AbilitytopreservedataandanalysiscapabilityforBaBar
(stoppeddatatakingin2008)
RandallSobieIPP/Victoria 10
HighleveltriggerfarmsoftheLHCExperiments(largemulti-10Kcoresystems)
Virtualmachinesarebootedduringno-beamperiods
ExamplesofTier-2clouddeployments
RandallSobieIPP/Victoria 11
RandallSobieIPP/Victoria 12
Research Cloud
CREAM CE
Dynamic Torque
TORQUE + Maui
TORQUE + Maui
control VMs
distribute jobs via SSH
(Currently 700 cores)
14,000 HEPSpec ~ (1400 cores)
Australia-ATLAs Tier 2
(Belle II) LCG.Melbourne.au
Australian Belle II Grid Site
Dynamic Torque
SingleCREAMCEservicesATLASTier-2(Torque)andBelleIIsite(DynamicTorque)
SWITCHengines–SwissNRENcommercialcloud
Why private cloud?
Chosen for flexibility, efficient use of compute resources for services Provides easy load-balancing and availability features Provides templating features Easy re-use of templates to test and instantiate new server instances Non-systems staff can provision their own instances of services Software Defined Networking is more malleable than physical
networking, encourages better networking practices, including security
RandallSobieIPP/Victoria 13
UK/GridPP• CloudsatHEPinstitutions(Oxford/Imperial).• ECDFcloudinEdinburghhasrecentlymadeavailabletotheHEP• UKVacuumdeployment• Commercialcloud–DataCentredOpenstackcloud
Italy/INFN• PrivateOpenStackCloud(Padova-Legnaro)calledCLOUDAREAPADOVANA• ~25usergroups/project• CMSproduction
PNNL/WashingtonPrivateOpenStackcloudforBelleIIproject(KEK)andotherlocalusers
RandallSobieIPP/Victoria 14
CloudSchedulerCloudScheduler
HTCondor
Job
VM
ComputeCloud
VMImage
Repository
Starts job
Start VMs
Submit user script
CanadaDistributedcloudsystemforATLASandBelleII10-15cloudsHTCondor/CloudScheduler4000-5000cores
ATLASjobsoncloudforCA-system10clouds4300cores
Jobscheduling/VMprovisioning
• VarietyofmethodsforrunningHEPworkloadsonclouds
– VM-DIRAC(LHCbandBelleII)
– VAC/Vcycle(UK)
– HTCondor/CloudScheduler(Canada)
– HTC/GlideinWMS(FNAL),HTC/VM(PNNL),HTC/APF(BNL)
– Dynamic-Torque(Australia)
– CloudAreaPadovana(INFN)
– ARC(NorduGrid)
• Eachmethodhasitsownmeritsandoftenwasdesignedtointegrated
cloudsintoanexistinginfrastructure(e.g.local,WLCGandexperiment)
RandallSobieIPP/Victoria 15
Commercialclouds
• AmazonEC2andMicrosoftAzure
– Short-termmulti-10Ktests
– Long-term1K-scaleproduction
• GCEevaluationbutnoproduction
• OthercommercialOpenStackclouds
– DataCentred(UK),SWITCHengines(Switzerland)
• CERNcommercialcloudprocurement
RandallSobieIPP/Victoria 16
Networkconnectivity
• AmazonandMicrosoftcloudsareconnectedtotheresearchnetworksin
NorthAmerica(probablyGCEaswell)
• Trans-borderortrans-oceantrafficcanbeanissue
– BecomeanimportantdiscussiontopicintheLHCONEmeetings
• Privateopportunisticclouds
– trafficflowsoverresearchnetworkbutnotLHCONEnetwork
RandallSobieIPP/Victoria 17
RandallSobieIPP/Victoria 18
CPUBenchmarks
Newsuiteof“fast”benchmarks
– HEPiXBenchmarkWorkingGroup
– Suiteavailableincludes“fastHS”(LHCb)andWhetstonebenchmarks
• WritetoElasticSearchDB
– RunbenchmarksinthepilotjoborduringthebootoftheVM
Datastorage
– DatawrittentolocalstorageonnodeandthentransferredtoselectedSE
– UKgrouphasdonesomeworkintegratingtheirobjectstorewithATLAS
– BNLusingS3storageonEC2forT2-SE
Monitoring
RandallSobieIPP/Victoria 19
CloudSystemmonitorSensu,Munin,RabbitMQ,Mongo-DB
ApplicationmonitorPanda
Application
BenchmarksandaccountingElasticSearchDB
Cloudorsitemonitor
Summary
• CloudsinHEP
– Growing,diverseuseofclouds
– Typicallyintegratedintoanexistinginfrastructure
– Seenasawaytobettermanagemulti-userresources
• Opportunisticresearchclouds
– Easywaytoutilizecloudsatnon-HEPresearchcomputingfacilities
– Norequirementforon-siteapplicationspecialistsorcomplexsoftware
• Commercialclouds
– EC2/AzureaswellasotherOpenStackclouds
– Trans-bordernetworkconnectivitychallenges
RandallSobieIPP/Victoria 20