Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 1/120
adopting cloud technologies
Stefano Bagnasco |INFN Torino
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 2/120
outline
● Leftover: a more realistic IaaS deployment
● Some general musings ● Toolkits & projects ● A simple virtual farm ● Bringing elasticity to the farm ● Flexibility and user-friendliness
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 3/120
a more complex deployment 1
.
Services
Workers
VMs providing critical services: ● Run on a cluster of server-class
redundant hypervisors ● Public & private IP ● Shared system disks on resilient storage
allowing live migration (Services need to run continuously)
VMs providing computing workforce: ● Run on a cluster of compute-optimized,
less expensive hypervisors ● Locally cached image repository for fast
startup (Workers are often reallocated) ● Access to fast storage for data ● Private network only
🐮
🐶
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 4/120
a more complex deployment 2
Storage Servers
Image Repository Datastore
Services Cluster Workers Cluster
● Image Repository Datastore holds the OS images
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 5/120
a more complex deployment 2
Storage Servers
Image Repository Datastore
Gluster Replicated Volume
Services Cluster Workers Cluster
Shared Datastore for running VMs
● Image Repository Datastore holds the OS images ● Services System Datastore is shared to allow live migration
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 6/120
a more complex deployment 2
Storage Servers
Image Repository Datastore
Gluster Replicated Volume
Services Cluster Workers Cluster Cache for imagerepo Datastore
Shared Datastore for running VMs
● Image Repository Datastore holds the OS images ● Services System Datastore is shared to allow live migration ● Workers System Datastore is local to the hypervisors to increase�
I/O capacity. Images repository is locally cached on each hypervisor to reduce startup time
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 7/120
a more complex deployment 2
Storage Servers
Image Repository Datastore
Gluster Replicated Volume
Services Cluster Workers Cluster Cache for imagerepo Datastore
Shared Datastore for running VMs
● Image Repository Datastore holds the OS images ● Services System Datastore is shared to allow live migration ● Workers System Datastore is local to the hypervisors to increase�
I/O capacity. Images repository is locally cached on each hypervisor to reduce startup time
● Persistent Space Datastore is mounted on the relevant hypervisors using the iSCSI Transfer Manager to provide persistent storage to Virtual Farms
iSCSI Datastore
🐷
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 8/120
cloud computing
Cloud Computing technologies promise to help the scientific computing community!
What the cloud is: ● A technology to ease resource management,
provisioning and sharing ● An industrial standard technology
What the cloud is not: ● A magical “resource multiplier” ● A complete end-to-end scientific computing system
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 9/120
why scientific cloud computing?
● Cloud computing is becoming (one of) the industry standard ways to provision computational power ■ We’re not any more the only ones working with Big Data, so
industry is developing tools
● The technology decouples the infrastructure from the application ■ Scientists will not need any more to worry (much) about
infrastructure management
● This is a general trend! ■ 14% of contribution titles at CHEP2015 included the word
“cloud”
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 10/120
development line 1
● Adapting scientific workloads to public clouds ■ Several activities ongoing, specially in large
experiments ■ See e.g. CHEP2013 (Amsterdam) and CHEP2015
(Okinawa) conferences for a number of reports ■ https://indico.cern.ch/event/214784/ ■ https://indico.cern.ch/event/304944/
● Main issues ■ Data access ■ Prices
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 11/120
development line 2
● Creating dedicated local and federated cloud infrastructures to cater to the scientific community’s computing needs ■ Min bias: using the IaaS model to manage a Computer
Centre • Small scale example: INFN-Torino • Large scale example: CERN Agile Infrastructure
■ Then: adapting experiment’s Computing Models to exploit Cloud technologies
• See e.g. Xiaomei’s talk tomorrow • Several more in CHEP2013 and CHEP2015 proceedings
■ Next: Building a large scale federated cloud infrastructure
• Now to complement, eventually to replace the existing Grid infrastructure
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 12/120
a few projects
● A “Grid of Clouds”: the EGI Federated Cloud
● Readily usable toolkit: the CernVM Ecosystem
● An EU-funded development project: INDIGO-DataCloud
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 13/120
the egi federated cloud
● A seamless grid of academic private clouds and virtualised resources, built around open standards and focusing on the requirements of the scientific community. ■ https://www.egi.eu/infrastructure/cloud/
● Features both a federated infastructure and an Application Marketplace ■ https://appdb.egi.eu ■ Holds, manages and populates “Virtual Appliances”
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 14/120
the egi federated cloud
3 28/05/15
EGI Cloud Infrastructure
EGI$Core$Pla,orm$
Federated'AAI'Service'
Registry'Monitoring' Accoun6ng'
EGI$Cloud$Infrastructure$Pla,orm$
VM'Management'
Informa6on'
Discovery'
Storage'Management'
Help$and$Support$
Security$Co:ordina;on$
Training$and$Outreach'
EGI$C
ollabo
ra;o
n$To
ols$
EGI'Applica6on'
DB'
Image'
Repository
'EGI'Cloud'Service'M
arketplace
'
Sustainable$Business$Models'
User'Communi6es'
Monitoring$and$control$of$u;lisa;on$
Technical$Consultancy$and$Support$
Uniform$interfaces$to$Cloud$Compute$and$Storage$
Cloud'Management'Stacks'(OpenStack,'OpenNebula,'Synnefo,'…)'
Cloud'Management'Stacks'(OpenStack,'OpenNebula,'Synnefo,'…)'Resource$Provider$Infrastructure$Resource$Provider$Infrastructure$
Secure'endorsed'Applica6on'and'
Service'Deploym
ent'
EGI Conference 2015
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 15/120
the egi federated cloud
4 28/05/15
On demand compute to run any kind of workloads on virtual machines
VM Management
EGI Conference 2015
VM' VM'
Site'A'
VM' VM'
Site'B'
• OCCI as standard interface – FedCloud contextualization
extension – FedCloud profile extension
• Clients – ruby and java clients – OCCI connectors in brokers
• Servers – rOCCI-server – OCCI-OS – snf-occi
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 16/120
the egi federated cloud
5 28/05/15
Automatic and secure distribution of endorsed VM images for Virtual Organisations
VM Image Management
EGI Conference 2015
Site'B'
VM'Image'
Site'A'
VM'Image' Site'C'
VM'Image'
EGI$AppDB$VM'
Image'
• Web based front-end in AppDB
• OVF for packaging images • HEPiX lists for distribution
– Endorsed by VO managers – Signed metadata – RP subscribe and download
• EGI endorses basic OS images
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 17/120
a higher level of abstraction
15 28/05/15
High Level Tools (PaaS & SaaS)
• Extend the IaaS capabilities of the EGI cloud • ‘Alternatives’ of the OCCI client and API • More than OCCI
• External contributions (! support many other clouds too) • New developments expected (e.g. INDIGO-Datacloud) • SaaS'
• Iden6ty'Federa6on'
Catania'Science'Gateway'
• PaaS'for'automa6ng'deployments'
• Helix'Nebula'Slipstream'
• Programming'framework'for'autoUparallelisa6on'COMPSs'
• Abstrac6on'on'top'of'various'HPC/HTC/cloud'VMDIRAC'
• Workflow'development'and'enactment'WSUPGRADE'
• VM'lifecycle'manager'Vcycle'
EGI Conference 2015
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 18/120
Persistent storage
Services
Workers
Virtual Router cloud-gw-216.to.infn.it 193.205.66.216
172.16.261.254
Virtual private network
LAN
iSCS
I
NFS
VirtualFarm001 172.16.261.1
VirtualFarm002 172.16.261.2
VirtualFarm003 172.16.261.3
Configurable port forwarding
virtual farm provisioning model
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 19/120
172.16.261.3
Services
Workers
Virtual Router cloud-gw-216.to.infn.it 193.205.66.216
172.16.261.254
Virtual private network
LAN
VirtualFarm001 172.16.261.1
VirtualFarm002 172.16.261.2
VirtualFarm003 172.16.261.3
virtual farm provisioning model
Automatic elasticity through elastiq
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 20/120
the VAF
[email protected] - A grounds-up approach to High-Throughput Cloud Computing in High-Energy Physics
The Virtual Analysis Facility
• Configured via a web interface: cernvm-online.cern.ch
• Entire cluster launched with a single command
• User interacts only by submitting jobs
• Elastic Cluster as a Service: elasticity is embedded, no external tools
• PoD and dynamic workers: run PROOF on top of it as a special case
26
PROOF+PoD elastiqCernVM HTCondor
What is the VAF? • A cluster of CernVM virtual machines: one head node, many workers
• Running the HTCondor job scheduler
• Capable of growing and shrinking based on the usage with elastiq
Dar
io B
erza
no’s
talk
@ C
HE
P201
3
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 21/120
the vaf
Dar
io B
erza
no’s
talk
@ C
HE
P201
3
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 22/120
cernvm ecosystem
● Virtual Software Appliance for CERN LHC experiments. ■ A complete, portable and easy to configure
environment for developing and running LHC data analysis locally and on private and public clouds.
■ http://cernvm.cern.ch/portal/
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 23/120
cernvm ecosystem
● Virtual Software Appliance for CERN LHC experiments.
CernVM is a baseline Virtual Software Appliance for the participants of CERN LHC experiments. µCernVM is an even lighter appliance.
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 24/120
cernvm ecosystem
● Virtual Software Appliance for CERN LHC experiments.
CVMFS is a network file system based on HTTP and optimized to deliver experiment software. (See also yesterday’s talk by Tian)
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 25/120
cernvm ecosystem
● Virtual Software Appliance for CERN LHC experiments.
Co-pilot is a framework which allows to instantiate a distributed computing infrastructure on top of virtualized computing resources (sometimes called a “Cloud Orchestrator”)
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 26/120
cernvm ecosystem
● Virtual Software Appliance for CERN LHC experiments. CernVM-Online is an easy contextualization mechanism that allows CernVM instances to be contextualized via a simple web interface. https://cernvm-online.cern.ch/user/login
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 27/120
cernvm online
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 28/120
indigo datacloud
● Develop a data/computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. ■ Which means cloud infrastructures + HPC clusters ■ https://www.indigo-datacloud.eu/
Key points: ■ Based on Open Source solutions, will develop Open Source
software. ■ Rooted in use cases and support by multi-disciplinar scientific
communities, big and small. ■ Exploitation of available, general solutions rather than on
custom, home-made specific tools or services. ■ Possibility to run the software in a hybrid, distributed Cloud
environment.
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 29/120
literature
● S. Bagnasco, D. Berzano, R. Brunetti, S. Lusso, S. Vallero, “Managing a Tier-2 Computer Centre with a Private Cloud Infrastructure”, Proceedings of ACAT2013, Beijing, J. Phys.: Conf. Ser. 1742-6596 523 012012 (2014)
● S. Bagnasco, D. Berzano, R. Brunetti, S. Lusso, S. Vallero, “Integrating multiple computing needs via a Private Cloud infrastructure”, proceedings of CHEP2013, Amsterdam, J. Phys.: Conf. Ser. 1742-6596 513 032100 (2014)
● D. Berzano, J. Blomer, P. Buncic, I. Charalampidis, G. Ganis, G. Lestaris and R. Meusel, “PROOF as a Service on the Cloud: a Virtual Analysis Facility based on the CernVM ecosystem”, proceedings of CHEP2013, Amsterdam, J. Phys.: Conf. Ser. 1742-6596 513 032007 (2014)
Stefano Bagnasco - INFN Torino . Adopting Cloud Technologies - 30/120
? Thanks
Questions? [email protected]