predrag buncic (cern/ph-sft) cernvm - a virtual software appliance for lhc applications c....

22
Predrag Buncic (CERN/PH-SFT) CernVM - a virtual software appliance for LHC applications C. Aguado-Sanchez 1) , P. Buncic 1) , L. Franco 1) , A. Harutyunyan 2) , P. Mato 1) , Y. Yao 3) 1) CERN, Geneva, 2) Yerevan Physics Institute, Yerevan, 3) LBNL, Berkeley

Upload: wilfred-fowler

Post on 01-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Predrag Buncic (CERN/PH-SFT)

CernVM - a virtual software appliance for LHC applications

C. Aguado-Sanchez 1) , P. Buncic 1) , L. Franco 1) , A. Harutyunyan2), P. Mato 1), Y. Yao 3)

1) CERN, Geneva,2) Yerevan Physics Institute, Yerevan,

3) LBNL, Berkeley

Prague, 26/03/2009 - 2CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

CernVM Project

• Talk Outline CernVM Project

• Building blocks• User Interface and API• Scalability and performance

Future plans & directions Release status Conclusions Links

Prague, 26/03/2009 - 3CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

CernVM Project

• Portable Analysis Environment using Virtualization Technology (WP9) Approved in 2007 (2+2 years) as R&D activity in CERN/PH Department Started January 2008 Sister project to Multicore R&D

• Project goals: Provide a complete, portable and easy to configure user environment for

developing and running LHC data analysis locally and on the Grid independent of physical software and hardware platform (Linux, Windows, MacOS)

Decouple application lifecycle from evolution of system infrastructure Reduce effort to install, maintain and keep up to date the experiment

software Lower the cost of software development by reducing the number of

compiler-platform combinations

Prague, 26/03/2009 - 4CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

• A complete Data Analysis environment available for each experiment Code check-out, edition, compilation, local small test, debugging, … Grid submission, data access… Event displays, interactive data analysis, …

• No software installation required

• Suspend/resume capability

3/9/[email protected] 4

Prague, 26/03/2009 - 5CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

Key Building Blocks

• rBulder from rPath (www.rpath.org) A tool to build VM images for various

virtualization platforms • rPath Linux 1

Slim Linux OS binary compatible with RH/SLC4

• rAA - rPath Linux Appliance Agent Web user interface XMLRPC API

• Can be fully customized and extended by means of plugins (401)

• CVMFS - CernVM file system Read only file system optimized for software

distribution• Aggressive caching

Operational in offline mode• For as long as you stay within the cache

Build types

Installable CD/DVD Stub Image Raw Filesystem Image Netboot Image Compressed Tar File Demo CD/DVD (Live CD/DVD) Raw Hard Disk Image Vmware ® Virtual Appliance Vmware ® ESX Server Virtual

Appliance Microsoft ® VHD Virtual Apliance Xen Enterprise Virtual Appliance Virtual Iron Virtual Appliance Parallels Virtual Appliance Amazon Machine Image Update CD/DVD Appliance Installable ISO

Prague, 26/03/2009 - 6CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

class Root(CPackageRecipe): name='root' version='5.19.02'

buildRequires = ['libpng:devel', 'libpng:devellib','krb5:devel', 'libstdc++:devel’,'libxml2:devel', 'openssl:devel','python:devel', 'xorg-x11:devel', 'zlib:devel', 'perl:devel', 'perl:runtime']

def setup(r): r.addArchive('ftp://root.cern.ch/root/%(name)s_v%(version)s.source.tar.gz’) r.Environment('ROOTSYS',%(builddir)s') r.ManualConfigure('--prefix=/opt/root ') r.Make() r.MakeInstall()

Conary Package Manager

Prague, 26/03/2009 - 7CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

CernVM Variations

group-cernvm(core packages)group-cernvm

(core packages)

group-cernvm-devel(development tools)

group-cernvm-devel(development tools)

group-slc4(SLC4 compatibility libs)

group-slc4(SLC4 compatibility libs)

group-<experiment>(groups and extra packages

required by experiment)

group-<experiment>(groups and extra packages

required by experiment)

100 MB

compat-db4compat-opensslcompat-libstdc++slc3compat-libxml2compat-readlinecompat-tclcompat-tk

group-<experiment>-desktop(lightweight X environment)

group-<experiment>-desktop(lightweight X environment)

group-cernvm-desktopX11

group-cernvm-desktopX11

Prague, 26/03/2009 - 8CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

import xmlrpclibimport osurl=‘http://user:password@host:8004/rAA/xmlrpc’server = xmlrpclib.ServerProxy(url)r = server.cernvm.Config.configGridUIVersion(”3.1.22-0")

As easy as 1,2,3

1. Login to Web interface

2. Create user account

3. Select experiment, appliance flavor and preferences

Prague, 26/03/2009 - 9CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

CVMFS

• CernVM File System (CVMFS) is derived from Parrot (http://www.cctools.org) and its GROW-FS plugin code base and adapted to run as a FUSE kernel module adding extra features like: possibility to use multiple file catalogues on the server side transparent file compression under given size threshold dynamical expansion of environment variables embedded in symbolic links

Prague, 26/03/2009 - 10CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

Scalable Infrastructure

Publishing Releases

1. Each experiment is given a VM to install and test their software using own installation tools

2. Publishing is an atomic operation

Prague, 26/03/2009 - 12CHEP 2009 CernVM – A Virtial Machine for LHC Experiments [email protected]

Primary (master) softwarerepository

Secondary softwarerepository

Secondary softwarerepository

Reverse ProxyLoad BalancingReverse Proxy

LoadBalancingReverse Proxy

Reverse Proxy

CernVM(thin client)

SiteReverse Proxy

Load BalancingReverse Proxy

RegionalReverse Proxy

The aim is to reduce latency which is the most important issue for distributed network file systems

3/9/09 12

Current deployment model

Benchmark setupROOT

(version 5.22, 100Mb)

• The benchmark runs in Xen VM and includes compilation and execution of ROOT stressHepix suite The system has to find and fetch binaries, libraries and headers from the network file

system but, once loaded, the libraries remain in a local cache • We compare the performance of CVMFS to AFS by artificially introducing network

latency and limiting bandwith (using tc tool)

CVMFS vs AFS

• These plots show the time penalty t=tAFS,CVMFS - tlocal resulting from having the application binaries, search paths, and include paths reside on a network file system, Additional performance loss due to virtualization (not shown here) is consistently <5%

• CVMFS shows consistently better performance than AFS in case of ‘cold cache’ ‘Worm cache’ performance is in case of AFS and CVMFS equal to local file access

ProxyServer

ProxyServer

CernVM

CernVM

CernVM

HTTPserver

HTTPserver

ProxyServer

HTTPserver

HTTPserver

ProxyServer

HTTPserver

HTTPserver

ProxyServer

Removing single point of failure (1)

Slave servers could be deployed on strategic locations to reduce latency

and provide redundancy

HTTPserver

HTTPserver

ProxyServer

CernVM

CernVM

CernVM

Content Distribution Network

Content Distribution Network

WAN Use existing Content Delivery Networks to remove single point of failure

Amazon CloudFront (http://aws.amazon.com/cloudfront/Coral CDN (http://www.coralcdn.org)

LANUse P2P like mechanism for discovery of nearby CernVMs and cache sharing between them. No need to manually setup proxy servers (but they could still be used where exist)

Removing single point of failure (2)

Prague, 26/03/2009 - 17CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

Bridging Grids & Clouds

• BOINC Open-source software for volunteer computing and grid computing http://boinc.berkeley.edu/

• CernVM CoPilot development Based on BOINC, LHC@HOME experience and CernVM image Image size is of outmost importance to motivate volonteers Can be easily adapted to Pilot Job frameworks (AliEn,Dirac, Panda)

• … or Condor Worker, or proofd..

Aims to demonstrate running of ATLAS simulation using BOINC infrastructure and PanDa

BOINCLHC@HOM

E PanDAPilot

Prague, 26/03/2009 - 18CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

Adapter

CernVM CoPilot

0. Send host JDL (free disk space, free memory, available packages)

1. Append framework specific information and request a job

3. Send input files and commands for execution(packages are already there)

5. Register output files4. When the job is done send back the output files (and the result of validation)

2. Send user job JDL from Task Queue

AliEn/DIRAC/PanDA

Prague, 26/03/2009 - 19CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

Simplifying Grid deployment

• Nimbus (former Globus Workspace Service) Nimbus is a set of open source tools that together provide an

"Infrastructure-as-a-Service" (IaaS) cloud computing solution• http://workspace.globus.org/

Google Summer School (hosted at ANL) project to deploy a one-click, auto-configuring virtual Grid overlay for Alice/AliEn

• Successfully created virtual AliEn site for ALICE with one command• See CHEP’09 contribution by Artem Harutyunyan (poster #214)

Prague, 26/03/2009 - 20CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

1.2.0 Release now available

• Available now for download from http://rbuilder.cern.ch/project/cernvm/releases

• Can be run on Linux (KVM, Xen,VMware Player, VirtualBox) Windows(WMware Player, VirtualBox) Mac (Fusion, Parallels, VirtualBox)

• Release Notes http://cernvm.web.cern.ch/cernvm/index.cgi?page=ReleaseNotes

• Appliance can be configured and used with ALICE, LHCb, ATLAS (and CMS) software frameworks ALICE (AliEn 2.15,2.16, AliRoot v4-14-Rev-04, v4-15-Rev-03, v4-16-Rev-01) ATLAS (14.2.20, 14.2.23, 14.2.24, 14.4.0, 14.5.0, 14.5.1) CMS (CMSSW_2_2_5, CMSSW_2_2_6 ) LHCB (GAUDI_v19r8, GAUDI_v19r9, GAUDI_v20r0, GAUDI_v20r2, GAUDI_v20r3,

GAUDI_v10r11, DAVINCI_v20r3, DAVINCI_v19r12, DAVINCI_v20r3,DAVINCI_v22r1)

Prague, 26/03/2009 - 21CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

Summary

• If you work for one of the LHC experiments and you have one or a few problems listed below then

CernVM might be what you are looking for: Your experiment software is not compatible with your hardware or s/w platform running on your

laptop/desktop You do not want to spend time to manually install and keep software up to date Your want to profit from the latest developments in CPU technology and use your multi/many

core CPU to its maximal potential without modifying your application You want to run your software on voluntary resources beyond the current Grid

• CernVM is a ‘thin’ virtual machines capable of running software of all four LHC experiments Based on rPath Linux, binary compatible with SL4 Extensible user interface (plugins) Package groups can be tailored to individual user group

• Experiment software is injected in VM by means of a file system (CVMFS) specially optimized for

efficient software distribution This allows us to develop, deploy and maintain only ONE image Easy for end user but also great potential for deployment on cloud & grid Using HTTP protocol allows efficient caching using standard technology

• Version 1.2.0 available at http://rbuilder.cern.ch/project/cernvm/releases

Prague, 26/03/2009 - 22CHEP 2009 CernVM – A Virtial Machine for LHC Experiments

• Mailing lists [email protected] (open list for announcements and discussion)

[email protected] (end-user support for the CernVM project)

• Savannah Portal Please submit bugs and feature requests to Savannah at

• http://savannah.cern.ch/projects/cernvm

• CernVM Home Page: http://cernvm.cern.ch

• rBuilder & Download Page: http://rbuilder.cern.ch

• ATLAS Wiki https://twiki.cern.ch/twiki/bin/view/Atlas/CernVM

CernVM Links…