xen and the art of virtualisation
DESCRIPTION
TRANSCRIPT
Xen and the Art of Virtualisation
Or “how do I make my hardware do more for less?”
Adrian Chadd <[email protected]>
Virtua-what?
• Virtualisation == “how to expose a different running environment to the physical environment”
• A program needs: CPU time, RAM, Input/Output (and some “love”)
• ... and this covers a lot more than Xen/VMWare/QEMU
Why?
• Safer execution environment
• Future-proof execution environment
• Migration of environments between physical resources
• Most workloads aren’t using 100% of your computer!
Good virtualisation ==
• Main requirements:
• Low overhead
• Unable to break out of virtual environment
• Completely (Mostly) transparent
Examples?
• Virtualising hardware
• OS/390, VMWare, QEmu, Xen
• Virtual software environment
• ‘protected mode’, Xen
• Completely synthetic machines
• Java, Forth, P-Code
Lets talk about OS/390• (Not talking from experience!)
• actually called MVS from the 1970’s
• Two parts - a control program (handling scheduling, IO, batch control, etc) and MVS tasks
• The clever bit: an MVS task could occupy a physical machine, or a partition, or a partition inside a partition, etc, etc.
• http://en.wikipedia.org/wiki/MVS
“Stacking”• As in MVS,OS/390, etc - the concept that a
machine can run inside a machine (inside a machine (inside a machine (inside a machine)))
• Why do this?
• better use of resources
• development/production environments can be simulated
• ..because you can..
Java? Forth?• They sound like “languages” to you, but
they’re actually entire synthetic environments
• Java and Forth define both the language and the environment the program runs in
• Java/Forth code can be written to run without requiring physical access to the hardware!
• But its not virtualised “hardware” - so can be slow!
Jails? Zones?• Various other options for virtualisation
exist; but they are specific to a particular environment
• FreeBSD jails: same kernel, same hardware; but processes in a “jail” can’t speak to processes in other jails and leave their jailed filesystem/network
• Solaris Zones are kind of like Jails, but much more fine-grained controls (network/CPU/quota stuff - ie, tight resource management)
protected modes?
• In the beginning, processors ran one program only, and ran it well
• Then people wanted the ability to run multiple tasks - anyone remember CP/M swappers?
• So processors grew extensions to run multiple tasks (user) with a control program to bind them (protected)
What was protected?• The processor generally split things up into
“What was safe for user tasks” and “what was dangerous for user tasks”
• Anything which controlled the CPU or hardware was “dangerous” - eg hardware access, creating tasks, changing memory layout, etc
• So “user” tasks had to talk through the “protected” task to talk to disks, screen, keyboard and such
protected mode on intel
• 8086/8088 (PC/XT): no protected mode, only on task (more if you “fake it”)
• 80286/80288 (PC/AT): 16-bit protected mode, but not able to “pretend to be DOS”
• 80386: 16-bit protected mode, 32 bit protected mode, and VM86 mode
• Ie, “I wish to pretend to be an 8086 in a ‘task’” please
“pretend to be 8086?”• DOS programs were written to talk directly
to the hardware
• Contrast this to MVS where all device access went via a control program..
• So if you wanted to run multiple DOS programs on one processor you had to provide not only a “task” with a virtual CPU and memory, you had to provide “virtual hardware” as well!
.. pretend, hardware?
• Whats involved in actually pretending to be hardware?
• .. A lot.
• Probably out of scope of this discussion
• But the important bit here is that its slow to pretend to be physical hardware!
• ... well, slow unless you’re VMWare..
Wait.. user task?• now, think back: there’s some stuff which can
only be done in the “protected” tasks
• So a “user” task (on intel) can’t pretend to do everything! So wait, you can’t pretend to be a complete machine in a task?
• And VM86 mode was an 8086 in a task, not an entire 80386 in a task..
• So - Intels can’t pretend to be an entire machine
Enter stage left: vmware
• Traditionally, the only way to virtualise a complete intel PC on an intel PC was to emulate the actual CPU and all the hardware
• SLOW
• Then VMWare came along and let you run a complete PC and operating system on a PC, including hardware..
• FAST
pretending where?• VMWare runs all the program code on the
real CPU..
• .. right until it notices you’re about to run a bit of code which wants to do “protected” stuff
• It then emulates the “protected stuff” slowish ..
• .. but most of your programs don’t do the protected stuff, so you don’t notice
pretending what?• VMware pretends:
• All I/O devices - keyboard, mouse, disk, a pretend BIOS and pretend BIOS services, pretend motherboard chipset, etc.
• Protected mode stuff - talking to hardware, setting up memory management and tasks, etc
• Installing the VMWare tools provides drivers which talk directly to vmware; bypassing the hardware emulation (ie: fast)
Enter stage right: Xen• A research project from the University of
Cambridge in the UK
• A hypervisor controls the CPU and memory resources, scheduling multiple tasks
• A task is generally an operating system
• One of the operating systems (dom0) also provides physical device resources
• Hypervisor: ring0; domain: ring1, (2), 3
• The other operating systems (domU) talk via the hypervisor to dom0 for all IO
Paravirtualisation• The older style of virtualisation provided by
MVS and such - hardware isn’t emulated
• All virtual machines talk to “disks”, “network interfaces”, etc via a software API
• Something then handles the physical hardware access
• No hardware emulation - so fast
• .. but operating systems have to be modified..
Full Virtualisation?• Vanderpool (Intel) / Pacifica (AMD)
• Essentially, provide a third “layer” on top of the traditional “protected mode” stuff
• So a hypervisor implements all the high level controls and virtual machines can now play with “protected” code
• .. hardware virtualisation? (IOMMUs)
• Currently though - VMWare is faster than Vanderpool/Pacifica ..
“Pretend” virtualisation• VMWare (ESX)
• Runs a hypervisor too, but the hypervisor handles handles hardware access
• Management virtual server in Linux
• Does the “code translation” techniques like normal VMWare does, so you get a decent speed boost
• Doesn’t incur the penalty of running on another normal operating system, like desktop VMWare does
Xen + fake hardware• .. or, how you run Windows under Xen
• Someone wrote a HAL for Windows 2003 running under Xen Paravirtualisation; so stuff is very quick!
• .. but it can’t be publicly released, ta licensing
• Xen “hardware virtualisation” uses Pacifica/Vanderpool to provide virtual services, and QEMU to provide the fake hardware
• The result? .. not that fast; not yet mature
What I can do with VMs
• Run legacy operating system environments on modern hardware
• Better use of resources - most servers aren’t always busy!
• Provide a constant environment so hardware can be upgraded without upgrading the virtual machines
• Migrate virtual machines between servers
Problems with VMs
• Computers might be fast, but IO isn’t always that fast..
• .. and multiple VMs talking to the same disk array turns your disk IO into random disk IO ..
• .. guess what that does to performance
More problems..
• Efficient use of resources sure..
• .. but what happens when your physical host crashes or needs to be upgraded? What if you lose your disk array?
• Not insurmountable - you just need to be careful about planning and avoiding horrible service dependencies
VMWare at home• Cool things to do with VMWare Player at
home
• You can test out Operating Systems without wiping your machine
• Kernel hackers: never again need to blow away your laptop’s disk whilst doing kernel hacking
• You can run Windows XP in a Linux VM just to run Office + Outlook + random application your employer DEMANDS!
Xen at home
• I run multiple Xen VMs for software development - but limited to Linux
• .. but this lets me test Squid under Debian/Ubuntu/CentOS relatively easily!
• Others run a firewall in one VM, file server in another VM, web server in another VM, etc..
Ok, so setting up Xen• Xen is easy to setup with modern
distributions
• step 1: install your favourite OS with Xen Dom0 support
• step 2: install the Xen packages
• step 3: reboot with the Xen hypervisor + Linux dom0 kernel
• step 4: profit!
Under FC6..
• # yum install xen kernel-xen xen-libs
• # more /boot/grub/menu.lsttitle Fedora Core (2.6.19-1.2911.6.5.fc6xen) root (hd0,0) kernel /xen.gz-2.6.19-1.2911.6.5.fc6 dom0_mem=384000 module /vmlinuz-2.6.19-1.2911.6.5.fc6xen ro root=/dev/Hosting3/Root module /initrd-2.6.19-1.2911.6.5.fc6xen.img
• Then type reboot!
Xen and Dom0• A straight Xen install will leave your
machine:
• Running the Xen hypervisor
• Your existing machine is Dom0 - so accessing hardware directly
• .. but some stuff might not work, like sound, graphics, anything with direct memory mapping for IO
• The question is why? (IO Virtualisation?)
Xen and DomU• DomU’s generally don’t get physical
hardware access (but you CAN if you’re scary..)
• They get a virtual network devices and virtual disk devices
• Virtual Network devices: are just network interfaces which appear as ethernet if ’s in dom0
• Virtual Disk devices: Loop devices, LVMs, physical partitions, etc..
DomU: disks• Some people like using large files for DomU
filesystems - mounted using the “loop” device
• This allows you to do scary things like mount the images via NFS off a filer..
• .. but performance suffers.
• Pro: easier to handle (they’re files)
• Con: all domU access goes via a dom0 filesystem; so things can get slow.. (FS in FS)
DomU: disks (ctd)• Or.. you can use LVM
• LVM lets you carve your disks up into subdisks which appear as normal disk block devices
• You can then load a disk block device into Xen
• .. and you can grow it as required..
• This method is very fast, but they’re not “files” so they’re possibly more confusing for people to use when beginning..
• Insert drawings on whiteboard here
DomU: networking• Xen gives you two types of network
interfaces: routed and bridged
• Both implemented as ethernet devices!
• So each DomU has a network interface which shows up in the Dom0 (as vifX.Y); and various scripts can then either add it to a Linux ethernet bridge group, or put an IP on it and treat it as a point-to-point routed interface
• Insert drawings on whiteboard here
Xen commands• “xm” is the command to use
• “xm create” creates a VM from a file
• “xm list” lists VMs
• “xm destroy/xm shutdown” does what they sound like
• “xm console” opens a text console to a VM
• “xm mem-set” lets you dynamically set RAM for a VM
• Plenty more magic commands to break stuff
How Xen creates VMs• This stuff is a bit voodoo; read the Xen
manual!
• xend-config.sxp defines two shell script locations - one which creates the top-level interfaces on startup; one which creates sub interfaces (vifs) when a VM is created
• They’re shell scripts - so you can roll your own
• Routed and Bridge examples are given - but you can do other stuff yourself (eg VLANs)
Booting the VM• The “VM creator” process involves creating
a domain, loading in the kernel, creating event channels, allocating memory, then letting it rip
• The ‘old’ way - kernel in Dom0; modules in DomU (loaded once domU is running)
• The ‘new’ way - use pygrub bootloader which prods the DomU filesystem like GRUB and loads kernels/modules from in there
• You can manage domU kernels in domU!
An example VM file
# This is very old-school - please use py-grub!kernel = "/boot/vmlinuz-2.6.19-1.2911.6.5.fc6xen"ramdisk = "/boot/initrd-2.6.19-1.2911.6.5.fc6xen.img.1"memory = 128name = "xenion"vif = [ 'bridge=xenbr0, ip=203.56.168.22' ]disk = [ "phy:Hosting3/XEN_xenion_root,sda1,w", "phy:Hosting3/XEN_xenion_swap,sda2,w" ]root = "/dev/sda1 ro"on_crash = "preserve"
Xen example: firewall• Here’s where I’ll draw on the whiteboard
what an example server setup would be with a firewall, a web/DNS server and a file server would look like
• There’s plenty of examples on the Internets explaining how this is done; no need to waste space here
• Use the virtual machine management tools for your platform - eg “virtsh” - “virtual manager” and pygrub; much easier!
More?
• Want to know more? There’s a LOT out there about Xen and modern virtualisation techniques in general
• Feel free to ask questions now, or on the PLUG mailing list