keir fraser, steve hand, christian limpach, dan ... · ian pratt keir fraser, steve hand, christian...

29
Xen and the Art of Virtualization Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), R Neugebauer (Intel), M Williamson (Intel) Computer Laboratory

Upload: donhu

Post on 26-Aug-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Xen and the Art of Virtualization

Ian PrattKeir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), R Neugebauer (Intel), M Williamson (Intel)

Computer Laboratory

Page 2: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Outline

Virtualization overviewXen 2.0 FeaturesArchitecturePerformanceXen para-virtualized binary interface

Linux 2.6 on Xen/x86Work in Progress

Page 3: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Virtualization Overview

Single OS image: Ensim, Vservers, CKRMGroup user processes into resource containersHard to get strong isolation

Full virtualization: VMware, VirtualPCRun multiple unmodified guest OSesHard to efficiently virtualize x86

Para-virtualization: UML, XenRun multiple guest OSes ported to special archArch Xen/x86 is very close to normal x86

Page 4: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Xen 2.0 Features

Secure isolation between VMsResource control and QoSOnly guest kernel needs to be ported

All user-level apps and libraries run unmodifiedLinux 2.4/2.6, NetBSD, FreeBSD, WinXP

Execution performance is close to nativeLive Migration of VMs between Xen nodesXen hardware support:

SMP; x86 / x86_64 / ia64; all Linux drivers

Page 5: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Xen 1.2 Architecture

Unmodified User-Level Application

Software

Ported ‘Guest’ Operating Systems

Xen Hypervisor

Hardware

Domain 0 Domain 1 Domain 2 Domain 3

Page 6: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Xen 2.0 Architecture

Unmodified User-Level Application

Software

Ported ‘Guest’ Operating Systems

Xen Hypervisor

Hardware

Domain 0 Domain 1 Domain 2 Domain 3

Page 7: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

System Performance

L X V U

SPEC INT2000 (score)

L X V U

Linux build time (s)

L X V U

OSDB-OLTP (tup/s)

L X V U

SPEC WEB99 (score)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

Benchmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML (U)

Page 8: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Xen Para-Virtualization

Arch Xen/x86 – like x86, but replace privileged instructions with Xen hypercalls

Avoids binary rewriting and fault trapping For Linux 2.6, only arch-dep files modified

Modify OS to understand virtualised env.Wall-clock time vs. virtual processor time• Xen provides both types of alarm timer

Expose real resource availability• Enables OS to optimise behaviour

Page 9: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

x86 CPU virtualization

Xen runs in ring 0 (most privileged)Ring 1/2 for guest OS, 3 for user-space

GPF if guest attempts to use privileged instrXen lives in top 64MB of linear addr space

Segmentation used to protect Xen as switching page tables too slow on standard x86

Hypercalls jump to Xen in ring 0Guest OS may install ‘fast trap’ handler

Direct ring user-space to guest OS system callsMMU virtualisation: shadow vs. direct-mode

Page 10: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

MMU Virtualizion : Shadow-Mode

MMU

Accessed &dirty bits

Guest OS

VMMHardware

guest writes

guest reads Virtual → Pseudo-physical

Virtual → Machine

Updates

Page 11: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

MMU Virtualization : Direct-Mode

MMU

Guest OS

Xen VMMHardware

guest writes

guest reads

Virtual → Machine

Page 12: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Para-Virtualizing the MMU

Guest OSes allocate and manage own PTsHypercall to change PT base

Xen must validate PT updates before useUpdates may be queued and batch processed

Validation rules applied to each PTE:1. Guest may only map pages it owns*2. Pagetable pages may only be mapped RO

Xen tracks page ownership and current useL4/L3/L2/L1/Normal (plus ref count)

Page 13: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

MMU Micro-Benchmarks

L X V U

Page fault (µs)L X V U

Process fork (µs)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

lmbench results on Linux (L), Xen (X), VMWare Workstation (V), and UML (U)

Page 14: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Queued Update Interface (Xen 1.2)

MMU

Guest OS

Xen VMMHardware

validation

guestwrites

guest readsVirtual → Machine

Page 15: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Writeable Page Tables (1)

MMU

Guest OS

Xen VMMHardware

page fault

first guestwrite

guest reads

Virtual → Machine

Page 16: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Writeable Page Tables (2)

MMU

Guest OS

Xen VMMHardware

guest writes

guest reads

Virtual → MachineX

Page 17: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Writeable Page Tables (3)

MMU

Guest OS

Xen VMMHardware

page fault

guest writes

guest reads

Virtual → MachineX

Page 18: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Writeable Page Tables (4)

MMU

Guest OS

Xen VMMHardware

validate

guest writes

guest reads

Virtual → Machine

Page 19: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Segmentation Support

Segmentation req’d by thread librariesXen supports virtualised GDT and LDTSegment must not overlap Xen 64MB areaNPT TLS library uses 4GB segs with –ve offset!• Emulation plus binary rewriting required

x86_64 has no support for segment limitsForced to use paging, but only have 2 prot levelsXen ring 0; OS and user in ring 3 w/ PT switch• Opteron’s TLB flush filter CAM makes this fast

Page 20: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

I/O Architecture

Xen IO-Spaces delegate guest OSesprotected access to specified h/w devices

Virtual PCI configuration spaceVirtual interrupts

Devices are virtualised and exported to other VMs via Device Channels

Safe asynchronous shared memory transport‘Backend’ drivers export to ‘frontend’ driversNet: use normal bridging, routing, iptablesBlock: export any blk dev e.g. sda4,loop0,vg3

Page 21: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Device Channel Interface

Page 22: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

TCP results

L X V UTx, MTU 1500 (Mbps)

L X V URx, MTU 1500 (Mbps)

L X V UTx, MTU 500 (Mbps)

L X V URx, MTU 500 (Mbps)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

TCP bandwidth on Linux (L), Xen (X), VMWare Workstation (V), and UML (U)

Page 23: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Isolated Driver VMs

0

50

100

150

200

250

300

350

0 5 10 15 20 25 30 35 40time (s)

Page 24: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Live migration for clustersPre-copy approach: VM continues to run‘lift’ domain on to shadow page tables

Bitmap of dirtied pages; scan; transmit dirtiedAtomic ‘zero bitmap & make PTEs read-only’

Iterate until no forward progress, then stop VM and transfer remainderRewrite page tables for new MFNs; RestartMigrate MAC or send unsolicited ARP-ReplyDowntime typically 10’s of milliseconds

(though very application dependent)

Page 25: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Scalability

Scalability principally limited by Application resource requirements

several 10’s of VMs on server-class machines

Balloon driver used to control domain memory usage by returning pages to Xen

Normal OS paging mechanisms can deflate quiescent domains to <4MBXen per-guest memory usage <32KB

Additional multiplexing overhead negligible

Page 26: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Scalability

L X2

L X4

L X8

L X16

0

200

400

600

800

1000

Simultaneous SPEC WEB99 Instances on Linux (L) and Xen(X)

Page 27: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Resource Differentation

2 4 8 8(diff)OSDB-IR

2 4 8 8(diff)OSDB-OLTP

0.0

0.5

1.0

1.5

2.0

Simultaneous OSDB-IR and OSDB-OLTP Instances on Xen

Agg

rega

te th

roug

hput

rela

tive

to o

ne in

stan

ce

Page 28: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

On-Going Work

xend web control interfaceCluster management tools

Load balancing

SMP guest OSes (have SMP hosts already) Support for Intel VT/LT x86 extensions

Will enable full virtualization

VM CheckpointingDebugging and fault tolerance

Page 29: Keir Fraser, Steve Hand, Christian Limpach, Dan ... · Ian Pratt Keir Fraser, Steve Hand, Christian Limpach, Dan Magenheimer (HP), Mike Wray (HP), ... Xen IO-Spaces delegate guest

Conclusions

Xen is a complete and robust GPL VMMOutstanding performance and scalabilityExcellent resource control and protectionLinux 2.6 port required no modifications to core code

http://xen.sf.net