windsor: domain 0 disaggregation for xenserver and xcp

23
August 28, 2012 Windsor Domain 0 disaggregation for XenServer James Bulpin, Director of Technology, XenServer, Citrix

Upload: the-linux-foundation

Post on 19-May-2015

25.820 views

Category:

Technology


1 download

DESCRIPTION

In a traditional Xen configuration domain 0 is used for a large number of different functions including running the toolstack(s), backends for network and disk I/O, running the QEMU device model instances, driving the physical devices in the system, handling guest console/framebuffer I/O and miscellaneous monitoring and management functions. Having all these functions in one domain produces a complex environment which is susceptible to shared fate on the failure of any one function, has complex interactions between functions (including resource contention) which makes it difficult to predict performance, and has limited flexibility (such as requiring the same kernel for all device drivers). ""Domain 0 disaggregation"" has been discussed for some time as a way to break out domain 0's functions into separate domains. Doing this enables each domain to be tailored to its function such as using a different kernel or operating system to drive different physical devices. Splitting functions into separate domains removes some of the unintentional interactions such as in-domain resource contention and reduces the system impact of the failure of a single function such as a device driver crash. Although domain 0 disaggregation is not new it is seldom used in practise and much of its use is focussed on providing enhanced security. Citrix XenServer will be moving towards a disaggregated domain 0 in order to provide better security, scalability, performance, reliability, supportability and flexibility. This talk will describe XenServer's “Windsor” architecture and explain how it will provide the above benefits to customers and users. We will present an overview of the architecture and some early experimental measurements showing the benefits.

TRANSCRIPT

Page 1: Windsor: Domain 0 Disaggregation for XenServer and XCP

August 28, 2012

Windsor Domain 0 disaggregation for XenServer

James Bulpin, Director of Technology, XenServer, Citrix

Page 2: Windsor: Domain 0 Disaggregation for XenServer and XCP

Introduction

• XenServer/XenCloudPlatform (XCP): a distribution of Xen, a domain 0 and

everything else needed to create a platform for virtualization

• A platform for server virtualisation

• A platform for virtual desktops (e.g. XenDesktop)

• A platform for the cloud (e.g. OpenStack and CloudStack)

• A platform for virtualised networking (e.g. NetScaler)

• All use cases are tending towards much higher numbers of guest VMs per host

• Current architecture works now but will hit bottlenecks as servers get bigger

and more powerful

• Want a flexible, modular solution to scalability

Page 3: Windsor: Domain 0 Disaggregation for XenServer and XCP

Overview

• XenServer architecture evolution – a brief history

• Limitations of the current architecture

• Windsor: 3rd generation XenServer architecture ᵒ Using domain 0 disaggregation for scalability and performance

This presentation is about the internal technology of

XenServer/XCP. It is not a statement about the future

feature set or capabilities of any particular XenServer

release.

Page 4: Windsor: Domain 0 Disaggregation for XenServer and XCP

XenServer Architecture – a

brief history

Page 5: Windsor: Domain 0 Disaggregation for XenServer and XCP

XenServer Architecture – a brief history (1)

• First generation architecture in Burbank – XenEnterprise 3.0.0

• Single host virtualisation: no resource pools, no XenMotion

• Based on open-source xend/xm toolstack

• Basic C host agent driving xend

• XenAdmin Java management application

• Initially used a small, read-only dom0 ᵒ Moved to writable environment in 3.1

Page 6: Windsor: Domain 0 Disaggregation for XenServer and XCP

XenServer Architecture – a brief history (2)

• Second generation architecture in Rio – XenServer 4.0.0

• Current releases, including XenServer 6.1 coming soon, based on this

• All current versions of XCP based on this

• Sophisticated Ocaml xapi toolstack implementing the XenAPI ᵒ Uses only lowest level part of open-source Xen toolstack (libxc)

• Clustered architecture for resource pools, XenMotion and master mobility

• Domain 0 evolved from 1st generation ᵒ Fairly full Linux distribution based on CentOS

ᵒ Writable environment using RPMs for package management

Page 7: Windsor: Domain 0 Disaggregation for XenServer and XCP

Limitations of the current

architecture

Page 8: Windsor: Domain 0 Disaggregation for XenServer and XCP

XenServer 2nd generation host architecture

8

xapi

kernel

dom0

Xen Hypervisor

Linux storage devices

Linux Network devices

Storage Network Domains db

SM i/f scripts

xenstore

Blktap/ blkback Netback

qemu-dm

vm consoles

XML/RPC; HTTP

XenAPI Streaming services

other pool hosts

Hardware devices

VM1

Shared memory pages

tapdisk

vswitch

ovs-vswitchd

DVS Controller

Page 9: Windsor: Domain 0 Disaggregation for XenServer and XCP

Limitations of the current architecture

• With future larger, powerful servers domain 0 will become a bottleneck ᵒ Will limit scalability and performance

• Lack of failure containment

• Hard to reason about dom0 security

• Non-optimal use of modern NUMA architectures

• Limited third party extensibility

Page 10: Windsor: Domain 0 Disaggregation for XenServer and XCP

“Windsor” – XenServer’s new

architecture

Page 11: Windsor: Domain 0 Disaggregation for XenServer and XCP

Goals for the new architecture

• Improved scalability on single XenServer host

• Remove performance bottlenecks

• Cloud scale horizontal scaling of XenServer hosts

• Better isolation for multi-tenancy

• Increased availability and quality of service

• Higher Trusted Computing Base measurement coverage

Page 12: Windsor: Domain 0 Disaggregation for XenServer and XCP

Design principles and overview

• Scale-out, not scale-up – exploit parallelism and locality

• Disaggregate Domain-0 functionality to other domains

• Encapsulate and simplify

• Clean APIs between components

• Flexibility

• Design for scalability and security

Page 13: Windsor: Domain 0 Disaggregation for XenServer and XCP

The approach

• Can we improve performance and scalability with a bigger domain 0? – Yes

• Can we tune domain 0 to better utilise hardware resource? – Yes

• But... this makes domain 0 a bigger and more complex environment ᵒ Easy to change one thing and have a negative effect elsewhere

ᵒ Multiple individuals and organisations working in the same complex environment

ᵒ Still don’t get containment of failures

• Instead disaggregate domain 0 functionality into multiple system domains ᵒ Can be thought of as “virtualising dom0” – after all we tell users that it’s better to have

one workload per VM and use a hypervisor to run multiple VMs per server

ᵒ Helps disentangle complex behaviour

ᵒ Easier to provision suitable resources, allows for clear parallelism

ᵒ Contains failures

Page 14: Windsor: Domain 0 Disaggregation for XenServer and XCP

Domain 0 disaggregation

• Moving virtualisation functions out of domain 0 into other domains ᵒ Driver domains contain backend drivers (netback, blkback) and physical device drivers

• Use PCI pass-through to give access to physical device to be virtualised

ᵒ System domains to provide shared services such as logging

ᵒ Domains for running qemu(s) – per VM stub domains or shared qemu domains

ᵒ Monitoring, management and interface domains

• Has been around for a while, mostly with a security focus ᵒ Improving Xen Security through Disaggregation (Murray et. al., VEE08)

[http://www.cl.cam.ac.uk/~dgm36/publications/2008-murray2008improving.pdf]

ᵒ Breaking up is hard to do: Xoar (Colp et. al., SOSP11) [http://www.cs.ubc.ca/~andy/papers/xoar-

sosp-final.pdf]

ᵒ Qubes (driver domains) [http://qubes-os.org]

ᵒ Citrix XenClient XT (network driver domains) [http://www.citrix.com/xenclient]

Page 15: Windsor: Domain 0 Disaggregation for XenServer and XCP

Targets for disaggregation

• Storage driver domains – storage stack (e.g. VHD), ring backends, device

drivers

• Network driver domains – network stack (e.g. OVS), ring backends, device

drivers

• xenstored domain – busy, central control database for low level functionality

• xapi management domain

• qemu domain (per node, per tenant or per-VM) – emulated BIOS etc.

Page 16: Windsor: Domain 0 Disaggregation for XenServer and XCP

Benefits

• Better VM density

• Better use of scale-out hardware (NUMA, many cores, balanced I/O)

• Improved stability

• Improved availability (fault isolation)

• Opportunity for secure boot etc.

• More extensible, future value-add opportunities

Page 17: Windsor: Domain 0 Disaggregation for XenServer and XCP

CPU CPU RAM RAM NIC

(or SR-

IOV VF)

NIC (or SR-

IOV VF)

NIC (or SR-

IOV VF)

NIC (or SR-

IOV VF)

RAID

Xen

Dom0 Network

driver

domain

NFS/

iSCSI driver

domain

Qemu

domain

xapi

domain

Logging

domain Local

storage driver

domain

NFS/

iSCSI driver

domain

Network

driver

domain

Qemu

domain

eth eth eth eth scsi

User VM User VM

NB gntdev NB

NF BF NF BF

dbus over v4v

qemu qemu

xapi xenopsd

libxl

healthd

Domain

manager

vswitch

networkd

tapdisk

blktap3

storaged

syslogd vswitch

networkd

tapdisk

blktap3

storaged

tapdisk

blktap3

storaged

gntdev gntdev

Page 18: Windsor: Domain 0 Disaggregation for XenServer and XCP

Hardware

XenServer Hypervisor

NIC Storage

Logging

XAPI

qemu

console

Replication to scale-out on big servers

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

ISV

• Scalability • More VMs, more system domains

• Isolation for cloud environments

XenServer Hypervisor

Hardware

NIC Storage qemu

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

NIC Storage qemu

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM xenstored

Dom

0

Page 19: Windsor: Domain 0 Disaggregation for XenServer and XCP

Balanced I/O

Socket 0

Core Core

Core Core

PCIe

I/O hub

Socket 1

Core Core

Core Core

PCIe

I/O hub

DIMMs DIMMs

Net 0 Net 1

NIC NIC

Network

driver

domain

User

VMs

NIC NIC

Network

driver

domain

User

VMs

QPI

NUMA affinity

and memory

placement

Use driver

domain on

this node

Redundant

failover to

other node All IO

processing,

DMA, etc.

kept within

NUMA node

Page 20: Windsor: Domain 0 Disaggregation for XenServer and XCP

Flexibility – value-add storage appliances

(today)

Xen

Dom0 Storage appliance User VM

BF

Clever

stuff

BF NF BB NB BB

iSCSI/

NFS

Local

SR

Appliance exposes its

storage to Dom0 via

NFS/iSCSI over local

networking

VM block

accesses

traverse

several rings

Page 21: Windsor: Domain 0 Disaggregation for XenServer and XCP

Flexibility – value-add storage appliances

(Windsor)

Xen

Dom0 Driver domain User VM

BF

blktap3

Clever

stuff

scsi

Xen

Dom0 Driver domain User VM

BF

blktap3

Clever

stuff

scsi

Network link

between cluster

nodes

(PV VIF or PCI

passthrough

NIC/SR-IOV VF) Access to

local

disks/SSDs

Local blk ring,

faster than

NFS/iSCSI

via dom0

Page 22: Windsor: Domain 0 Disaggregation for XenServer and XCP

Summary

• Next generation XenServer architecture built to scale-out using domain 0

disaggregation techniques ᵒ Scale with the workload and server size

ᵒ Expect to significantly enhance scalability and aggregate performance

• Well defined APIs between components will allow better extensibility

• Avoids complexity of single, large, multi-function domain 0 ᵒ Easier to reason about

ᵒ Easier to maintain and debug

ᵒ Containment of failures

Page 23: Windsor: Domain 0 Disaggregation for XenServer and XCP

Work better. Live better.