pac-498b esx server architectural directionsdownload3.vmware.com/vmworld/2005/pac498-b.pdfpac-498b...
TRANSCRIPT
PAC-498BESX Server
Architectural Directions
Beng-Hong LimDirector of R&D
This presentation may contain VMware confidential information.
Copyright © 2005 VMware, Inc. All rights reserved. All other marks and names mentioned herein may be trademarks
of their respective companies.
AgendaESX Server overviewTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performance Power efficiency and management
Conclusions
ESX Server Core Virtualization
Hardware
VM VMVM VM
VMM VMMVMM VMM
VMkernel
Service Console
VMX VMXVMX VMX
Device Drivers
Virtual MachineMonitor
Device Drivers
VMkernel Hardware Interface
Storage Stack Network Stack
I/O Stack
ESX Server Enterprise-Class Features
Hardware
VM VMVM VM
VMM VMMVMM VMM
VMkernel
Service Console
VMX VMXVMX VMX
Device DriversI/O Stack Virtual
MachineMonitor
Device Drivers
VMkernel Hardware Interface
Third-Party
Agents
Storage Stack Network Stack
DistributedVM File System
Virtual NIC &SwitchResource
ManagementCPU Scheduling
Memory SchedulingStorage BandwidthNetwork Bandwidth
EnterpriseClassVirtualizationFunctionality
ESX Server in the Datacenter
Hardware
VM VMVM VM
VMM VMMVMM VMM
VMkernel
Service Console
VMX VMXVMX VMX
Device DriversI/O Stack
SDK, VirtualCenter Agent
Third-party Solutions
DistributedServices
DRS
VMotion
DAS
ProvisioningBackup
Virt
ualC
ente
r
Managementand
DistributedVirtualization
Services
Virtual MachineMonitor
Device Drivers
VMkernel Hardware Interface
Third-Party
Agents
Storage Stack Network Stack
DistributedVM File System
Virtual NIC &SwitchResource
ManagementCPU Scheduling
Memory SchedulingStorage BandwidthNetwork Bandwidth
EnterpriseClassVirtualizationFunctionality
AgendaOverview of ESX ServerTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performancePower efficiency and management
Conclusions
Server Technology TrendsMulti-core CPUs
16 to 32 CPU cores per server64-bit systems
Multi-terabytes of memoryPower-aware architectures
Adaptive throttling of CPUs and server hardwareConverged I/O fabrics and interfaces
Shared high-speed interface to network and storageNetwork-based, virtualized storage
Stateless servers
The Future Datacenter
Many CPU cores64-bits: lots of memoryShared, high-bandwidth connection to network and external storageStatelessPower-hungry
……Each server:
ServersServers NetworkNetwork StorageStorage
Virtualization is KeyAbundant compute resources on a server
Virtualization is inherently scalable and parallelThe killer app for efficiently utilizing many CPUs and multi-terabytes of memory
Power management increasingly importantHigher compute densities and rising utility costsMaximize performance per watt across all serversDistributed resource scheduling can optimize for this metric
Transforms system managementBreaks the bond between hardware and applicationsEase of scale-up management in a scale-out environment
The Virtual Datacenter……
ESX Server virtualizes individual serversVirtualCenter synthesizes virtualized servers into a giant computerESX Server and VirtualCenter map applications and virtual machine topologies onto physical resources
A Distributed A Distributed Virtual ComputerVirtual Computer
AgendaOverview of ESX ServerTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performancePower efficiency and management
Conclusions
CPU Virtualization
VM VMVM
DistributedVirtual Machine
File System
Device Drivers
Storage Stack
Virtual NIC andSwitch
Network Stack
VM
VMM VMMVMM VMM
VMkernel
Hardware
Service Console
VMX VMXVMX VMX
Device Drivers
I/O Stack
CPU VirtualizationBasic idea: directly execute code until not safeHandling unsafe code
Trap and emulate: classic mainframeAvoid unsafe code, call into VMM: paravirtualizationDynamically transform to safe code: binary translation
Tradeoffs among the methods
ExcellentPoorExcellentCompatibilityHighAverageAverageSophistication
GoodExcellentAveragePerformance
Binary Translation
Para-virtualization
Trap and Emulate
CPU Virtualization Directions
Flexible architecture supports mix of guests and VMM types
Separate VMM per virtual machineSimultaneously run 32-bit, 64-bit, and paravirtualized guests
Use most efficient method for the hardware and guest OS
VM VMVM VM
BTVMM-32
VTVMM-64
ParaVMM
BTVMM-32
VMkernel
. . .
. . .
New technologies: 64-bit CPUs, VT/Pacifica, Linux/Windows paravirtualization. Many guest OS types
AgendaOverview of ESX ServerTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performancePower efficiency and management
Conclusions
I/O Virtualization
Hardware
VM VMVM
DistributedVirtual Machine
File System
Device Drivers
Storage Stack
Virtual NIC andSwitch
Network Stack
VM
VMM VMMVMM VMM
VMkernel
VMX VMXVMX VMX
Device Drivers
I/O StackService Console
I/O Virtualization Paths
Device Driver
I/O Stack
Guest OS
Serv
ice
Con
sole
VMkernel
Device Driver
VMX
I/O Stack
Device Driver
DeviceEmulation
DeviceEmulation
VMM
1. Hosted/Split I/O
2. Native I/O
Paths to physical device1. Hosted/Split I/O: via a
separate host/VM2. Native I/O: via the
vmkernel3. Passthrough I/O: guest
directly drives device Needs hardware support, sacrifices functionality.
3. Passthrough I/O
Which I/O path to use?Which I/O path to use?
Evaluating the I/O pathsCompatibility
Hardware vendors can re-use existing device driversPerformance (per watt)
High I/O performance, low CPU occupancyIsolation
Contain device driver faultsVirtualization Functionality
Virtual machine portabilityResource sharing and multiplexingOffloading guest functionality into virtualization layer
I/O Virtualization: Compatibility
Performance
FunctionalityIsolation
GoodPoorGoodCompatibilityPassthroughNativeHosted/Split
Hosted/Split and Passthrough can re-use device drivers from existing OSesNative requires new or ported drivers. Provide DDK and driver APIs to ease driver development and porting
Hosted/Split I/O Performance
Microkernel-style communication, context switch and scheduling overhead unless CPUs dedicatedScalability limits to Service Console or Driver VM
Service Consoleor Trusted
Virtual MachineOS
Backend
Native Device Drivers
VMkernel
VirtualMachine1
OS
Frontend Device Driver
VirtualMachine2
OS
Frontend Device Driver
VirtualMachine3
OS
Frontend Device Driver
Native I/O Performance
Direct calls between frontend and backend Backend can run on any CPU, scalable
BackendNative Device Drivers
OS
Frontend Device Driver
OS
Frontend Device Driver
OS
Frontend Device Driver
VMkernel
VirtualMachine1
VirtualMachine2
VirtualMachine3
Passthrough I/O Performance
Interrupt routing
Device Driver
Device Driver
Device Driver
Guest OS driver drives the device directlyVMkernel may have to handle/route interrupts
VirtualMachine1
VirtualMachine2
VirtualMachine3
OS OS OS
VMkernel
I/O Virtualization: Performance
GoodGoodPoorPerformance
FunctionalityIsolation
GoodPoorGoodCompatibilityPassthroughNativeHosted/Split
Hosted/Split incurs switching and scheduling overheads, or consumes dedicated CPUsNative and Passthrough are efficient, scalablePassthrough avoids an extra driver layer, but runs more code non-natively
I/O Virtualization: Isolation, Today
GoodGoodPoorPerformance
FunctionalityN/ANoneNoneIsolation
GoodPoorGoodCompatibilityPassthroughNativeHosted/Split
Passthrough allows malicious guest to crash system, so not an optionAll three methods need I/O MMU to map and protect DMA
I/O Virtualization: Isolation, Future
GoodGoodPoorPerformance
FunctionalityGoodGoodGoodIsolation
GoodPoorGoodCompatibilityPassthroughNativeHosted/Split
Hosted/Split and Passthrough can isolate within virtual machine, use I/O MMUsNative can isolate within in-kernel protection domains, use VT/Pacifica and I/O MMUsNot a substitute for testing and qualification
I/O Virtualization: Functionality
GoodGoodPoorPerformance
PoorGoodGoodFunctionalityGoodGoodGoodIsolation
GoodPoorGoodCompatibilityPassthroughNativeHosted/Split
Passthrough precludes offloading functionality from the guest into the virtualization layer, e.g., NIC teaming, SAN multipathingPassthrough sacrifices some key virtualization capabilities; VM portability, VMotion
I/O Virtualization Direction
GoodGoodPoorPerformance
PoorGoodGoodFunctionalityGoodGoodGoodIsolation
GoodPoorGoodCompatibilityPassthroughNativeHosted/Split
Future datacenter implicationsPower-efficient performance favors Native and PassthroughStateless servers and converged I/O interfaces: fewer devices to support, eases compatibility
I/O Virtualization DirectionOptimize Native I/O for selected devices, driver isolation
I/O Stack
Guest OS
VMkernel
Device Driver
DeviceEmulation
VMM
Device Driver
Serv
ice
Con
sole
Device Driver
VMX
I/O Stack
DeviceEmulation
I/O Virtualization DirectionOptimize Native I/O for selected devices, driver isolationMigrate from Hosted/Split I/O to Passthrough I/O when hardware readyI/O Stack
Guest OS
VMkernel
Device Driver
DeviceEmulation
VMM
Device Driver
I/O Virtualization DirectionOptimize Native I/O for selected devices, driver isolationMigrate from Hosted/Split I/O to Passthrough I/O when hardware readyCan synthesize Hosted/Split I/O by proxy through Passthrough I/O VM
I/O Stack
Guest OS
VMkernel
Device Driver
DeviceEmulation
VMM
Device Driver
Guest OS
Device Driver
I/O Stack
Hardware Support for Passthrough
To preserve key virtualization capabilitiesDevice sharing: multiple virtual end pointsSnapshots and VMotion: save/restore device statePage sharing, VMotion: demand pagingVirtual machine portability: standard device abstraction
Active industry interest in hardware support for Passthrough I/O. Please contact VMware if interested
Other I/O Virtualization DirectionsMore network-based storage support
iSCSI and NAS in ESX Server 3.0I/O accelerators
Offload engines: offload guest or vmkernel I/O Intel I/OAT: guest and vmkernel usage
I/O bandwidth managementImportant for shared interfaces, converged I/O fabrics
ParavirtualizationReduce hardware requirements for Passthrough I/ODefine a standard paravirtual I/O interface
AgendaOverview of ESX ServerTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performancePower efficiency and management
Conclusions
Service Console
Scalable Performance
Hardware
VM VMVM
DistributedVirtual Machine
File System
Device Drivers
Storage Stack
Virtual NIC andSwitch
Network Stack
VM
VMM VMMVMM VMM
VMkernel
ResourceManagement
VMX VMXVMX VMX
Device Drivers
SDK and VirtualCenter AgentThird Party
Agents
I/O Stack
Scalable Performance
Hardware
VM VMVM
DistributedVirtual Machine
File System
Device Drivers
Storage Stack
Virtual NIC andSwitch
Network Stack
VM
VMM VMMVMM VMM
VMkernel
ResourceManagement
VMX VMXVMX VMX
Device Drivers
SDK and VirtualCenter AgentThird Party
Agents
I/O Stack
Service ConsoleVirtual
Machine
Scalable Performance
ESX 2:VMX on ServiceConsole
ESX 3:VMX on VMkernel
Win
dow
s 20
00 B
oot T
ime
(s)
Number of idle Windows 2000 guests
8-CPU DL-76016GB RAM
AgendaOverview of ESX ServerTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performance Power efficiency and management
Conclusions
Power Efficiency and ManagementIncreasing CPU power consumptionIncreasing compute and power densities with multi-core CPUs and stateless serversSignificant cost to power and cool a datacenterLimits to datacenter power and cooling capability
Server Power Management
Power consumption varies as cube of voltage x frequencyNew hardware support for dynamically adjusting voltage/frequencyLoad-balance across minimally powered CPUs
ConventionalConventional
Voltage/FreqVoltage/Freq
With parallelism, two halfWith parallelism, two half--speed CPUs are speed CPUs are more efficient than one fullmore efficient than one full--speed CPUspeed CPU
Datacenter Power Management
Dynamic, power-aware load-balancing across servers with VMotionConsider fixed power consumption per serverBalance powering off servers vs. throttling CPUs
……
Need an efficient Need an efficient virtualization layervirtualization layer
AgendaOverview of ESX ServerTechnology trends in the datacenterESX Server architectural directions
CPU virtualizationI/O virtualizationScalable performance Power efficiency and management
Conclusions
RecapConsider datacenter of the future
Stateless, power-hungry virtualized servers with many CPUs, lots of memoryVirtualized network “backplane”Virtualized network-based storageA global virtual computer
Impact on ESX Server architectureCPU virtualization: multiple VMM typesI/O virtualization: Native and Passthrough I/OScalable performance: relieve bottlenecksPower efficiency: minimize CPU consumption
Role of Virtualization Hardware SupportPhase 1: Hardware for correctness
Trap or exit on unsafe codeSafe device access, driver isolation
Phase 2: Hardware as acceleratorSpeed up virtualization softwareFast VM enter/exit, nested paging, I/O offloading
Does not eliminate the need for virtualization software, just as hardware support does not eliminate the need for operating systems
ESX Server Architecture Today
Hardware
VM VMVM
DistributedVirtual Machine
File System
Device Drivers
Storage Stack
Virtual NIC andSwitch
Network Stack
VM
VMM VMMVMM VMM
VMkernel VMkernel Hardware Interface
ResourceManagement
CPU SchedulingMemory SchedulingStorage BandwidthNetwork Bandwidth
Service Console
VMX VMXVMX VMX
Device Drivers
SDK and VirtualCenter AgentThird-Party
Agents
I/O Stack
ESX Server Architecture in the Future
Hardware
VM VMVM
DistributedVirtual Machine
File System
Isolated Device Drivers/Modules
Storage Stack
Virtual NIC andSwitch
Network Stack
VM
VMM-32 VMM-64Para-VMM
VTVMM
VMkernel
ResourceManagement
CPU SchedulingMemory SchedulingStorage BandwidthNetwork BandwidthPower Management
POSIX API
SDK and Management
Agents
VMVM
Para-VMMVMM-64
VMkernel Hardware Interface
Third-PartyAgents
VMX VMXVMX VMX
Passthrough I/O
Call to ActionHardware vendors
Build performance-focused virtualization assistsBuild hardware for fully-functional Passthrough I/OWork with VMware on Native I/O devices and drivers
Software vendorsSupport standard interfaces: OS, apps, managementVMI for transparent paravirtualization
Datacenter architects and administratorsVirtualize now, get ready for the future virtual datacenter
VMware will use relevant technology to provide the broadest, most flexible, highest performance virtualization platform
Backup Slides
PAC879: The Next Phase of Virtual Infrastructure: Introducing ESX Server 3.0 and VirtualCenter 2.0
PAC177: Distributed Availability Services ArchitecturePAC484: Consolidated Backup with ESX Server:
In-Depth ReviewPAC485: Managing Data Center Resources Using the
VirtualCenter Distributed Resource SchedulerPAC532: iSCSI and NAS in ESX Server 3
This presentation covers potential and uncommitted future directions. Details about future releases of our products are available
in select sessions at VMworld, including:
Overview of ESX ServerMature x86 hypervisor-based virtualization
Considered best server virtualization approachHighlights:
Sophisticated resource managementEnterprise networking and storage supportIntegrated VMFS for managing virtual disksBroadest support for x86 OSes Transparent VM migration (VMotion)
I/O VirtualizationFull virtualization of I/O devices
Standardized virtual device set provides hardware independence, virtual machine portability
Intermediate layer for enterprise-class functionality. Examples:
VMFS: store and manipulate virtual disksStorage multipathing: link failoverVirtual network switch: VLAN, NIC teaming
How to route data from/to physical devices?
Other scalability factorsMany CPUs/server
Scalable scheduler algorithms, large SMP VMsLarge memory and storage addressing
64-bit vmkernelMany storage targets
VMFS scalingLUN-mapping schemes
Many serversScalable distributed resource scheduling
ObservationsCPU and I/O virtualization
No single technique satisfies all the requirementsProvide transparent choice of best technique for the hardware and customer needsHardware support can help virtualization software
Evolve towards the vision and requirements of the future datacenter
Scalability and power managementStateless servers and converged I/O fabrics