bco2655-vmware vsphere fault tolerance for multiprocessor virtual machines—technical preview and...
DESCRIPTION
BCO2655-VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines—Technical Preview and Best Practices_Final_US.pdfTRANSCRIPT
VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines— Technical Preview
Jim Chow, VMware, Inc.
Shrinand Javadekar, VMware, Inc.
INF-BCO2655
#vmworldinf
2
Disclaimer
This session may contain product features that are currently under development.
This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.
Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.
Pricing and packaging for any new technologies or features discussed or presented have not been determined.
3
Agenda
vSphere Availability Portfolio Why Fault Tolerance Multiprocessor Fault Tolerance details Live Demo Performance Numbers Questions
4
43% of companies experiencing disasters never re-open, and 29% close within two years.
(McGladrey and Pullen)
93% of business that lost their data center for 10 days went bankrupt within one year.
(National Archives & Records Administration)
Top executives say 10 hours to recovery; IT managers say up to 30 hours.
(Harris Interactive)
Disasters Happen. Do You Need Protection?
5
Do you need protection?
Server failures happen • Google released some data about their server failures
• 2% to 4% servers fail, 1% to 5% of disk drives crash. • 20 rack failures: 40-80 machines instantly disappeared • 1-6 hours to get back
Sources
• http://content.dell.com/us/en/gen/d/large-business/google-data-center
6
vSphere Offers Protection at Every Level
NIC Teaming, Storage Multipathing
High Availability, Fault Tolerance, vMotion, DRS
Storage vMotion
Site Recovery Manager
Component Server Storage Data Site
Backup Solutions
Protection against hardware failures Planned maintenance with zero downtime Protection against unplanned downtime and disasters
7
vSphere Availability Portfolio
Coverage
Hardware
Guest OS
Application
Fault Tolerance
App Monitoring APIs
none minutes Downtime
Guest Monitoring
VM Infrastructure HA
8
vSphere Availability Portfolio
Coverage
Hardware
Guest OS
Application
Fault Tolerance
App Monitoring APIs
none minutes Downtime
Guest Monitoring
VM Infrastructure HA
9
Why Fault Tolerance?
Continuous Availability • Zero downtime
• Zero data loss
• No loss of TCP connections
• Completely transparent to guest software • No dependency on Guest OS, applications • No application specific management and learning
10
Background
2009: vSphere Fault Tolerance in vSphere 4.0 2010: Updates to vSphere Fault Tolerance in vSphere 4.1 2011: Updates to vSphere Fault Tolerance in vSphere 5.0 Details: http://www.vmware.com/products/fault-tolerance/ Problem:
• FT only for uni-processor VMs
• Is FT for multi-processor VMs possible? • An impressively hard problem • Concerted effort to find an approach
Reached a key milestone • We’d like to share it
11
A Starting Point: vSphere FT
FT LOGGING
Shared VMDKs
vLockstep
vSphere ESX
(Primary)
vSphere ESX
(Secondary)
12
A Clean Slate
FT LOGGING
Shared VMDKs
vSphere ESX
(Primary)
vSphere ESX
(Secondary)
13
A Clean Slate
FT LOGGING vSphere ESX
(Primary)
vSphere ESX
(Secondary)
Next: FT in practice
14
Turning on Multiprocessor FT
Creating two VMs A new VM, but identical configuration
• vRAM, # vCPUs, vNICs, etc.
Each VM owns a complete set of VM files • Separate vmdks completely owned by each VM
Primary VM
Disk 2
Config
Disk 1
Secondary VM
Disk 2
Config
Disk 1
15
Datastores
Primary VM
Disk 2
Config
Disk 1
Secondary VM
Disk 2
Config
Disk 1
16
Datastores
Primary VM
Disk 2
Config
Disk 1
Secondary VM
Disk 2
Config
Disk 1
17
Datastores
One datastore must be common Ensures only one running copy of the VM at any time
Primary VM
Disk 2
Config
Disk 1
Secondary VM
Disk 2
Config
Disk 1 Tie
Break Datastore
18
19
Initial placement of secondary
Not tied to the host
20
- Intel vs AMD - vMotion
compatible
21
Co-located on single Datastore
by default
22
23
24
All done!
25
Live Demo
VMware vCenter Server
Central management server Continuous availability difficult Multiprocessor FT makes it simple
• Natural fit
26
Live Failover
vSphere Web Client
Continuous availability through server failure
vCenter Server
27
Backing up FT VMs
Support for vStorage APIs for Data Protection (VADP) • API for non-disruptive snapshots
Many VADP solutions on the market
28
Live Demo Summary
FT in action • Principles to keep in mind • Doing backups of FT VMs • Ensure continuous availability of multiprocessor workloads
• Presented a good solution • Client oblivious to FT operation • Zero downtime, zero data loss
• Taste for performance / bandwidth
But that’s not all
29
Performance Numbers
0
20
40
60
80
100
Microsoft SQLServer 2-vCPU
Microsoft SQLServer 4-vCPU
OracleSwingbench 2-
vCPU
OracleSwingbench 4-
vCPU
% Throughput (FT/non FT) (higher is better)
Similar configuration to vSphere 4 FT Performance Whitepaper • Models real-world workloads: 60% CPU utilization
30
vSphere FT Summary
Why Fault Tolerance • Continuous availability
Fault Tolerance for multi-processor VMs • Good solution to impressively hard problem
• A new design
• Demonstrated similar experience to existing vSphere FT • But more vCPUs
31
Thank you!
Questions?
FILL OUT A SURVEY
EVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A
$25 VMWARE COMPANY STORE GIFT CERTIFICATE
VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines— Technical Preview
Jim Chow, VMware, Inc.
Shrinand Javadekar, VMware, Inc.
INF-BCO2655
#vmworldinf