elastic vsphere?
DESCRIPTION
This presentation discusses design considerations around the use of stretched clusters with VMware vSphere. It was presented to the Denver VMUG on September 28, 2010.TRANSCRIPT
1© Copyright 2010 EMC Corporation. All rights reserved.
Elastic vSphere?Design Considerations for Building Stretched ClustersScott Lowe, vSpecialist, EMC CorporationVCDX #39, VMware vExperthttp://blog.scottlowe.orghttp://twitter.com/scott_lowe
2© Copyright 2010 EMC Corporation. All rights reserved.
Agenda• Reasons for building stretched clusters
• Storage configurations for stretched clusters
• Design considerations for stretched clusters
• EMC VPLEX in stretched clusters
• Q&A
3© Copyright 2010 EMC Corporation. All rights reserved.
Reasons for Building Stretched Clusters• Valid reasons:– Provide high availability across sites– Workload balancing across sites
• Invalid reasons:– Because you can/because it’s cool (is that a valid
business justification?)– Enable vMotion over distance (stretched clusters
are not a prerequisite)– Use vMotion as a DR mechanism (vMotion not
applicable when both ends aren’t up and available)
4© Copyright 2010 EMC Corporation. All rights reserved.
Storage Configurations for Stretched ClustersA review of storage configurations to support stretched cluster designs
5© Copyright 2010 EMC Corporation. All rights reserved.
Stretched SAN Configuration• Literally just stretching the SAN fabric
between locations
• Requires synchronous replication
• Limited in distance to ~100km in most cases
• Typically read/write in one location, read-only in second location
• Implementations with only a single storage controller at each location create a SPoF (single point of failure)
6© Copyright 2010 EMC Corporation. All rights reserved.
Stretched SAN Configuration
XX
Read/Write Read-Only
7© Copyright 2010 EMC Corporation. All rights reserved.
Distributed Virtual Storage Configuration• Leverages new storage technologies to
distribute storage across multiple sites
• Requires synchronous mirroring
• Limited in distance to ~100km in most cases
• Read/write storage in both locations, employs data locality algorithms
• Typically uses multiple controllers in a scale-out fashion
• Must address “split brain” scenarios
8© Copyright 2010 EMC Corporation. All rights reserved.
Distributed Virtual Storage Configuration
X XRead/Write Read/Write
9© Copyright 2010 EMC Corporation. All rights reserved.
EMC VPLEX Overview• EMC VPLEX falls into the distributed virtual
storage category
• Keeps data synchronized between two locations but provides read/write storage simultaneously at both locations
• Uses scale-out architecture with multiple engines in a cluster and two clusters in a Metro-Plex
• Supports both EMC and non-EMC arrays behind the VPLEX
10© Copyright 2010 EMC Corporation. All rights reserved.
Preferred Site in VPLEX Metro• VPLEX Metro provides read/write
storage in two locations at thesame time (AccessAnywhere)
• In a failure scenario, VPLEX uses “detach rules” to prevent split brain– A preferred site is defined on a per-
distributed virtual volume basis– Preferred site remains read/write;
I/O suspended at non-preferred site
• Invoked only by entire cluster failure, entire site failure, or cluster partition
Distributed Virtual Volume
IP/FC linksfor Metro-PlexX
Read/write
Read/write
I/O Suspended
Preferred Site
Non-Preferred
Site
11© Copyright 2010 EMC Corporation. All rights reserved.
Design Considerations for Stretched ClustersA review of design considerations and design impacts when using stretched clusters
12© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #1
Consideration: Without read/write storage at both sites, roughly half the VMs incur a storage performance penalty.
• With stretched SAN configurations:– VMs running in one site are accessing storage in
another site– Creates additional latency for every I/O operation
• With distributed virtual storage configurations:– Read/write storage provided, so this doesn’t
apply
13© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #2
Consideration: Prior to vSphere 4.1, you can’t control HA/DRS behavior.
• With stretched SAN configurations:– Additional latency introduced when VM storage
resides in other location– Storage vMotion required to remove this latency
• With distributed virtual storage configurations:– Need to keep cluster behaviors in mind– Data is access locally due to data locality
algorithms
14© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #3Consideration: With vSphere 4.1, you can use DRS host affinity rules to control HA/DRS behavior.
• With all storage configurations:– Doesn’t address HA primary/secondary node selection
• With stretched SAN configurations:– Beware of single-controller implementations– Storage latency still present in the event of a controller
failure
• With distributed virtual storage configurations:– Plan for cluster failure/cluster partition behaviors
15© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #4
Consideration: Host affinity rules don’t affect VMware HA admission control.
• With all storage configurations:– Must configure admission control for 50% failure
in order to guarantee resource availability– Can’t configure “per site” admission control rules– Impacts the reasons people would build stretched
clusters, especially workload balancing
16© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #5
Consideration: There is no supported way to control VMware HA primary /secondary node selection.
• With all storage configurations:– Limits cluster size to 8 hosts (4 in each site)– das.preferredprimaries is not a supported
mechanism for controlling primary/secondary node selection
– Methods for increasing the number of primary nodes also not supported by VMware
17© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #6
Consideration: Stretched HA/DRS clusters require Layer 2 adjacency (or equivalent) at the network layer.
• With all storage configurations:– Complicates the network infrastructure– Involves technologies like OTV, VPLS/Layer 2
VPNs
• With stretched SAN configurations:– Can’t leverage vMotion at distance without
storage latency
• With distributed virtual storage configurations:– Data locality enables vMotion at distance without
latency
18© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Considerations #7
Consideration: The network lacks site awareness, so stretched clusters introduce new networking challenges.
• With all storage configurations:– The movement of VMs from one site to another
doesn’t update the network– VM movement causes “horseshoe routing” (LISP,
a future networking standard, helps address this)– You’ll need to use multiple isolation addresses in
your VMware HA configuration
19© Copyright 2010 EMC Corporation. All rights reserved.
Horseshoe Routing
20© Copyright 2010 EMC Corporation. All rights reserved.
Stretched Cluster Recommendations• Use separate HA/DRS clusters in each
datacenter
• Use separate distributed VMFS datastores for each clusters
• Use vMotion to move VMs as needed between clusters– Keep preferred/non-preferred site behavior in
mind!– Try to keep related VMs together in a site– Change detach rules to switch preferred site if
necessary
• A VMware KB article is available discussing HA/DRS clusters with VPLEX
21© Copyright 2010 EMC Corporation. All rights reserved.
For More Information…• VMware support with NetApp MetroCluster:
http://kb.vmware.com/kb/1001783
• Using VPLEX Metro with VMware HA:http://kb.vmware.com/kb/1026692
• vMotion over Distance Support with VPLEX Metro:http://kb.vmware.com/kb/1021215
• The Case For and Against Stretched ESX Clusters:http://virtualgeek.typepad.com/virtual_geek/2008/06/the-case-for-an.html
22© Copyright 2010 EMC Corporation. All rights reserved.
Q&A
23© Copyright 2010 EMC Corporation. All rights reserved.
THANK YOU