business continuity planning with sql server hadr options prem mehra program manager microsoft...
TRANSCRIPT
Business Continuity Planning with SQL Server HADR options
Prem MehraProgram Manager
Microsoft Corporation
Key Takeaways of the Session
• SQL Server 2008 and SQL Server 2008 R2 can meet very high HA and DR requirements
• Upgrades to SQL Server 2008 and to SQL Server 2008 R2 can be achieved with downtime limited to minutes
• Demanding HA and DR deployments require very good documented operational procedures and highly skilled staff
Current Technologies• Failover Clustering
– Local server redundancy
• Database Mirroring
– Local server & storage redundancy
– Disaster recovery
• Log Shipping
– Additional disaster sites for databases
– App/user error recovery
• Replication
– Database reporting and read scale out with redundancy
• Always On Partner Solutions
– Highest hardware reliability
Database Mirroring
Hot Standby
Warm Standby
App/User Error
Recovery
Log Shipping
Log Shipping With Restore Delay
ProductionDatabase
Replication DatabaseScale OutFor Queries
Failover Cluster
# Architecture Key Distinguishing Scenario Use & Deployment
Characteristics
Examples
1 Failover Clustering for HA and Database Mirroring for DR
A) Single data copy for HA sufficient B) Positive experience with Failover clustering C) Comfortable deploying two different
technologies for HA & DR
ServiceU and CareGroup
2 Synchronous Database Mirroring for HA/DR and Log Shipping for additional DR
A) Require deploying fewer (only one) technology for HA & DR
B) Avoid costs associated with Failover clustering C) For HA, remote data center execution
acceptable
bwin
3 Geo-Cluster for HA/DR A) Require deploying fewer (only one) technology for HA & DR
B) Positive experience with Geo-Clustering
Edgenet
4 Failover Clustering for HA and SAN-based Replication for DR
A) Require deploying single DR technology across multiple DBMSs
B) A third party DR technology acceptable
MySpace
5 Peer-to-Peer Replication for HA and DR (and reporting)
A) Simultaneous data manipulation from multiple sites
B) Potential data loss acceptable
Enterprise in Travel Industry
Proven HA / DR Architectures: Successfully Deployed by
Customers
Atlanta Standby Data CenterMemphis Primary Data Center
SQL Server Infrastructure
DNS
Asynchronous Database Mirroring
Windows 2008 SQL 2008 Windows 2008 SQL 2008
MIRROR
PreferredPRINCIPAL
DB Connection to Memphis for Regular Test Exercise
DNS
WEB FARM WEB FARM
DNS
Upgrade Process• Setup a temporary cluster (Windows Server 2008 and SQL Server 2008)
in the primary data center• Establish log shipping to temporary cluster• Break DBM to the DR data center• Establish DBM from production cluster to temporary cluster (convert LS
to DBM)• Failover to temporary cluster. Temporary cluster is now production• Break DBM to old production cluster, and rebuild the old production
cluster with Windows Server 2008 and SQL Server 2008• Establish DBM from temporary production cluster to the newly built
cluster• Failover to newly built cluster. New cluster is now production• Rebuild the old DR cluster with Windows Server 2008 and SQL Server
2008• Establish log shipping to the newly build DR cluster• Break DBM to temporary cluster in the primary data center• Establish DBM from production cluster to new DR cluster (convert LS to
DBM)
7
Mirror Server
SQL Server Disaster Recovery
SQL Server Cluster
Cisco Global Site Selector (GSS) DNS
SQLNetworkNameA\SQL1Active IP:100.10.56.30
Alias Name = GreenActive IP: 100.10.56.30 100.85.3.10
Connect to: Green\SQL1
SQLHostNameB\SQL1Passive IP:100.85.3.10
DR SiteMirroring
Prin
cipa
l Ser
ver
Applications:1- SharePoint2- SSRS3- BlackBerry4- Citrix Server5- VMware VC
Windows Server 2008 R2SQL Server 2008 R2
Mirroring
Mirroring
Upgrading Failover Cluster:To Windows Server 2008 R2 and SQL Server 2008 R2
EvictEvictEvict
Windows Server 2003 SQL Server 2005 6 nodes Cluster
Each SQL instance has two preferred owners
Give back to Server Team
Mirroring
Mirroring
Mirroring
Mirroring
Borrowed from Server Team
RebuildRebuild
Infrastructure Scale Up & Zero Data Loss
Principal Mirror
Log Shipping1h delay
Log Shipping2nd copy
All Log Backups andFull Backups Days 1,3,5…
All Log Backups andFull Backups Days 2,4,6…
HA Zero Data Loss Solution Remarks
• Zero data loss is higher priority than Availability, so if we can’t harden the transaction to disk in two datacenters, we put our application offline
• If “Principal” fails, we put our application offline, failover to “Mirror”, break the mirror and promote “Log Shipping Copy 2” to be the new mirror.
• If “Mirror” fails, we put our application offline, let “Log Shipping” catch up and promote it to be the new mirror
• If either of the log shipping secondaries fail, we continue operation• One of the log shipping secondaries has the one hour delay to be
able to fix human or applications errors (like deleting data) quickly, if we do not detect deleted data within an hour we have to restore on of our backups
• Backup Infrastructure: Each datacenter has one file server optimized to hold large files (but just a few < 10,000) and one to hold small files (but many of them > 1,000,000)
Infrastructure Scale Up and High Availability
Principal Mirror
Log Shipping1h delay
Log Shipping2nd copy
All Log Backups andFull Backups Days 1,3,5…
All Log Backups andFull Backups Days 2,4,6…
High Availability Solution Remarks
• Priority is Availability but with the theoretical ability of loosing some data
• “Principal” does sync database mirroring to “Mirror” and a Witness watches them both
• If “Principal” fails, we automatically failover to the mirror, a scheduled SQL Server Agent script will then assess the situation and if the failed server does not come online within a few minutes it will break the mirroring session, and promote “Log Shipping Copy 2” to be the new mirror.
• If the second data center fails we go offline, a scheduled SQL Server Agent script will then assess the situation and if the failed server does not come online within a few minutes (we give it more than the principal) it will break the mirroring session and let “Log Shipping” catch up and promote it to be the new mirror.
Cluster Diagram
RecoverPoint Appliance(s)EMC RecoverPoint CE
Milwaukee, WI Atlanta, GA
300 Mb Ethernet Stretch Vlan 10.10.10.0/24
RecoverPoint Appliance(s)EMC RecoverPoint CE
Asynchronous Replication 85
0 M
iles
NEC Express 5800/A1160 MX4 Socket – Hex Core (24 Cores)
Xeon X7460 – 2.66 GHz128 GB RAM
SQL Server 2008 R2 EnterpriseWindows Server 2008 R2 Enterprise
NEC Express 5800/A1160 MX4 Socket – Hex Core (24 Cores)
Xeon X7460 – 2.66 GHz128 GB RAM
SQL Server 2008 R2 EnterpriseWindows Server 2008 R2 Enterprise
SAN
Fab
ric
Bro
cade
530
0 8
Gb
Sw
itch
SAN
Fab
ric
Bro
cade
530
0 8
Gb
Sw
itch
SANEMC Clariion CX4-80
15k Fibre Channel Disk
SANEMC Clariion CX4-80
15k Fibre Channel Disk
Passive SQL Server Cluster Node
Active SQL Server Cluster Node
• On the Target Server• Add the SAN snapped drives as mount points
• Eliminates the Drive Letter limitation by allowing as many drives as needed
• Install SQL Server on the local drive• Place the System Databases on the SAN Drive under a mount point• Attach the User Databases , create the jobs, monitoring scripts, etc.• Verify connectivity by running all the SELECT Stored Procedures
• Add the Target Server to the cluster• Run the verification procedure on the snapped drives
• Saw a saving of over 2 hours per server
Old Data Center New Data Center
Still Live in Production!
SnapReplicate Data
Moving Databases to New DatacenterPreparation
• Make another Copy of the volumes on the Source Server and replicate them to the Target Server
• Here is the Tricky Part• Stop SQL Server• Remove the volumes used to prep the server• Add the latest volumes with EXACTLY the same layout as the prep volumes• Start SQL Server• Test connectivity, call the SPROCS which run SELECT Statements
Moving Databases to New DatacenterAfter the Prep is Done
• Trickery & Shenanigans• The sys.sysDatabases stores only where the MDF file is located
so as long as the MDF is where sys.sysDatabases says, no issues will exist
• Since the sys.sysFiles data is stored in each database the switch can take place seamlessly
Moving Databases to New DatacenterWhy Does This Work?
Topology Deployed
ASIA CORE 1
ASIA CORE 2
Data Warehouse
ASIA Web
America Web
America Web
ASIA Web
Read Only Copy
America CORE 1
America CORE 2
P2P Reference
P2P FinancialWeb Publication
Asia Core: IBM x3850 2x6 64 GB
Asia DW: IBM x3850 2x6 128 GB
America Core: HP DL380 G5’s 2x4 64GB
Web Servers: IBM x3650 1 x 4 8GB
Tran ReferenceTran Financial
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and
Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.