snapmirror business continuity (sm-bc)...•automated failover initiated by mediator in case of...
TRANSCRIPT
SnapMirror Business Continuity (SM-BC)Tech Data
Manish Thakur Cheryl GeorgePrincipal Product Manager Technical Marketing [email protected] [email protected]
March 2021© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —1
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Agenda• Evolving with business continuity
• Introduce new business continuity solution SM-BC
• Failure scenarios
• The Right BC Solution for You
2
Business Continuity and Disaster Recovery (BCDR) TechnologyNetApp SnapMirror
Data loss avoidance
BCDR
Rapid application recovery
Production
Normal
Enterprise applications
SecondaryTape/VTL
Simplicity in management and orchestration
NetApp®SnapMirror®
Public cloud
SMtape
Failover
NetApp INSIGHT © 2020 NetApp, Inc. All rights reserved. NetApp Confidential – Limited Use Only3
Ability for application to failover to secondary copy in storage, without application re-connect or user disruption What Is Business Continuity?
Tape/VTL vaulting
Tape/VTL backup
Async replication
Sync replication
D2D/D2C backup
-36 -24 -12 0 12 24 36 48 60 72
Disaster recovery
Backup and recovery
Recovery Point Objective (RPO)Time period (amount) of lost data
Recovery Time Objective (RTO) Time required to resume business
Data loss Recovery time
Business continuity Sync replication
Guaranteed zero data loss2 Transparent Application Failover to prevent application disruptionRPO=0
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Event
RTO=01 3
4
Granular Application Availability
Evolving with business continuity
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —5
• Consolidate all your workloads on a single cluster
• Protect only the critical SAN workloads
• Secondary cluster hosts the mirror as well as other workloads
• Primary cluster can be high end with entry on secondary
• Leverage mirror copy for development and testing
Asymmetric configuration and value from mirror copyDesign an Optimal Solution
SnapMirror®
Business Continuity
Site A
Site B
Production application
Mirror copy
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —6
No new license
Part of data protection bundle
SAN protocols
FC, iSCSI
Highly resilient
External mediator for transparent failover
Granular business continuity solution for SAN applicationsSnapMirror Business Continuity (SM-BC) Overview
Continuous AvailabilityActive workloads on both clusters
Platform FlexibilityAny 2-node AFF or ASA clusters
Easy AdministrationSnapMirror® simplicity
Application support
Consistency groupMonolithic and
distributed applications
Setup simplicity
ONTAP® System Manager simplicity
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —7
Synchronous replication with Transparent Application FailoverContinuous Availability for the Application
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
SnapMirror Synchronous
Enterprise Applications
Primary Consistency Group (CG)
Mirror copy
Storage Virtual Machine
Normal Automated failover
Storage Virtual Machine (SVM)
Unmirrored volumes
Primary Datacenter Secondary Datacenter
• Automated failover of business-critical applications
• Simplified application management with granular control
• Consistency Group (CG) for dependent write-order consistency
• Flexibility of DR Test for each application
• Create instantaneous clones of mirror, without impacting application availability
• Optimize cost by using existing ONTAP 9.8 AFF clusters
• Ease of management with intuitive workflows
8
Transparent application failover for SAN onlySnapMirror Business Continuity (SM-BC)
SnapMirror Synchronous
Enterprise applications
Primary consistency group
Mirrored consistency group
Storage virtual machine (SVM)
Normal Automated failover
Storage virtual machine (SVM)
Unmirrored volumes
Primary data center Secondary data center
1. NetApp SnapMirror® Synchronous technology for RPO 0
2. Consistency group for application granularity
3. LUN identity is same on both sides
4. Transparent application failover
1
2 3
4
ONTAP Mediator
Third site
Single virtual device
NetApp INSIGHT © 2020 NetApp, Inc. All rights reserved. NetApp Confidential – Limited Use Only9
1
2
3
4
• Linux physical or virtual server running RHEL or CentOS 7.6 – 8.2
• Establish quorum • Both ONTAP® storage systems (including nodes)
periodically send heartbeat and status of replication
• Avoid split-brain• Scenario in which each cluster may simultaneously try to
become the master
• Orchestrates automated failover upon detection of failure• Switchover of application host with RTO <= 120 seconds
• ONTAP Mediator can manage a total of 10 cluster pairs
Enables automated failover during disasterONTAP Mediator 1.2
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Replication Status
Primary Datacenter Secondary Datacenter
ONTAP Mediator
Third site
10
• CG preserves write order across volumes
• Relationship between CG across 2 clusters
• CG categorized as master and mirror• Master CG preferred to serve if IO both copies available
and connectivity between clusters is lost
• All volumes within CG part of relationship
• A cluster could have multiple master or mirrored copy CGs
• SVMs containing the related CGs can have different namespaces, LIFs
Relationship between two consistency groups (CG)SM-BC Relationship
Prod - Storage Virtual Machine
Cluster A Cluster B
Master CG
NDR - Storage Virtual MachineUn-mirrored Volumes Mirror CG
Volume contains one or more LUNs.
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —11
Typical data layout for Enterprise ApplicationsOracle, Microsoft SQL Server, vMSC, etc
FlexVol 2Log file(s)
FlexVol 1Datafile(s)
§ Small to medium-size non-critical databases§ Better consolidation§ Loose application granularityFlexVol 3
Datafiles Log filesLUN1 LUN2
§ Other application related files
§ Typical enterprise application layout§ Uses a dedicated vol\LUN for each database§ Suitable for large critical databases§ Better for BC, DR and cloning purposes
FlexVol 5Binaries
FlexVol 4VMs
SVM
Virtualization
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —12
Fan-out Enterprise Application Protection
Production Stand-by
Primary Datacenter Local DisasterRecovery Datacenter
SM-BC
SnapMirror Synchronous(SM-S) Replication
~150km max
Mirrored CG
SVM SVM`
Primary Consistency Group (CG)
Data
Logs
VMs
Round Trip Time (RTT) <10ms
Manual step:Map the mirror LUNs to respective initiator group
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —13
AsynchronousSnapMirror
FlexClone
Remote DisasterRecovery Datacenter3
Stand-by
SVM``
• Active optimized path to LUN is through cluster hosting master CG
• Both copies of LUN have same id hence host sees a single virtual LUN
• LUN can be accessed for read/write through the mirror CG• Host has an active non-optimized path to it
• Access through mirror CG is redirected to the master CG • This increases latency perceived by host
ALUA provides host optimal path to the LUNHost Path to LUNs
ALUASVM SVM
Master CG Mirror CG
Active non-optimized
Active optimized
Common LUN id across copies
Proxy path
Cluster A Cluster B
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —14
• Write to a master CG results in write to mirror cluster
• Master and mirror designation of CG meant to avoid split brain
• Host receives ACK only upon writes to both master and mirror
• Periodic CG snapshot scheduled on both clusters
• Optimal for host to read/write to master CG
Host writes op complete only when NVLOG on both clusters written toSM-BC Replication
Master CG NVRAM
Mirror CG NVRAMHost Write (1)
R ACK (4)
ONTAP WR(2)
Mirror WR (2)
ACK (5)
ONTAP WR (3)
Master CG NVRAMHost Write (1)
ACK (5)
Write (4)
ACK (7)
Proxy Write (2)
Write (3
Write (3
)Mirror CG NVRAM
R ACK (6)
Host write to a CG on cluster hosting master copy
Host write to a CG on cluster hosting mirror copy
Write (3)
ALUA
ALUA
Cluster A Cluster B
Cluster B Cluster A
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —15
• Automated failover initiated by mediator in case of disaster
• Mirror CG changed to master after in-flight write ops committed
• Sole path remains active optimized• Erstwhile non-optimized paths become optimized
• Replication from master CG suspended
Mediator initiates a transparent failoverSM-BC Transparent Failover on a Disaster
Master CG Mirror CGALUA
In Sync
Master CGALUA
Active non-optimized
Active optimized
Active optimized
CG on Cluster B
CG on Cluster A CG on Cluster B
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —16
vSphere Stretched ESX cluster
Clustered SAN applicationVMware vSphere Metro Storage Cluster – Uniform host access configuration
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
ESXi1 ESXi2 ESXi3 ESXi4
• Active workloads served simultaneously from both clusters• Bidirectional replication
• Best practice:• Plan for VMs to be stored in
separate datastores, locally• Otherwise,
• VM from secondary site will incur • 2x Round Trip Time (RTT)
for Writes • 1x RTT for Reads
• Then, perform a planned failover (PFO) to avoid using proxy path
VM1-vmdk VM1-vmdkSITE A
VM1 VM2
VM2-vmdk VM2-vmdkSITE B
17
Failure scenarios
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —18
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Steady StateNormal Operations
L1P L1S
AOANO
VS2VS1
In Sync
ONTAP Mediator
Host
N1 N2 N3 N4
C1 C2
ANOANO
Cluster
Node
19
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Steady StateNormal Operations
Mediator
C1Primary
C2Secondary
Parameter DetailsHost access to storage
C1 and C2
SM-BC relationship state
In Sync
21
Mediator
C1Primary
C2Secondary
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
• ONTAP Mediator failure1. VM2. Link (Single or double)
ONTAP Mediator failure
112
2 2Parameter DetailsSM-BC action upon failure NAHost access to storage C1, C2 (No disruption to I/O)SM-BC relationship state In SyncFailover operation Not possible (Planned or
Unplanned Failover)
22
Mediator
C1Primary
C2Secondary
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
§ Replication Link failure1. Transient2. Persistent (Tries 3 times every 3 seconds = 9
seconds)
Link failure between the sites (Split brain scenario)Replication Link failure
Parameter DetailsSM-BC action upon failure No actionHost access to storage C1 after consensusSM-BC relationship state Out of syncFailover operation NA
23
Mediator
C1Primary
C2Secondary
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Disaster at Site A
§ SM-BC detects failure at primary and triggers an automatic failover
§ When C1 recovers, automatic resync completes to bring C2->C1 relationship “In Sync”
§ Planned failback to restore normal steady state operations
Parameter DetailsSM-BC action upon failure Automatic unplanned
failover (AUFO)Host access to storage C2 after consensus (Mirror
copy->active copy)SM-BC relationship state Out of syncFailover operation Possible
24
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
§ When C2 recovers, automatic resync completes to bring C1->C2 relationship “In Sync”
Disaster at Site B
Mediator
C1Primary
C2Secondary
Parameter DetailsSM-BC action upon failure No actionHost access to storage C1 after consensusSM-BC relationship state Out of syncFailover operation NA
25
The Right Solution for You
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —27
Business continuityIntegrated ONTAP Data Protection
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Continuous Availability§ RPO 0§ RTO 0
SnapMirror Business Continuity
MetroCluster
Select SAN Applications Protected
All workloads protected
SnapMirror Business Continuity (SM-BC) MetroCluster (MCC)
28
Zero data lossSnapMirror Synchronous (SM-S) to achieve zero Recovery Point Objective (RPO)
Application Granularity Consistency Group (CG) to maintain dependent write-order consistency
Transparent application failoverNear-zero recovery time objective (RTO) for continuous application availability
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Key takeaways
29
NetApp unlocks the best of cloud
© 2021 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —