2008 srdf star for open systems srg
TRANSCRIPT
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 1
© 2008 EMC Corporation. All rights reserved.
SRDF/Star for Open SystemsSRDF/Star for Open Systems
Welcome to SRDF/Star for Open Systems Training.
Copyright © 2008 EMC Corporation. All rights reserved.These materials may not be copied without EMC's written consent.EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.EMC2, EMC, EMC ControlCenter, AlphaStor, ApplicationXtender, Captiva, Catalog Solution, Celerra, CentraStar, CLARalert, CLARiiON, ClientPak, Connectrix, Co-StandbyServer, Dantz, Direct Matrix Architecture, DiskXtender, DiskXtender 2000, Documentum, EmailXaminer, EmailXtender, EmailXtract, eRoom, FLARE, HighRoad, InputAccel, Navisphere, OpenScale, PowerPath, Rainfinity, RepliStor, ResourcePak, Retrospect, Smarts, SnapShotServer, SnapView/IP, SRDF, Symmetrix, TimeFinder, VisualSAN, VSAM-Assist, WebXtender, where information lives, Xtender, Xtender Solutions are registered trademarks; and EMC Developers Program, EMC OnCourse, EMC Proven, EMC Snap, EMC Storage Administrator, Acartus, Access Logix, ArchiveXtender, Authentic Problems,Automated Resource Manager, AutoStart, AutoSwap, AVALONidm, C-Clip, Celerra Replicator, Centera, CLARevent, Codebook Correlation Technology, EMC Common Information Model, CopyCross, CopyPoint, DatabaseXtender, Direct Matrix, EDM, E-Lab, Enginuity, FarPoint, Global File Virtualization, Graphic Visualization, InfoMover, Infoscape, Invista, Max Retriever, MediaStor, MirrorView, NetWin, NetWorker, nLayers, OnAlert, Powerlink, PowerSnap, RecoverPoint, RepliCare, SafeLine, SAN Advisor, SAN Copy, SAN Manager, SDMS, SnapImage, SnapSure, SnapView, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix DMX, UltraPoint, UltraScale, Viewlets, VisualSRM are trademarks of EMC Corporation.
All other trademarks used herein are the property of their respective owners.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 2
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 2
Course ObjectivesUpon completion of this course, you will be able to:
List the benefits of SRDF/Star over other replication technologies
Explain the underlying technologies for SRDF/Star– Synchronous SRDF consistency groups using RDF-ECA– SRDF/A Multi-session Consistency (MSC)– Special SRDF features in support of Star
Explain Concurrent and Cascaded SRDF/Star concepts
Describe the steps needed to perform– Normal Operation– Transient Fault– Unplanned switch caused by a major outage
The objectives for this course are shown here. Please take a moment to read them.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 3
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 3
Course Outline
Module 1– Introduction and Overview
Module 2– Underlying technologies that support SRDF/Star
Existing SRDF FeaturesConsistency technology and SRDFSpecial RDF and SDDF capabilities in support of SRDF/Star
Module 3– Using SRDF/Star
The outline for this course is shown here. Please take a moment to review it.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 4
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 4
Introduction and Overview of SRDF/StarUpon completion of this module, you will be able to:
Describe two kinds of Star configurations
Describe the origins of Star and its place in the SRDF family
The objectives for this module are shown here. Please take a moment to read them.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 5
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 5
The History of SRDF/Star
2001: Special version of SRDF/AR for a US company
2003: SRDF/A and concurrent SRDF/S and SRDF/A released
2003: A European company in a similar business visited the American company– After their visit, the Europeans met with EMC Engineering and
requested features that led to Concurrent Star
2004: Release of Concurrent SRDF/Star on Mainframes
2005: Release of Concurrent Star on Open Systems
2008: Release of Cascaded SRDF/Star on Open Systems
In 2001, EMC built a special version of a multi-hop Mainframe SRDF/AR (known at the time as SAR, which stood for Symmetrix Automated Replication), for a New York-based financial services company. This version of Mainframe SRDF/AR maintained a differential relationship between the source (site A) and the remote site (site C). If the bunker site (site B) failed, a full resynchronization between A and C was no longer necessary. By virtue of using DeltaMark (SDDF) sessions, an incremental relationship could be maintained between site A and site C.
In 2003, SRDF/A was released and concurrent SRDF/A and SRDF/S became possible. When a large European company (in the same line of business as the American company), paid a visit to their friends in New York, they got the idea for a product with the functionality of Star. Late in 2003, the Europeans came to Hopkinton and had a long meeting with EMC Engineering in which they outlined their requirements.
EMC decided to implement a product as desired by the European customers and call it STAR – which was supposed to be an acronym for Symmetrix Triangulated Automated Replication. It took 2 years from the first conversation with the customer and 18 months of development to produce a GA version of Star on Mainframe in 2004. The Open Systems version was released in 2005. To conform to EMC’s naming architecture for the SRDF products, the name SRDF/Star was chosen.
In 2008 when Cascaded SRDF was released, Star functionality was enhanced to support this feature.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 6
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 6
Concurrent SRDF/Star3-site disaster recovery over extended distances
Concurrent SRDF: source to two concurrent targets
SRDF link between two targets in standby mode
Synchronous and Asynchronous targets can be differentially synchronized if Workload site fails
SRDF/S< 2
00 km
SRDF/A > 200 km
Workload Site
Nearby Synchronous Target
Short distanceZero data lag
Remote Asynchronous Target
Extended distanceVariable data lag (seconds to minutes)No performance impact
Async Target
Sync Target
Concurrent SRDF/Star enables concurrent SRDF/S and SRDF/A operations from the same source volumes.
The primary business benefit of Star is that in the event of a workload site outage, it is possible to undertake a differential resynchronization between the two remaining sites followed by the resumption of production at either site.
Concurrent Star can be reconfigured to run in cascaded mode without stopping replication between the Workload and Synchronous target sites. For example, if the link between the Workload and the Asynchronous target sites fails, a reconfiguration would allow the three sites to run in Cascaded Star mode with Star protection.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 7
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 7
Concurrent Star Operating States After Failure
1. Link failure between A & B
2. Site B failure
3. Link failure between A & C
4. Site C failure
5. Link failure between B & C
6. Site A failure
Failure Cases Operating State Operating State(without Reconfiguration) (with Reconfiguration)
A - R11
C - R2
A - R11
C - R2
A - R11 B - R2 A - R1 B - R21
C - R2A - R11 B - R2
A - R11
C - R2
B - R2
B – R11
C - R2
B – R2
C – R11
OR
This is a list of possible actions after a failure occurs in a concurrent Star setup.
1. If the link between A and B fails, it is still possible to run production at A with remote protection available at site C.
2. The same holds true if site B fails.
3. If the link between A and C fails, there are two possibilities. The first is to continue running production at A with remote protection at B. The second is to reconfigure concurrent Star to cascaded Star and run in Star protected mode.
4. If site C fails, the only option is to continue running at site A with remote protection at B.
5. If the link between B and C fails there is no effect on Star operations because the standby links between B and C are not used unless there is a failure at site A.
6. If site A fails, production has to be switched to site B or site C. This necessitates a reconfiguration of the RDF devices. The devices at the site to which production was switched become R1 devices and the remaining site is reconfigured to become R2 targets to the new production site.
The choice of which location to fail over to depends on customer needs and the location of customer resources.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 8
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 8
Cascaded SRDF/Star3-site disaster recovery over extended distances
Cascaded SRDF: source to two cascaded targets
SRDF link between source and asynctarget in standby mode
Depending on the nature of the failure, can be reconfigured to concurrent Star
SRDF/S< 2
00 km
SRDF/A > 200 km
Workload Site
Nearby Synchronous Target
Short distanceZero data lag
Remote Asynchronous Target
Extended distanceVariable data lag (seconds to minutes)No performance impact
Async Target
Sync Target
Cascaded SRDF/Star was introduced in 2008 with the release of Enginuity 5773. Cascaded RDF allows a synchronous R2 target to also act as a source for SRDF/A. The long distance site in Cascaded RDF uses this source to receive its data feed. In the event of a failure of the workload site, the synchronous target has up-to-date data. The asynchronous target data is not more than two SRDF/A cycles behind the source site data.
Cascaded Star can be reconfigured to run in concurrent mode. If the link between the Workload and Synchronous target sites fails while running in Star protected mode, the Workload and Asynchronous target sites can be directly connected using differential resynchronization.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 9
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 9
Cascaded Star Operating States After Failure
1. Link failure between A & B
2. Site B failure
3. Link failure between A & C
4. Site C failure
5. Link failure between B & C
6. Site A failure
Failure Cases Operating State Operating State(without Reconfiguration) (with Reconfiguration)
A - R11
C - R2
A - R11
C - R2
A - R1 B - R21
A - R1 B - R21
C - R2
A - R1 B - R21
A - R11
C - R2
B - R2
B – R21
C – R1
ORB – R11
C - R2
This is a list of possible actions after a failure occurs in a cascaded Star setup.
1. If the link between A and B fails, it is still possible to run production at A with remote protection available at site C but only after a reconfiguration to concurrent Star.
2. The same holds true if site B fails.
3. If the link between A and C fails, there is no effect on Star operations because the standby links between A and C are not used unless there is a failure of site A.
4. If site C fails, the only option is to continue running at site A with remote protection at B.
5. If the link between B and C fails, there are two possibilities. The first is to continue running production at A with remote protection at B. The second is to reconfigure cascaded Star to concurrent Star and run in Star protected mode.
6. If site A fails, production has to be switched to site B or site C. This necessitates a reconfiguration of the RDF devices. The devices at the site to which production was switched become R1 devices and the remaining site is reconfigured to become R2 targets to the new production site.
The choice of which location to fail over to depends on customer needs and the location of customer resources. As is obvious from this diagram, 3 of the 6 failure scenarios necessitate an RDF reconfiguration in Cascaded Star.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 10
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 10
Benefits of SRDF/Star
If one site fails, production can continue without losing remote data protection
If the workload site fails, the two remaining target sites can be incrementally synchronized
If loss of primary data center is the principal risk, choose cascaded SRDF/Star
If loss of primary data center and the loss of the synchronous target are possible risks, choose concurrent SRDF/Star
The events of Sep. 11, 2001 made businesses more aware of the critical need to recover their data after a disaster. A few years ago at Share, a major bank from New York did a presentation entitled “The Effects of 9/11”. After the attacks on Sep. 11, this bank failed their data processing over to their New Jersey site. Later that week they were asked by federal regulators, “How are you protected now?”
As the importance of information continues to increase, companies are increasingly interested in protecting their data and minimizing their down time after a failure. SRDF/Star offers customers the business benefits that are a high priority for institutions with mission-critical data.
Both cascaded and concurrent Star have their uses depending on the application environment. If the loss of the primary data center is the principal concern, cascaded Star is a good choice.
If there is a risk of losing the synchronous target as well as the workload site, concurrent Star is a better choice.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 11
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 11
SRDF Solutions Comparison
Migration
Three Site
Three Site
Three Site
Three Site
Three Site
Two Site
Two Site
Two Site
Configuration
Data Migration
Data MigrationLowUnlimitedAdaptive CopySRDF/DM
FastNo Data Loss to sec/min
High / Medium
1–200 km Sync200 km-Unlimited
Cascaded SRDF/S and
SRDF/A
SRDF/Star w/ Cascaded SRDF
FastNo Data Loss to sec/min
High / Medium
1–200 km Sync200 km-Unlimited
Concurrent SRDF/S and
SRDF/A
SRDF/Star w/ Concurrent SRDF
FastNo Data Loss to sec/min
High / Medium
1–200 km Sync200 km-Unlimited
Concurrent SRDF/S and
SRDF/A
Concurrent SRDF/S and
SRDF/A
FastNo Data Loss to sec/min
High / Medium
1–200 km Sync200 km-Unlimited
Cascaded SRDF/S and
SRDF/A
Cascaded SRDF/S and SRDF/A
Fastmin/hrLowestUnlimitedAdaptive CopySRDF/AR
Single Hop
FastNo Data LossHigh / Low1–200 km Sync
200 km-UnlimitedSynchronous ==>
Adaptive CopySRDF/AR
Multi-hop
sec/min
No Data Loss
RPO
Medium
High
Bandwidth
Fast
Fast
RTO
SRDF/A (Asynchronous)
SRDF/S
Solution
UnlimitedAsynchronous
~200 kmSynchronous
Recommended DistancesMode
SRDF is a mature and proven technology with over 33,000 licenses sold. This slide shows the comparative merits of each SRDF technology.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 12
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 12
Module Summary
Key points covered in this module:
Two kinds of Star configurations
Origins of Star and its place in the SRDF family
These are the key points covered in this module. Please take a moment to review them.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 13
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 13
Underlying Technologies for SRDF/StarAfter completion of this module you will be able to:
Describe SRDF technologies that support Star: – Dynamic, concurrent, and cascaded RDF devices and groups– SRDF/Synchronous and SRDF/Asynchronous– Synchronous SRDF consistency groups managed by the SRDF
daemon– Cycle switching in an SRDF/A Multi-session Consistency (MSC)
environment – MSC Cleanup– Special use of SDDF sessions in tracking changes– Half delete, half swap, and special pair creation commands
The objectives for this module are shown here. Please take a moment to read them.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 14
© 2008 EMC Corporation. All rights reserved. Module Title - 14
Lesson 1
Upon completion of this lesson, you will be able to:
Describe dynamic, concurrent, and cascaded SRDF devices and groups
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 15
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 15
Dynamic RDF DevicesThe symrdf command permits quick creation and deletion of dynamic RDF pairs
Devices can be:– R1 capable– R2 capable– R1 and R2 capable
Dynamic RDF attribute of a device can be examined in the output of symdev show:
# symdev show 95 -sid 35
Dynamic RDF Capability : RDF1_OR_RDF2_Capable
Dynamic RDF devices can only exist in a Symmetrix that has the Dynamic RDF feature enabled. They can be created to be RDF1 capable, RDF2 capable, or RDF1 and RDF2 capable (as shown above). These devices can be used to build dynamic RDF pairs.
Dynamic RDF pairs can be created or dissolved using the symrdf command. They can belong to static or dynamic RDF groups. Below is a command line example of how a dynamic RDF pair is created
symrdf createpair –file <device file> -sid <xx> -rdfg <n> -type RDF1 –establish
where
• the device file contains device pairs
• sid refers to the Symmetrix ID
• -rdfg refers to the RDF group number
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 16
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 16
Dynamic RDF Groups
Allow creation and deletion of SRDF groups using the symrdf command
Work over switched SRDF network
The Symmetrix must allow switched RDFSwitched RDF Configuration State : Enabled
Dynamic RDF Configuration State : Enabled
Sample command:symrdf addgrp –label <label> -sid <xx> -rdfg <m> -dir <i> -remote_sid <yy> -remote_rdfg <n> -dir <j>
Additional documentation is located in Chapters 3 and 7 of the SYMCLI SRDF manual
While dynamic groups are not essential for SRDF/Star, they are preferable for ease of reconfiguration.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 17
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 17
Concurrent RDFPermits a single source to communicate with two targets
Allows a combination of:– Synchronous– Adaptive Copy– Asynchronous
Two asynchronous connections are not allowed
The Symmetrix must allow switched RDFConcurrent RDF Configuration State : Enabled
Dynamic RDF Configuration State : Enabled
Additional documentation is located in Chapters 2 and 7 of the SYMCLI SRDF manual
In an SRDF configuration, a single source (R1) device can concurrently be remotely mirrored to two target (R2) devices. This feature, available with Enginuity Version 5568-based Symmetrix arrays and higher, is known as a concurrent RDF configuration and is supported with ESCON, GigE, and fibreinterfaces. This allows the availability of two remote copies at any point in time. It is valuable for duplicate restarts or disaster recovery, or for increased flexibility in data mobility and migrating applications.
Concurrent RDF technology can use two different RA adapters (RAs, RFs, or REs) in the interface link to achieve the connection between the R1 device and its two concurrent R2 mirrors. Each of the two concurrent mirrors must belong to a different RDF (RA) group.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 18
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 18
Cascaded SRDF
R1 R21 R2
Single device assumes dual roles of Primary and Secondary SRDF simultaneously
Data received by this device as a Secondary can be transferred automatically by this device as a Primary
RPO in the order of seconds or minutes compared to RPO of hours as in the case of SRDF/AR
Can be used to build SRDF/Star configurations
Cascaded SRDF Device
Prior to Enginuity™ version 5773, an SRDF device could be a source device (R1) or a target device (R2), but could not function in both roles simultaneously. Cascaded SRDF introduces the concept of the dual role R1/R2 device, referred to as an R21 device. The R21 device is both an R1 mirror and an R2 mirror.
The first leg in a cascaded configuration can be set to run in synchronous, asynchronous, or adaptive copy modes.
The second leg may run in asynchronous or adaptive copy disk mode. If the first leg is running in asynchronous mode, the second leg may only run in adaptive copy disk mode.
Cascaded SRDF requires additional cache to support SRDF/A in the middle Symmetrix. If multi-session consistency is enabled from the R21 to the R2, there may be some performance degradation between the R1 and the R21 during MSC cycle switching.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 19
© 2008 EMC Corporation. All rights reserved. Module Title - 19
Lesson 2
Upon completion of this lesson, you will be able to:
Describe SRDF/S and SRDF/A
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 20
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 20
SRDF/S Architecture
SRDF/S links
I/O write received from host/server into source cacheI/O is transmitted to target cache
Receipt acknowledgment is provided by target back to cache of source
Ending status is presented to host/server
Source Target
1
4
2
3
Synchronous SRDF mode is primarily used in campus environments. In this mode, Symmetrix maintains a real-time mirror image of the data from remotely mirrored volumes.
Data on the source (R1) volumes and target (R2) volumes are always fully synchronized. Data movement is at the block level.
The sequence of operations is:An I/O write is received from the host/server into the source cache.The I/O is transmitted to the target cache.A receipt acknowledgment is provided by the target back to the cache of the source.An ending status is presented to the host/server.
Synchronous mode is one of three modes in which SRDF can operate. The other modes are Asynchronous and Adaptive copy. Unlike competitive products, SRDF can be dynamically switched to operate in another mode without interrupting host I/O.
Like all synchronous replication solutions, synchronous SRDF has architectural limitations that must be understood:
The maximum distance over which Synchronous SRDF can be used is limited by application timeouts and speed-of-light issues. Link bandwidth must be sized for peak workload at all times.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 21
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 21
SRDF/A Architecture
SRDF/A performs Write Folding, which only sends Transmits of the final writes from the Capture Delta Set
Repeat
Source Target
CaptureTransmitReceiveApply
CAPTURE (N)Collects
application write I/O
TRANSMIT (N-1)Sends final set of writes to target
RECEIVE (N-1)Receives writes from Transmit
Delta Set
APPLY (N-2)Once Receive is complete, data is
applied to disk
SRDF/A’s architecture delivers replication over extended distances with no performance impact.
SRDF/A uses Delta Sets to maintain a group of writes over a short period of time. Delta Sets are discrete buckets of data that reside in different sections of the Symmetrix cache. Starting at 1, each Delta Set is assigned a numerical value that is one more than the preceding one.
There are four types of Delta Sets to manage the data flow process.
The Capture Delta Set in the source Symmetrix (numbered N in this example), captures (in cache) all incoming writes to the source volumes in the SRDF/A group.
The Transmit Delta Set in the source Symmetrix (numbered N-1 in this example), contains data from the immediately preceding Delta Set. This data is being transferred to the remote Symmetrix.
The Receive Delta Set in the target system is in the process of receiving data from the transmit Delta Set N-1.
The target Symmetrix contains an older Delta Set, numbered N-2, called the Apply Delta Set. Data from the Apply Delta set is being assigned to the appropriate cache slots ready for de-staging to disk. The data in the Apply Delta set is guaranteed to be consistent and restartable should there be a failure of the source Symmetrix.
The Symmetrix performs a cycle switch once data in the N-1 set is completely received, data in the N-2 set is completely applied, and the 30 second minimum cycle time elapsed. During the cycle switch, a new delta set (N+1) becomes the capture set, N is promoted to the transmit/receive set and N-1 becomes the apply Delta Set.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 22
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 22
Dependent Writes Ensure Data Consistency
Dependent write logic:– If ‘A’ is a predecessor and ‘B’ is a dependent write:
Any I/O ‘B’ that arrives after I/O ‘A’ has completed, must be dependent on ‘A’
SRDF/A ensures that:– ‘A’ and ‘B’ are in the same Delta Set
or– ‘B’ is in a later Delta Set
These Delta Sets (cycles) of I/Os, not the I/Os themselves, are ordered by SRDF/A
Symmetrix ensures that dependent write relationships are honored during Delta Set switch or Write Folding
Database application consistency forms the backbone of SRDF/A design. Inherently, all database applications are consistent, which means that a database application does not issue a dependent write unless a predecessor write is completed.
For example, a DBMS does not issue a dependent data write unless a predecessor write to the log was successfully completed. EMC’s consistency technology honors this dependent write logic. By honoring write ordering at the time of the Delta Set switch, SRDF/A guarantees dependent write consistency.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 23
© 2008 EMC Corporation. All rights reserved. Module Title - 23
Lesson 3
Upon completion of this lesson, you will be able to:
Describe Synchronous SRDF Consistency Groups
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 24
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 24
SRDF Daemon (storrdfd)
System process on Unix and Windows
Interacts with:– Base Daemon (storapid)– Enginuity Consistency Assist (RDF-ECA)– Group Name Services (GNS) daemon
Maintains Consistency– On SRDF/S composite groups with consistency enabled– Performs cycle switching in SRDF/A when MSC is active– Performs MSC Cleanup in SRDF/A
Cooperates with daemons running on other hosts
Storrdfd (pronounced “store” R-D-F-D) is a process that runs as a daemon on Unix systems and as a service in Windows. It is referred to as the SRDF daemon and uses the base daemon for all its communications with the Symmetrix, such as the issuing of syscalls.
In an SRDF/S environment, the RDF daemon cooperates with RDF-ECA to maintain consistency for composite groups.
If GNS is enabled on the host, the SRDF daemon interacts with the GNS daemon to acquire composite group definitions. Otherwise, it gets definitions from the SYMAPI database.
In an SRDF/A environment, the SRDF daemon is responsible for cycle switching when Multi-Session Consistency is enabled.
The RDF daemon is designed for full cooperation with other RDF daemons. Any task for which the daemon is responsible, such as an MSC cycle switch, can be initiated by one RDF daemon and completed by another RDF daemon. At no time is there a single point of failure if there are two or more RDF daemons monitoring the same processes.
It is therefore advisable to have more than one host running the SRDF daemon in an environment where the daemon’s services are necessary. Such a configuration provides redundancy in case one of the daemons stops unexpectedly.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 25
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 25
Consistency: The Dependent Write I/O PrincipleLogical dependency between write I/Os– Embedded in the logic of an application, operating system, or DBMS
A write I/O is not issued by an application until a prior related write I/O is completed– A logical dependency, not a time dependency– Inherent in all Database Management Systems (DBMS)
Page (data) write is dependent write I/O based on a successful log write– Power failures create a dependent write consistent image– Restart transforms dependent write consistent to transactionally consistent
Ensuring ‘dependent write consistency’ is the basic principle behind all EMC Consistency Technology solutions– SRDF Consistency Groups– TimeFinder Consistent Split– Consistent SNAPs and Consistent Clones– SRDF/A– Open Replicator for Symmetrix
Almost all commercial applications, such as databases, are inherently consistent by design. EMC’s consistency technology makes it possible for consistency to be maintained when replicas of production data are made.
All logging database management systems use the consistency principles described on this slide to maintain integrity. This is required for the protection against local power outages, loss of local channel connectivity, or storage devices. There is a logical dependency between I/Os built into database management systems, certain applications, middleware tools such as MQ Series, and operating systems.
EMC can create a dependent write consistent local image with its TimeFinder family of products, whereas SRDF Consistency Groups and SRDF/A create a consistent image on one or more remote Symmetrix arrays.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 26
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 26
Enginuity Consistency Assist (ECA)
ECA is a feature that works inside the Symmetrix array
Stalls write I/Os to a user-defined list of Symmetrix devices prior to splitting a source volume and its replica
Used for:– Open Replicator consistent activation– TimeFinder consistency
TF/Mirror consistent splitTF/Clone consistent activationTF/Snap consistent activation
Enginuity Consistency Assist is a feature introduced with Enginuity 5x67. It stalls write I/O to a user-provided list of devices prior to a consistent TimeFinder split or a consistent activation of Open Replicator, TimeFinder Clone, or TimeFinder Snap. Reads are allowed to continue during this time. Once the split or activation is complete, I/O is allowed to flow again. The stalling of write I/Os guarantees that the copy of data being split or activated is dependent write consistent.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 27
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 27
RDF-ECA
Used with SRDF/S to hold write I/Os to a consistency group until all relevant links are suspended
Interacts with RDF daemons on one or more control hosts to manage consistency
Can replace PowerPath and MF Consistency Group Task to manage consistency on FBA and CKD volumes
Supports synchronous consistency in concurrent and cascaded RDF composite groups
RDF ECA is an extension to ECA released in Enginuity 71. It interacts with the RDF daemon to manage consistency of a user-defined RDF consistency group. RDF ECA can manage consistency for CKD and FBA devices.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 28
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 28
ECA WindowECA is activated by Enginuity when:– Host issues a TimeFinder consistent split or activate command– Host issues an Open Replicator consistent activation command– An SRDF/S I/O directed at a consistency group fails to complete on the
remote side
At activation, a 30 second timer (ECA window) starts
While ECA window is open, Enginuity requests host HBAs to retry write I/Os to affected devices
When the desired action (split/activate/suspend) completes, the ECA window is closed and I/O can flow again
If the action fails to complete within 30 seconds, I/O is allowed to flow again, but an error message is logged
In the context of TimeFinder, ECA window is the name given to a 30 second timer that starts when a consistent split or consistent activation is initiated.
In the context of RDF-ECA, the 30 second timer is started by the Symmetrix after it determines that a write I/O to a device in a consistency group cannot complete on the remote array.
Once the ECA timer starts, Enginuity does not accept write I/Os to the affected devices. Instead, it asks the host HBAs to retry the I/O. When the required action completes, the ECA window is closed and I/O is permitted to flow again.
If, for some unexpected reason, the required action does not complete before 30 seconds are up, Enginuity closes the ECA window. It allows I/O to flow again while recording an error message in the host-based log file.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 29
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 29
Composite Groups
Similar to Device Groups– Four types: Regular, RDF1, RDF2, and RDF21– Can be used for all Control operations (e.g., TimeFinder, SRDF, etc.)– Can participate in consistency operations such as TimeFinder/Mirror
consistent splits and TimeFinder/Snap activate
Different from Device Groups– Can span multiple RDF groups and Symmetrix arrays– Managed by RDF daemon if created with –rdf_consistency option
and consistency is enabled– Required for SRDF/A Multi-session Consistency (MSC)
Composite groups are similar to device groups. They can be type RDF1, RDF21, RDF2, or Regular, and used for every action that is available with device groups.
Composite groups can span RDF groups and Symmetrix arrays. For example, a host running a database application spanning two Symmetrix arrays can use a composite group to perform a consistent TimeFinder split of the application data. If the type of the Composite Group is RDF1, RDF21, or RDF2, the CG (Composite Group) can span RDF groups.
Composite groups are a requirement for building SRDF consistency in SRDF/S and SRDF/A. When a CG is created for SRDF consistency using RDF-ECA, the “–rdf_consistency” option must be specified at group creation time.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 30
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 30
RDF Daemon and SRDF/S Consistency
A link failure causes Symmetrix to pause writes to the CG devices in that Symmetrix
ReceiveN
ApplyN-1
ReceiveM
ApplyM-1
CaptureN+1
TransmitN
CaptureM+1
TransmitM
Host I/O
Host I/O
Host I/O
Host I/O
SRDF LINKS
SRDF LINKS
CG
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
RDF daemon
monitoring
When synchronous RDF consistency is enabled for a consistency group, the RDF daemon polls the Symmetrix every second to monitor the health of the consistency group.
Assume that the links connecting one of the Symmetrix pairs fail. When the source Symmetrix fails to complete writes to the remote devices, it starts the ECA timer window. All subsequent writes to the devices belonging to the composite group in that Symmetrix are turned back with retry requests issued to the host HBA. During this time, no dependent writes are issued by the application, because the host database application has not been notified of the completion of the predecessor write.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 31
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 31
RDF Daemon and SRDF/S Consistency (Cont)
Next, the RDF daemon requests logical link suspension of the remaining devices…
ReceiveN
ApplyN-1
ReceiveM
ApplyM-1
CaptureN+1
TransmitN
CaptureM+1
TransmitM
Host I/O
Host I/O
Host I/O
Host I/O
SRDF LINKS
SRDF LINKS
CG
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
Daemonsuspends RDF links
When the RDF daemon recognizes that one of the Symmetrix pairs has lost connectivity, it requests the remaining Symmetrix arrays to open an ECA window which will hold incoming writes as well.
Once all ECA windows are open and write I/O is stopped to the entire consistency group, the daemon logically suspends the remaining communication links.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 32
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 32
RDF Daemon and SRDF/S Consistency (Cont)
leaving the R2 devices dependent-write consistent
ReceiveN
ApplyN-1
ReceiveM
ApplyM-1
CaptureN+1
TransmitN
CaptureM+1
TransmitM
Host I/O
Host I/O
Host I/O
Host I/O
SRDF LINKS
SRDF LINKS
CG
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
R1 R2
RDF daemon
monitoring
Once all the links have been suspended, the ECA windows are closed and the writes to the local arrays are allowed to complete. Note that writes to the first group of devices were held as soon as the links failed and the remote writes did not complete. Thus, if the host was running a database application, no dependent write could have been issued by the host application between the time that links on the first Symmetrix failed and I/O flow was restored by the RDF daemon to all devices. This makes the target site data consistent.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 33
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 33
RDF Daemon and MSC Cycle Switching
SRDF LINKS
CaptureN+1
TransmitN
ReceiveN
ApplyN-1Host I/O
SRDF LINKS
CaptureM+1
TransmitM
ReceiveM
ApplyM-1Host I/O
SRDF LINKS
CaptureN
TransmitN-1
ReceiveN-1
ApplyN-2
SRDF LINKS
CaptureN
TransmitN-1
ReceiveN-1
ApplyN-2
CG: MSCHost I/O
Host I/OTime to perform
cycle switch!
RDF daemon
monitoring
Daemon monitors RDF devices that belong to MSC group
A second function of the RDF Daemon is to maintain Multi-session Consistency (MSC) in an SRDF/A environment. MSC is important when consistency must be maintained between multiple production applications running on multiple SRDF groups. The example on this slide illustrates how the RDF daemon maintains consistency while cycle switching during normal MSC operations.
The RDF Daemon (or daemons) monitors all groups and manages cycle switching for all R1 Symmetrix arrays whose sessions are managed by MSC.
When the minimum cycle time, which by default is set to 30 seconds, has elapsed, the RDF Daemon verifies that:
each R1 Symmetrix array has completed transferring the Transmit Delta Set to the R2s andeach R2 Symmetrix has completed applying the apply delta sets.
Until the conditions above are satisfied for each RDF group in each Symmetrix array, the cycle switching is not initiated and the present cycle gets elongated.
Once all RDF groups indicate their readiness to switch, the RDF daemon briefly holds writes to the source arrays and switches the cycles first on the source and then on the target arrays. The cycle switching is an asynchronous process so all the source and target boxes do not switch in the same instant, they switch one after the other. Host writes are allowed to flow into the source arrays as soon as the source array has switched, whereas transmit data is allowed to flow into the target array as soon as the target array has switched.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 34
© 2008 EMC Corporation. All rights reserved. Module Title - 34
Lesson 4
Upon completion of this lesson, you will be able to:
Describe MSC Cleanup using the SRDF Daemon
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 35
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 35
MSC Cleanup After an SRDF/A Trip
A trip can occur at different times in the SRDF/A cycle
From the viewpoint of a single R2 Symmetrix, there are only 2 possible states for a receive delta set:– The receive delta set is incomplete
Symmetrix knows it is incomplete, so it is automatically discardedNo MSC Cleanup needed
– The receive delta set is completeSymmetrix marks the session as Needing MSC CleanupDisposition of delta set depends on status of other R2 Symmetrix arrays in the same SRDF/A MSC protected CG group
If all are complete, then it is all right to commit the delta setIf all are not complete, then delta set must be discarded
The third and final function of the RDF daemon is to manage SRDF/A multi-session consistency in the event of a failure of communication between source and target.
When there are multiple Symmetrix arrays or SRDF groups participating in a multi-session consistency group, the Symmetrix sets the “MSC cleanup required” flag if the receive Delta Set was completely received at the time the failure occurred.
A single Symmetrix, with the SRDF/A MSC flag set, cannot determine the correct action to take for a completely received Delta Set without information from other Symmetrix arrays in the SRDF/A MSC protected consistency group.
MSC Cleanup can be invoked by any of the following methods:The RDF daemon performs MSC cleanup automatically if it can communicate with the target arrays.The API/CLI automatically performs MSC cleanup during the processing of any RDF control command.User can manually execute MSC cleanup through CLI.
The MSC Cleanup Needed status is exported to user-visible displays such as query output. MSC Cleanup commits receive cycle data in case of failure during cycle switch instead of discarding it unnecessarily.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 36
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 36
RDF Daemon and MSC Cleanup
MSC Cleanup is needed in the bottom Symmetrix only
SRDF LINKS
CaptureN+1
TransmitN
ReceiveN
ApplyN-1Host I/O
SRDF LINKS
CaptureM+1
TransmitM
ReceiveM
ApplyM-1Host I/O
SRDF LINKS
CaptureN+1
TransmitN
ReceiveN
ApplyN-1
SRDF LINKS
CaptureN+1
TransmitN
ReceiveN
ApplyN-1
CG: MSCHost I/O
Host I/O ReceiveDelta SetComplete
ReceiveDelta Set
Incomplete
?RDF
daemonmonitoring
Assume that the links between the source and target arrays have tripped. The Receive Delta set in the top array is incomplete while the Receive delta set in the bottom array is complete.
In the top case, because the Receive Delta set is incomplete, the only valid choice for the Symmetrix is to discard it because the dependent write principle only works for complete Delta Sets.
For the bottom case, the Receive Delta set is complete. Since this is an MSC protected group, the Symmetrix cannot decide what to do on its own.
If all Receive Delta sets were complete, it would be correct to Apply the data.However, if any of the Receive Delta sets are incomplete, then the data must be discarded.The Symmetrix sets the MSC Cleanup Needed flag.
In the example displayed on this slide, MSC Cleanup is undertaken by one of the three methods mentioned earlier:
The RDF daemon;Any RDF control command issued by the API/CLI;The user issues an explicit “symstar cleanup” command.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 37
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 37
MSC Cleanup Logic
All SRDF/A MSC sessions in the CG are inventoried– Is the MSC Cleanup Needed flag set?– What are the Apply and Receive Delta set cycle numbers?
MSC Cleanup logic decides what to do (4 possibilities)
MSC Cleanup Needed Symms must discardA=NMSC Cleanup
NeededA=NNo Cleanup Needed2
A=N-1R=N
A=N
-
B
A=NR=N+1
A=N
A
4
3
1
#
Failure occurred during a cycle switch, all are committed
MSC Cleanup Needed
No Cleanup Needed
All complete, all are committedMSC Cleanup Needed
MSC Cleanup Needed
All discarded, no MSC CleanupNo Cleanup Needed
No Cleanup Needed
ActionR2 SymmR2 Symm
Once the RDF daemon on the source side notices a trip event, it runs the MSC cleanup logic on the target arrays if it can communicate with them. The legend A=N means the Apply Delta set is numbered N. Similarly, R=N+1 means that the number of the Receive Delta Set is N+1. Though the table shown here uses two Symmetrix units, the logic works for larger numbers of arrays.
1. In this case, none of the SRDF/A sessions have the “MSC Cleanup Needed” flag set. This occurs when all the Receive Delta sets were incomplete and all were automatically discarded. There is no Cleanup action to take and it is not invoked automatically.
2. Only some Symmetrix arrays have the “MSC Cleanup Needed” flag raised. Also, ALL Apply Delta set numbers are the same. This means that some Symmetrixes had to discard their incomplete Receive Delta Sets. Consequently, all the Symmetrixes needing MSC Cleanup must discard their completely received Delta Sets.
3. All Symmetrixes have the “MSC Cleanup Needed” flag raised. In this case, ALL Apply Delta Set numbers must be the same. This indicates that all Receive Delta Sets are complete and all the Receive Delta Sets can be applied.
4. Only some Symmetrix units have their flag raised. Also, one or more Symmetrixes with the flag raised has a Receive Delta Set number that matches the Apply cycle number for a Symmetrix which discarded its incomplete Receive cycle. This indicates a failure in the middle of a cycle switch. So, all the completely received Receive Delta Sets in the Symmetrix arrays with the flag raised are applied.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 38
© 2008 EMC Corporation. All rights reserved. Module Title - 38
Lesson 5
Upon completion of this lesson, you will be able to:
Describe special SDDF and RDF features in support of STAR
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 39
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 39
Symmetrix Differential Data Facility
Each Symmetrix logical volume can support up to 16 sessions
SDDF sessions comprise bitmaps that flip a bit for every track that changed since the session was initiated
SDDF sessions are used to monitor changes in:– Clones– Snaps– BCVs– Change Tracker– Open Replicator
Enhanced to support SRDF/Star
Each Symmetrix logical volume is allotted a quota of 16 SDDF sessions. These sessions allow the Symmetrix to track changes using bitmaps, which flip from a zero to a one whenever a monitored track changes.
SDDF sessions are used to monitor changes in BCVs, Clones, Snaps, Change Tracker, and Open Replicator.
SDDF functionality was enhanced for SRDF/Star to enable differential resynchronization between two target sites. Once Star is enabled, two sessions are created and activated at the Synchronous target site, and one SDDF session is created at the Asynchronous target site.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 40
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 40
SDDF Session Usage in Concurrent Star
R2
R2
R11
Site B – 2 Active SDDF Sessions per each device
SRDF/A
SRDF/S
Passive Link
1001011001001…
0001011100100…
000000000000…
Site C – 1 Inactive SDDF Session per each device
Site A – RDF daemon from control host manages SDDF sessions
When Star Protection is enabled, two SDDF sessions are created at site B and one SDDF bitmap is created at site C. The bitmaps at site B are always active during normal Star operation. They are alternately marked and cleared after every two or more SRDF/A MSC cycles elapse between sites A and C.
The bitmap at site C stays inactive during normal Star operation.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 41
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 41
Concurrent STAR – When Site a Fails
R2
R2
R11SRDF/A
SRDF/S
Passive Link
1001011001001…
0001011100100…
000000000000…
IOR
SDDF sessions at Site B frozen, since data flow to B stops
Inclusive OR of 2 SDDF bitmaps at B used to resolve track differences between B and C
2 SDDF bitmaps
1 SDDF bitmap
If the primary site fails, data transmission to both sites stop simultaneously.
Under these circumstances, the data at Synchronous Target B is more recent than the data at Asynchronous target C.
In the course of recovery, an inclusive OR of the two bitmaps is performed at site B. This operation marks all tracks updated in the current bitmap and all tracks updated in the previous bitmap as owed to site C. Since the bitmap initialization at site B occurs every two plus cycles, it is possible that the inclusive OR will result in more than the minimum required tracks being marked as invalid. This is not a problem since by copying a few more tracks than needed, we err on the side of caution.
MSC cleanup needs to be run at site C if needed.
If a business decision is made to run production at site B, the RDF devices at site B are turned into R1 volumes and paired with corresponding R2 volumes at site C. An RDF establish now copies the invalid tracks from site B to site C.
If the decision is to run at site C, the devices at B are turned into R2s and those at site C into R1s. An RDF restore updates the C site with tracks owed by B to C.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 42
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 42
Concurrent STAR – Rolling Disaster
R2
R2
R11
First Failure
Passive Link
1001011001001…
0001011100100…
110011001000…
IOR
When link to Site B fails– SDDF bitmaps at Site B are frozen since data flow to B stops– SDDF bitmap and Token Counter at C (not shown in diagram) activated at
SRDF/A cycle boundary– Token counter at Site C counts elapsed cycles since activation
After failure of Site A, inclusive OR of both SDDF bitmaps at B and bitmap at C used to resolve track differences between B and C
Second Failure
2 SDDF bitmaps
1 SDDF bitmap IOR
The failure described here is often referred to as a rolling disaster, where the first failure is succeeded by a second one. Here, the first fault disrupts the links between A and B. This causes the synchronous consistency group to trip, leaving the data at site B consistent. The SDDF sessions at site B are frozen for later conversion to invalid tracks. Data processing continues at site A, and site C continues to get updated.
When the synchronous link fails, the SDDF session at site C is activated on a cycle boundary just prior to the next cycle switch. This SDDF session records new writes coming into site C. Additionally, a token counter is started at C. It starts counting the number of cycle switches after activation.
Shortly after the first failure, the primary site fails, causing data transmission to stop at site C. If the second failure occurs more than two SRDF/A cycle switches after the first failure (as recorded by the token counter), site C will be more current than site B.
A Star query after the final primary site failure indicates which side is more current.
An inclusive OR between the two SDDF bitmaps at site B and an inclusive OR between the resulting bitmap and the bitmap at site C, creates the invalid track table that must be resolved when the two sides are synchronized.
If data at site C is more current, the synchronization should cause tracks to flow from C to B. If the token counter indicates that B is more current than C, new data flows from B to C.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 43
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 43
SDDF Session Usage in Cascaded Star
R2
R2
R1
Site B
SRDF/A
SRDF/S
Passive Link
1001011001001…
0001011100100…
000000000000…
Site C – 1 Inactive SDDF Session per each device
Site A – RDF daemon manages 2 Active SDDF sessions per each device
When Star Protection is enabled, two SDDF sessions are created at site A and one SDDF bitmap is created at site C. The bitmaps at site A are always active during normal Star operation. They are alternately marked and cleared after every two or more SRDF/A MSC cycles elapse between sites A and C.
The bitmap at site C stays inactive during normal Star operation.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 44
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 44
Cascaded STAR – When Site B Fails
R2
R21
R1 SRDF/A
SRDF/S
Passive Link
1001011001001…
0001011100100…
000000000000…
If Sites A and C are reconfigured in concurrent mode:
Inclusive OR of 2 SDDF bitmaps at A used to resolve track differences between A and C at reconfiguration time
2 SDDF bitmaps 1 SDDF bitmap
IOR
The failure of site B in Cascaded Star is a major failure, since reconfiguration from cascaded to concurrent Star must be undertaken in order to provide remote data protection. When the link between A and C is activated, the SDDF bitmaps at site A are used to determine the invalid tracks that must be moved from A to C.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 45
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 45
Cascaded STAR – When Site A Fails
R2
R21
R1 SRDF/A
SRDF/S
1001011001001…
0001011100100…
000000000000…
SDDF sessions at Site A frozen
Since B and C already have a track table relationship, there is no need for SDDF sessions
2 SDDF bitmaps 1 SDDF bitmapPassive Link
If the workload site fails in a cascaded star environment and the decision is made to switch production to either target site, the SDDF sessions are not needed because the differences between the B and C sites are recorded in the track tables.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 46
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 46
Half Delete SRDF Pair
Requires 5x71 or later version of Enginuity
Deletes half of the RDF pair relationship
Can be used to dissolve RDF relationships if partner device is unavailable
RDF pair relationship shows up as half
Normal Configuration Suspended State
R1 2-WayMir
After half delete
R2R1
A half delete operation can be executed on a dynamic RDF pair using SYMCLI commands. After the half delete command is executed, the device in the Symmetrix on the left turns into a regular device, and the one on the right retains its identity. A SYMCLI query shows it as a half pair. The SRDF pair state must be suspended, failed over, split or partitioned before a half delete can be performed.
The half delete of SRDF pairs is used by SRDF/Star in a disaster situation.
The command is also available for general use, but only in special cases. If an existing RDF relationship is rendered null and void by the physical removal of one of the Symmetrix arrays, without the termination of the SRDF relationships, the half delete command can be used to dissolve remaining RDF volumes.
Do not use the half delete command when both arrays in an RDF relationship have visibility to each other.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 47
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 47
Half Swap
Changes personality of one side of an RDF relationship
After a half swap– the RDF pair configuration for the
device shows up as “Duplicate”OR– the RDF pair configuration shows up
as normal if one of the pair state was “Duplicate”
BEFORE HALF SWAPNormal Configuration – Suspended State
R1R1
R2R1
AFTER HALF SWAPDuplicate Configuration
R2R1
R1R1
BEFORE HALF SWAPR1 and R2 are in duplicate pair state
AFTER HALF SWAPNormal Configuration – Suspended State
The half swap operation changes the personality of one SRDF volume, irrespective of whether the other RDF volume is visible or not.
There are two uses for the half swap command while reconfiguring devices during a Star action:
1. Sometimes during a site reconfiguration, an R2 device is half swapped so it becomes an R1 device. This makes the pair relationship “duplicate” since there are now two R1 devices in the pair pointing at each other.
2. At other times in the course of a site reconfiguration, a half swap converts a duplicate device pair into a normal device pair by turning one member of the duplicate pair from an R1 to an R2.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 48
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 48
Special Create Pair Options
Note: These commands are not available to users
Two forms of RDF pair creation used only in SRDF/Star
createpair with “nocopy” option – Creates a dynamic RDF pair without copying data– Declares both sides equal without any tracks being moved– Used during a planned switch from one workload site to another
createpair with “refresh” option– Uses SDDF sessions at synchronous and asynchronous targets to
perform an incremental resynchronization
The two functions described on this slide were created for the purpose of Star and are not available to users.
Creation of a dynamic RDF pair without copying data is an action that risks data corruption if it was not 100% certain that the devices in the pair did, in fact, contain identical data. This function is used in the case of a planned workload site switch when applications are halted and all three sites are made equal prior to a switch.
Creation of a dynamic RDF pair with an incremental refresh is only possible based on the SDDF bitmaps at the synchronous and asynchronous target sites. This is the key behind SRDF/Star’s ability to switch workload sites without a full refresh.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 49
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 49
Module Summary
Key points covered in this module:– Dynamic, concurrent, and cascaded RDF devices and groups– SRDF/Synchronous and SRDF/Asynchronous– Synchronous SRDF consistency groups managed by the SRDF
daemon– Cycle switching in an SRDF/A Multi-session Consistency (MSC)
environment – MSC Cleanup– Special use of SDDF sessions in tracking changes– Half delete, half swap, and special pair creation commands
These are the key points covered in this module. Please take a moment to review them.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 50
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 50
Using SRDF/StarUpon completion of this module, you will be able to:
Describe Symmetrix parameters required to run SRDF/Star
List the host software components needed for Star
Explain concurrent SRDF/Star operations
Explain cascaded SRDF/Star operations
The objectives for this module are shown here. Please take a moment to read them.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 51
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 51
Minimum Hardware Requirements3 Symmetrix DMX systems, one for each site
2 SRDF director boards per Symmetrix
Equal number and size of SRDF devices for each Symmetrix at each site
Primary control host with Solutions Enabler at each site from where SRDF/Star will be managed
R2
R2
R11
Symmetrix 1Site A
Symmetrix 3Site C
Symmetrix 2Site B
Concurrent SRDF/Star Configuration
Cascaded SRDF/Star Configuration
SRDF/A
SRDF/S
Passive Link
R2
R21
R1
Symmetrix 1Site A
Symmetrix 3Site C
Symmetrix 2Site B
SRDF/A
SRDF/S
Passive Link
At EMC, people sometimes use the words “director” and “director board” interchangeably. For the sake of the discussion below, a director board has 8 ports managed by 4 microprocessors. When configured for SRDF, it is recommended practice to assign one microprocessor per RDF connection, leaving the other port on that microprocessor open.
When running concurrent RDF as Star does, it is recommended that at least 2 of the 4 microprocessors on each 8-port director be dedicated to SRDF traffic. This is in accordance with Symmetrix performance engineering guidelines that SRDF/A and SRDF/S traffic should not be allowed to run on the same microprocessor.
Two director boards are required to guarantee redundancy.
In addition to RDF devices, it is highly recommended that an equal number of Clone capable devices (e.g., BCVs) be provisioned for each Symmetrix target to which production could be switched.
It is recommended that redundant control hosts are run at each site from where SRDF/Star will be managed. Redundant hosts allow for redundant RDF daemons, which are necessary to avoid a single point of failure.
Though not shown on this diagram, it is assumed that each site to which production can be switched, is provisioned with hosts capable of running production applications.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 52
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 52
System RequirementsSolutions Enabler V6.2 or higher– Need 6.5 for Cascaded SRDF/Star
Minimum Enginuity level – 5671 or 5771– Need 5773 at Sites A and B for Cascaded Star
Symmetrix level settings– Switched RDF Configuration State is Enabled– Concurrent RDF Configuration State is Enabled– Dynamic RDF Configuration State is Enabled– Concurrent Dynamic RDF Configuration is Enabled– RDF Data Mobility Configuration State is Disabled– RDF Directors are Fibre-Switched or GigE
SRDF Group Level Settings– Prevent Auto Link Recovery is Enabled– Prevent RAs Online Upon Power On is Enabled
The system requirements for Star are listed on this slide. Please take a moment to review them. The information related to the Symmetrix and the SRDF groups can be verified by the use of SYMCLI commands.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 53
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 53
SRDF/Star Licensing and HardwareWorkload Site Synchronous Target Asynchronous TargetRequired Hardware Required Hardware Required HardwareDMX running 5x71 / 5773 DMX running 5x71 / 5773 DMX running 5x71
2+ remote adapters (GigE or FC) 2+ remote adapters (GigE or FC) 2+ remote adapters (GigE or FC)
Recommended Hardware Recommended Hardware Recommended Hardware
Available BCV Cap. for all R2s Available BCV Cap. for all R2s
Required Licenses Required Licenses Required Licenses(if failing over to this site) (if failing over to this site)
SE Base license SE Base license SE Base license
SRDF/Star license SRDF/Star license SRDF/Star license
SRDF/S license SRDF/S license
SRDF/A license SRDF/A license SRDF/A license
SRDF/CG SRDF/CG SRDF/CG
Optional Licenses Recommended Licenses Recommended Licenses (if failing over to this site) (if failing over to this site)
TimeFinder/Clone TimeFinder/Clone TimeFinder/Clone
TimeFinder/CG TimeFinder/CG TimeFinder/CG
Concurrent Star can be run in environments capable of running Enginuity 5x71. Cascaded Star can be run in environments that support 5773 or later revisions of microcode. If Cascaded Star is never used, the SRDF/CR licenses are not needed.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 54
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 54
GNS in the SRDF/Star EnvironmentGNS Advantages– Consistent common composite group definition maintained for all
management hosts– Reduces possibility of human error when there are several
management hosts at each location
GNS Disadvantages– Concurrent group definitions not propagated over SRDF links
SYMAPI_USE_GNS=ENABLE in options file to start GNS
Command to determine if database is in GNS# symcfg -db
GNS State : Enabled
When planning an SRDF/Star implementation, a consideration is the use of Global Naming Services. GNS can be started by setting the value of SYMAPI_USE_GNS to ENABLE in the options file on the management host. This file is located on Windows hosts at \Program Files\EMC\SYMAPI\config\options. It is located in /var/symapi/config/options on Unix hosts.
The use of GNS in an SRDF/Star environment can simplify management tasks when there are several management hosts at each site cooperatively managing Star. In such a case, it greatly reduces the chance of errors caused by someone changing a CG definition on one management host, but not on the other.
GNS cannot propagate concurrent CG or DG definitions across SRDF links. Using GNS does not obviate the need for copying the Star definitions file to the other management hosts.
If GNS is enabled, the RDF daemon must be explicitly started on the management host. Details on how to do this are provided on a later page.
More information on GNS is available in the Array Management CLI product guide.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 55
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 55
Star Options and Internal Definitions Files
Star Options File– Created with text editor on a host at Workload site– Used for:
Defining site names for the 3 sitesSpecifying parameters that govern SRDF/Star behaviorCreating the SRDF/Star internal definitions file with help of symstarsetup command
Star Internal Definitions File– Created from Star options file– Copied from Workload site host to other management hosts– Used by the symstar command– Should not be modified by user
The Star options file is created by the user with a text editor. It specifies parameters shown on the next page. The setup command translates the contents of the options file and writes them into the Star internal definitions file. This file is used by the symstar command for all its actions. The internal definitions file should not be modified by users. Any changes should be instituted through the options file.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 56
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 56
Action Categories for SRDF/StarNormal Operation– Used for configuration setup– Connecting, protecting, and enabling configuration– Isolation of sites
Transient Fault Operation– Caused by temporary loss of network connectivity or either target site– Reset the environment
Unplanned Switch Operation– Caused by Workload Site fault – Cleanup– Unplanned switch, keep local/remote data
Planned Switch Operation– Purposeful switching of workload to another site– Halt, Halt -reset– Planned switch
During normal operation of Star, the list of activities consists of configuring and setting up Star. Connecting, Protecting and Enabling are the steps required to achieve Star protection. Site isolation is available to temporarily isolate a site for maintenance purposes.
A temporary failure caused by the outage of the network or either remote array is classified as a SRDF/Star transient fault. It does not disrupt the production workload site, and only requires remote site recovery and protection procedures executed at the workload site.
A fault caused by a workload site loss is classified as a disaster. A disaster necessitates the unplanned switching of the workload to either remaining remote sites. Even after the move, disaster protection is available because of the asynchronous SRDF relationship created between the remaining remote sites.
Planned workload switch operation defines the system in a state where the user can move the production workload from site to site in a planned procedure. It is typically undertaken when returning to the original Workload site after a disaster had forced a move of production activity to one of the target sites.
This type of operation assumes and enforces a behavior of the customer stopping the workload in the current production site, draining and synchronizing both remote sites, halting the system, and then “switching” the workload to either of the remote production sites.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 57
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 57
Actions Used for Normal Operationsetup (workload)
– Reads/verifies the CG definition and options file– Generates the SRDF/Star internal definition file
buildcg (workload or target)– Reads the internal definition file and creates a composite group suitable for the site
connect (workload)– Performs SRDF reconfiguration and starts data flow in Adaptive Copy Disk Mode
disconnect (workload or target)– Suspends SRDF data flow– Places the path in Adaptive Copy Disk Mode
protect (workload)– Transitions to correct mode (sync/async) and enables consistency protection
unprotect (workload)– Deactivates consistency protection and transitions target to Adaptive Copy Disk Mode
enable (workload)– Provides SRDF/Star consistency protection and activates SDDF sessions
disable (workload)– Deactivates SRDF/Star consistency protection and optionally deletes SDDF sessions
isolate (workload)– Isolates a target site and makes the R2 devices R/W enabled
A brief description of the actions used in Star are provided on this and the next two slides.
Unlike RDF commands which can be issued from the source or target site, Star actions can only be initiated from a particular site. Most actions are allowed from the workload site only as indicated by the word “workload” in parenthesis next to the command.
The disconnect action can be issued from either the source or the target site.
The buildcg action is a utility that assists the user in creating the Star composite group based on the information contained in the internal definition file created at the time of setup. Based on the site from which it is run, buildcg can be used both from the workload or the target sites.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 58
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 58
Transient Fault and Informational Commandsreset (workload)
– Cleans up internal metadata and Symmetrix cache at remote site after temporary problem (such as loss of connectivity) has been resolved
query (workload or target)– Displays status of the SRDF/Star configuration– Last action performed from that management host
show (workload or target)– Displays SRDF/Star internal definition file contents
Options selectedSymmetrix resources Optionally, all devices in the configuration
The reset action is used after the loss of a target. It performs MSC Cleanup, if needed. It can only be run from the workload site. The informational commands can be run from any site.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 59
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 59
Actions for a Planned or Unplanned Switchcleanup (target)
– Performs MSC Cleanup at the asynchronous target– Allows for “Gold” copy capture prior to resynchronization
switch (target)– Switches workload to a remote site, either synchronous or asynchronous
halt (workload or target)– Disables Star consistency protection– Stops application workload from writing to R1 devices– Allows all invalid tracks and cycles to drain, resulting in all 3 sites having the
same data
halt –reset (workload)– write enables the R1 devices at a halted workload site
reconfigure (workload)– Changes the Star setup from cascaded to concurrent and vice versa
cleanup is performed on the asynchronous site if the target site is in PathFail;CleanReq state. While reset and some symrdf commands perform MSC Cleanup without requiring the user to issue an explicit cleanup, it is always a good idea to issue a cleanup command after the failure of the asynchronous site.
The switch action is executed to move the workload to either target site. It is used both for a planned as well as an unplanned workload switch.
reconfigure allows the changing of a star configuration from concurrent to cascaded and vice versa.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 60
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 60
BCV Usage in Starsymstar command does not manage BCVs
BCVs are recommended at target location
BCVs are used to preserve a consistent data copy before resynchronizing a target site in the process of recovering from a link or site failure
Although the symstar command does not manage BCVs at target sites, BCVs are an important piece of Star operation. The purpose of BCVs is to preserve a gold copy of the data at the target site after :
Loss of connectivity between source and either targetReturn of connectivity between source and that target
Using BCVs at the target site preserves a good data copy to guard against a source site disaster during resynchronization.
BCV operations in Star must be managed by the user.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 61
© 2008 EMC Corporation. All rights reserved. Module Title - 61
Lesson 1
Upon completion of this lesson, you will be able to:
Describe concurrent Star operations
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 62
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 62
SRDF/Star Site Designations
Workload Site: ConcStarASynchronous Target Site: ConcStarBAsynchronous Target Site: ConcStarC
R2
RecoveryLinks
R1
WorkloadSite
R2
Synchronous Target Site
Host I/O
Asynchronous Target Site
Symmetrix ID000190300992
Symmetrix ID000190300994
Symmetrix ID000190103734
RDFG 5
RDFG 6
RDFG 20
RDFG 21
RDFG 7
RDFG 8
The Star documentation uses the names Workload Site, Synchronous Target, and Asynchronous Target to identify the three sites participating in Star. These names are functional descriptors of the sites and not rigidly tied to a geographical location.
The names ConcStarA, ConcStarB, and ConcStarC are names that refer to this specific Star configuration. These names could be assigned to geographical locations at customer sites, e.g., NewYork, NewJersey, and London. In this Star configuration, the Workload Site could move to ConcStarB or ConcStarC after a switch operation.
The Symmetrix IDs are displayed in the diagrams. The RDF group number on Symmetrix 92 connecting it to Symmetrix 94 is 5. The RDF group number on Symmetrix 92 connecting it to Symmetrix 34 is 6. The recovery group between Symmetrix arrays 34 and 94 must be empty. They are used in the event of a workload site disaster.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 63
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 63
Star Options File
The Star options file is created on host1 (at Workload site)
It names the 3 sites and assigns values to SRDF/Star parameters
Example:# cat ConcStar.optSYMCLI_STAR_WORKLOAD_SITE_NAME = ConcStarASYMCLI_STAR_SYNCTARGET_SITE_NAME = ConcStarBSYMCLI_STAR_ASYNCTARGET_SITE_NAME = ConcStarCSYMCLI_STAR_ADAPTIVE_COPY_TRACKS = 30000SYMCLI_STAR_ACTION_TIMEOUT = 1800SYMCLI_STAR_TERM_SDDF = NoSYMCLI_STAR_ALLOW_CASCADED_CONFIGURATION = Yes
While the names ConcStarA, ConcStarB, and ConcStarC are used for this example, names at customer sites could be chosen after other criteria such as geographic locations such as New York, New Jersey, and London.
The adaptive copy tracks value is the number of invalid tracks that must accumulate before transitioning from Adaptive Copy mode into synchronous or asynchronous mode. The default is 30,000.
The action timeout value is the maximum time (in seconds) that the system waits for a particular condition before returning a time-out failure. The wait condition may be one of:
Time to achieve Star consistency after a symstar enable commandTime for a site to reach protected state after a symstar protect command
The default is 1800 seconds (30 minutes). The smallest value allowed is 300 seconds (5 minutes).
Setting SYMCLI_STAR_TERM_SDDF to Yes terminates SDDF sessions any time Star is disabled. Setting this option to No deactivates the SDDF sessions instead of terminating them. For performance reasons, the default is No.
The last option refers to whether cascaded star configuration should be allowed.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 64
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 64
Create CG and Define SRDF Group NamesCreate a composite group named ConcStar whose consistency will be managed by the RDF daemonsymcg create ConcStar –type RDF1 –rdf_consistency
Add all concurrent RDF devices belonging to Groups 5 and 6 to the composite group called ConcStar – Adding devices from RDFG 5 includes devices in RDF Group 6symcg –cg ConcStar addall dev –sid 92 -rdfg 5
Assign the name ConcStarB to the devices belonging to RDFG 5symcg –cg ConcStar set –name ConcStarB –rdfg 92:5
Assign the name ConcStarC to the devices belonging to RDFG 6symcg –cg ConcStar set –name ConcStarC –rdfg 92:6
In the event of failure, devices from Group 5 on Symm 92 will be inherited by Group 20 on Symm 94symcg –cg ConcStar set –recovery_rdfg 20 –rdfg 92:5
In the event of failure, devices from Group 6 on Symm 92 will be inherited by Group 21 on Symm 34symcg –cg ConcStar set –recovery_rdfg 21 –rdfg 92:6
A - R11 B - R2
C - R2
Symm: 92
Symm: 34
Symm: 94
5
6 20
21
7
8
This graphic illustrates one group of devices that belong to SRDF groups 5 and 6. Group 5 refers to the RDF group which is attached to the Synchronous target. Group 6 connects the same devices to the asynchronous target.
The commands shown create a composite group called Star and place all devices in the RDF groups 5 and 6 in that composite group.
The Recovery group 20 at the Synchronous Target site connects to group 21 at the Asynchronous target site. It must be connected but not contain any devices. The assignment of the recovery group numbers tells SRDF/Star which SRDF groups are used between the Synchronous and Asynchronous targets to communicate with each other in the event of a Workload site failure.
The commands specify that in the event of a failure:The RDF group 20 at the Synchronous site ConcStarB inherits devices in the RDF group 5 in ConcStarA. The RDF group 21 at the Asynchronous site in ConcStarC inherit devices in the RDF group 6 in ConcStarASince the same devices belong to groups 5 and 6 in ConcStarA, it therefore follows that group 20 in ConcStarB is connected to group 21 in ConcStarC
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 65
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 65
Perform Setup and Copy File
A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21Command Issued to set up Star at Workload site ConcStarAsymstar –cg ConcStar setup –options ConcStar.opt
This creates a Star internal definition file with the CG name ConcStarin the directory- /var/symapi/config/STAR/def (Unix)- C:\Program Files\emc\symapi\config\STAR\def (Windows)
ConcStar is then copied to management hosts at sites ConcStarB and ConcStarC
Command issued on management host at Synchronous Target ConcStarBsymstar –cg ConcStar buildcg –site ConcStarB
Command issued on host at Asynchronous Target ConcStarCsymstar –cg ConcStar buildcg –site ConcStarC
7
8
The setup command reads and verifies the CG definition and options file. If the Enginuity, Solutions Enabler, and Symmetrix pre-requisites have not been met, the setup command fails. A successful execution of the setup command generates the SRDF/Star internal definition file.
This internal definition file resides on the host at the Workload site in the directory locations shown on this slide. This file should be copied manually from the host on the source site to the management hosts on the target site(s) to the same directory location.
The buildcg command can be issued at each remote site host to which the internal Star definition file, created in the setup state, is copied. This command builds a matching R2 composite group at each target site. This composite group is used in the event of failure.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 66
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 66
Connect Both Target Sites
symstar –cg ConcStar connect –site ConcStarC
A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
Disconnected
Connected
connect
connect
Disconnected
Connected
Disconnected
Connected
Each connect command• Performs the commands to reconfigure the SRDF devices • Establishes the SRDF devices in Adaptive Copy Disk Mode• Transitions the site in question to a ‘Connected’ state
symstar –cg ConcStar connect –site ConcStarB
7
8
The next step is to connect the Synchronous and Asynchronous target sites using the connectcommand. If this command is executed at the beginning of a new Star setup, the connect command begins to establish the SRDF pairs in Adaptive Copy Disk mode if synchronization is necessary.
Later on, we will see that if the Disconnected state is reached from the PathFail state after a failure or from a Halted state, the connect command reconfigures the RDF devices (if needed), and then starts the adaptive copy synchronization.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 67
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 67
Star Query After Connect# symstar query -cg ConcStarSite Name : ConcStarA
Workload Site : ConcStarA1st Target Site : ConcStarB2nd Target Site : ConcStarC
Composite Group Name : ConcStarComposite Group Type : RDF1
Workload Data Image Consistent : YesSystem State:
{1st_Target_Site : Connected2nd_Target_Site : ConnectedSTAR : Unprotected}
2nd Target Site Information:{Source Site Name : ConcStarATarget Site Name : ConcStarCRDF Consistency Capability : MSCRDF Consistency Mode : NONESite Data Image Consistent : No
An excerpt from the Star query command shows the state of the configuration. The 1st target site in the query refers to ConcStarB and the 2nd target site refers to ConcStarC. Both sites are in the connected state.
The ellipsis represents truncation or omission of output.
Another important piece of information is available in the section under 2nd target information. Here we see that ConcStarA is the source for ConcStarC, the asynchronous target. This means that this is a concurrent Star configuration.
If this had been a cascaded star configuration, the source for the asynchronous target would have been ConcStarB.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 68
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 68
Protect Both Target Sites
The protect command will:• Set SRDF mode to synchronous for site B, asynchronous for C• Enables SRDF/S Consistency Protection for site ConcStarB• Enables MSC protection for site ConcStarC• Transitions each site to ‘Protected’ state
Connected
Protected
protect
A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
symstar –cg ConcStar protect –site ConcStarC
symstar –cg ConcStar protect –site ConcStarB
Connected
Protected
Connected
Protected
protect
7
8
The protect command issued on the Synchronous target site first checks to see if the number of invalid tracks is lower than the threshold specified in the SYMCLI_STAR_ADAPTIVE_COPY_TRACKS parameter in the Star Options file.
If the invalid track count is lower than the specified threshold, the command switches the SRDF mode to synchronous and enables RDF-ECA consistency.
If the number of invalid tracks is above the specified threshold, the protect command waits for the invalid track count to fall below the threshold before it executes.
The protect command issued on the Asynchronous target site switches the SRDF mode to asynchronous and enables multi-session consistency if the invalid track count between the Workload site and Asynchronous target site is less than the invalid track count specified in the option file. Otherwise, it waits for the invalid track count to fall below that threshold and then switches the SRDF mode and enables consistency.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 69
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 69
Star Query After Protect# symstar query -cg ConcStarSite Name : ConcStarA
Workload Site : ConcStarA1st Target Site : ConcStarB2nd Target Site : ConcStarC
Composite Group Name : ConcStarComposite Group Type : RDF1
Workload Data Image Consistent : YesSystem State:
{1st_Target_Site : Protected2nd_Target_Site : ProtectedSTAR : Unprotected}
As seen earlier, the excerpt from the Star query command shows the state of the configuration, which is both sites are in the protected state.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 70
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 70
Enable SRDF/Star Protection
symstar –cg ConcStar enable• Creates and activates SDDF sessions at the target sites• RDF daemon "ties together" the sync and async consistency• Target device state allows for future differential resynchronization• Transitions the Star environment to a ‘Star Protected’ state
R2
SRDF/ARecovery
Links
R1
R2
SRDF/Synchronous
Host I/O
Synchronous Consistency
Group Protection
Multi-Session Consistency
Group Protection
SRDF/Asynchronous
SDDFSDDF
SDDF
ConcStarA
ConcStarB
ConcStarC
ProtectedProtected
Star Protected
enable
The enable command creates the SDDF sessions at the Synchronous and Asynchronous target sites and activates sessions at the Synchronous target. Once the sessions are set up, it becomes possible to differentially resynchronize the Synchronous and Asynchronous target sites and Star protection is achieved.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 71
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 71
Query After Enabling Star# symstar query –cg ConcStarSite Name : ConcStarA
Workload Site : ConcStarA1st Target Site : ConcStarB2nd Target Site : ConcStarC
Composite Group Name : ConcStarComposite Group Type : RDF1
Workload Data Image Consistent : YesSystem State:
{1st_Target_Site : Protected2nd_Target_Site : ProtectedSTAR : Protected}
Last Action Performed : EnableLast Action Status : Successful
This query output shows that Star protection is enabled. Consequently, differential resynchronization between ConcStarB and ConcStarC is possible in the event of a disaster.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 72
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 72
State Flow Diagram: Transient Fault
After ‘PathFail’condition occurs, it is recommended to capture the “gold copy” after the resetand prior to the connect command
Firm lines indicate user actions
Dotted lines indicate fault occurrences
DisconnectedDisconnected
ConnectedConnected
connect
ProtectedProtected
protect
Star Protected
enable
PathFailPathFail
transientfault
transientfault
transientfault
reset
The next section deals with the events following a transient fault. A transient fault does not disrupt the production workload site. Thus, remote site recovery and protection procedures can be executed at the workload site.
After a transient fault, the Star state changes from Star enabled to PathFail. In the PathFail state, there is no data flow between the Workload and Target site. The data at both sites is consistent, since consistency protection was in force.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 73
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 73
Transient Fault: Async Link Failure
R2
SRDF/ARecovery
Links
R1
R2
SRDF/Synchronous
Host I/O
Synchronous Consistency
Group Protection
Multi-Session Consistency
Group Protection
SRDF/Asynchronous
SDDFSDDF
SDDF
Asynchronous Link Failure• RDF daemon performs an MSC “trip”• Dependent-write consistent image on R2s• Star environment is now considered “tripped”• Transitions ConcStarC to PathFail state
Workload SiteConcStarA
Synchronous Target SiteConcStarB
AsynchronousTarget SiteConcStarC
Star Protected
ProtectedPathFail
transientfault
Let us presume that the link between the Workload site and Asynchronous target site fails. The target devices are in a consistent state and the state transitions from Star Enabled to PathFail.
If the asynchronous targets were in the middle of a cycle switch when the failure occurred, the state would transition to PathFailCleanReq. In such a case, an MSC cleanup would first be required to assure consistency on the target.
MSC Cleanup can be performed explicitly with a symstar cleanup command. The resetcommand also performs a cleanup in the course of cleaning up the metadata on the failed site.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 74
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 74
Reset Fault Condition After Link Restoration
R2
SRDF/ARecovery
Links
R1
R2
SRDF/Synchronous
Host I/O
Synchronous Consistency
Group Protection
SRDF/Asynchronous
SDDFSDDF
SDDF
symstar –cg ConcStar reset –site ConcStarC• Initiates cleanup if necessary• Disables Star protection• Disables consistency protection of failed site• Allows for "Gold" copy capture prior to resynchronization• Transitions ConcStarC to ‘Disconnected’
Workload SiteConcStarA
Synchronous Target SiteConcStarB
AsynchronousTarget SiteConcStarC
Protected
ProtectedDisconnected
PathFail
reset
The reset action discussed earlier performs cleanup at the Asynchronous target site, if necessary. It disables Star protection of the failed Asynchronous target site, unprotects the site and transitions it to the Disconnected state. This is also the time when a BCV copy of the consistent Asynchronous target site data from the point in time of the failure should be made. Once the resynchronization between the Workload and Asynchronous site starts, the consistency is lost.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 75
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 75
Resume SRDF/Star Protection
Connect asynchronous (failed) sitesymstar –cg ConcStar connect –site ConcStarC
Protect asynchronous (failed) sitesymstar –cg ConcStar protect –site ConcStarC
Enable SRDF/Star protectionsymstar –cg ConcStar enable
ProtectedDisconnected
ProtectedConnected
connect
ProtectedProtected
protect
Star Protected
enable
A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
7
8
Once the transient fault is remedied, the “connect”, “protect”, and “enable” actions shown earlier allow the failed site to rejoin the other two sites in Star Enabled mode.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 76
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 76
State Flow Diagram: Unplanned Switch
DisconnectedDisconnected
ConnectedConnected
connect
ProtectedProtected
protect
Star Protected
enable
PathFailSTAR Tripped
PathFail;CleanReq
STAR Tripped
site loss
cleanup
PathFailSTAR Tripped
PathFailSTAR Tripped
site loss
DisconnectedDisconnected
DisconnectedConnected
switch(keep local data)
switch(keep remote data)
An Unplanned Switch Operation becomes necessary when the Workload site disaster warrants a move of the production workload to the Synchronous or Asynchronous target site. After the disaster, the system transitions from the Star Protected to the Star Tripped state. The Synchronous target transitions to the PathFail state, the Asynchronous Targets to the PathFail or PathFail; CleanReq state. Recovery operations must be undertaken to start production at one of the target sites.
The distinction between PathFail and PathFail; CleanReq states is the need for MSC cleanup. PathFail; CleanReq indicates the need for MSC cleanup at the Asynchronous Target site.
If it is decided to switch to one of the remote sites and preserve the data at that site, the switch command transitions the sites to the Disconnected state. From that state, it is necessary to issue a connect command to arrive at the Connected state.
If the decision is made to switch to one of the remote sites and preserve the data of the other remote site, the switch command transitions the sites to the Connected state.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 77
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 77
Workload Site Failure, Switch Variations
There are 4 main variations for Unplanned Workload Switch– Switch to Sync Target Site, Keep Sync Target data– Switch to Sync Target Site, Keep Async Target data– Switch to Async Target Site, Keep Async Target data– Switch to Async Target Site, Keep Sync Target data
Switch to site, decision based on customer needs
Keep data decision from symstar query output, telling if Async Target data is most current
Best practice is to save "Gold" copy before initiating synchronization
The decision about which site to switch depends on the customer’s infrastructure capabilities and the nature of the disaster which may have affected the campus Synchronous Target Site.
The Asynchronous Target site can be more up-to-date than the Synchronous target in the case of a rolling disaster. This can happen if the links to the Synchronous target site fail first and the Asynchronous target continues to receive data for a while. Then the Workload site fails completely. The symstar query command can assist in making the decision about which data is most recent and must be preserved.
In the example that follows, Workload is switched to the Asynchronous target site while keeping the data of the Synchronous target site.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 78
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 78
Workload Site Fault: Synch Site More Current
R2
SRDF/ARecovery
Links
R1
R2
SRDF/Synchronous
Host I/O
Synchronous Consistency
Group Protection
Multi-Session Consistency
Group Protection
SRDF/Asynchronous
SDDFSDDF
SDDF
Workload Site Failure• Both consistency groups "trip“
– Dependent-write consistent image on R2s• Star environment is now considered "tripped"• Target devices can be differentially synchronized• ConcStarB site more current than ConcStarC site
Workload SiteConcStarA
Synchronous Target SiteConcStarB
AsynchronousTarget SiteConcStarC
StarInternal
Definition
StarInternal
Definition
Star Protected
PathFailPathFail;CleanReq
siteloss
In the example shown here, the Workload site has failed. Assume that a query command shows the value of “Asynchronous Target Site Data Most Current” is “No”. This means that the data at the Synchronous Target is more recent.
The system state is StarTripped.
The Synchronous Target site transitions to the PathFail state.
The Asynchronous target site transitions to the PathFail; CleanReq state if the failure occurred in the middle of a Delta Set switch. Otherwise, it transitions to the PathFail state.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 79
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 79
Workload Site Disaster: Cleanup Asynch Target
Commands performed at the ConcStarC site (or ConcStarB)symstar –cg ConcStar cleanup –site ConcStarC
• Cleans up internal metadata and Symmetrix cache at ConcStarC• Transitions 2nd target site from ‘PathFail;CleanReq’ to ‘PathFail’• Allows for “Gold” copy capture prior to resynchronization
PathFail
PathFailPathFail
PathFail;CleanReq
cleanupA - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
This step is necessary only if the state of the Asynchronous target site was PathFail; CleanReq after the failure of the Workload site.
The cleanup command can be issued from either remaining site. This performs MSC Cleanup at the Asynchronous target site.
Recall that the earlier buildcg step had already created the consistency groups at both sites and the Star internal definitions file has a record of the site names and locations. After cleanup is performed, a BCV copy of the data should be taken to preserve a consistent data copy from the point of time of the failure.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 80
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 80
Workload Site Fault: Switch to Asynch Target
Switches production to remote site, ConcStarC, keeping sync datasymstar –cg ConcStar switch –site ConcStarC –keep_data ConcStarB
• ConcStarC devices swap personality to become R1s• ConcStarB devices are reconfigured so they become R2s of ConcStarC• ConcStarC devices are made Read-Write to the host• Allowed from a ‘PathFail;PathFail;Tripped’ state• ConcStarA is ‘Disconnected’ and ConcStarB is ‘Connected’
PathFail
DisconnectedConnected
PathFailswitch A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
In the example shown here, the Workload site is being moved from ConcStarA to ConcStarC, while retaining data in ConcStarB. Note that this represents the “Keep Remote Data” option on the Unplanned Switch state flow diagram shown earlier.
The switch command reconfigures the RDF devices at ConcStarB and ConcStarC. Since the data at ConcStarB is more recent, a differential RDF restore operation from ConcStarB to ConcStarC is undertaken. Once the restore is complete, the R1 devices at ConcStarC are enabled for reading and writing.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 81
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 81
Workload Site Fault: Protect Synch Target
Initiate remote data protection
symstar –cg ConcStar protect –site ConcStarB• Waits for invalid track count to reach specified amount• Sets SRDF mode to asynchronous• Enables the SRDF/A MSC Consistency Protection• Transition ConcStarB to a ‘Protected’ state
Disconnected
DisconnectedProtected
Connected
protectA - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
The protect command now enables MSC protection between ConcStarC and ConcStarB. Star protection is not possible because three sites are not available. Note that no reconfiguration is undertaken at site ConcStarA, which is expected to be inaccessible after a workload site disaster.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 82
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 82
Query After Protecting ConcStarB# symstar -cg ibm1star querySite Name : ConcStarC
Workload Site : ConcStarC1st Target Site : ConcStarA2nd Target Site : ConcStarB
Composite Group Name : ConcStarComposite Group Type : RDF1
Workload Data Image Consistent : YesSystem State:
{1st_Target_Site : Disconnected2nd_Target_Site : ProtectedSTAR : Unprotected}
The output of the query command shows that the Workload site is now at ConcStarC and that ConcStarA is disconnected. ConcStarA is now referred to as the 1st target site and ConcStarB as the 2nd target site.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 83
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 83
Planned Switch Operation: Command Flow
DisconnectedDisconnected
ConnectedConnected
connect
ProtectedProtected
protect
Star Protected
enable
HaltedHalted
switch
DisconnectedDisconnectedhalt
halt
halt
The final example shows the steps to switch the Workload site back to the original site, ConcStarA, in a planned fashion. The key command here is halt. A planned workload switch is typically used either to move back home after the resolution of a Workload site failure or in the course of a disaster drill. All site moves are allowed as long as the sites are functional and the RDF connectivity is present.
Permitted Workload site moves include:
A to B
A to C
B to A
B to C
C to A
C to B
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 84
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 84
Workload Site Problem Resolved: Bring Online
Propagate data back to ConcStarAsymstar –cg ConcStar connect –site ConcStarA• Performs commands to reconfigure RDF devices • Establishes RDF devices in Adaptive Copy Disk Mode• Transitions ConcStarA to a ‘Connected’ state
Disconnected
ConnectedProtected
Protected
connect
A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
218
7
The halt command has been explained earlier as part of a planned switch procedure. It write disables the R1 devices, then drains data from the production site to the two target sites. It therefore ensures that data at all three sites are identical.
Continuing the example shown earlier, let us assume that the problem that caused the Workload site at ConcStarA to be shut down has been resolved. When the connect command is used at ConcStarC, the RDF volumes in ConcStarA are reconfigured so that they become concurrent targets of the R1 devices in ConcStarC. An adaptive copy synchronization is initiated between ConcStarA and ConcStarC.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 85
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 85
Planned Switch to ConcStarA: Halt Replication
SRDF/ARecovery
Links
R2
R2
Host I/OShutdown applicationssymstar –cg ConcStar halt• Completely synchronize both remote target sites • Allows all invalid tracks and cycles to drain• WD or NR the R1 devices• Results in all 3 sites having the same data
Workload SiteConcStarA
Synchronous Target SiteConcStarB
AsynchronousTarget SiteConcStarC
R1
Connected
HaltedHalted
Protected
halt
SRD
F/Asynchronous
Multi-Session
ConsistencyProtection
Next, the halt command ensures that all three sites are identical and write disables the R1 devices if they are mapped to an FA.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 86
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 86
Query after Halt# symstar query -cg ConcStar
Site Name : ConcStarC
Workload Site : ConcStarC1st Target Site : ConcStarA2nd Target Site : ConcStarB
Composite Group Name : ConcStarComposite Group Type : RDF2
Workload Data Image Consistent : YesSystem State:
{1st_Target_Site : Halted2nd_Target_Site : HaltedSTAR : Unprotected}
Last Action Performed : HaltLast Action Status : Successful
The query shows that the halt was successful. The Workload site still remains at ibm1starB.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 87
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 87
Planned Switch Back to ConcStarA
Commands entered at ConcStarAsymstar –cg ConcStar switch –site ConcStarA• ConcStarA devices swap personality to become R1s• ConcStarC devices become R2s• ConcStarA devices are made Read-Write to the host• ConcStarB and ConcStarC are in ‘Disconnected’ state
Halted
DisconnectedDisconnected
Halted
switch
A - R11 B - R2
C - R2
ConcStarA
ConcStarC
ConcStarB
5
6 20
21
7
8
The switch command now resets the RDF relationships so that ConcStarA devices have the RDF1 attribute and are concurrently connected to ConcStarB and ConcStarC. Both targets transition to the “Disconnected” state. Now, the “connect”, “protect”, and “enable” action sequence transitions the system to the Star protected state.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 88
© 2008 EMC Corporation. All rights reserved. Module Title - 88
Lesson 2
Upon completion of this lesson, you will be able to:
Describe Cascaded Star operations
The objective for this lesson is shown here.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 89
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 89
Differences Between Cascaded and Concurrent Star
Since the integrity of the asynchronous data depends on the data at the synchronous target:– Connect synchronous target before asynchronous target– Protect synchronous target before asynchronous target– May not unprotect synchronous target if asynchronous
target is protected– May not connect synchronous target if synchronous target is
disconnected and the asynchronous target is protected
There are a few important differences between the normal operating conditions of Concurrent Star and Cascaded Star.
1. While connecting the sites from the Disconnected state, the synchronous site must be connected first, the asynchronous site, second.
2. Since the consistency of the asynchronous site data is dependent on the consistency of the synchronous site data, the asynchronous target can only be protected if the synchronous target is protected as well. Consequently, after the two sites have been connected, the synchronous target must be protected first.
3. While both the synchronous and asynchronous targets are in the protected state, an unprotect action on the synchronous site will not be permitted.
4. If the synchronous target is disconnected while the asynchronous target is protected (as can happen after a failure of the links between the workload site and the synchronous target), a connect action will not be permitted on the synchronous target. The asynchronous target must be tripped or unprotected before the connect with the synchronous target is allowed.
5. Since only the asynchronous site can be taken out of service without disrupting remote data protection, it is only permissible to isolate the asynchronous target from the Protected, Protected state.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 90
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 90
Setting Up Cascaded StarCascaded SRDF/Star Configuration
Create R1 type composite group
symcg create CascStar –rdf_consistency –type r1
# Add all devices from as RDFG 17 to the CG
symcg –cg CascStar addall dev –rdfg 17
# Assign name to the B location only
symcg –cg CascStar set –name CascStarB –rdfg 92:17
# Define recovery groups
symcg –cg CascStar –rdfg 92:17 set –recovery_rdfg 37
# Run Star setup
symstar –cg CascStar –options <filename> setup
R2
R21
R1
Site CascStarASymm 92
SRDF/A
SRDF/S
Passive RecoveryLink
RDFG 17
RDFG 37
RD
FG 57
Site CascStarCSymm 34
Site CascStarBSymm 94
RD
FG 5
7
The CG creation in Cascaded Star differs from Concurrent Star in a couple of ways.
First, since the source devices have only one set of connections, only the RDF group(s) connecting A to B needs to be named. Secondly, only one recovery group statement is required, because the RDF group(s) from CascStarA to CascStarC is the only one that needs to be identified as such.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 91
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 91
Excerpt from Cascaded Star Query# symstar query -cg CascStarSite Name : CascStarA
Workload Site : CascStarA1st Target Site : CascStarB2nd Target Site : CascStarC
Composite Group Name : CascStarComposite Group Type : RDF1
Workload Data Image Consistent : YesSystem State:
{1st_Target_Site : Disconnected2nd_Target_Site : DisconnectedSTAR : Unprotected}
2nd Target Site Information:{Source Site Name : CascStarBTarget Site Name : CascStarCRDF Consistency Capability : MSCRDF Consistency Mode : NONESite Data Image Consistent : No
As in the case of Concurrent Star, the two sites are in the Disconnected state, because the RDF links are suspended.
Note that the second target site information displays that the source for that site is CascStarB. This indicates that this is a Cascaded Star configuration.
After setup, Cascaded Star is brought up just like Concurrent Star with use of the connect, protect, and enable commands.
The only caveats to observe are the order in which the synchronous and asynchronous target sites can be connected and protected. The sequence of commands would therefore be:
# symstar –cg CascStar connect –site CascStarB –nop
# symstar –cg CascStar connect –site CascStarC –nop
# symstar –cg CascStar protect –site CascStarB –nop
# symstar –cg CascStar protect –site CascStarC -nop
# symstar –cg CascStar enable -nop
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 92
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 92
Cascaded Star – Failure of Synchronous LinksExample using composite group CascStar makes the following assumptions
– Links to site B have failed, i.e., System state is: PathFail, Protected, Tripped– Site C is still working– A to C links still work
# symstar query -cg CascStar
Workload Site : CascStarA1st Target Site : CascStarB2nd Target Site : CascStarC
System State:{1st_Target_Site : PathFail2nd_Target_Site : ProtectedSTAR : Tripped}
2nd Target Site Information:{Source Site Name : CascStarBTarget Site Name : CascStarC
If the synchronous target of a Cascaded Star configuration goes down, either because of a link or because of a site failure, remote protection is lost. It is possible then to reconfigure Cascaded Star so the recovery links are turned into a live RDF connection between CascStarA and CascStarC. However, this configuration is considered to be a Concurrent Star configuration.
In the example shown, the composite group CascStar has experienced a loss of links to site CascStarB. Star has tripped. The CascStarB to CascStarC links are intact so the second target state is Protected.
The query indicates that the configuration is cascaded, because the source for the second target is sun1starB.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 93
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 93
Disconnect and Reconfigure Asynchronous TargetSince asynchronous links are still up, trip them consistently# symstar -cg CascStar disconnect -trip -site CascStarC -nop
Now, a query should reveal the asynchronous site in PathFail stateSystem State:{1st_Target_Site : PathFail2nd_Target_Site : PathFailSTAR : Tripped}
A reconfiguration activates the A to C links in a Concurrent Star configuration leaving the C site disconnected# symstar -cg CascStar reconfigure -reset -site CascStarC –path CascStarA:CascStarC -nop
Note that a query after the connect and protect steps reveals that the A site is now connected to C# symstar -cg CascStar connect -site sun1starC –nop# symstar -cg CascStar protect -site sun1starC –nop
System State:{1st_Target_Site : PathFail2nd_Target_Site : ProtectedSTAR : Unprotected}
2nd Target Site Information:{Source Site Name : CascStarATarget Site Name : CascStarC
If the synchronous target site had failed, both links would appear in the PathFail state and the disconnect –trip action would not have been necessary. In this example the asynchronous target must be tripped with a disconnect –trip command. This leaves the data at the asynchronous target in a consistent, PathFail state.
Now a reconfiguration causes the A to C links to be used for data replication. The query reveals that the second target site is receiving its data feed from site A.
Once the connection between A and B is reinstated, a connect CascStarB, protect CascStarB followed by enable CascStar will result in a Concurrent Star configuration in Star protected state.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 94
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 94
Reconfiguration of Star
Star can often be reconfigured from Cascaded to Concurrent and vice versa while the Workload site is running
Other than a workload failure, which requires changing of RDF personalities, cases where reconfiguration is practical are:– Failure cases for Cascaded Star
1 Loss of link between A and B2 Loss of site B5 Loss of links between B and C
– Failure case for Concurrent Star3 Loss of links between A and C
Other than a Workload site failure, which always requires a reconfiguration of the RDF personalities of the source and target sites, there are several failure cases in Cascaded Star and Concurrent Star where a reconfiguration of the Star setup might be desirable. The following failure cases were discussed in the introductory section:
1. Loss of Links between A and B
2. Site B failure
3. Link failure between A and C
4. Site C failure
5. Link failure between B and C
6. Site A failure
In the case of Cascaded star, a reconfiguration to Concurrent Star can allow the Workload site to continue functioning with remote data protection after failure cases 1 and 2.
A reconfiguration after failure case 5 from Cascaded to Concurrent and a reconfiguration from Concurrent to Cascaded in failure case 3 makes it possible for the three sites to continue in Star protected mode despite the above named failures.
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 95
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 95
Module Summary
Key points covered in this module:
Symmetrix parameters required to run SRDF/Star
Host software components needed for Star
Concurrent SRDF/Star operations
Cascaded SRDF/Star operations
These are the key points covered in this module. Please take a moment to review them
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.
SRDF/Star for Open Systems - 96
© 2008 EMC Corporation. All rights reserved. SRDF/Star for Open Systems - 96
Course Summary
Key points covered in this course:
Benefits of SRDF/Star over other replication technologies
Underlying technologies for SRDF/Star– Synchronous SRDF consistency groups using RDF-ECA– SRDF/A Multi-session Consistency– Special SRDF features in support of Star
Concurrent and Cascaded SRDF/Star concepts
Steps to perform:– Normal Operation– Transient Fault– Unplanned switch caused by a major outage
These are the key points covered in this training. Please take a moment to review them.
This concludes the training. Please proceed to the Course Completion slide to take the assessment.