Download - Storage - DS8K HA Best Practices_v1.9
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
1/27
Copyright 2007 IBM Corporation. All rights reserved.
Recommended Best Practices Considerations for HighAvailability on IBM System Storage DS8000 andDS6000 and IBM TotalStorage ESS
Prepared by:Cam-Thuy Do and John Sing
IBM High Availability Center of CompetencyOctober 2007
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
2/27
IBM Systems and Technology Group
Copyright IBM Corporation 2007. All rights reserved.Version 1.8
Disclaimers
Copyright 2007 by International Business Machines Corporation.
No part of this document may be reproduced or transmitted in any form without written permission from IBMCorporation.
Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change
without notice. This information could include technical inaccuracies or typographical errors. IBM may makeimprovements and/or changes in the product(s) and/or programs(s) at any time without notice.
Any statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, andrepresent goals and objectives only.
References in this document to IBM products, programs, or services does not imply that IBM intends to make suchproducts, programs or services available in all countries in which IBM operates or does business. Any reference toan IBM Program Product in this document is not intended to state or imply that only that program product may beused. Any functionally equivalent program, that does not infringe IBMs intellectually property rights, may be used
instead. It is the users responsibility to evaluate and verify the operation of any on-IBM product, program or service.
THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHEREXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR APARTICULAR PURPOSE OR NONINFRINGEMENT. IBM shall have no responsibility to update this information. IBMproducts are warranted according to the terms and conditions of the agreements (e.g., IBM Customer Agreement,Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBMis not responsible for the performance or interoperability of any non-IBM products discussed herein.
The provision of the information contained herein is not intended to, and does not, grant any right or license under anyIBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to:
IBM Director of LicensingIBM CorporationNorth Castle DriveArmonk, NY 10504-1785U.S.A.
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
3/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
Trademarks
IBM, IBM eServer, IBM logo, e-business logo, CICS, DB2, MQ, ESCON, Enterprise Storage Server, GDPS, IMS, MVS,OS/390, Parallel Sysplex, Redbook, Resource Link, S/390, System z9.iSeries, pSeries, xSeries, OS/400, i15OS, SystemStorage, TotalStorage, VM/ESA, VSE/ESA, WebSphere, z/OS, z/VM, z/VSE, and zSeries are trademarks or registeredtrademarks of International Business Machines Corp. in the United States, other countries, or both.
Linux is a registered trademark of Linux Torvalds in the United States, other countries, or both.
Microsoft is a registered trademark of Microsoft Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States, other countries, or both.
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
4/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
IBM System Storage Enterprise Disk
DS6000
DS8000
New Standard inPricing and
Packaging
New Standardin Functionality,
Performance, TCO
ESS 750 / 800
This document provides a summary of recommended High Availability best practiceconsiderations for the DS8000, DS6000, and Enterprise Storage Server disk subsystems
The reader is assumed to have a baseline understanding of the concepts and facilities ofthese products
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
5/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices
Configuration
RAID 5 - spreads data across multiple disk drives using parity (P) and spares, thusproviding redundancy (e.g. A 6+P+S array consists of six data, one parity drive andone spare) Use RAID-5 when the desire is to use less storage, but at expense of longer rebuild time if drive fails
RAID 10 stripes half the disk drives while the other half of the array mirrors the firstset of disk drives Use RAID-10 when the desire is for highest performance and/or lower rebuild time
At expense of requiring larger amount of raw storage
Exploit available hardware options Server & Storage fail-over/fall-back in Metro Mirror Environment Concurrent Maintenance Minimize Single Frame DS8300 purchases as 1st expansion frame upgrade is disruptive. Distribute Host connections across multiple physical adapters on the DS8000 Verify all host paths are available before upgrading software Logical Partitioning (LPAR) capability to distribute workloads
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
6/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices
Multiple Redundant Management Control Consoles
Uninterruptible Power Supply Earthquake Resistant Kit (where applicable) Consider IBM Standby Capacity on Demand (Standby CoD) offering for capacity planning
Enable Call Home and Remote Support Monitor the storage subsystem status
e-mail notification for a serviceable event
Simple Network Management Protocol (SNMP) notification
Service Information Message (SIM) notification zSeries
Reviewing the event log of the DS8000
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
7/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices
Maintain Currency Create a regular maintenance window for storage and SAN Install firmware updates as recommended Understand what fixes/upgrades are in a Firmware update Integrate into Change Control Management May install first on less critical systems, prior to production Maintain supported combinations of Host Adapter Driver Subscribe to MySupporthttp://www.ibm.com/support/mySupport
Concurrent Maintenance
Perform Concurrent Maintenance operations of the storage subsystem duringtime of low activities
Microcode upgrade will be performed by IBM support personnel
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
8/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices
Host Based Monitors and Alert
GDPS/PPRC HyperSwap Manager
GDPS/PPRC HyperSwap Monitors & Alerts
TPC-R
Host Based Collection Facilities
z/OS LOGREC
Host Based High Availability Options for Data
DFSMF Dataset Name separation
Host Connections provide multiple paths from each host to the storage
MPIO or Subsystem Device Driver (SDD) for Open Systems
Dynamic Path Selection (DPS) and Dynamic Path Reconnect (DPS) forzOS
Host Based Monitors and Alert GDPS/PPRC HyperSwap Manager
GDPS/PPRC HyperSwap Monitors & Alerts
TPC-R
Host Based Collection Facilities z/OS LOGREC
Host Based High Availability Options for Data
DFSMF Dataset Name separation
Host Connections provide multiple paths from each host to the storage MPIO or Subsystem Device Driver (SDD) for Open Systems
Dynamic Path Selection (DPS) and Dynamic Path Reconnect (DPR) for zOS
Distribute paths across multiple physical adapters on the DS8000
System i
DSCLI commands executed through i5/OS interface Copy Services for System i Toolkit
Combination of iSeries Navigator and 5250 interface
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
9/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices
Duplicate Storage Subsystems in Campus or Same Data Center Floor
Can use Metro Mirror for data redundancy, to enable quick Re-IPL Requires automation S/W such as TPC-R or GDPS
IBM Softek TDMF to move data around in Real Time Can perform local Site Switch before maintenance actions to reduce impact
of human errors and reduce impact to production
Know the following IBM System Storage web sites
IBM System Storage support web site
Starting point for IBM System Storage hardware and s/w support Includes links to subscription services to sign up for email alerts Includes links to product docs, contact information, fix search engine
IBM System Storage Interoperation Centerhttp://www-01.ibm.com/servers/storage/support/config/ess/index.jsp
Fibre Channel host bus adapter firmware and driver level matrix site
IBM Hi h A il bilit C t f C t
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
10/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Advanced Copy Functions Overview for Availability
Point in Time Copy (FlashCopy)
Minimize application / database downtime required to make local point in time copies for:
- Production backup, data cloning, data warehouse, test and development
- Disk subsystem microcode creates internal copy of data (FlashCopy) Copy initialization of large terabytes of data can be accomplished in seconds
Remote Mirroring (Metro Mirror, Global Mirror, zOS Global Mirror)
Create real-time, continuously updated remote copies of disk subsystem data- Campus, metropolitan, or geographically distant site
Data suitable for High Availability fast failover and failback
Supports large amounts of data, at the terabyte level
Disk subsystem microcode mirrors volumes/LUNs to remote disk subsystem
- Synchronous capability (Metro Mirror)
- Asynchronous capability (Global Mirror)
IBM Hi h A il bilit C t f C t
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
11/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Point in Time internal Data Replication
Fast Time-Zero internal data replication capability (FlashCopy)
Create internal copies of data for backup, cloning, data mining, etc.
Physical configuration
Assure sufficient target disk space allocated
Usage practices:
Plan databases/applications to be in hot backup mode or quiesce to maintain dataintegrity
Back up internal volume/LUN required for:
Operating System catalogs, etc.
Database/application metadata
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
12/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Metro Mirror - Synchronous Data Replication
Applicability:
General:provide synchronous data replication of disk subsystem at volume / LUN level
System z:In combination with GDPS HyperSwap, provides foundation for removal ofParallel Sysplex disk subsystem single point of failure
Physical configuration, link and infrastructure planning
Must perform initial and ongoing analysis of write workload to determine sufficientSAN/WAN/telecom infrastructure bandwidth
Automation
Plan for highly automated operational control of mirroring to mask complexity andsupport reliability, repeatability, testability
Testing and testing resource expectations
Plan to provide Tertiary Copy storage at remote site- For every production TB to be mirrored, ideally 2x that TB at remote site
- To provide additional storage for ongoing testing environment, resync protection,and golden copy, problem determination, validation
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
13/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Global Mirror - Asynchronous Data Replication
Applicability of IBM Global Mirror : is usually chosen when
Open Systems or mix of z/OS and Open asynchronous replication of volumes/LUNs isdesired, and when reduced bandwidth is a necessity
Link and infrastructure planning
Must perform initial and ongoing analysis of write workload to determine sufficientSAN/WAN/telecom infrastructure bandwidth
Similar speed and throughput characteristics on source and target volumes can provideoptimum performance
Automation Plan for highly automated operational control of mirroring to mask complexity and
support reliability, repeatability, testability
Availability and Testing
Plan to provide sufficient Tertiary Copy storage at remote site
- For every production TB to be mirrored, ideally 3x that TB at remote site
- To provide storage for ongoing testing environment, resync protection, golden copy, problemdetermination, validation
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
14/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Global Mirror (XRC) - Asynchronous Data Replication
Applicability of IBM z/OS Global Mirror (XRC):
General:z/OS Global Mirror is usually chosen when:
- Only z/OS data requires asynchronous data replication, or when heterogeneous
z/OS disk vendors are required.
Physical configuration, link and infrastructure planning
Must perform initial and ongoing analysis of write workload to determine sufficientSAN/WAN/telecom infrastructure bandwidth
Similar speed and throughput characteristics on source and target volumes can provide
optimum performance Plan to provide sufficient System z cycles at remote site for System Data Mover
Automation
Plan for highly automated operational control of mirroring to mask complexity andsupport reliability, repeatability, testability
Availability and Testing
Plan to provide sufficient Tertiary Copy storage at remote site- For every production TB to be mirrored, ideally 2x that TB at remote site
- To provide ongoing testing environment for setup, validation, problem determination,
validation
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
15/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Three site replication
When to use 3 site
Three site replication is used when the requirement is to combine zero data loss RPO using localMetro Mirror, and combining that with out of region recovery (async).
Pre-requisites: Three site replication is affordable and justifiable to the business when:
Data Center strategy and implementation is already well under way towards Active-Active or PlannedWorkload Rotation for two site
Pre-requisite: Two site configuration already includes ongoing:
Automated failover/failback
Full Tertiary Copy capability for testing, problem determination, validation, automation
Ongoing WAN / bandwidth / workload Capacity Planning
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
16/27
IBM High Availability Center of Competency
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Management of Replication
Plan for highly automated disk mirroring environment
Provides foundation for Reliability, Repeatability, Scalability, Testability
Recommendations for automation software:
System z environment: GDPS
Mixed open platform: GDOC
General disk mirroring mgmt: TPC for Replication
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
17/27
g a ab ty Ce te o Co pete cy
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Resources
System Storage Business Continuity Solutions website
http://www-03.ibm.com/servers/storage/solutions/business_continuity/index.html
System Storage Technology Center
http://www-03.ibm.com/system/storage/
Storage Education http://www-03.ibm.com/systems/education/cust/crossprod/custcp.html
System Storage Interoperation Center
http://www-01.ibm.com/systems/support/storage/config/ssic/index.jsp
System Storage Services
http://www-03.ibm.com/systems/storage/services/index.html
Redbooks/Redpapers http://www.redbooks.ibm.com/redbooks.nsf/portals/Storage The IBM TotalStorage DS8000 Series: Concepts and Architecture (SG24-6452-00) IBM System Storage Business Continuity Solutions Overview (SG24-6684-01) IBM System Storage DS8000 Series: Copy Services with IBM System z (SG24-6787-02) IBM System Storage DS8000 Series: Copy Services in Open Environments (SG24-6788-02)
IBM System Storage Solutions Handbook (SG24-5250-06)White papers
IBM Storage Infrastructure for Business Continuity Solution
Global Mirror Technical Whitepaper
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
18/27
Copyright 2007 IBM Corporation. All rights reserved.
Data Corruption Solutions
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
19/27
g y p y
Copyright 2007 IBM Corporation. All right reserved.
System Storage Enterprise Disk Practices Data Corruption
Logical data corruption protection must be designed at the operational andapplication level
Best practices procedures are:
Sufficient point in time disk copies of data To provide adequate known restart points
Supplemented by operational procedures at the database/application level
Tools include (but not limited to): FlashCopy Point in Time Copy
Software: zCDP for DB2 (zOS 1.8 + DB2 9)
- Eliminates need for DB2 Backup Windows via DB2 BACKUP Utility
- No interruption to DB2 Processing to take backups.
- DFSMShsm Maintains up to 50 Backup versions across disk & Tape.
- DB2 RESTORE Utility Granularity - System, Volume, DB Table.
Future: zCDP for Storage IBM SOD on providing CDP function for all zOS data.
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
20/27
Copyright 2007 IBM Corporation. All rights reserved.
Supplemental Information
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
21/27
Copyright 2007 IBM Corporation. All right reserved.
FlashCopy: Local Point in Time Data Replication to improvedata availability
Copy data command issued - Copy is immediately available
Read and write to bothsource and copy possible
Write Read
When copy is complete,relationship betweensource and target ends
Time
Optional background copy
Source Target
FlashCopy Use Cases
- Production backup
Regain information from an older level of data Re-establish production in case of any server errors
- Data backup Create backups with the shortest possible
application outage
- Data Mining Avoid performance impacts of the production system
- Test system Allow to test new application with real production
data
- Moving and migrating data Move a consistent data set from one host to another
with a minimum of downtime for the host application
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
22/27
Copyright 2007 IBM Corporation. All right reserved.
Storage
Mirroring
Scalable Data Integrity
Storage
Network
Server
cluster
Synchronous data mirroring (up to 300km) Superior performance
- Low internal MM Overhead (at zero distance DS8000additional overhead is .38ms)- Optimized Protocol Exchange- Each 100KM add 1ms- Plus Switch/channel extender Overheads- Generally Fewer Links Required over
competition
Platform environment
System z: GPDS/PPRC, GPDS/PPRC Hyperswap Manager
System p: AIX HACMP/XD + Metro Mirror
System i: High Availability Business Partner software; ASR Toolkit
Geographically Dispersed Open Clusters (GDOC) for Unix, Linux and
Windows
Metro Mirror: synchronous replication of data between two
storage subsystems to improve data availability
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
23/27
Copyright 2007 IBM Corporation. All right reserved.
GDPS/PPRC HyperSwap Manager and Metro Mirror
Extends Parallel Sysplex Availability to z/OS DS8000,
DS6000, ESS disk subsystems Eliminates disk subsystem as single point of failure
in a z/OS Parallel Sysplex
Masks primary disk subsystem failures by transparentlyswitching to use secondary disks (Unplanned
HyperSwap)
Provides ability to perform disk maintenance withoutrequiring applications to be quiesced (PlannedHyperSwap)
Delivered as IGS Services offering Technical concept:
Planned or unplanned HyperSwap will dynamicallysubstitute DS8000, DS6000, or ESS Metro Mirrorsecondary for primary device
No operator interaction - GDPS-managed
Can swap large number of volumes - fast
Includes volumes with Sysres, page DS, catalogs
Non-disruptive - applications keep using samedevice addresses
P S
applicationapplication
UCB
Metro
Mirror
UCB
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
24/27
Copyright 2007 IBM Corporation. All right reserved.
Global Mirror: Asynchronous data replication between two storagesubsystems to improve data availability at global distance
'A
Primary
Native
Performance
Consistent Data
FlashCopy
REMOTEHOSTS
SAN
PRIMARY
HOSTS
BGlobalCopySecondary
SAN
TransmissionPerformance
Two site, unlimited global distance
Complete and consistent data mirroring
Consistency groups
Across zOS and Open Systems data Across up to 16 subsystems
Currency can be configured to as little as 3 to 5
seconds behind host I/O
Native application performancePlatform environment
System z: GPDS/GM
Geographically Dispersed Open Clusters (GDOC)
for Unix, Linux and Windows
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
25/27
Copyright 2007 IBM Corporation. All right reserved.
z/OS Exploit Global Mirror (XRC): Asynchronous data replication between twostorage subsystems to improve data availability at global distance, using
System z MIPs
Productivity tool that integrates management of
XRC and FlashCopy
Premium performance & scalability
Data moved by System Data Mover (SDM)address space(s) running on System z
Supports heterogeneous disk subsystems
GDPS/XRC runs in the SDM location
Manages availability of SDM Sysplex
Performs fully automated site failover Single point of control for multiple / coupled
System Data Movers
Supports zSeries and zSeries Linux data
Over 200 installations worldwide
XRC manages secondary consistency
Across any number of primary subsystemsAll writes time-stamped and sorted before committed to secondary devices
SDM systems
GDPS/XRCproduction
systems
secondary disk
subsystems
journals
primary disk
subsystems
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
26/27
Copyright 2007 IBM Corporation. All right reserved.
Ability to switch production to any site
Planned/Unplanned Outage
Minimal Data Movement
Protection from local site disaster
Metro Mirror (Sync PPRC )
GDPS/MGM with HyperSwap locally Protection from regional disaster
Global Mirror (Async PPRC) Regional C
Minimal Data Loss (3-5 seconds)
Resynchronize any site with incremental
changes only
Managed by GDPS/MGM or TPC-R
Metro/Global Mirror : IBM three site recovery
Metro
Mirror
IHIHIH
LH
RJ
RH
BackupGlobalMirror
IBM High Availability Center of Competency
-
8/8/2019 Storage - DS8K HA Best Practices_v1.9
27/27
Copyright 2007 IBM Corporation. All right reserved.
IBM TotalStorage Productivity Center for Replication (TPC-R)
Flash Copy
Metro Mirror, Global Mirror
Session Management
Consistency Groups
Replication Monitor
Copy Device Interface
Basic function plus
High AvailabilityDR Management
3rd Party Storage
TPC-R V3.1 Two-Site BC V 3.1
TPC-R V3.1
ESS 800 DS6KDS8KSVC
GUI / CLI / API Enable the configuration of complexreplication environments, providefeedback on the state of their operations,and make changes easy to accomplish
Provide Common Interface
Single point of control
Single set of commands and session states
Build on copy services functions to provideour customers a DR solution
Dynamically monitor Metro Mirror andmaintain write order data consistency
Hide differing hardware technologies andunique Copy Service functionimplementations
Automate Metro/Global Mirror Incremental
Resync function