unleash the savings: db2 backup & recovery at near zero cost
TRANSCRIPT
Unleash the Savings: DB2 Backup & Recovery at Near Zero Cost
Bryan Smith / Tim Willging Rocket Software
Session Code: G11 Date: Thu, May 02, 2013 (02:15 PM - 03:15 PM) Platform: Data Management, Disaster Recovery, DB2 for z/OS
Click to edit Master title style
Agenda
Fast Replication Overview
Storage-Aware Database Backup and Recovery
Implementation Planning Considerations
Summary
2
Click to edit Master title style
Why Fast Replication Should be Used
• Cost Savings
• Saves CPU and I/O
• “We would backup more often if it didn’t cost so much.”
• Speed
• Faster backup and recovery
• Faster access to cloned data
• “We need to achieve our recover time objective.”
• Minimal Disruption to Data Access
• Provide fast and effective ways to backup and copy data with no downtime
• “Our backup for disaster recovery purposes includes other data so our applications must be down during the backup”
3
Click to edit Master title style
Why Fast Replication Hasn’t Been Used
• Typically only understood by storage administrators
• DBAs don’t get training
• Standard FlashCopy tools don’t address needs of DBA
• DFSMSdss
• ICKDSF
• Volume level FlashCopy leaves datasets un-cataloged
• Concerns with Remote Replication
• Requires specific configuration in some configurations
• Changing over time
• Starting with DB2 V8, each release of DB2 includes more FlashCopy support
• Introduction of ‘Storage Aware’ database tools
4
Click to edit Master title style
Fast Replication Overview • Command issued from z/OS to storage processor (SP)
• Almost instant return to z/OS
• Data copy continues inside SP • Offloading data copy to SP save z/OS CPU and I/O resources
• z/OS access to source and target data is possible immediately following fast replication command • Availability
• Source Update: If source track is updated, the SP makes sure original track has been sent to target before update takes place
• Target Read: If a track being read on target volume has not yet been copied, the SP redirects read to the corresponding source volume track
• Target Update: When updating track on target volume, the SP makes sure track has been copied to target before allowing update
5
Fast Replication
Command
Click to edit Master title style
Fast Replication – Explained
• Fast Replication (FlashCopy) command issued 1) Source and Target volume relationship established
2) Track bit map created
3) Source and Target volumes available for updates – Copy is DONE!
4) In background copy, tracks are copied from Source to Target
5) Track bit map is updated
6
Source
Establish/Thaw FC
Target
Track Bit Map
Source
Background Copy
Target Track Bit Map
Click to edit Master title style
Importance of Consistency
• Required for Volume Level Copies • Dependent writes must be maintained through copy
• If not maintained, copy cannot be used for recovery or cloning
• DB2 Backup Consistency Options • DB2 Backup System
• DB2 Log Suspend
• Storage Based Consistency
• Provides a shorter unavailability time
• FlashCopy Consistency Group
• Enginuety Consistency Assist (ECA)
7
Click to edit Master title style
Source Target
S1 T1
FlashCopy Establish
1) S1 is Frozen, no more writes
2) S2 is Frozen, no more writes
3) S3 is Frozen, no more writes
T1 – T3 have Consistent FlashCopy
Source Target
S2 T2
Source Target
S3 T3
THAW after Establish
Phase
1) Source updates proceed on
S1, S2, S3
FlashCopy Consistency Example
8
8
Click to edit Master title style
FlashCopy Consistency Example
9
Source
Establish/Thaw FC
Target
Track Bit Map
Specifications
1) 13 TB of data
2) 461 volumes
3) DS8300
4) 2817-M80 z196
5) 4,075.28 trans/second
6) Backup Elapsed = 0.37 secs
IMS Recovery Expert for z/OS Backup Summary Report Utility Executed:......... Backup Profile Name:............. ROCKET1.BKUP1 IMS Subsystem:............ IMSP IMS Version:.............. 12.1 Backup Type:.............. Flash Copy Backup Contains:.......... Database, Log Data (Mixed) Partial Backup:........... No Nbr of Volumes:........... 0461 Backup Date:.............. 02/01/2012 Backup Time:.............. 2012-02-01-17.03.20.671934 Consistency Method:....... Flash Consistency Group Supports Database Restore: No I/O Suspend Time:......... 2012-02-01-17.03.20.671932 I/O Resume Time:.......... 2012-02-01-17.03.21.042397 Backup Elapsed:........... 00.37 Seconds
9
Click to edit Master title style
Database and Storage Administration Trends and Directions
• Large Database systems require high availability • Fast and non-intrusive backup and cloning facilities are required
• Fast recovery capabilities are required to minimize downtime and promote high availability
• Most backup, recovery and cloning solutions do not leverage storage-based fast-replication facilities
• Storage-based fast-replication facilities are under-utilized • Tend to be used by storage organizations
• Tend not to be used by database administrators (DBAs)
• Storage aware database products allow DBAs to use fast-replication in a safe and transparent manner • Provides fast and non-intrusive backup and cloning operations
• Simplifies recovery operations and reduces recovery time
• Simplifies disaster recovery procedures
10
Click to edit Master title style
Storage-aware Data Management Database and Storage Integration
11
Mainframe
Database
Systems
Storage-Aware
Database Tools
Application and
Database Management
Domain
Storage Administration
and
Business Continuity
Domain
• Organizational Integration
• New Backup Methods
• New Recovery Strategies
• Business Recovery Monitoring
• Cloning Automation
• Disaster Restart Solutions
Source
Database Backup,
Clone,
DR
Click to edit Master title style System Level Backup Customer Operational Advantages
• Reduce backup, recovery, and cloning administration costs
• Reduce host CPU and I/O resource utilization
• Perform backups and create clone copies almost instantly
• Parallel processing while data is being copied • Fast restore and parallel recovery reduces recovery time
• Perform clone conditioning on target volumes in parallel
• Simplify disaster recovery operations and procedures
• DBMS and storage-based fast-replication integration • Leverage storage processors and fast-replication investments: IBM, EMC,
HDS
• Expose fast-replication capabilities to the DBAs safely and transparently using “storage-aware” database utilities
• Provide a sophisticated infrastructure and metadata to manage the DBMS and storage processor coordination
12
Click to edit Master title style
System Level Backup Overview • A System Level Backup is a backup of the entire DB2
Subsystem at a point in time • Full System Level Backups
• Data system Level Backups
• Partial System Level Backups
• Leverages storage-based fast replication to drive the volume backup • Backup in seconds
• Offloading data copy process to the storage processor saves CPU and I/O resources
• Faster than data set copies
• Backup DB2 without affecting applications
• Backup windows reduced by replacing image copies
• Extends processing windows
• One backup instead of 100s or 1000s
• Data consistency ensures data is dependent-write consistent • DB2 Log Suspend
• Storage-based consistency functions
Source DB2 Volumes
Storage Processor APIs
Target Volumes
DB2 RE
DB2
13
Click to edit Master title style
Recover DB2 systems or application objects from
disk or tape automatically
Intelligent Recovery Manager (DB2) invoked to
optimize recovery plans
Faster recovery
Instantaneous system-restore process
Coordinated and parallel restore and DBMS
recovery operations minimize system downtime
System backup can be used for database object
(DB2), or application recovery
Data sets flashed to restore data
Parallel log apply reduces recovery time
One system backup used for system, application,
and disaster recovery
Source Database Volumes
System Level
Backup
Storage-Aware
Backup and
Recovery
DB2
Tape
Processing
SLB
SLB
SLB
Restore
Storage Processor APIs
Database Backup and Recovery for DB2 - System and Application Recovery Overview
14
Click to edit Master title style
System Backup / System Restore Requirements
• System Backup Requirements
• Backup Validations
• Backup Reporting
• System Backup Health Check
• Create Image Copy from System Backup
• Multi Vendor support: IBM, EMC, Hitachi, STK storage
• Restore Requirements
• Recover objects from System Backup
• Recovery Expert System Restore
• Automated recovery of objects in Recover/Rebuild Pending after System Restore
• System Backup Aware Tooling
• Coordinated DB2 / IMS
• Disaster Recovery – Might include CICS / VSAM or other related data
• Application Recovery
15
Click to edit Master title style
System Level Backup Required Storage Layout Validations
• Make sure entire DB2 system gets included in each backup
• Determine on “Other” data being backed up • Increases storage requirement
• Might be desired
• Validate storage layout of DB2 is properly segregated • Proper separation of DB2 System and Log Assets might be required
• Check that ICF catalogs are properly included in System Backup
• Validate storage configuration to ensure proper source to target mapping
• Determine objects in bad state at backup time
• HSM Migrated Object Datasets – (Automated recall)
• Recover Pending, Rebuild Pending, etc..
• Option to fail backup or continue
• Without these validations, Restores can fail from RC=0 backup!
16
Click to edit Master title style
DB2 System Backup – Reporting
17
Click to edit Master title style
Create Image Copies from System Level Backup
• Image copies still required by many downstream processes
• Image copies can be generated from a DB2 Recovery Expert generated system level backup (SLB)
• Image copies are registered DB2 image copies
• Image copies can be used by any existing process • DSN1COPY – Restore data to another object or system
• Unload – To transport data to another DBMS
• Recovery
• All image copies are created at the same point in time • No affect on the application for image copy creation
• Can be run during off peak CPU times
• Reduces I/O contention caused by performing traditional image copy processing during high transaction activity
18
Click to edit Master title style
Application Restore from System Backup
• DB2 V10 / HSM have improved, but still have limitations
• DB2 (and z/OS 1.11) now allow recovery from SLB even after an Online Reorganization has been executed.
• Datasets must be restored to original volume where they existed at backup time
• If no space is found on the original volume, Restore fails.
• Recovery Expert takes inventory where all datasets reside and keeps information with backup
• As long as there is space somewhere SMS rules allow, dataset can be restored.
• With Recovery Expert, objects can always be restored from SLB
19
Click to edit Master title style
System Restore Automations
Automations to improve System Restore Process
• Disconnect ICF catalogs
• Take volumes offline and back online to clear out allocations that can fail the restore.
• Executes commands to clear coupling facility structures for Data Sharing
• Build jobs to insert system point in time records in BSDSs • One per data sharing member
• Makes sure DB2 is down and no tasks are accessing DB2 volumes
• After System Restore, reports objects in Recover/Rebuild pending
• Build JCL to recover those objects from image copies
• Speeds recovery time by automating this process
• With automations, process is less error prone!
20
Click to edit Master title style
Storage Aware Integrated DB2 Recovery Example
21
DB2
Spaces
BSDS
Image
Copies
DB2 System
Backup
DB2 RE V3.1 Intelligent Recovery Manager
Tra
dit
ion
al
IC
Fastr
ep
IC
SQL
Recovery
Red
o S
QL
Un
do
SQ
L
Index
Rebuil
d
DB
2
Uti
lity
Check
Utility
Ind
ex
Data
Post
Recovery
Image
Copy
Tra
dit
ion
al
Fa
str
ep
IC
Restore
From
SLB
IBM
Fla
sh
co
py
EM
C S
NA
P
IBM
DfS
MS
dss
Fast-replication
Data Set Restore
DB2 RE V3.1 Invoked
Recovery Processes
Recover
Utility
DB2
Catalog
DB2 Log
DB2 RE
Repository
Dropped
Object
Recovery
DD
L
+ D
CL
Data
Reco
ver
Lo
g
Ap
ply
Click to edit Master title style
Coordinated Disaster Recovery Customer Successes Example
• Customer desired improved DR process of CICS / DB2 / VSAM application over their existing process
• Existing process required taking down CICS while DB2 was image copied and CICS region was copied
• New process: – Recovery Expert replaced 1000s of image copies with single system
level backup
– CICS / VSAM / DB2 was all included in System Backup
– System Backup delivered to DR site through replicated VTL
– Significant decrease in CICS downtime to run the backup
• From several hours to under five minutes
– Reduced recovery time at DR site over existing process
– Fit with current system DR process without significant changes
22
Click to edit Master title style
2
3
Implementation and Planning Considerations
• System level backup usage • Determine how SLB(s) will be used
• SLB type • Determine full, data-only, or partial SLB requirements
• Backup frequency and space utilization • Determine backup frequency, performance, and space efficient fast-replication
requirements
• Disaster restart considerations • Determine offsite disaster restart resources and preferences (RTO, RPO) to
define appropriate disaster recovery profiles
23
Click to edit Master title style
2
4
System Level Backup Usage and Data Set Layout Considerations
• SLB used for local system recovery • Database data and recovery structure isolation required • Database system isolation may be required
• Non-database data sets will get restored when DB2 system is restored • User catalogs will get restored
• SLB used for application recovery
• Data and recovery structure isolation is not required
• SLB used for remote disaster restart operations • Recovery structure isolation is not required • Database system isolation may be required
• Non-database data sets will get restored when DB2 system is restored • User catalogs will get restored
24
Click to edit Master title style
DB2 System Level Backup Usage Data Set Layout for Full Backup / System Recovery
DB2 on z/OS System and Database Environment
DB2 Source Volume Isolation Required
Active
Logs
Archive
Logs
BSDS ICF User
Catalogs
ICF User
Catalogs
DB2
Catalog
DB2
Directory
DB2
Databases
DB2 Recovery Structures DB2 System and Application Structures
DB2 System Backup Volumes
Active
Logs
Archive
Logs
BSDS ICF User
Catalogs
ICF User
Catalogs
DB2
Catalog
DB2
Directory
DB2
Databases
SMS Group, DB2 System Backup Volume Pool, Target Unit Range
25
Click to edit Master title style
DB2 System Level Backup Usage Data Set Layout for Data Only / Application Recovery
DB2 on z/OS System and Database Environment
DB2 Source Volume Isolation not Required
Active
Logs
Archive
Logs
BSDS
ICF User
Catalogs
ICF User
Catalogs
DB2
Catalog
DB2
Directory
DB2 System and Application Structures
DB2 System Backup Volumes
ICF User
Catalogs
DB2
Catalog
DB2
Directory
SMS Group, DB2 System Backup Volume Pool, Target Unit Range
DB2
Databases
DB2
Databases
DB2 Data Set
DB2 Data Set
DB2 Data Set
DB2 Data Set
26
Click to edit Master title style
Partial System Level Backup
• Partial system level backup (PSLB) • Backup volumes representing a subset of the database system • PSLB’s used for database or application recovery only • Data set fast replication used to restore data • Log and data isolation not required • Desired application database data should be grouped on volumes as
a best practice
• PSLB cannot be used for system recovery • System recovery requires all volumes in SLB
• PSLB usage • Large databases or applications having unique backup requirements • Creating image copies from a PSLB • Reduce disk utilization • Support more backup generations
27
Click to edit Master title style
DB2 Partial System Level Backup Data Set Layout for Application Recovery
DB2 on z/OS System and Database Environment
DB2 Source Volumes
Active
Logs
Archive
Logs
BSDS ICF User
Catalogs
ICF User
Catalogs
DB2
Catalog
DB2
Directory
DB2 Recovery Structures Application Structures
DB2 Backup Volumes
SMS Group, DB2 System Backup Volume Pool, Target Unit Range
DB2
Databases
DB2 Data Set
DB2 Data Set
DB2
Databases
DB2 Data Set
DB2 Data Set
28
Click to edit Master title style
2
9
Implementation Planning Backup Frequency, Space, and Resource Usage
• SLB type: full, data-only, or partial – shown in previous slides • Determine optimal backup frequency • Determine number of backups to keep online (on disk)
• Establish online backup duration requirements • SLB or PSLB used for IC creation may be deleted after ICs complete
• Determine offline (tape) backup requirements • Consider incremental fast-replication options to reduce
background copy time and resources • Consider using one set of volume targets to support multiple
database systems – next slide • Saves fast-replication target volume storage requirements
• Consider using space efficient fast-replication methods
29
Click to edit Master title style
One Set of Backup Volumes for Multiple DBMS Systems (DB2 Example)
Storage-Aware
Backup and
Recovery
DB2-1
Offload DB2-1, T1
Source DB2 -1
Volumes
System Level
Backup
DB2-2
Source DB2 -2
Volumes
Tape
Processing
SLB1
SLB2
SLB1
SLB2
Offload DB2-2, T2
Offload DB2-1, T3
Offload DB2-2, T4
Offload
Process
• Backup DB2–1
– SLB-1 created on disk
– Archive SLB-1
– Backup volumes are available after archive completes
• Backup DB2–2
– SLB-2 created on disk
– Archive SLB-2
– Backup volumes are available after archive completes
• Repeat for DB2-1
• Repeat for DB2-2
Storage Processor APIs
30
Click to edit Master title style
Space-efficient Fast-replication Operation Overview
• Space-efficient target volume is accessible when the copy is created
• The first time a track on the source volume is written to:
• Original data on the source volume is copied to a save volume (pool)
• Pointer on the space-efficient volume is changed to point to the save pool
• The host write is written onto the track of the source volume in cache
• The track on the source volume is then updated
• Unchanged data stays in place on the source volume
Source
Volume
Write to
Track
Space-efficient
volume
Original
Track Save
Pool
Space-efficient
Access
31
Click to edit Master title style
Space Efficient Usage Economics Enable Frequent SLB or Clone Copies
Full-volume SLB or clone copies
Requires 12 TB of additional capacity
Based on a 30% change rate
Space-efficient SLB or clone copies
Requires ~900 GB of additional capacity
Source
3 TB
12:00 a.m.
3 TB
6:00 p.m.
3 TB
12:00 p.m.
3 TB
6:00 a.m.
3 TB
Source
3 TB
Save Area
~900 GB
6:00 a.m
12:00 p.m.
3:00 a.m.
12:00 a.m.
9:00 p.m.
6:00 p.m.
3:00 p.m.
9:00 a.m.
32
Click to edit Master title style
Coordinated Disaster Recovery Using Virtual Tape and VTape Replication
Storage Processor APIs
Storage-Aware
Backup and
Recovery
DB2 / IMS
Tape
Processing
Source
Database
Volumes
System
Level
Backup
SLB
Primary Production Site Secondary Production Site
Vtape Replication
Primary Disaster
Restart Site
(remote tape-based
disaster restart
and optional recovery)
Tape
Processing
SLB
SLB and
Archive Log Tapes
Offload
33
Click to edit Master title style
DB2 and IMS SLBs with PPRC Remote Pair FlashCopy
• Storage Aware Backup/Recovery and “Remote Pair FlashCopy” Support
– FlashCopy to PPRC Primary volume while maintaining Full Duplex
– FlashCopy Metro-Mirror implementations only (Synchronous PPRC)
• Preserve Mirror support option specified in installation ParmLib
(FCTOPPRCP)
– N - Do not allow the PPRC primary to become a FlashCopy target
– Y - The pair can go into a duplex pending state
– P - It preferable that the pair does not go into a duplex pending state.
– R - It is required that the pair not go into a duplex pending state
• Applies to FlashCopy type backups
34
Click to edit Master title style
System Level Backup Without Remote Mirror FlashCopy
Storage Processor APIs
Storage-Aware
Backup and
Recovery
DB2 / IMS
Source
Database
Volumes
System
Level
Backup
PPRC
Target
Volumes
PPRC
Target
Volumes
PPRC Duplex
Pending on Restore
PPRC Duplex
Pending on
Backup
PPRC Metro Mirror
(Synchronous)
• SLB causes backup volume data to be copied through PPRC link
• SLB can cause PPRC duplex pending state
• SLB restore can cause PPRC duplex pending state
35
Click to edit Master title style
System Level Backup With Remote Mirror FlashCopy
Storage Processor APIs
Storage-Aware
Backup and
Recovery
DB2 / IMS
Source
Database
Volumes
System
Level
Backup
PPRC
Target
Volumes
PPRC
Target
Volumes
PPRC Remote
FlashCopy Operation
on Backup
PPRC Metro Mirror
(Synchronous)
• FlashCopy data is not copied over PPRC links
• SLB drives remote pair
FlashCopy operation • Remote PPRC production
volumes Flashed to remote PPRC SLB volumes
• System level restore drives
remote pair FlashCopy operation • Remote PPRC SLB volumes
Flashed to remote PPRC production volumes
PPRC Remote
FlashCopy Operation
on Restore
36
Click to edit Master title style
DB2 and IMS SLBs with XRC and PPRC without Remote Pair FlashCopy
• Assume DB volumes are primary volumes in a PPRC metro mirror or XRC relationship
• Backup target volumes must not be in a PPRC or XRC relationship
• Backup volumes cannot be used for DB system recovery without duplex pending state
• DB2 Application recovery allowed • Application, database, and object recovery performed by copying data sets from
the backup volumes to the source volumes
• DFSMSdss used to copy data sets • Fast Replication Preferred option used to copy data
• DFSMSdss uses slow copy methods as data sets cannot be Flashed to source PPRC or XRC volumes.
37
Click to edit Master title style
• Environment • DB2 V10
• IBM DB2 Recovery Expert V3.1
• 2.5 TB DB2 Environment, 80% Utilized
• 38,874 Combined TS/IX Datasets
• Tablespaces & Indexes Backed Up
• Weekly Backup Process • Traditional 1 Full Image Copy, 6 Incremental Copies
• 20% Changed Pages Daily(Incremental Image Copies)
• DB2 V10 Flash Copy Image Copies (7 Daily Copies)
• IBM DB2 Recovery Expert - System Level Backups (7 Daily Copies)
System Backup Resource Utilization Test
38
Click to edit Master title style
0
50
100
150
200
250
300
350
400
Elasped Time in Minutes EXCPs in Millions CPU in Minutes
Traditional ICStrategy
Flash CopyImage Copy
RecoveryExpert SLB
System Backup Resource Utilization Test Results
39
Click to edit Master title style
4
0
Session Summary
• Storage-aware database utilities provide storage integration to simplify database administration tasks
• System-level backup solutions leverage storage-based fast-replication facilities and investments • Fast and non-intrusive backup operations with less administration
• Reduces host CPU, I/O and storage utilization
• Backups can be used for system, application, disaster restart
• Parallel recovery reduces system and application recovery time
• Less skill required to implement advanced backup, recover, disaster recovery, and cloning solutions
• Implementation planning is important to optimize the benefits
40
Click to edit Master title style 41
Bryan Smith / Tim Willging Rocket Software, Inc. [email protected] / [email protected]
Unleash the Savings: DB2 Backup & Recovery at Near Zero Cost
Evaluate my session online: www.idug.org/na2013/eval