ete zdlra business case - oracle · ete zdlra business case energy transfer information technology...
TRANSCRIPT
ETE ZDLRA Business Case
Energy Transfer
Information Technology Infrastructure Services (ITIS)
“Striving to exceed customer expectations via standard solutions”
George F Mamvura Sr. Mgr. IT Infrastructure Services – Oracle/Unix
Key ETE Database Protection Goals
Business Goals • Never lose critical business data
• No impact on business applications
I.T. Goals • Ensure database level recoverability
• Centralized service to protect all databases
2
Current backup solutions fail to achieve these goals!
ETE Backup Technology Evolution
3
Industry focus on Optimization of Storage Capacity, not pure Database Availability
1980’s 1990’s 2015
Tape Libraries …Still in use today
Network Attached Storage
Deduplication/ SAN Replications
Late 2000’s
Oracle Backup Appliances (ZDLRA)
General Backup Solutions Are Not Designed for Database
…most Treat Databases as Just Files to Periodically Copy
4
Daily Backup Window
Large performance impact on production
Data Loss Exposure
Lose all data since last backup
Many Systems to Manage
Scale by deploying more backup appliances
Poor Database Recoverability
Many files are copied but protection state of database is unknown
ETE Data Protection: A Complex Task
• Significant challenges for data protection?
#1: Backing up and managing increasingly large data volumes
5
Backups are too slow
Backups need constant management
Complex Recovery
Production slow down
Coordination of multiple backups
ETE Benefits achieved with the Recovery Appliance
7
Results Achieved Recovery Appliance Cost Savings Over 5
Years
Storage Consumption for Oracle DB Backups
5x less - 10TB vs about 49TB: Estimated 103TB reduction / 5 year
$176k
Capital Savings Cost Avoidance – Freed up 30TB of Hitachi Tier 1 Storage
$150k
Decreased Backup Time Reduced backup window by 5 hours $350k
Better Utilization of IT Assets - Offloaded most backup processing allowing more resources for data processes - Legacy solution use of production system resources could affect daily processing
$150k
Team Productivity Savings Theoretically, reduced personnel time needed to manage backup, restore and cloning
$575.7k
ETE Benefits achieved with the Recovery Appliance – continued…
8
Results Achieved Recovery Appliance
Extended PITR Recovery Window on Disk
Allowed ETE to have a longer recovery window at lower cost Legacy solution retention period limited due to high storage costs
Improved Development and Project Efficiency
Centralized and efficient project database builds/backups/restore replaced previous cumbersome process -- Substantial DBA time savings
Centralized Archival Offsite data archive for compliance and DR now accessed from a single repository and Replicated significantly reducing negative affect on production systems
ETE Platform as a Service Improved platforms recoverability standards to a Diamond service level - See Data Loss Calculator
ETE Benefits of a ZDLRA Solution- continued…
Backup Storage Analysis
9
0
20
40
60
80
100
120
140
1 2 3 4 5 6
TB
Year
DB Backup Storage Growth
Total Current (TB)
ZDLRA (TB)
ETE Benefits of a ZDLRA Solution- continued…
Five Year Cost Analysis
10
$0.00
$100,000.00
$200,000.00
$300,000.00
$400,000.00
$500,000.00
$600,000.00
$700,000.00
$800,000.00
Cost Status Quo Cost ZDLRA
Capital Cost
Cost Status Quo
Cost ZDLRA
ETE Benefits of a ZDLRA Solution- continued …
Decrease Backup Window Time
11
0
1
2
3
4
5
6
7
8
9
DB Backup Window (Tape) hrs DB Backup Window (ZDLRA) hrs
DB Backup Window
DB Backup Window (Tape) hrs
DB Backup Window (ZDLRA) hrs
ETE Benefits of a ZDLRA Solution- continued …
Team Productivity Improvement Analysis
12
Productivity Investment Year 1 Year 2 Year 3 Year 4 Year 5 Totals
Status Quo
Cloning $0.00 $16,380.00 $16,380.00 $16,380.00 $16,380.00 $16,380.00 $81,900.00
Backup Window $0.00 $98,280.00 $98,280.00 $98,280.00 $98,280.00 $98,280.00 $491,400.00
Restore Time $0.00 $45,000.00 $45,000.00 $45,000.00 $45,000.00 $45,000.00 $225,000.00
With ZDLRA
Cloning $0.00 $4,680.00 $4,680.00 $4,680.00 $4,680.00 $4,680.00 $23,400.00
Backup Window $0.00 $21,840.00 $21,840.00 $21,840.00 $21,840.00 $21,840.00 $109,200.00
Restore Time $0.00 $18,000.00 $18,000.00 $18,000.00 $18,000.00 $18,000.00 $90,000.00
Productivity Savings $0.00 $115,140.00 $115,140.00 $115,140.00 $115,140.00 $115,140.00 $575,700.00
ETE Benefits of a ZDLRA Solution- continued…
Recovery Appliance Unique Benefits for Business and I.T.
13
Minimal Impact Backups
Production databases only send changes. All backup and tape processing offloaded
Eliminate Data Loss
Real-time redo transport provides instant protection of ongoing transactions
Scale Protection
Easily protect all databases in the data center using massively scalable service
Database Level Recoverability
End-to-end reliability, visibility, and control of databases - not disjoint files
ETE Benefits of a ZDLRA Solution- continued …
Minimal Impact Backups: Boost Production Server Performance
• Offload Backup Processing, Eliminate Expensive Backup Agents, Reduce Network Load
14
Prior Database Server architect
Performance degrades with backups
Disk / Tape / Dedupe Backup Agents
Backup Operations: Merge, Compress, Validate, Delete,
Full/Tape Backups
Performance improves with backup offload
Database Server with Recovery Appliance
Disk / Tape / Dedupe Backup Agents
Backup Operations: Merge, Compress, Validate, Delete,
Full/Tape Backups
Delta Push
ETE Benefits of a ZDLRA Solution- continued …
Recovery Appliance: End-to-End Data Protection Visibility
Policy-based Management: Application-oriented Data Protection
15
1 Application-oriented
Protection Policy
Well Defined Oracle Blocks
RMAN Backup Set
End-to-end Oracle Block Validation
2
Recovery Appliance
3 EM Cloud Control: Integrated
Management
Accelerating Database Backup
and Recovery with Zero Data Loss
Recovery Appliance Kevin Prendergast
Lead Specialist
Javier Ruiz
Technical Team Lead
Agenda • Configuration
• Tape Hardware
• Network Setup
• Backup Configuration
• Restore and Clones
• Tape Backup Configuration
• Replication Configuration
• Troubleshooting
• Monitoring
ZDLRA Setup
Upstream
Tape Library 1
Downstream
Tape Library 2
Tape Library 1
RMAN: • Nightly incremental Backups • Archivelog Backups • Real-time Redo Shipping
Protected Databases
Replication: • Replicate only Gold and
Platinum
OSB Tape Library 1 : • Nightly incremental • Weekly full • Archivelog
OSB Tape Library 2 : • Archivelog • ZDLRA File System
WAN
OSB Tape Library 1 : • Nightly incremental • Weekly full • Archivelog
Tape Hardware
EMC Library NDMA
Oracle Secure Backup (OSB) 12.1
Direct fiber
Media manager setup configured in OEM recovery appliance
Network Configuration
New Cisco 2xxx Nexus switch infrastructure providing 40GB uplink to core.
Switches installed in each rack with 10GB to each port.
Compute Node Network Trunking
• NIC LACP(Link Aggregation Control Protocol) bonding is configured to provide redundancy and performance.
• The bonding trunks the interfaces together to provide a single network.
• Two(2) x 10GB for 20GB bandwidth using Dynamic Link Aggregation(mode 4).
• IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.
• Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option.
• Current configuration utilizing teamed active/passive 10GB interfaces due ETP network topology limitations and upgrade timelines.
• Future progression to trunked interfaces for 40GB uplink to core network and switch configuration, when Data Center is fully modernized.
Issues with Legacy Solution
• Daily backup window • Amount of DBs and size causing long RMAN backups • Batch processing and app jobs conflict with resources with backup
• Data Loss Exposure • From the last full backups archivelogs are not backed up still in ARCHDG • Manual process to restore files from tape and make sure correct files are
restored
• Space constraints with holding backups
• Scheduled maintenance requires special backups that take more time
• Tape backup with no understanding of RMAN backups pieces.
• Clones take more time and manual work to transfer RMAN backups
Energy Transfer ZDLRA
Benefits • Significant performance improvement
(reduction in time):
• Backups and restores of databases using the Recovery Appliance.
• Cloning of new databases from backups on the Recovery Appliance.
• Incremental forever architecture reduced backup window
• Offload backup processing from production servers:
• Backup compression, merge of incrementals, validation, deletion and backup to tape.
• Integrated and automated multi-tiered deployment (Replica and tape)
• RMAN automatically restores from tape if backup isn’t on the appliance
• Free up local /backup SAN space about 22T
• Non-prod environments can access backup files in ZDLRA producing faster refreshes
• Real-time redo shipping from memory allowing up to the second point in time recovery
• 10GB Nexus network connectivity in Active/Passive mode 1
• Online backup and restore store via RMAN fast access to data.
• Reduces backup overhead on large filesystem cache utilization.
Backup Module
• SBT library is bundled with ZDLRA database
• Protected databases can download module from OTN or install from ZDLRA $ORACLE_HOME/lib
• All protected database will have module installed • $ORACLE_HOME/lib
• Central path for lib
• Oracle wallet created to store the credentials needed to authenticate the protected database with ZDLRA. • Recommend creating a strong 12+ character password
ZDLRA User Accounts and Catalog
• Virtual private catalog (VPC) users are created on the appliance • DBAs can only access metadata for protected databases they manage
• Authentication credentials stored in Oracle wallet on the protected database host.
• Recovery Appliance catalog is managed by the appliance • Eliminates the need to maintain and manage a separate RMAN catalog
• Existing catalog metadata can be imported using RMAN IMPORT
DB Backup and Restore
• Nightly incremental backups
• First backup is incremental level 0, then level 1 thereafter
• Catalog reports a virtual full (VB$) available to the point of the incremental
• Settings on protected databases
• Enable block change tracking for fast incremental
• When database is added to Recovery Appliance management, Enterprise Manager (EM) automatically updates RMAN backup setting to use Recovery Appliance SBT_TAPE library
• Recovery Window for disk and tape backups configured via Recovery Appliance Protection Policy
• Backups beyond recovery window automatically purged by the appliance
• Eliminates need to perform RMAN DELETE OBSOLETE operation on protected databases
• Backup, restore and duplicate operations must connect to catalog
• Restore any virtual full backup eliminating overhead and time needed to merge / apply incremental
Protection Policies
• Central mechanism • Recovery window goals for disk backups
• Recovery window goals for tape backups
• Info about replication of backups
• Disk recovery window 4 days
• Maximum disk retention 15 days
• Media Manager 28 days
• Unprotected data window
• Policies • Bronze • Silver • Gold • Platinum
Improved Backup Performance
• Before ZDLRA full compressed
backup where taking between 8 to 12 hours with 4 channels
• Backup windows - 5pm to 5am • ZDLRA nightly incremental
backups completed between 2 to 3 minutes depending on the amount of block changes.
• Now all backups complete before 11pm
0
2
4
6
8
10
12
14
DB1 DB2 DB3 DB4
RMAN Backup Times
Before ZDLRA ZDLRA Level 0 After ZDLRA Level 1
Backups
• Offloading
• Compression
• Backup validation
• CPU usage
• Power usage
• Network usage
Load on the server from RMAN
0
10
20
30
40
50
60
Node 1 Node 2 Node 3
CPU Usage
Before ZDRLA After ZDLRA
0
10
20
30
40
50
60
Node 1 Node 2 Node 3
Power Usage
Before ZDRLA After ZDLRA
Restores Are 3x Faster
• Before ZDLRA restores could take up to 10 hour
• ZDLRA delivers a 75% improvement on restore times.
• This sped up our clones of non-prod databases.
0
2
4
6
8
10
12
DB1 DB2 DB3 DB4
RMAN Restore Times
Before ZDLRA With ZDLRA
Redo Shipping
• Support for 11g and 12c
• Data guard redo transport
• Incoming redo blocks directly from SGA on protected database
• Writes logs to redo staging location
• Converts to compressed archive log backup and written to delta store
• Preserves data loss with archived redo log
• Run archivelog backup and delete
• Archivelog deletion policy apply on standby
• Requires restart of database
log_archive_config dg_config=(zdlra,standby) redo_transport_user=RAVPC1 log_archive_dest_2='VALID_FOR=(ALL_LOGFILES, ALL_ROLES) ASYNC DB_UNIQUE_NAME=zdlra', 'SERVICE=“<NODE>ingest-scan:1521/zdlra:dedicated"'
Parameter Changes on Protected DB
OSB Tape Backup Configuration
• OSB 12.1
• Upstream • SL500
• SL150
• Downstream • SL500
• Configure copy to tape jobs in OEM • To add second library to upstream required custom setup • Custom OSB storage selector for archivelog backup pieces
to second library • ZDLRA file system tape backup • Tape life cycle automated
Replication Configuration
• RA version ra_automation-12.1.1.1.7-23494283.x86_64
• Replicate Gold and Platinum protected policies
Troubleshooting
• Methodology • Doc ID: 2066528.1
• Startup and Shutdown Process • Customer Outage Classifications and Restoration Action Plans Doc ID: 2022047.1
• Make sure you are on the latest patch for RA • Doc ID: 2028931.1
• Protected DBs • 10.2.0.4 and below not supported
• Replication • Transfer issue fixed with ra_automation-12.1.1.1.7-23494283
• Incidents and events • Watching for alerts showing protected DB not meeting recovery window
Monitoring Cloud Control
Monitoring and configuration via Cloud Control • Summary
• Protected DBs
• Data sent and received
• Performance
• Media managements
• Copy to Tape
• Storage locations
• Replication
• Incidents and events
• BI Reports
Monitoring Cloud Control
OSB Monitoring • Monitor
Devices
• Administer Devices
• Administer Library
• Setup file system backup jobs