® information management © 2005 ibm corporation ibm software group availability overview in ids...
TRANSCRIPT
®
Information Management
© 2005 IBM Corporation
IBM Software Group
Availability Overview in IDS Cheetah
Dave Desautels
Lenexa, KS
Senior Software Engineer - Informix
Information Management
2
Agenda
Why Replicate?
Enterprise Replication – then and now
HDR – then and now
Additional new functionality in Cheetah
Information Management
3
Replication
Information Management
4
Workload Partitioning
Different datum ‘owned’ in different locations (warehouses)
High Availability
Provide a hot backup to avoid downtime due to system failure Secondary
Capacity Relief
Offload some work ontosecondary systems (reporting/analysis) Reporting
Database
Why Replicate?
Information Management
5
Enterprise Replication
Information Management
6
Pre-Cheetah: Enterprise Replication
Uses Workload partitioning Capacity relief
Flexible and Scalable Subset of data Only RDBMS with this feature
Supports update anywhere Very low latency Synchronize local with
global data
Integrated Compatible with all other IDS availability
solutions Secure data communications
Information Management
7
Enterprise Replication Topology
The entire group of servers is the replication domain
Every server has a role Root – fully connected to all other Root
servers, aware of entire topology
Non-Root – aware of the entire topology, only directly connected to it’s children and parent root
Leaf – direct connect to only Root or Non-Root nodes
Any node within the domain can replicate data with any other node in the domain
Data can be replicated thru a node without having to house the replicated table
Any node can be an HDR pair
Root
Non-Root
Leaf
Information Management
8
ER Strengths
Flexible Choose Columns to Replicate Choose where to Replicate
Supports Update anywhere Conflicting updates resolved by:
Timestamp, Stored Procedure, Always Apply
Completely implemented in the Server No additional products to buy
Based on log snooping rather than transaction based
Support for heterogeneous OS, IDS versions, and hardware
Information Management
9
ER Guidelines
Table must have a primary key which is unique throughout the replication domain
Although Conflict Resolution supported, better to avoid conflicts
Root commit does not depend on any target commit
Need to ensure plenty of log space if network or target goes down
Smart blob space needed for queue Could lead to extra log records written
Information Management
10
New Enterprise Replication Features in Cheetah
Improved ER performance: Apply ParallelismImproved ability to update target tables in parallel
Reduces response latency back to source
Fire Triggers During Synchronization on ER Servers Triggers can now be used during ER synchronization
Default is off
New built-in ChecksumNo longer need to build a private checksum function
No setup needed to run 'cdr check' function“cdr check” checks whether two ER nodes are in synch
Ensures the checksum is properly installed and can be safely used by all ER nodes
Information Management
11
HDR Replication
Information Management
12
Pre-Cheetah: HDR Replication
Uses: High availability: takeover from
primary Capacity relief: distribute workload
Secondary available for Read-only queries
Simple to administer
Integrated Compatible with all other IDS
availability solutions Any ER node can also be an HDR
pair
Primary server Secondary server
Information Management
13
Strengths of HDR
Easy setup Just backup the primary and restore on the
secondary No significant configuration required
Secondary can be used for dirty reads
Provides failover to secondary Automatic failover when DRAUTO is set
Stable codeHas been part of the product since
version 7
Integrates easily with ER
Information Management
14
Recent HDR Features Prior to Cheetah
DRIDXAUTO Specifies if the primary will automatically
resend indexes if the secondary detects corruption of the index
ER/HDR Support of ER within the HDR environment
Support of logged extended/user defined types Time Series, Smart Blobs, Logged UDTs
Support for HDR groups Supports connection failover
Ontape to STDOUT Allows a secondary to be restored from
primary without doing a backup to media
Information Management
15
Customer Problem #1
“We Need Additional Availability!”
Need additional nodes for reports
Scale out for workload distribution
One bunker site for disaster recovery not enough
Information Management
16
Where do we go from Here?
Next evolutionary step for HDR New type of secondary – RSS nodes Can have 0 to N RSS nodes Can coexist with HDR secondary
Uses: Reporting Web Applications Additional backup in case primary fails
Similarities with HDR secondary node: Receive logs from Primary Has its own set of disks to manage Primary performance does not affected RSS RSS performance does not affect primary
Differences with HDR secondary node: Can only be promoted to HDR secondary, not primary Can only be updated asynchronously Only manual failover supported
Primary Node
Secondary Node
RSS #2RSS #1
Replication to Multiple Remote Secondary Nodes
Remote Standalone Secondaries!
Information Management
17
Usage of RSS: Additional Capacity
Applications
Secondary ServersPrimary Server
Customer needs to add additional capacity for its web applications. Adding additional RSS nodes may be the answer.
Information Management
18
Usage of RSS – Availability with Poor Network Latency
RSS Node
Customer in Dallas wants to provide copies of the database in remote locations, but knows there is a high latency between the sites.
Primary
RSS Node
RSS uses a fully duplexed communication protocol. This allows RSS to be used in places where network communication is slow or not always reliable.
Dallas
MemphisNew Orleans
Information Management
19
Usage of RSS – Bunker Backup
RSS Node
Customer currently has their primary and secondary in the same location and is worried about loosing them in a disaster. They would like to have an additional backup of their system
available in a remote location for disaster recovery.
HDR Secondary
Primary
Using HDR to provide High Availability is aproven choice. Additional disaster availability is provided by using RSS to replicate to a secure ‘bunker’.
Information Management
20
Customer Problem #2
“We have too much data!”
I want the benefits of replication but I can’t afford to duplicate my data
I want to have my data available on my replicas with lowest possible latency
I want to add or remove capacity as demand changes
Information Management
21
What is the Answer?
Next evolutionary step SDS nodes share disks with the primary Can have 0 to N SDS nodes
Uses: Adjust capacity online as demand changes Does not duplicate disk space
Features Doesn’t require any specialized hardware Simple to setup Can coexist with ER Can coexist with HDR and RSS secondary
nodes
Similarities with HDR secondary node: Dirty reads allowed on SDS nodes The primary can failover to any SDS node
Differences with HDR secondary node: Only manual failover of primary supported
Primary
SDS #1 SharedDisk
SharedDiskMirror
SDS #2
SDS #3
Blade Server
HDR with Multiple Shared Disk Secondary Nodes
Shared Disk Secondary Nodes
Information Management
22
SDS Usage: Capacity as Needed
Primary
SDS #1
SharedDisk
SDS #2
SDS #3
Blade Server A
Primary
SDS #1
SDS #2
SDS #3
SharedDisk
Blade Server B
Web Applications Analytic Applications
Information Management
23
Replication – The Complete Picture
Primary
SDS
Blade Server A<New Orleans>Building-A
HDR Traffic
Blade Server B<Memphis>
HDR Secondary
SharedDisk
RSS
Blade Server C<Denver>
SharedDisk
SharedDiskMirrorSDS
SDS
Blade Server D<New Orleans>Building-B
Offline
Offline
RSS Traffic
HDR Traffic
Primary
HDR Secondary
SDS
Client Apps
Client Apps
Client Apps
Information Management
25
Problem – Secure HDR Communication
SECURITY is a concern in every industry
Data that travels between primary and secondary systems may be snooped
Replication over long distance increase the risk
Information Management
26
Solution - HDR Encryption
Encrypts data traffic between HDR servers
Based on OpenSSL and uses standard ciphers.
Enabled via onconfig file, defaults off
Both primary and secondary must use the same onconfig values
Information Management
27
Continuous Log Restore
Information Management
28
Existing problem (Setting up a stand by system) Difficult to maintain an IDS instance in recovery mode
With ontape, after a log restore completes, ontape waits:
bash-2.05# ontape -r
Please mount tape 1 on /tape/t2 and press Return to continue ... ....Continue restore? (y/n)yDo you want to back up the logs? (y/n)n....Restore a level 1 archive (y/n) nDo you want to restore log tapes? (y/n)y....Please mount tape 1 on /tape/lt2 and press Return to continue ... Do you want to restore another log tape? (y/n)
Wait at the prompt for next tape to fill up and then press Y
Otherwise system will complete recovery and comes to quiescent mode
Information Management
29
Preparing Remote Standby System
Primary
Log backup device
Restore Logs
onbar –b –l ontape -a
Remote Standby
Physical backup Physical Restore
Information Management
30
Continuous log restore option with ontape/onbar
With continuous log restore option, server will suspend log restore:
bash# ontape -l -CRoll forward should start with log number 8Log restore suspended at log number 10bash#
Log restore can be restarted again with ontape command
bash# ontape -l -CRoll forward should start with log number 7Log restore suspended at log number 7bash#
Recovery mode is terminated by ontape –l command
Information Management
31
Using Continuous log restore option
Logical log backups made from an IDS instance are continuously restored on a second machine
Using continuous log restore does not effect the primary server
Can co-exist with HDR/ER for disaster recovery (use HDR for HA and continuous log restore for disaster recovery)
Sending logs can be automated via ALARMPROGRAM in onconfig
Information Management
32
Using Continuous Log Restore for Disaster Recovery
IDS
Physical Log
Logical Logs
reserve pagesDBS11GB
DBS32GB
DBS21GB
ROOTDBS
IDS
Physical Log
Logical Logs
reserve pagesDBS11GB
DBS32GB
DBS21GB
ROOTDBS
Log backup device
onbar –r –l -Contape –l -C
onbar –b –l ontape -a
Recoveringonline
Information Management
33
Combining HDR with Continuous Log Restore for HA and Disaster Recovery
SecondaryPrimary
Log backup device
onbar –r –l -Contape –l -C
onbar –b –l ontape -a
Remote Standby