© 2015 ibm corporation december 17, 2015 db2 purescale overview and technology deep dive >
TRANSCRIPT
© 2015 IBM CorporationApril 21, 2023
DB2 pureScale Overview andTechnology Deep Dive
<<Speaker Name Here>><<Speaker Title Here>><<For questions about this presentation contact Kelly Schlamb ([email protected])>>
© 2015 IBM Corporation2
Clients need a highly scalable, flexible solution for the growth of theirinformation with the ability to easily grow existing applications
Critical IT Applications Need Reliability and Scalability
Down-time is Not Acceptable– Any outage means lost revenue and
permanent customer loss– Today’s distributed systems need reliability
Local Databases are Becoming Global– Successful global businesses must deal
with exploding data and server needs– Competitive IT organizations need to
handle rapid change
© 2015 IBM Corporation3
Learning from the undisputed Gold Standard... System z
Introducing DB2 pureScale
Extreme capacity– Buy only what you need,
add capacity as your needs grow
Application transparency– Avoid the risk and cost of
application changes
Continuous availability– Deliver uninterrupted access to your
data with consistent performance
© 2015 IBM Corporation4
DB2 pureScale Architecture
• Multiple DB2 members for scalable and available database environment
• Client application connects into any DB2 member to execute transactions• Automatic workload balancing
• Shared storage for database data and transaction logs
• Cluster caching facilities (CF) provide centralized global locking and pagecache management for highest levels of availability and scalability• Duplexed, for no single point of failure
• High speed, low latency interconnect for efficient and scalable communication between members and CFs
• DB2 Cluster Services provides integrated failure detection, recovery automation and the clustered file system
Shared Storage
Database
Logs Logs LogsLogs
Cluster Interconnect
Member
CSMember
CSMember
CSMember
CS
Primary CF
CFCS
Secondary CF
CFCS
Clients
DB2 pureScale Cluster (Instance)
Leveraging IBM’s System z Sysplex Experience and Know-How
© 2015 IBM Corporation5
DB2 vs. DPF (Shared Nothing) vs. pureScale (Shared Data)
DB2
Tran
Log
Database
Core DB2
Single Database View
Log
DB2
Log
DB2
Log
DB2
Part 1 Part 2 Part 3
SQL 1’ SQL 1’’ SQL 1’’’
SQL 1
DB2 with Database Partitioning FeatureIdeal for warehousing and OLAP scale outand massively parallel query processing
Log
DB2 DB2 DB2
Single Database View
DB2 pureScale Data SharingIdeal for active/active OLTP/ERP scale out
Tran 1 Tran 2 Tran 3
Shared Data Access
© 2015 IBM Corporation6
Comparing pureScale with Other DB2 HA Options
Integrated Clustering
• Active/passive• Hot/cold, with failover typically
in minutes• Easy to setup
• DB2 ships with integrated TSA failover software
• No additional licensing required
HADR
HADR
• Active/passive or active/active (with Reads on Standby)
• Hot/warm or hot/hot (with RoS), with failover typically less than one minute
• Easy to setup
• DB2 ships with integrated TSA• Minimal licensing (full licensing
required if standby is active)• Perform system and database
updates without interruption
CFCF
pureScale
• Active/active• Hot/hot, with automatic and
online failover• Integrated solution includes CFs,
clustering, and shared data access
• Included as part of DB2"Advanced" editions
• Perform system and database updates in rolling online fashion
© 2015 IBM Corporation7
Machine Deployment Examples
Highly flexible topologies due to logical natureof member and CF
– A member and CF can share the same machine– For AIX, separate members and CFs in different LPARs– Virtualized environments via VMware and KVM
Dedicated cores for CFs– Optimizes response time
No pureScale licenses requiredfor CF hosts
– You only need to license the CPUs forhosts on which members are running
pureScale included in– DB2 Advanced Workgroup
Server Edition– DB2 Advanced Enterprise
Server Edition– DB2 Developer Edition– DB2 Business Application
Continuity Offering
Member
CFp
Member
CFs
Member
CFp
Member
CFs
Member Member Member Member Member Member Member Member CFp CFs
Member
Two Machines
Four Machines
Ten Machines
Member
© 2015 IBM Corporation8
pureScale Client Configuration
Workload Balancing (WLB)– Application requests balanced across all members or
subsets of members– Takes server load of members into consideration– Connection-level or transaction-level balancing
Client Affinity– Direct different groups of clients or workloads to specific
members in the cluster– Consolidate separate workloads/applications on same
database infrastructure– Define list of members for failover purposes
Automatic Client Reroute (ACR)– Client automatically connected to healthy member in
case of member failure– May be seamless in that no error messages returned
to client– Application may have to re-execute the transaction X X
Online Recovery from Failures
DB2 pureScale design point is to maximize availability duringfailure recovery processing
When a database member fails, only in-flight data remains locked until member recovery completes
– In-flight = data being updated on the failed member at the time it failed
Target time to availability of rows associated with in-flight updates on failed member in seconds
% o
f D
ata
Ava
ilab
le
Time (~seconds)
Only data in-flight updates locked during recovery
Database member failure
100
50
“We pulled cards, we powered off systems, we uninstalled devices, we did everything we could do to make the cluster go out of service, and we couldn’t make it happen.”
-- Robert M. Collins Jr. (Kent), Database Engineer, BNSF Railway Inc.
CFCF
X
10
Scale with Ease
Log
LogLogLog
Addmemberonline
Scale up or out… without changingyour applications
– Efficient coherency protocols designedto scale without application changes
– Applications automatically andtransparently workload balancedacross members
– Up to 128 members
Without impacting availability– Members can be added while
cluster remains online
Without administrative complexity– No data redistribution required
Log
MemberMemberMemberMemberMember
CF CF
“DB2 pureScale is the only solution we found that provided near linear scalability... It scales 100 percent, which means when I add servers and resources to the cluster, I get 100 percent of the benefit. Before, we had to ‘oversize’ our servers, and used only 50 - 60 percent of the available capacity so we could scale them when we needed.”
-- Robert M. Collins Jr. (Kent), Database Engineer, BNSF Railway Inc.
© 2015 IBM Corporation11
DB2 pureScale Daily Licensing
0%10%20%30%40%50%60%70%80%90%
100%
J F M A M J J A S O N D
WorkloadCPUDemand
LicensingNeed
Savings
License Cost Savings
© 2015 IBM Corporation12
Online System and Database Maintenance
Transparently perform maintenance to thecluster in an online rolling fashion
– DB2 pureScale fix packs– System updates such as operating system
fixes, firmware updates, etc.
No outage experienced by applications
DB2 fix pack install involves a single installFixPack commandto be run on each member/CF
– Quiesces member• Existing transactions allowed to finish• New transactions sent to other members
– Installs binaries– Updates instance
• Member still behaves as if running on previous fix pack level– Unquiesces member
Followed up by final cluster-wide installFixPack command to completeand commit updates across cluster
– Instance now running at new fix pack level
CFCF
© 2015 IBM Corporation13
> installFixPack –check_commit> installFixPack –check_commit6
Rolling Database Fix Pack Updates (cont.)
CF S
Code level: GA
CF P
Code level: GA
Member
Code level: GA
Member
Code level: GA
Member
Code level: GA FP1 FP1
FP1
FP1
FP1
Cluster is effectively running at: GA FP1
> installFixPack –online> installFixPack –online4 > installFixPack –online> installFixPack –online5
> installFixPack –online> installFixPack –online1 > installFixPack –online> installFixPack –online2 > installFixPack –online> installFixPack –online3
> installFixPack –commit_level
7
Transactions routed away from member undergoing maintenance, so no application outages experienced. Workload balancing brings work back after maintenance finished
Cluster not running at new level until commit is performed
Simplified installFixPack commands are shown here for example purposes
© 2015 IBM Corporation14
Continuous Availability During Maintenance and System Growth
Storage server• 1 - IBM Storwize v7000• 8 - SSD drives (2TB usable capacity)
• 4 - for data, 4 - for logs
Database servers• SUSE Linux Enterprise Server 11 SP 1• 6 - IBM x3950 X5s (Intel XEON X7560 @ 2.27GHz
(4s/32c/64t))• Mellanox ConnectX-2 IB Card• 128GB system memory
Update secondary
CF
Updateprimary
CF
Updatemembers 1,2 & 3
Add a 4th
member
Start new member and
add more app clients
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
Time
Tota
l Tra
nsactions P
er S
econd
Updatesecondary
CF
Updateprimary
CF
Updatemembers1,2 & 3
Add a 4th member
Start new member and add more app
clients
4 Hour Run Duration
© 2015 IBM Corporation15
DB2 pureScale Supported Hardware and OS
x86 IntelCompatible Servers
OR
BladeCenter H22/HS23
TCP/IP sockets interconnect or highspeed, low latencyRDMA-based interconnect (InfiniBand, 10 GE (RoCE)
Flex
GPFS compatible storage
(ideally storage that supports SCSI-3 PR fast I/O fencing)
POWER6POWER7/7+POWER8Flex
© 2015 IBM Corporation16
Cluster Interconnect Options
RDMA-capable interconnect for best performance and scalability– Requires specialized network adapter cards
• InfiniBand• 10 Gigabit Ethernet RoCE (RDMA over Converge Ethernet)
TCP/IP sockets interconnect for faster cluster setup and lower cost deployments using commodity network hardware– 10 Gigabit Ethernet (10GE) strongly recommended for production installations– Appropriate for smaller clusters with moderate data sharing workloads where
availability is the primary motivator for pureScale
No compromise in availability as both options provide exactly the same levels of high availability
Choice of interconnect based onyour performance andscalability requirements
CFCFInterconnect
© 2015 IBM Corporation17
Relative Performance Between RDMA and TCP/IP Sockets
Rel
ativ
e #
of
tran
sact
ions
pe
r se
cond
Rel
ativ
e #
of
tran
sact
ions
pe
r se
cond
1 member 2 members 3 members 4 members
1 member 2 members 3 members 4 members
Sockets (TCP/IP over Ethernet)
InfiniBand (RDMA)
Transactional workload with70% reads, 30% writes
Transactional workload with90% reads, 10% writes
Rel
ativ
e #
of
tran
sact
ions
pe
r se
cond
• 1-4 members, 2 CFs• Intel x86 servers• 32 logical CPUs per server• Single adapter per server• IBM DS3000 storage
© 2015 IBM Corporation18
Virtualized Deployments of DB2 pureScale
Virtualized environment options include– RDMA-capable interconnect
• AIX LPARs, with dedicated RDMA network adapters per partition• KVM with RHEL, with dedicated 10 GE RoCE network adapters per partition
– TCP/IP sockets interconnect• AIX LPARs• VMware (ESXi, vSphere) with RHEL or SLES• KVM with RHEL
Virtualized environments provide alower cost of entry and are perfect for– Development– QA and testing– Production environments with moderate workloads– Getting hands-on experience with pureScale
© 2015 IBM Corporation19
Virtualized Deployments: Supported Configurations
OperatingSystem
VirtualizationTechnology
InfiniBand Supported?
10GE RoCE Supported?
TCP/IP Sockets Supported?
AIX, SLES, RHEL No virtualization (bare metal)
Yes * Yes * Yes
AIX PowerVM (LPARs)
Yes * Yes * Yes
SLES VMware No No Yes
RHEL VMware No No Yes
KVM No Yes * Yes* Dedicated interconnect adapter(s) per host/partition
VMware supported with– Any x64 system that is supported by both the VM and DB2 pureScale– Any Linux distribution that is supported by both the VM and DB2 pureScale
KVM supported with– Any x64 system that is supported by both RHEL 6.2 and DB2 pureScale– RHEL 6.2 and higher
© 2015 IBM Corporation20
DB2 pureScale Supported Storage
Full explanation in “Shared Storage Considerations” section of the Information Center– http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.qb.server.doc/doc/c0059360.html
pureScale supports all storage area network (SAN) and directly attached shared block storage, referenced as a logical unitnumber (LUN)
pureScale exploits two specific features in storage when available for best results with recovery– Fast I/O fencing– Tie-breaker support
• Storage that supports both features isconsidered "category 1"
© 2015 IBM Corporation21
Fast I/O Fencing
SCSI-3 Persistent Reserve (PR) provides fast I/O fencing for fast recovery times– Fencing in as little as 1 - 2 seconds, allowing for host failure detection and I/O
fencing in as little as 3 seconds
Guarantees protection of shared data in the event that one or more errant hosts splits from the network
Substantially more robust than technology used by others (self initiated reboot based algorithms or STONITH)
Allows re-integration of a split host into the cluster when the network heals without requiring a reboot
© 2015 IBM Corporation22
Storage Categories
Storage controllers and multipath I/O driver combinations are divided into three categories
– Category 1• Verified with pureScale to support both fast I/O fencing and act as a tie-breaker
– Category 2• Verified with pureScale to support tie-breaker but not fast I/O fencing
– Category 3• Has not been verified with pureScale to support one or both of tie-breaker and/or fast
I/O fencing• Storage controller or multipath I/O driver may not support it – OR – it may support it
but we have just not been able to validate it yet
IBM is working closely with many storage vendors to move them up to category 1
© 2015 IBM Corporation23
Current Category 1 Storage List
See "Shared Storage Considerations" section of Information Center for latest information:http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.qb.server.doc/doc/c0059360.html
Includes fast I/Ofencing and
tiebreaker support
© 2015 IBM Corporation24
What is a DB2 Member?
A DB2 engine address space– i.e. a db2sysc process and its threads
Members Share Data– All members access the same
shared database– Also known as “data sharing”
Each member has it’s own– Buffer pools– Memory regions– Log files
Members are logical. Each can have– 1 per machine or LPAR (recommended)– >1 per machine or LPAR
(not recommended)
db2 agents and other threads
log buffer, dbheap, andother heaps
bufferpool(s)
Member 1
Shared database(Single database partition)
Log Log
db2sysc process
Member 0
db2 agents and other threads
bufferpool(s)
db2sysc process
log buffer, dbheap, andother heaps
© 2015 IBM Corporation25
What is a Cluster Caching Facility (CF)?
Software technology that assistsin global buffer coherencymanagement and global locking– Shared lineage with System z
Parallel Sysplex– Software based
Services provided include – Group Bufferpool (GBP)– Global Lock Management (GLM)– Shared Communication Area (SCA)
Members duplex GBP, GLM, SCAstate to both a primary and secondary– Done synchronously– Having a secondary is optional
(but recommended)– Set up automatically, by default
Shared database(Single database partition)
Log Log
GBP GLM SCA
Primary
Secondary
db2 agents and other threads
log buffer, dbheap, andother heaps
bufferpool(s)
db2 agents and other threads
bufferpool(s)
log buffer, dbheap, andother heaps
CF Self-Tuning Memory
CF memory is optimally distributed betweenconsumers based on workload– Less administrative overhead for DBA, with reduction
in memory monitoring and management
Can function at two levels– Dynamic distribution of CF memory between
multiple databases in an instance– Dynamic distribution of database's CF
memory between its consumers• Group buffer pool (GBP)• Global lock manager (GLM)• Shared communication area (SCA)
Beneficial for multi-tenant environmentswhere multiple databases are consolidatedwithin the same DB2 pureScale cluster
DB #2
DB #1
GBP GLM
SCA
GBP GLM
SCA
CF Memory
© 2015 IBM Corporation30
Achieving Efficient Scaling – Key Design Points
Deep RDMA exploitationover low latency fabric– Enables round-trip response time
~10-15 microseconds
Silent Invalidation– Informs members of page updates– Requires no CPU cycles on
those members– No interrupt or other message
processing required– Increasingly important as cluster grows
Hot pages available withoutdisk I/O from GBP memory– RDMA and dedicated threads enable
read page operations in ~10s of microseconds
GBP GLM SCA
Buffer Mgr
Lock Mgr Lock Mgr Lock Mgr Lock Mgr
Can I have this lock ?
Yes, here you are.
New
page image
Rea
d P
age
© 2015 IBM Corporation31
Scalability Demonstration
0123456789
101112
0 5 10 15
1.98x @ 2 members
3.9x @ 4 members
# Members
Th
rou
gh
pu
t v
s 1
me
mb
er
7.6x @ 8 members
10.4x @ 12 members
20Gb/s IB HCAs7874-024 IB Switch
OLTP 80/20 R/W workloadNo affinity
12 8-core p550 members64 GB, 5 GHz each
Duplexed CFson 2 additional 8-core p550s64 GB, 5 GHz each
DS8300 storage576 15K disksTwo 4Gb FC Switches
© 2015 IBM Corporation32
Top SAP Certified Transaction Banking (TRBK) Benchmark Result with DB2 pureScale
Benchmark reflects the typical day-to-dayoperations of a retail bank
Day processing:– 90 million accounts and 1.8 billion postings– Over 56 million postings per hour
Night processing:– Over 22 million accounts balanced per hour
Benchmark Configuration– Five 3690 X5 servers– Total database size: 9 TB uncompressed,
3.5 TB compressed
http://www.sap.com/solutions/benchmark/trbk3_results.htm
© 2015 IBM Corporation33
DB2 pureScale and SAP TRBK:Near Linear Scalability From One To Four Members
Postings per hour (millions) as function of number of members
0.0
10.0
20.0
30.0
40.0
50.0
60.0
1 2 3 4
Postings/hour (millions)
Note: This data was run separately and is not officially certified as part of the benchmark result
2x
3x
3.9x
© 2015 IBM Corporation34
Member Hardware Failure: Member Restart on Guest Host
Log
CS
CS
DB2
Shared Data
Power cord tripped over accidentally DB2 Cluster Services looses heartbeat and
declares member down– Informs other members and CF servers– Fences member from logs and data– Initiates automated member restart
on another (“guest”) host> Using reduced, and pre-allocated
memory model– Member restart is like a database crash recovery in
a single system database, but is much faster• Redo limited to in-flight transactions
(due to FAC)• Benefits from page cache in CF
In the mean-time, client connections are automatically re-routed to healthy members
– Based on least load (by default), or,– Pre-designated failover member
Other members remain fully available throughout – “Online Failover”
– Primary retains update locks held by member at the time of failure
– Other members can continue to read and update data not locked for write access by failed member
Member restart completes– Retained locks released and all data
fully available
CS
DB2
CSDB2
CS
Updated Pages Global Locks
LogLogLog
PrimarySecondary
Updated Pages Global Locks
Fence
CS
DB2
DB2
Pages
Log Recs
Single Database View
Automatic.
Ultra Fast.
Online.
Almost all data remains available. Affected connections transparently re-routed to other members.
Clients
© 2015 IBM Corporation35
Member Failback
Log
CS
CS
DB2
Shared Data
Power restored and system re-booted
DB2 Cluster Services automatically detects system availability
– Informs other members and CFs– Removes fence– Brings up member on home host
Client connections automatically re-routed back to member
CS
DB2
CS
CS
Updated Pages Global Locks
LogLogLog
PrimarySecondary
Updated Pages Global Locks
CS
DB2
DB2
Single Database View
DB2
Clients
© 2015 IBM Corporation36
Primary CF Hardware Failure
Log
CS
CS
DB2
Shared Data
Power cord tripped over accidentally
DB2 Cluster Services looses heartbeat and declares primary down
– Informs members and secondary– CF service momentarily blocked– All other database activity
proceeds normally• E.g. accessing pages in bufferpool,
existing locks, sorting, aggregation, etc
Members send missing data to secondary
– E.g. read locks, page registrations
Secondary becomes primary– CF service continues where it
left off– No errors are returned to
DB2 members
CS
DB2
CSDB2
CS
Updated Pages Global Locks
LogLogLog
PrimarySecondary
Updated Pages Global Locks
CS
DB2
Single Database View
Primary
All data remains available. Completely transparent to members and transactions.
ClientsAutomatic.
Ultra Fast.
Online.
© 2015 IBM Corporation37
CF Failback
Log
CS
CS
DB2
Shared Data
Power restored and system re-booted
DB2 Cluster Services automatically detects system availability
– Informs members and primary
New system assumes secondary role in catchup’ state
– Members resume duplexing– Members asynchronously send lock
and other state information to secondary
– Members asynchronously castout pages from primary to disk
CS
DB2
CSDB2
CS
Updated Pages Global Locks
LogLogLog
Secondary
Updated Pages Global Locks
CS
DB2
Single Database View
Primary
(Catchup state)(Peer state)
Clients
38
Single Failure Scenarios
Failure Mode
DB2 DB2 DB2 DB2
CF CF
DB2 DB2 DB2 DB2
CF CF
DB2 DB2 DB2 DB2
CF CF
Member
Primary CF
Secondary CF
OtherMembersRemainOnline ?
Automatic andTransparent ?
Connections to failed member transparently move to another member
39
DB2 DB2 DB2 DB2
CF CF
DB2 DB2 DB2 DB2
CF CF
DB2 DB2 DB2 DB2
CF CF
Examples of Simultaneous Failures
Failure Mode
OtherMembersRemainOnline ?
Automatic andTransparent ?
Connections to failed member transparently move to another member
Connections to failed member transparently move to another member
Connections to failed member transparently move to another member
41
Single installation for all components
Monitoring integrated into Optim tools
Single installation for fixpacks and updates
Simple command to add and remove members
DB2 pureScale is Easy to Deploy
© 2015 IBM Corporation42
pureScale is DB2
A pureScale environment looks and feels very much likea "regular" DB2 environment– Same code base shared by DB2, DPF, and pureScale– In DB2 10.1 and 10.5, pureScale is just an installable feature of DB2
Immediate productivity from DBAs and application developers– Single system view for utilities
• Act and behave exactly like they do in non-pureScale• Backup, restore, rollforward, reorg, load, …
– Applications don’t need to know about or care about the fact that aremultiple members
• In general, can run SQL statements or command on any member– SQL, data access methods, and isolation levels are the same– Backup/recovery processes are the same– Security is managed in the same way– Environment (even the CFs) still managed by database manager and
database configuration parameters
© 2015 IBM Corporation43
Installation and Adding Capacity
1. Complete pre-requisite work
OS installed, on the network, access to shared disks
2. Add the member db2iupdt –add –m <MemHostName> -mnet <MemInterconnectName> InstName
SD image
You can also: Drop member Add / drop CFs
3. DB2 does all tasks to add the member to the cluster
Copies the image and response file to new host
Runs install Adds new host to the
resources for the instance Sets up access to the
cluster file system for host
Complete pre-requisite work: OS installed, hosts on the network, access to shared disks enabled
Initial installation Copies the DB2 pureScale image to the Install Initiating Host Installs the code on the specified hosts using a response file Creates the instance, members, and primary and secondary CFs
as directed Adds members, primary and secondary CFs, hosts, HCA cards,
etc. to the domain resources Creates the cluster file system and sets up each member’s access to it
Add a member online
host6host0
Install
Member 0
CSCS
scp image and rsp file
host4
Install
CSCS
Primary
host5
Install
CSCS
Secondary
Install
host1
Member 1
CSCS
Install
host2
Member 2
CSCS
Install
host3
Member 3
CSCS
InstallInitiating Host
Copy Image Locally
DB2pureScal
eImage
Member 4
Install
CSCS
Member 4
© 2015 IBM Corporation44
0 host0 0 - MEMBER1 host1 0 - MEMBER2 host2 0 - MEMBER3 host3 0 - MEMBER4 host4 0 - CF5 host5 0 - CF
db2nodes.cfg Host status
Instance status
> db2start08/24/2008 00:52:59 0 0 SQL1063N DB2START processing was successful. 08/24/2008 00:53:00 1 0 SQL1063N DB2START processing was successful. 08/24/2008 00:53:01 2 0 SQL1063N DB2START processing was successful.08/24/2008 00:53:01 3 0 SQL1063N DB2START processing was successful. SQL1063N DB2START processing was successful.
> db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT
0 MEMBER STARTED host0 host0 NO1 MEMBER STARTED host1 host1 NO2 MEMBER STARTED host2 host2 NO3 MEMBER STARTED host3 host3 NO4 CF PRIMARY host4 host4 NO5 CF PEER host5 host5 NO
HOST_NAME STATE INSTANCE_STOPPED ALERT
host0 ACTIVE NO NOhost1 ACTIVE NO NOhost2 ACTIVE NO NOhost3 ACTIVE NO NOhost4 ACTIVE NO NOhost5 ACTIVE NO NO
Instance and Host Status
0 host0 0 - MEMBER1 host1 0 - MEMBER2 host2 0 - MEMBER3 host3 0 - MEMBER4 host4 0 - CF5 host5 0 - CF
db2nodes.cfg
DB2 DB2 DB2 DB2
Single Database View
CF CF
Shared Data
host1host0 host3host2
host5
Clients
host4
© 2015 IBM Corporation45
Workload Balancing
Run-time load information used to automatically balance load across the cluster – Shares design with system z Sysplex– Load information of all members kept on each member– Piggy-backed to clients regularly– Used to route next connection (or optionally next transaction) to least loaded member– Routing occurs automatically (transparent to application)
Failover– Load of failed member evenly distributed to surviving members automatically
Fallback– Once the failed member is back online, fallback does the reverse
Clients
Clients
© 2015 IBM Corporation46
Workload Balancing Across Member Subsets
Data
Member 0 Member 1 Member 2 Member 3
CFCF
Member 4
Batch OLTP
Data
Member 0 Member 1 Member 2 Member 3
CFCF
Member 4
Mix ofOLTP & Batch
Workload balancing can be configured to take place across a subset of members, which enables– Isolation of batch from transactional workloads within a single database– Workloads for multiple databases in a single instance isolated from each other
Example of isolating a batch workload from a transactional workload
© 2015 IBM Corporation47
Optional Affinity-Based Routing
Allows you to target differentgroups of clients or workloadsto different members in the cluster– Maintained after failover …– … and fallback
Example use cases– Consolidate separate
workloads/applications onsame database infrastructure
– Minimize total resource requirementsfor disjoint workloads
Easily configured through client configuration– db2dsdriver.cfg file
App Servers Group A
App Servers Group B
App Servers Group C
App Servers Group D
© 2015 IBM Corporation48
Managing and Monitoring pureScale Using DB2 Tooling
Task pureScale Support Product
Database Administration
Ability to perform common administration tasks across members and CF Integrated navigation through shared data instances
Data Studio
Recover database objects safely, precisely, and quickly DB2 Recovery Expert
Support for high speed unload utility DB2 High Performance Unload
Merge incremental backups into a full backup DB2 Merge Backup
System Monitoring Integrated alerting and notification Seamless view of status and statistics across all members and CFs
Data Studio Web Console
Optim Performance Manager
Configuration Tracking and
Client Management
Full support for tracking and reporting of configuration changes across clients on servers
Optim Configuration Manager
Application Development
Full support for developing Java, C, and .NET applications against a DB2 pureScale environment
Data Studio
Query Tuning Full support for query, statistics, and tuning advice for applications on pureScale systems
Optim Query Workload Tuner
© 2015 IBM Corporation49
Optim Performance Manager
Provide monitoring metrics about the cluster caching facility (CF) on the Overview dashboard– CF CPU and memory utilization– Group Buffer Pool Hit Ratio– CF lock timeouts, lock escalations, and transaction lock wait time
Show enhanced system information on System dashboard– Host status, instance status, CF requests and time, more CPU values, ....
Member information for locking problems on Locking dashboard
Provide DB2 pureScale system templates – DB2 pureScale production with
all details– DB2 pureScale production with
low overhead
© 2015 IBM Corporation50
Optim Performance Manager (cont.)
Cluster Caching Facility (CF) monitoring metrics include– Group Buffer Pool Hit Ratio per connection, statement, buffer pool or
table space– CF locking information, CF requests/time on connection or
statement level– Page reclaim information– CF configuration parameters in database and database
manager reports
Health alerts can notify DBAs or others of CF or member failures
© 2015 IBM Corporation51
Overview Dashboard for DB2 pureScale System
© 2015 IBM Corporation52
System Dashboard for DB2 pureScale System
CF details
© 2015 IBM Corporation53
System Dashboard for DB2 pureScale System (cont.)
Member details
54
Optim Data Administrator pureScale Support
Select Quiesce options that
define how and when the action
should occurLaunch Desired Administration Task Assistant Select which
member to quiesce before taking it offline
View, modify, or execute the commands to
complete a task
© 2015 IBM Corporation55
Isolate Applications using OCM and Member Subsets
Available for pureScale with Optim ConfigurationManager for DB2 for LUW V3.1
Penalty boxing– Protect mission critical applications from cascading effects
of misbehaving applications
Proving grounds– Test new applications with production data in a limited capacity environment
Define and activate rules that dictate which member subset to use– Fine-grained rules based on user, client workstation IP address, data
source name, and other properties– Newly activated rules applied at transaction boundaries
Available for managed clients– Requires Optim Data Tools Runtime Client to be installed on clients– JDBC, ODBC/CLI, .NET
© 2015 IBM Corporation56
OPM and OCM Penalty Box Example
Data
Member 0 Member 1 Member 2
CFCF
Penalty BoxRegular Operation
App A App B App C
• OPM alerts DBA that App C is using excessive CPU
• OPM also shows that App A and App B are affected
Data
Member 0 Member 1 Member 2
CFCF
Penalty BoxRegular Operation
App A App B App C
• User defines and activates a rule in OCM to isolate App C to a restricted environment without any outages
• Performance of App A and App B go back to normal
© 2015 IBM Corporation57
Disaster Recovery Options for pureScale
Variety of disaster recoveryoptions to meet your needs
– HADR
– Storage Replication
– Q Replication
– InfoSphere Change DataCapture (CDC)
– Geographically DispersedpureScale Cluster (GDPC)
– Manual Log Shipping
© 2015 IBM Corporation58
HADR in DB2 pureScale
Integrated disaster recovery solution– Very simple to setup, configure, and manage
Support includes– Asynchronous and super-asynchronous modes– Time delayed apply– Log spooling– Both non-forced (role switch) and forced (failover) takeovers
Member topology must match between primary and standby clusters– Different physical configuration allowed (less resources, sharing of LPAR, etc.)
CFCF CFCF
PrimarypureScale Cluster
Standby DR pureScale Cluster
HADR
© 2015 IBM Corporation59
HADR in DB2 pureScale: Highly Available By Design
If member in primary cluster fails or cannot connect to standby, logs for member shipped by another member to standby (referred to as assisted remote catchup)
If replay member fails then another member automatically takes over and becomes thereplay member
Member
Member
Member
CFCF
Primary Site
Member
Member
MemberCF
CF
Standby Site
Logs 1 Logs 2 Logs 3
Assisted remote catchup - failed member's logs sent by healthy member
TCP/IP
Preferred replay member
Transactions
Member
Becomesreplaymember if preferred member fails
© 2015 IBM Corporation60
Storage Replication
Uses remote disk mirroring technology– Maximum distance between sites is typically 100s of km
(for synchronous, 1000s of km for asynchronous)– For example: IBM Metro Mirror, EMC SRDF
Transactions run against primary site only,DR site is passive– If primary site fails, database at DR site can be brought online– DR site must be an identical pureScale cluster with matching topology
All data and logs must be mirrored to the DR site– Synchronous replication guarantees no data loss– Writes are synchronous and therefore ordered, but “consistency groups” are
still needed• If failure to update one volume, don’t want other volumes to get updated (leaving data
and logs out of sync)
© 2015 IBM Corporation61
Q Replication
High-throughput, low latency logical data replication– Distance between sites can be up to thousands of km
Asynchronous replication
Includes support for:– Delayed apply– Multiple targets– Replicating a subset of data– Data transformation
DB2 pureScale can be a source and/or target of replication– If using pureScale as a source, target does not have to be pureScale– Member topology does not have to match if pureScale both a source and target
DR site can be active– Bi-directional replication is supported for updates on both primary and DR sites
© 2015 IBM Corporation62
Geographically Dispersed pureScale Clusters (GDPC)
A “stretch” or geographically dispersed pureScale cluster (GDPC) spans two sites at distances of up to tens of kilometers
– Provides active/active DR for one or more shared databases across the cluster– Enables a level of DR support suitable for many types of disasters (e.g. fire, data center power
outage)– Supported on AIX (InfiniBand, 10 GE RoCE, TCP/IP) and RHEL/SUSE Linux (10 GE RoCE, TCP/IP)
Both sites active and available for transactions during normal operation
On failures, client connections are automatically redirected to surviving members– Applies to both individual members within sites and total site failure
M1 M3 M2 M4CFSCFP
Site A Site B
Tens of km
Workload fully balanced
M1 M3 M2 M4CFSCFP
Site A Site B
Tens of km
Workload rebalanced on hardware failure
M1 M3 M2 M4CFSCFP
Site A Site B
Tens of km
Workload rebalanced on site failure
© 2015 IBM Corporation63
0%
10%
20%
30%
40%
10km 20km 30km 40km 50kmSite-to-site distance
Write Activity
Percentage of UPDATE, INSERT, or
DELETEoperations
in workload Good candidateworkloads /
configurationsfor GDPC
60km 70km
Good candidateworkloads for other
replication technologies
GDPC Suitability
© 2015 IBM Corporation64
Comparison of pureScale Disaster Recovery Options
HADR
Storage Replication
GDPCQ Replication
/ CDCManual Log
ShippingSync Async
Active/active DR No No No Yes Yes No
Synchronous No Yes No Yes No No
Requires matching pureScale topology at DR site
Yes Yes Yes n/a No Yes
Delayed apply Yes No No No Yes Yes
Multiple DRtarget sites No No No No Yes Yes
Maximum distancebetween sites 1000s km 100s km 1000s
km 10s km 1000s km 1000s km
DB2 Business Application Continuity (BAC) Offering
Two member, active/active DB2 pureScale configuration– All application workloads are directed to one primary active member– Utilities and admin tasks allowed on the secondary admin member– Application workloads quickly failover to secondary member during
planned or unplanned outages
Low cost active/passive licensing model – Primary member fully licensed– Secondary admin member licensed as idle/warm standby– Available for DB2 Workgroup Server Edition (WSE) and DB2 Enterprise
Server Edition (ESE)
Administrative activities allowed on secondary member includes– Backup and restore– Reorg and runstats– Monitoring– Usage of Data Definition Language (DDL)– Database Manager configuration– Database configuration– Log based capture utilities for the purpose of data capture– Security administration and setup
pureScale cluster implemented using
BAC offering
CF CF
upApps
upAdmin
© 2015 IBM Corporation
© 2015 IBM Corporation66
DB2 10.5 Trial Software Available
90 day trial Includes the pureScale feature With TCP/IP sockets support, no new hardware
investment required to try it out VMware installs also possible
http://www-01.ibm.com/software/data/db2/linux-unix-windows/downloads.html
Login withIBM ID
© 2015 IBM Corporation67
DB2 pureScale Reference Material
DB2 pureScale Redbookhttp://www.redbooks.ibm.com/abstracts/sg248018.html
DB2 pureScale Bookhttp://public.dhe.ibm.com/common/ssi/ecm/en/imm14079usen/IMM14079USEN.PDF
IBM DB2 pureScaleProduct Informationhttp://www-01.ibm.com/software/data/db2/linux-unix-windows/purescale/
DB2 Information Centerhttp://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.licensing.doc/doc/c0057442.html
developerWorks Articleshttp://www.ibm.com/search/csass/search/?q=purescale&sn=dw&dws=dw
IDUGPresentationshttp://www.idug.org/p/se/in/q=purescale&sb=1
© 2015 IBM Corporation68
What Can DB2 pureScale Do For You?
Deliver higher levels of scalability and superior availability
Better concurrency during regular operations
Better concurrency during member failure
Less application design and rework due to transparent scalability
Improved SLA attainment
Lower overall costs for applications that require high transactional performance and ultra high availability
“In all other respects, for scalability to flexibility, through ease of use and high availability, to cost (at least at list prices), IBM appears to offer significant advantages.”- Phillip Howard, Research Director
Backup Slides
© 2015 IBM Corporation
pureScale Roadmap Highlights (DB2 9.8 and DB2 10.1)
pureScale 9.8GA on Power
(12/09)9.8 FP1(03/10)
• Power 7
9.8 FP2(08/10)
• SLES Linux on System x
• DR with QRep• PIT Recovery
9.8 FP3(12/10)
• RHEL Linux• Multiple
databases• 10 GE with
RDMA (RoCE) for SLES
• XML
9.8 FP4(07/11)
• Multiple CF HCAs• Multiple switches• Set write suspend
for snapshots• Improved
serviceability• Monitoring
enhancements• IBM BladeCenter
GDPC(04/11)
• Stretch cluster for AIX
9.8 FP5(06/12)
• Code fixes
10.1 GA(06/12)
• Range partitioning• Workload management• Table space recovery• RoCE for AIX/RHEL• Enhanced CF monitoring• Performance optimizations• pureScale installable
feature of DB2
10.1 FP1(09/12)
• KVM support
10.1 FP2(12/12)
• GDPC for RHEL• Multiple interconnect
adapters for members
• Support for generic Intel x86 rack-mounted servers
10.1 FP3(09/13)
• Snapshot backup scripts
10.1 FP3(05/14)
• Code fixes
9.8
10.1
pureScale Roadmap Highlights (DB2 10.5)
10.5 GA(06/13)
• HADR for pureScale DR• Online add member• Performance optimizations• Topology changing backup/restore• pureScale/non-pureScale
backup/restore• Snapshot backup scripts• Random key indexes• Multi-tenancy enhancements with
member subsets and per-member STMM
10.5 FP1(08/13)
• Online fix pack updates• Explicit Hierarchical Locking• Support for generic Intel x86
rack-mounted servers
10.5 FP2(10/13)
• Code fixes 10.5 FP3(02/14)
• Code fixes
10.5
DB2 Cancun Release(10.5 FP4)
(08/14)• Commodity Ethernet
interconnect (sockets)• VMware/KVM with sockets• Incremental backup/restore• Online table reorg• More GDPC configurations• Power8 support• Integrated flash copy backups• DB2 Spatial Extender• Improved install and upgrade
experience• …
10.5 FP5(12/14)
• AutomaticCF memory
• GDPC with TCP/IP• Business Application
Continuity Offering *• Code fixes