breaking the db2 platform barrier

written byJim Wankowski

Quest Software, Inc.

Breaking the DB2 Platform BarrierAn Examination of the Architectural Differences of DB2 UDB for z/OS vs. DB2 UDB for Linux/Unix/Windows

White Paper

WPD_BreakingDB2PlatfBarrier_101606_AG

© Copyright Quest® Software, Inc. 2006. All rights reserved.

This guide contains proprietary information, which is protected by copyright. The software described in this guide is furnished under a software license or nondisclosure agreement. This software may be used or copied only in accordance with the terms of the applicable agreement. No part of this guide may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording for any purpose other than the purchaser's personal use without the written permission of Quest Software, Inc.

WARRANTY

The information contained in this document is subject to change without notice. Quest Software makes no warranty of any kind with respect to this information. QUEST SOFTWARE SPECIFICALLY DISCLAIMS THE IMPLIED WARRANTY OF THE MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Quest Software shall not be liable for any direct, indirect, incidental, consequential, or other damage alleged in connection with the furnishing or use of this information.

TRADEMARKS

All trademarks and registered trademarks used in this guide are property of their respective owners.

World Headquarters 5 Polaris Way Aliso Viejo, CA 92656 www.quest.com e-mail: [email protected] U.S. and Canada: 949.754.8000

Please refer to our Web site for regional and international office information.

Updated—October 10, 2006

http://www.quest.com/

mailto:[email protected]

i

CONTENTS

INTRODUCTION ..........................................................................................1 BASIC COMPONENTS...................................................................................2

INSTALLATION.............................................................................................. 2 SYSTEM CATALOG.......................................................................................... 3 ACCESSING DB2........................................................................................... 3

TERMINOLOGY DIFFERENCES .....................................................................4 DEFINING THE SYSTEM CATALOG........................................................................ 5

STORAGE MANAGEMENT .............................................................................6 Z/OS ........................................................................................................ 6

Volume.................................................................................................. 6 LUW......................................................................................................... 6

Container............................................................................................... 6 OBJECT COMPARISONS ...............................................................................7

BUFFERPOOLS .............................................................................................. 7 z/OS ..................................................................................................... 7 LUW...................................................................................................... 7 Self-Tuning Memory ................................................................................ 7

DATABASES................................................................................................. 8 TABLESPACES............................................................................................... 9

z/OS ..................................................................................................... 9 Types of z/OS Tablespaces ....................................................................... 9

PARTITIONED TABLESPACE ............................................................................. 11 LUW....................................................................................................... 12 TABLES .................................................................................................... 13 INDEXES .................................................................................................. 13 INDEX STRUCTURES ..................................................................................... 14 Z/OS ...................................................................................................... 14 LUW....................................................................................................... 14

PARTITIONING .........................................................................................15 DATABASE PARTITION GROUP ......................................................................... 15

ADMINISTRATION.....................................................................................16 OPTIMIZER................................................................................................ 16 DB2 HINTS............................................................................................... 16 OPTIMIZATION CLASS – GUIDELINES ................................................................. 17 EXPLAIN PROCESSING................................................................................... 18

z/OS ................................................................................................... 18 LUW.................................................................................................... 18

PARALLELISM ............................................................................................. 19 TYPES OF PARALLELISM ................................................................................. 20

Performance Monitoring ......................................................................... 21 UTILITIES ................................................................................................. 21

ii

BACKUP AND RECOVERY ................................................................................ 22 Backups............................................................................................... 22 Recovery Info ....................................................................................... 22

LOGGING.................................................................................................. 23 CIRCULAR LOGGING IN LUW........................................................................... 24 ARCHIVAL LOGGING ..................................................................................... 25 RECOVERY ................................................................................................ 25 RUNSTATS ................................................................................................ 26 REORGANIZING DATA ................................................................................... 26 UNLOADING DATA ....................................................................................... 27 LOADING DATA........................................................................................... 27

CONCLUSION ............................................................................................28 ABOUT THE AUTHOR .................................................................................29 ABOUT QUEST SOFTWARE, INC. ................................................................30

CONTACTING QUEST SOFTWARE....................................................................... 30 CONTACTING QUEST SUPPORT......................................................................... 30

White Paper

1

INTRODUCTION

The popularity of DB2 Universal Database (UDB) running on distributed platforms continues to grow. The recent growth and popularity of DB2 on distributed platforms has resulted in a shortage of experienced non-mainframe DB2 database administrators (DBAs). InformationIT departments today have to deal with tightening budgets and shrinking staffs. The luxury of being a single-platform DBA is becoming a thing of the past. Many DB2 mainframe DBAs find themselves supporting DB2 on these distributed platforms, resulting in a huge learning curve. It is essential for the DB2 DBA of the new millennium to be well versed on running DB2 on multiple platforms.

This paper is geared toward any DB2 DBA responsible for having to support DB2 on multiple platforms, whether you’re a Z Series Operating System (z/OS) DBA with little or no knowledge of distributed platforms or a distributed DBA with little or no knowledge of z/OS. It will cover some of the basic terminology for the different platforms and how they differ, as well as the key architectural differences and administrative issues. These topics are based on DB2 UDB for z/OS V8 and DB2 UDB for Linux/Unix/Windows (LUW) V9.1.

Breaking the DB2 Platform Barrier

2

BASIC COMPONENTS

Most of the object types and functionalities are very similar across the platforms. You will notice that the object types are essentially identical from the database (DB) down, but differ significantly in regards to storage management. We will cover this in more detail in the storage management section.

Z/OS LUW

• Subsystem • VCAT/Volume • Stogroup • Database • Tablespace • Creator • Table • Index • View • Packages • Plans

• Instance • Container • N/A • Database • Tablespace • Schema • Table • Index • View • Packages • N/A

Figure 1: Basic components of databases

Installation

When selecting the actual software to install for DB2, there is basically one choice for z/OS. For the LUW environments, there are several editions to choose from. The Enterprise Server Edition (ESE) is the most common. When installing DB2 in either environment you can choose between a standard environment or a partitioned environment. The concept of partitioning on the different platforms is essentially the same in that you are harnessing the processing power of multiple processors; however, the actual architecture is quite different. This will be discussed in more detail later in this paper.

Z/OS UNIX, WINDOWS

DB2 UDB for z/OS V8 DB2 UDB for LUW V9.1

Editions: • Data Warehouse Edition • Enterprise Server Edition • Workgroup Edition • Express Edition • Personal Edition • Universal Developers Edition • Personal Developers Edition

Figure 2: Editions of DB2

White Paper

3

System Catalog

The system catalogs are very similar between the platforms; however, where they reside and how they are accessed are quite different. To extract information from the z/OS catalog, you typically run queries directly against the appropriate tables. In distributed environments, there are a series of views defined with the schema of SYSCAT for retrieving catalog information and a series of updateable views with a schema of SYSSTAT for updating optimizer-related stats in the catalog. How the catalogs are defined and where they reside will be discussed in the upcoming architectural overview section.

Z/OS LUW

• SYSIBM.xxxx • Most optimizer related fields are

updateable

• SYSIBM.xxxx • SYSCAT

o Read-only views defined for catalog base tables

• SYSSTAT o Updateable set of views o Primarily used for access path

manipulation

Figure 3: System Catalogs

Accessing DB2

IBM provides a core set of tools with the database. Accessing DB2 z/OS is done through a supplied application called DB2 Interactive (DB2I). This facility provides basic functionality for running queries, issuing commands, generating utilities and preparing programs. DB2 on distributed platforms comes with a graphical user interface (GUI) toolset called Control Center and Health Center. Control Center provides basic functionality for doing rudimentary tasks within DB2. Health Center allows you to set up monitoring parameters for autonomic computing. Both Control and Health Centers have the ability to connect to z/OS as a separate add-on.

Z/OS LUW

• DB2I o DB2 tool set (3270 based) o SPUFI o DCLGEN o Bind/Rebind o Command Processor o Utilities o Defaults

• Control Center • Health Monitor

• Control Center o GUI tool set for administration

• Command center • Command line processor • Command window • Script center • Visual Explain • Health Center

o Facilitates autonomic computing

Figure 4: Accessing DB2


4

TERMINOLOGY DIFFERENCES

When dealing with a multi-platform DB2 environment, you will encounter common terminology with completely different meanings. It is important to note the platform when referring to these terms.

Z/OS LUW

System Managed Storage (SMS) software for managing all the Direct Access Storage Devices (DASDs) in an S/390 environment.

A storage group definition in an SMS environment is typically defined with a volume list containing an asterisk (*). This in turn allows SMS to decide which volume to put the dataset on.

System Managed Space (SMS) – This is a tablespace (TS) allocation parameter. See the table space section for more details.

Extent – Physical extension of a dataset based on a secondary allocation. This is not limited to DB2 data sets. When defining any type of data set in z/OS, a secondary value is specified. In the event of the datasets primary allocation becoming full, an extension is taken with the amount of space specified on the secondary allocation.

Extent – A block of data pages that gets allocated based on the EXTENTSIZE parameter of the tablespace definition. See the tablespace section for more details.

Figure 5: Common terms, different meanings

The installation of DB2 in the z/OS environment is known as a subsystem. When a subsystem is created, four system databases are created, bufferpools (BPs) are defined, and there is one configuration file called DSNZPARM. There are typically many databases defined within a subsystem. All applications running in this subsystem share the system resources such as catalog, databases, BPs etc. The installation of DB2 on the distributed platforms is called an instance. Notice that there aren’t any system type objects defined at this time. This is where the architecture really becomes different.

Z/OS LUW

• Subsystem – Logical database environment • Four databases created

o DSNDB06 o DSNDB01 o DSNDB04 o DSNDB07

• Memory Structures • Database Configuration

o DSNZPARM • Many databases

• Instance – Logical database server environment

• Also referred to as a NODE • One to many databases • Database Manager Configuration File

Figure 6: Different Terms, Similar Meaning

White Paper

5

Defining the System Catalog

z/OS

As mentioned above for z/OS, the system catalog for a subsystem gets defined within the system database DSNDB06 when DB2 is installed. This catalog contains the metadata for all objects defined within the subsystem.

LUW

For distributed environments, a system catalog gets defined for every database created.

Z/OS LUW

One common system catalog for all databases defined within a subsystem

• DSNDB06 – Catalog database

Three system tablespaces are created by default for every database

1. SYSCATSPACE – Contains system catalog tables 2. TEMPSPACE – Holds temp tables used by UDB 3. USERSPACE1 – Contains user tables unless tablespace

specified (like DSNDB04)

Figure 7: Overview of System Catalogs

This pictorial view of the architectures clearly depicts the significant differences between the different platforms. DB2 for z/OS is a “share everything” type of architecture vs. a “share nothing” architecture on the distributed platforms. Each database defined within an instance has its own system catalog, bufferpools, log files and configuration file. An instance is very similar to a subsystem on z/OS.

Instance_1Instance_1

PRODDB1Log

Catalog

BP's

DBCONFIG

PRODDB2Catalog

Log BP's

DBCONFIG

Instance_2Instance_2

TESTDB1Log

Catalog

BP's

DBCONFIG

TESTDB2Catalog

Log BP's

DBCONFIG

Log

Catalog

BP's

PRODDB1

PRODDB2

DB2PRODDB2PROD

Log

Catalog

BP's

TESTDB1

TESTDB2

DB2TESTDB2TEST

z/OSz/OS LUWLUWDSNZPARM

DBMCONFIG

DSNZPARM

DBMCONFIG

Figure 8: Pictorial overview of architectures


6

STORAGE MANAGEMENT One of the biggest differences between the platforms is storage management. The physical devices for z/OS are known as volumes. The physical devices for distributed platforms are known as containers. A container in itself is not a physical device but a representation of how the space is defined. DB2 on distributed platforms allows for three different ways of defining containers depending on the type of tablespace defined.

z/OS

Volume

Physical storage device for DB2 z/OS. A volume can contain one or many tablespaces or indexspaces (Iss) as well as non-DB2 files.

• Terminology

• DASD – Direct Access Storage Device

• Logical disk drives • VolSer – Volume serial

• This is a name identifying the disk pack i.e. DB2001 • Storage Group

• DB2 object – A logical grouping of volumes – Can be used by more than one TS or IS – N/A on LUW

LUW

Container

A container is only applied to a single tablespace. A tablespace can have multiple containers if it is defined as Database Managed Space (DMS). The way a container can be defined depends on how the tablespace is defined:

• SMS managed

• Directory name

• D:\MYTS

• DMS managed

• Raw device

1. E: • File name

1. D:\SODADB\SODA.DATA.DMS Determining when to choose SMS or DMS will be discussed in further detail in the tablespace section.

White Paper

7

OBJECT COMPARISONS

This section will discuss the differences in object architecture across the platforms.

Bufferpools

Bufferpools are areas of virtual storage where DB2 maintains data pages to satisfy queries without having to do physical I/O to the DB2 tables. The goal is to keep as many frequently accessed data pages in memory as possible. Bufferpools are by far the most critical memory area when it comes to performance for all platforms of DB2. A tremendous boost in performance can be obtained with properly tuned bufferpools.

z/OS

In z/OS V8, you have the option of allocating up to 50 4K bufferpools and 10 8K,16K, and 32K bufferpools. Minimally, you have to allocate one 4K buffer (BP0), one 8K (BP8K0), one 16K (BP16K0), and one 32K buffer (BP32K). There are quite a few customizable thresholds within z/OS bufferpools to allow you to tailor a bufferpool to a specific application, such as random access vs. sequential access, frequent updates, etc. There are companies that actually change their bufferpool settings throughout the day to adjust for process changes. Refer to the DB2 Administration Guide for details on how to tune these parameters.

LUW

An LUW database must have at least one bufferpool. A default bufferpool (IBMDEFAULTBP) is created automatically when a new database is created. A series of hidden bufferpools are also created in 4K, 8K, 16K, and 32K page sizes. These are there in case a normal bufferpool of the required page size is unavailable due to insufficient memory, or the normal bufferpool is not active for some reason. These hidden bufferpools do not appear in the system catalog or bufferpool system files. A new auto-tune feature in V9.1 will allow DB2 to automatically size your bufferpools based on transaction load. This takes a lot of the guesswork out of tuning your bufferpools.

Self-Tuning Memory

DB2 UDB for LUW V9.1 introduced the concept of self-tuning memory. The feature simplifies the task of managing package cache, sort heap, bufferpools, and locklist. By specifying the database_memory parameter as AUTOMATIC, DB2 will dynamically adjust these memory parameters based on database workload. This parameter can be left on or, once a typical workload has been set, the values can be frozen by turning off the automatic parameter.


8

Z/OS L,U,W

• 80 bufferpools available o 50 4K o 10 8K, 16K, 32K o Become active when space is

assigned • Defined at install time in DSNZPARMS • Highly configurable • Shared by all objects in subsystem

• Defined within a database • Can only be used by objects within database • Defined via Data Definition Language (DDL) • Memory size is only configuration • Number limited by amount of memory

available

Figure 9: BP Overview

Databases

The database is the highest level in the relational hierarchy and can be described as the “wrapper” which identifies a grouping of tablespaces, tables, indexes, views, etc. One of the biggest differences here, architecturally, is the fact that in z/OS, we have a single catalog containing all information regarding all databases in the subsystem, whereas in LUW databases, a new catalog is created for each database and contains only information about that specific database.

Z/OS LUW

• Logical grouping of DB2 objects o Does not consume resources

• Many DBs in subsystem • Meta data for all databases stored in one

system catalog

• Logical grouping of DB2 objects

• Typically one database/instance • More like an z/OS subsystem • Catalog for each database defined within

database o SYSCATSPACE o TEMPSPACE o USERSPACE

• Bufferpools defined in database • Database configuration file

Figure 10: Database Overview

White Paper

9

Tablespaces

z/OS

Tablespaces in z/OS are generally segmented and partitioned. Four types of tablespaces can be defined:

• Simple

• Segmented

• Partitioned

• Large (DSSIZE)

Two types of allocation methods exist for z/OS:

• Virtual Catalog(VCAT)

• Stogroup

VCAT method is very seldom used other than for defining system catalogs. It requires that the Virtual sequential Access Method(VSAM) data set be pre-defined via an IDCAMS job prior to the tablespace being created.

Stogroup defined tablespaces allows DB2 to do all the VSAM allocation work for you. It actually runs the IDCAMS job under the covers when the create statement is executed.

When a tablespace is created, a VSAM file is defined with the following format:

• VCAT.DSNDBC.DBNAME.TSNAME.I0001.A001

• VCAT.DSNDBD.DBNAME.TSNAME.I0001.A001

where:

• VCAT – Typically the subsystem name • DBNAME – Database name • TSNAME – Tablespace name • A001 – Partition or dataset number (A001, A002, etc.)

Types of z/OS Tablespaces

Simple Tablespaces

This was the only type of tablespace available in the early releases of DB2. They are limited in their usefulness, in that, if you have multiple tables in a tablespace you have the potential of lock contention issues. Remember that in the event of a lock on TableA, you also will be locking pages from TableB and TableC. And in the event of a TS scan you must scan the data pages for all three tables. Support of simple tablespaces will be dropped in V9, as they really aren’t used much anymore. Segmented tablespaces offer superior performance.


10

Characteristics of a simple tablespace

• One to many tables

• Smallest unit of recovery is the tablespace

TableA TableB TableC

A AB BB A A

B B C B CC C

Page

SIMPTS1

Figure 11: Simple tablespace

Segmented Tablespace in z/OS

Segmented tablespaces allow for efficient access of data, particularly when multiple tables are defined into a tablespace. The table data for each table is physically separated into segments. A segment is a block of pages from 4 – 64 similar to an extent in UDB. I/O performance in general is much better with a segmented vs. a simple tablespace.

Characteristics of a segmented tablespace:

• Can contain multiple tables, but rows are not commingled.

• Space is divided into groups of pages called segments.

• Segsize = 4 to 64 pages each.

• Each segment contains rows for only one table, and each table can have different locking strategy.

• Segmented tablespaces read-only relevant pages during TS scan.

• Automatically reclaim space after drop table.

• Much more efficient for mass deletes.

TableA Data

TableA Data

TableA Data

TableB Data

TableB Data

TableB Data

TableA Data

TableA Data

TableC Data

TableC Data

TableA Data

TableA Data

TableC Data

TableA Data

TableB Data

TableA Data

TableC Data

TableB Data

Figure 12: Segmented tablespace

White Paper

11

Partitioned Tablespace

Partitioned tablespaces are typically used for very large tables. Only one table can be defined into a partitioned tablespace. A partitioned tablespace can have up to 254 partitions. Each partition is a separate physical data set that can be placed on different volumes for optimal performance. Partitioning allows for easier maintenance such as loads, copies, or reorgs because each partition can be acted on independently.

Characterisitics of a Partitioned tablespace

• One table per tablespace.

• Each partition is a separate dataset.

• One – 254 partitions.

• Each partition can be on a separate volume.

• Data placement is controlled by partitioning index.

• Partition independence allows utilities to be run on individual partitions.

• Query Parallelism.

Partts1

Part1Part1

Sales Data forJan-Apr

Part2Part2

Sales Data forMay-Aug

Part3Part3

Sales Data forSep-Dec

SG1

DB2VOL1

SG2

DB2VOL2

SG3

DB2VOL3

Figure 13: Partitioned tablespace


12

LUW

One type of Tablespace

Three Categories

• Regular

• Temporary

• Long

Two Allocation Methods

SMS

• Directory only

DMS

• File

• Device

SMS – System Managed Space allows for the operating system to allocate space for the table as needed. No space parameters are specified. This is the easiest method as far as storage management. It is good for small tables or tables which grow for short periods and shrink back down.

DMS – Database Managed Space requires the specification of space when the tablespace is created. This space is immediately taken and reserved for use by the tablespace.

The type of tablespace chosen depends on the characteristics of the data stored within the tablespace. While DMS tablespaces clearly provide more flexibility for storage capacity, SMS tablespaces are generally recommended for temporary tablespaces and catalog tablespaces.

In addition to understanding the types of tablespaces, it is important to understand how data is managed within the tablespace. Tablespaces can be allocated in 4k, 8k, 16k, and 32k pages. Row size, random vs. sequential access, and several other factors must be evaluated to determine the optimal page size for the tablespace.

Pages are grouped into allocation units called extents. Each time the tablespace needs to allocate additional storage, the extent size is used to determine the size. During insert activity, DB2 will write to a container until the extent size has reached capacity; at that point, another extent will be allocated and continue the write activity.

White Paper

13

Tables

The biggest difference in the table definitions between z/OS and LUW is the way index definitions are handled. In z/OS, there are no predefined indexspaces as there are in LUW.

Z/OS LUW

• One to many tables defined in simple or segmented tablespaces

• Tables and Indexes are independent of each other

• One to many tables can be defined within a tablespace

• Indexspace directly tied to table definition and can exist in same tablespace

Figure 14: Table comparison

Indexes

In z/OS there are no predefined indexspaces as there are in LUW.

When the CREATE INDEX statement is executed the same rules apply as those when creating a tablespace. You can either specify it to be VCAT or storage-group defined. The underlying VSAM file is then created.

When creating a table in LUW, you must have a tablespace predefined for both the table and any indexes you might add to the table. The indexspace specification is part of the table definition. Therefore all indexes for the table use the same indexspace. When using SMS managed tablespaces, the indexspace has to be the same as the tablespace.

Z/OS LUW

• Unique • Non-unique • Clustering* • Partitioning

* - Non partitioned TS only

• Unique • Non-unique • Clustering • Multi-Dimensional (V8)

Figure 15: Index comparison


14

Index Structures

z/OS

• Each index has its own VSAM dataset • Indexspace created when CREATE INDEX executed

o No CREATE INDEXSPACE DDL like tablespaces o Only one index per indexspace o VSAM dataset name can be a little cryptic for indexes

VCAT.DSNDBC.DBNAME.IXNAME.I0001.A001 VCAT.DSNDBD.DBNAME.IXNAME.I0001.A001

Where: o VCAT = Typically the subsytem name o DBNAME = Database name o IXNAME = 8 character representation of IX name o A001 = Dataset number (A001, A002, etc.)

• 2 types of allocation methods o VCAT o STOGROUP

LUW

• Indexes are dependent on tables. Indexspace must be specified when table created.

o All indexes for table use one tablespace o Indexspace is predefined before IX’s are created o Indexes can be defined in same tablespace as table

Required for SMS

White Paper

15

PARTITIONING

Partitioning is the process of breaking large volumes of table data into multiple parts based on a key range. This provides multiple benefits. First, it provides manageability in that the partitions are independent of one another when it comes to maintenance such as reorgs, runstats, copy, etc. Second is the performance benefit of being able to place the partitions on different I/O devices as well as access data from multiple partitions concurrently via I/O parallelism.

Prior to DB2 LUW 9.1, the concept of partitioning was completely different between the platforms. z/OS partitioning is based on a partitioning key with a data range. LUW partitioning is at the database level and a partitioning key is specified, but not a range. The partitioning is done via a hashing algorithm which automatically partitions the data across the different nodes. This is done with the Data Partitioning Feature (DPF) available with DB2 ESE. In 9.1, you are now able to partition tables much like z/OS with a partitioning key range and get the same administrative/performance benefits of partition independence Partitions are created with individual tablespaces which can then be independently reorged, backed up, etc.

Z/OS DPF

• Single table • One-254 partitions • Partitioning key range controls what partition

data resides in • Each partition can be on separate device

• Database partitioned o Multiple tables in database o Partitions usually on separate machines o Hash or range partitioning

• Controlled via Database Partition Groups • DB Part = Node

o Data o Indexes o Config files o Logs

• Specify key but not data values

Figure 16: Partitioning overview

Database Partition Group

• Formerly nodegroup

• A set of one or more database partitions

• A tablespace exists within a nodegroup

• More than one table can be in a nodegroup • Rows are distributed across partitions of nodegroup

• Partitioning Map controls data placement • Hash function places rows on a given partition

• Data will be evenly distributed across nodes in nodegroup


16

ADMINISTRATION

The basic concepts of database administration are the same for any relational database management system (RDBMS). This section will discuss the similarities and differences in maintaining DB2 across multiple platforms.

Optimizer

The Optimizer is what DB2 uses in determining the “roadmap” to use in order to retrieve the results set for a particular SQL statement.

OS/390 UNIX/NT

• Fixed optimization • HINTS allow for some flexibility

o Mainly used to maintain old access path o Must be turned on at install time o Need to modify PLAN_TABLE o Manual Process

• Much more flexible than z/OS o Seven levels of optimization o Adjusted based on query

complexity

Figure 17: Optimizer overview

DB2 Hints

Hints was a concept introduced in z/OS with DB2 V6. Hints allow you to manually update the access plan information in the Plan Table to force DB2 to use a specific access path. This process is mainly used to maintain a specific access path when upgrading to a newer version of DB2.

White Paper

17

Optimization Class – Guidelines

The LUW Optimizer gives the DBA much more flexibility in deciding how much resource should be utilized in optimizing a query. The more complex the query, the higher the optimization level should be used. Remember that the higher the optimization level, the more resources are consumed for optimization. This could be a significant factor when dealing with dynamic SQL.

LEVEL RECOMMENDATION

0 Minimal amount of optimization. Only recommended for very simple SQL accessing well indexed tables. Only nested loop joins and IX scans enabled.

1 Similar to 0 except Merge Scan and TS scan enabled.

2 Recommended for very complex queries which are infrequently executed in a decision support or OLAP environment.

3 Closest to z/OS optimizer. Recommended for queries with four or more joins.

5 DEFAULT – Most cost effective method for mix of simple and complex queries. Optimization will be automatically reduced for complex dynamic SQL if optimizer determines that the resources are not necessary.

7 Same as five except optimization not reduces for complex dynamic SQL.

9 Used to determine whether more comprehensive optimization can generate better access plan for very complex, long running queries using large tables.

Figure 18: Optimizer “Rules of Thumb”

The more complex the query, the higher the optimization level should be used. Remember that the higher the optimization level, the more resources are consumed for optimization. This could be a significant factor when dealing with dynamic SQL.


18

Explain Processing

z/OS

PLAN_TABLE

DSN_FUNCTIONDSN_STATEMENT

Figure 19: z/OS Explain tables

• The EXPLAIN statement was extended to insert information into two new tables for V6

• DSN_FUNCTION table is useful for finding out information about function resolution

• DSN_STATEMENT table is useful for finding out the estimated cost of SQL statements

Unlike the plan table, neither the function table nor the statement table has to exist to use EXPLAIN.

LUW

EXPLAIN_INSTANCE

EXPLAIN_STATEMENT

EXPLAIN_OPERATOR

EXPLAIN_STREAM EXPLAIN_ARGUMENTEXPLAIN_PREDICATE

EXPLAIN_OBJECT

Figure 20: Distributed Explain tables

White Paper

19

• EXPLAIN_ARGUMENT: Represents the unique characteristics for each individual operator.

• EXPLAIN_INSTANCE: Main control table for all Explain information. Each row of data in the Explain tables is explicitly linked to one row in this table. Basic information about the source of the SQL statements being explained and environment information is kept in this table.

• EXPLAIN_OBJECT: Contains data objects required by the access plan to satisfy the SQ statement.

• EXPLAIN_OPERATOR: Contains all the operators needed to satisfy the SQL statement.

• EXPLAIN_PREDICATE: Identifies which predicates are applied by a specific operator.

• EXPLAIN_STATEMENT: Contains the text of the SQL statement in two forms. The original version entered by the user and a rewritten version generated by the compilation process.

• EXPLAIN_STREAM: This table represents the input and output data streams between individual operators and data objects.

Parallelism

Parallelism is another concept that is radically different across platforms. Parallelism is restricted to EEE environments for distributed DB2.

z/OS SYSPLEX

DB2 Catalog

DASD

Workfile DB

BSDS

Log

DB2A

CPU1

Log

Workfile DB

BSDS

DB2B

CPU2CouplingFacility


20

DB Part 0

CPU1

Log Data

DB Part 1

CPU2

Log Data

DB Part 2

CPU3

Log Data

DB Part 3

CPU4

Log Data

Fast Communication Manager

ESE Data Partitioning Feature MPP

Figure 21: Parallelism examples

There are two types of configurations when using DPF:

• Massively parallel processor (MPP) (pictured in Figure 21)

• Multiple machines with single processors grouped together in a cluster • “Shared nothing” configuration

• Symmetric multi-processor (SMP)

• Multiple processors on a single machine

Types of Parallelism

z/OS

I/O—DB2 concurrently pre-fetches data from multiple partitions.

CPU—DB2 starts multiple tasks in parallel to process query.

SYSPLEX—Same as CPU except tasks are spread across machines in sysplex. (figure 24)

LUW

• I/O

• Multi-container TS

• Query

• Intra-partition (SMP)

• Parallelism within single partition

• Inter-partition (EEE/MPP)

• Parallelism across multiple partitions

White Paper

21

Performance Monitoring

z/OS

z/OS traces offer much greater detail but can also cause much more overhead. Records can be output to either SMF or GTF record types.

OS/390 UNIX/NT

Instrumentation Facility Component (IFC) • Statistics

o Global statistical data • Accounting

o Detail info for specific application • Audit

o Table access audits o Requires AUDIT keyword on table

definition • Performance

o Most detailed $$$ o Only use for short periods

• Monitor Makes trace data available for monitoring applications

The amount of memory used for database monitoring is configurable in the DBM configuration file using the monheapsz parameter.

Control center, CL, or third-party monitor used to view trace output.

• Snapshot Monitor o Show status of database for an

instant in time • Event Monitor

o Historical status over time Databases Tablespaces Connections Tables Statements Transactions Deadlocks

Figure 22: Monitoring facilities

Utilities

As you can see, the utilities available are very comparable across the platforms; however, the utilities for z/OS are still considerably more robust.

Z/OS LUW

• COPY • DSNTIAUL/Fast Unload • LOAD • RECOVER • REORG (TS, IX) • RUNSTATS

o “Real-Time Statistics” (V7) • QUIESCE • MERGECOPY • CHECK DATA

• BACKUP • EXPORT • LOAD/IMPORT • RESTORE • REORG (Table)

o REORGCHK • RUNSTATS • QUIESCE • Set Integrity

Figure 23: Utility comparison


22

Backup and Recovery

Backups

These are the components that comprise a complete backup for the different environments.

Z/OS LUW

• Tablespace • Index • Components

o Full Copy o Incremental Copy o Copy to Copy (V7) o Active/Archive Logs o BSDS o SYSLGRNX

• Database • Tablespace • Components

o Backup Image o Incremental Copy (7.2) o Backup History File o Active Logs* o Archive Logs*

*Depends on how logging is defined for DB

Figure 24: Components of backups

Recovery Info

Z/OS LUW

Bootstrap Dataset (BSDS) • Inventory of all active and archive log data

sets • Range of log records in each log file • Restart information

o Size/Thresholds of Bufferpools and Hiperpools

Recovery History File • Updated:

o Backup of full DB or TS o Restore of full DB or TS o Load of a table o Quiesce TS

• Contains: o Part of DB which was copied o When DB was copied o Location of the copy o Time of last restore

Figure 25: BSDS vs. Recovery History File

White Paper

23

Logging

z/OS

Logs are defined at the subsystem level and are global for all objects. All update activity is logged in the current active log. When the active log is full, it is automatically archived. Dual Logging provides a redundant backup of log files in case of media failure. All activity is logged to current active log. Once the active log is full it is auto-archived to tape and the next active log in sequence is used. Dual Logging permits two of these processes to occur in parallel for failure protection.

LUW

There is no concept of auto-archiving. When a primary log file is filled, a secondary file will be allocated. This will continue until no more secondary logs are available.

Z/OS LUW

• Logs apply to entire subsystem o Active o Archive

• Active logs are automatically archived when full

• Dual-Logging

Defined at database • Circular

o No roll-forward recovery • Archival

o Fully recoverable o Similar to OS/390

Three log files Active Online Archived Offline Archived

• On Demand Archiving o Close and archive an active log at any time

• Dual-Logging

Figure 26: Logging overview


24

Circular Logging in LUW

Supports both crash- and version-type recoveries.

Primary log files are allocated when the database is created.

Secondary log files are allocated as needed.

• Automatically de-allocated when no longer needed

• Good for periodic large units of work

• Non-recoverable databases

• Log files are reused

• Uses active logs only

• Secondary used for overflow

• Roll-forward recovery not possible

• Default method for new DBs

Primary"n" 2

1

3

Secondary

1

"n"

Figure 27: Circular logging process

White Paper

25

Archival Logging

Log files are not reused—it’s a roll forward recovery.

12

13

14

15

16Offline ArchivalFiles moved from activeLog subdirectory.Usually offline media.

Active -Contains informationFor non-committed orNon-externalizedTransactions.

Online Archival -Contains informationfor committed andexternalized transactions.Stored in the active logsubdirectory.

Figure 28: Archival logging detail

Active (15, 16) — Contains information related to units of work that have not yet been committed or rolled back. They also contain information for transactions that have committed, but whose changes have not been written to disk.

Online archive (14) — Contains information related to completed transactions that no longer require crash recovery protection. These are called online because they reside in the same subdirectory as the active logs.

Offline archive (12, 13) — Log files that have been removed from the active log subdirectory. The files must be moved manually. There is no auto-archiving in UDB.

Recovery

There are three basic recovery options available for either z/OS or LUW:

Z/OS L,U,W

• Crash o DB2 restart

• Roll-Forward o IC plus log apply o LOGONLY

• Point in Time o IC only (TOCOPY) o To RBA

• Crash o Uses logs to recover from power interrupts or

application ABENDS • Version

o Image copy (TOCOPY) • Roll-Forward

o Image copy plus log apply

Figure 29: Types of recovery


26

Runstats

The concept of statistics collection for proper optimization is essentially the same across the platforms.

z/OS

The Real-Time statistics facility in z/OS is a stored procedure (DSNACCOR) that gives near real-time feedback of space utilization. This feature requires the real-time stats database to be set up DSNRTSDB.

LUW

In DB2 LUW Version 9.1, automatic statistics collection has been added. The DB2 server collects statistical information about your data in a background process when required. Only optimizer-related stats are collected in order to minimize performance overhead.

Reorganizing Data

The biggest difference between reorganization in z/OS vs. LUW is that in z/OS you reorg either a tablespace or an index. In LUW, you reorganize the table. DB2 for LUW v9.1 added the ability to automatically reorg tables and indexes based on predefined thresholds.

Z/OS LUW

• Tablespace o Log Yes/No o Unload Pause o Shrlevel

• Index • Online

o SHRLEVEL CHANGE

• Table • Index (v8) • REORGCHK

o Determines when reorg is required • Online • Automated reorg (V9.1)

Figure 30: Reorg parameters

White Paper

27

Unloading Data

IBM introduced their new high-performance Unload utility in version 7. This utility allows you to unload data from either a table or an image copy.

Z/OS LUW

• DSNTIAUL o IBM sample program

• REORG UNLOAD PAUSE • UNLOAD Utility

o Table o Image Copy

• EXPORT o Accessed via Control Center or CLP o Rename columns o Multiple output formats

Figure 31: Unload options

Loading Data

Z/OS LUW

• Load Utility o Resume/Replace o Log YES/NO o Runstats/Copy o Sophisticated SQL processing

• ONLINE o SHRLEVEL CHANGE

• Load o Insert/Replace o RUNSTATS o Significantly faster than import o Good for large amounts of data o Online)

• Import o Can dynamically create table o Insert process

Update Replace

o Good for small amounts of data

Figure 32: Data load options


28

CONCLUSION

DBAs are more commonly being asked to manage relational databases regardless of the vendor or the operating system on which the database resides. Having a solid foundation in relational database principles is absolutely necessary, but not enough in a heterogeneous database environment. It is also necessary to be able to work with the nuances and varying processes required by the individual database type. There is no substitute for experience and knowledge, but having a tool that standardizes and simplifies these processes will maximize the efficiency of a DBA staff and greatly help to reduce problems that can result in application downtime.

White Paper

29

ABOUT THE AUTHOR

Jim Wankowski is currently the DB2 Technology Specialist at Quest Software. Jim has more than 20 years of development and DBA experience with DB2. Jim participated in the original beta program for DB2 in 1984. Jim is well known in the DB2 community. He has written articles for DB2 Magazine, z/Journal, Database Trends & Applications, and regularly presents at IDUG conferences, regional DB2 user groups, and vendor seminars worldwide.


30

ABOUT QUEST SOFTWARE, INC.

Quest Software, Inc. delivers innovative products that help organizations get more performance and productivity from their applications, databases and Windows infrastructure. Through a deep expertise in IT operations and a continued focus on what works best, Quest helps more than 18,000 customers worldwide meet higher expectations for enterprise IT. Quest Software can be found in offices around the globe and at www.quest.com.

Contacting Quest Software

Phone: 949.754.8000 (United States and Canada)

Email: [email protected]

Mail: Quest Software, Inc. World Headquarters 5 Polaris Way Aliso Viejo, CA 92656 USA

Web site www.quest.com

Please refer to our Web site for regional and international office information.

Contacting Quest Support

Quest Support is available to customers who have a trial version of a Quest product or who have purchased a commercial version and have a valid maintenance contract. Quest Support provides around the clock coverage with SupportLink, our web self-service. Visit SupportLink at http://support.quest.com

From SupportLink, you can do the following:

• Quickly find thousands of solutions (Knowledgebase articles/documents).

• Download patches and upgrades.

• Seek help from a Support engineer.

• Log and update your case, and check its status.

View the Global Support Guide for a detailed explanation of support programs, online services, contact information, and policy and procedures. The guide is available at: http://support.quest.com/pdfs/Global Support Guide.pdf


mailto:[email protected]


http://support.quest.com/

http://support.quest.com/pdfs/Global Support Guide.pdf

breaking the db2 platform barrier

Documents