best practices for emc symmetrix4655

Upload: icanari

Post on 02-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    1/29

    Best Practices for EMC Symmetrix8000

    with IBMDB2

    Universal DatabaseTM

    Karen Sullivan

    IBM Canada Ltd

    IBM Toronto Lab

    John MacdonaldEMC Corporation

    June 2003

    Engineering White Paper

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    2/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 1

    Copyright 2003 EMC Corporation and IBM Corporation. All rights reserved.

    EMC and IBM believe the information in this publication is accurate as of its publication date. The

    information is subject to change without notice.

    THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. - NEITHER EMC

    CORPORATION NOR IBM CORPORATION MAKE ANY REPRESENTATIONS OR WARRANTIES

    OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND BOTH

    SPECIFICALLY DISCLAIM IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR

    A PARTICULAR PURPOSE.

    The furnishing of this document does not imply giving license to any IBM or EMC patents.

    References in this document to IBM products, Programs, or Services do not imply that IBM intends to

    make these available in all countries in which IBM operates.

    Use, copying, and distribution of any EMC or IBM software described in this publication requires an

    applicable software license.

    IBM, AIX, DB2, DB2 Universal Database, and RS/6000 are trademarks or registered trademarks of

    International Business Machines Corporation in the United States, other countries, or both.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    3/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 2

    Table of Contents

    Introduction ......................................................................................................................4

    Symmetrix Concepts and Definitions........................................................................5Hypervolumes.......... ........................................................ ................................................... 5

    Hypervolume Size................................................ ....................................................... .....5Metavolumes ......................................................................................................................6

    Metavolume Size.............................................................................................................6Meta Head and Meta Tail ....................................................... .......................................... 7

    Types of Metavolumes.......................... ........................................................ ....................... 7

    Concatenated Metavolume ..................................................... .......................................... 7Striped Metavolume.........................................................................................................8Mirrored Metavolumes ............. ........................................................ ................................ 9

    Channel Directors ................................................... ....................................................... .....9

    DB2 UDB Concepts and Definitions ....................................................................... 10

    Instances ...................................................... ........................................................ ............ 10Databases .................................................... ........................................................ ............ 10

    Database Partitions ....................................... ........................................................ ............ 10

    Nodegroups ........................................................ ....................................................... ... 10Buffer Pools .................................................. ........................................................ ............ 11

    Tables ................................................. ........................................................ ..................... 11

    Table Spaces.. ........................................................ ....................................................... ... 11

    System-Managed ve rsus Database-Managed Table Spaces ..... ...... ..... ...... ..... ..... ...... ..... . 12Containers ................................................ ........................................................ ............ 12Pages ....................................................... ........................................................ ............ 12Extents...................................................... ........................................................ ............ 12

    Prefetch Size............................................................................................ ..................... 12Prefetching ...................................................................... ................................................. 13

    Page Cleaners ........................................................ ....................................................... ... 13

    Configuring a Symmetrix System ........................................................................... 14

    Creating Metavolumes...................................................... ................................................. 14

    Metavolume Size.............................. ........................................................ ..................... 14Hypervolumes versus Physical Disks............................. ................................................. 15Striped versus Concatenated Metavolumes ....................................... .............................. 15Stripe Size ................................................ ........................................................ ............ 15

    Channel Directors ................................................... ....................................................... ... 15

    Configuring the Operating System .........................................................................17

    Multipathing with PowerPath ...................................................... ........................................ 17

    Operating System Logical Volume Striping ...................................................................... ... 17

    Configuring DB2 UDB................................................................................................. 18

    Table Space Container Configurations............................... ................................................. 18

    Shared Nothing ................................................... ....................................................... ... 18Shared Everything........................................................ ................................................. 20JBOD........................................................ ........................................................ ............ 22

    Table Space Configuration......................................................... ........................................ 22

    Extent Size................................................ ........................................................ ............ 22

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    4/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 3

    Prefetch Size............................................................................................ ..................... 22Overhead and Transfer Rate .................................................. ........................................ 22

    Other Tuning Parameters .............................. ........................................................ ............ 23

    I/O Servers ................................................ ........................................................ ............ 23DB2_PARALLEL_IO ................................................................................................... ... 23Multipage File Allocation ............................................... ................................................. 23

    Understanding Existing Systems............................................................................ 24

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    5/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 4

    IntroductionFor every complex problem, there is a solution that is simple, neat, and wrong.

    H. L. Mencken

    H. L. Mencken was a journalist whose clever observation that simple solutions to complicated problems are

    not always right ones, reflects the fear people sometimes feel when attempting to take on new, complicated

    problems. Avoiding the solution that is neat and wrong is never simple; rather, it takes patience, planning,

    and experience. This paper gives a general overview of IBM DB2 Universal Database

    (DB2 UDB) with

    database paritioning and the EMC Symmetrix

    8000 series (Symmetrix). It also supplies practical

    recommendations for implementing DB2 UDB for data warehouse applications running on EMCSymmetrix 8000 series storage servers. The information presented within this paper was compiled using

    DB2 UDB V7.2. However, unless otherwise noted, concepts and methodologies remain the same for DB2

    UDB V8.1.

    The paper does not provide complete descriptions for DB2 UDB or the Symmetrix 8000 series products;

    refer to www.emc.comor www.ibm.com/db2for additional product information.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    6/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 5

    Symmetrix Concepts and DefinitionsThe following is a brief summary of Symmetrix terminology and a discussion of limits required to

    understand the contents of this white paper. Note that the configuration and capacity limits are microcode-

    level dependent and subject to change. For a complete description, review yourEMC Symmetrix Product

    Guide. The limits discussed in this paper are all based on microcode level 5068.

    The physical disks within a Symmetrix system can be subdivided into various-sized logical volumes that

    can be logically joined together again. These logically linked volumes are then presented to a server as an

    addressable device. Within a Symmetrix system there are two types of logical volumes: hypervolumes and

    metavolumes. The maximum number of logical volumes is a microcode-dependent value currently set to

    8000 volumes.

    Hypervolumes

    A hypervolume , also referred to as a hyper, is a range of contiguous space on a single physical disk that is

    defined to be an individually addressable Symmetrix logical volume. Each physical disk can be divided

    into a maximum of 128 hypervolumes. People familiar with the process of creating hypervolumes will

    often refer to the process asslicing upthe physical disks or creating splits. For clarity, the termsslices and

    splitswill not be used to describe hypervolumes. While hypervolumes can be presented to the server as adirectly addressable device, they are also the foundation for creating metavolumes. The major attribute that

    defines a hypervolume is its size.

    Figure 1. A 36 GB Physical Disk Divided into Four Hypervolumes of 9 GB Each

    Hypervolume Size

    The amount of physical disk space associated with one hypervolume is called the hypervolume size.

    Hypervolume size is microcode-dependent and currently limited to 15 GB. In Figure 1, a 36 GB physicaldisk is subdivided into four hypervolumes, each with a hypervolume size of 9 GB.

    36 GB Physical Disk

    9 GB Hypervolume

    9 GB Hypervolume

    9 GB Hypervolume

    9 GB Hypervolume

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    7/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 6

    Metavolumes

    Once a physical disk has been divided into hypervolumes, a group of hypervolumes of the same size can

    then be logically joined across various physical disks to create a metavolume. This newly created logical

    volume can then be presented to a server as an addressable device. The hypervolumes that make up the

    newly created metavolume can no longer be presented to the server as separate devices.

    Figure 2 provides an example of how four metavolumes can logically reside on four 36 GB physical disks.Note, for simplicitys sake only one metavolume is labeled. In the diagram, the four physical disks are

    subdivided into four 9 GB hypervolumes. Each hypervolume is coloured red, yellow, blue, or green. The

    metavolumes are made up of four like-coloured 9 GB hypervolumes. Therefore, there are four

    metavolumes of 36 GB each in the diagram.

    Figure 2. Example Layout of Four Metavolumes across Four Physical Disks

    Metavolume Size

    A metavolume consists of 2 to 255 hypervolumes. Each time a metavolume is created, the number ofhypervolumes it contains must be determined. This value is referred to as the number of hypers per meta

    for one metavolume. The metavolume sizeis simply the product of the hypervolume size and the number

    of hypers per meta.

    metavolume size = ( hypers per meta ) * ( hypervolume size )

    Formula 1. Metavolume Size

    Given the fact that the maximum hypervolume size is 15 GB and the maximum number of hypers per metais 255, the largest metavolume that can be created is 3825 GB or 3.74 TB (based on the formula given).

    9 GB Hypers 36 GB Metavolume

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    8/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 7

    Meta Head and Meta TailAs the hypervolumes are assigned to a metavolume, they are given a sequence number. The first

    hypervolume in the sequence is known as the meta head, and the last hypervolume in the sequence is

    considered the meta tail. All remaining hypervolumes are considered membersof the metavolume. Figure

    3 shows which hypervolumes are considered the meta head, meta members, and the meta tail.

    Figure 3. Relative Positions of a Meta Head, Member, and Tail for an Example Metavolume

    Data is always placed on the metavolume across the hypervolumes from meta head to meta tail. Therefore,

    when data is first written to a metavolume, the first write always takes place on the meta head.

    Types of Metavolumes

    There are two different types of metavolumes: concatenated metavolumes and striped metavolumes. For

    both types, the metavolume size is defined in the same way. For example, if you have the same number of

    hypers per meta (e.g., four) and the same hypervolume size (e.g., 9 GB), then both a concatenated and a

    striped metavolume will produce a device of the same size (e.g., 36 GB). The difference between

    concatenated and striped metavolumes is in the method in which the logical data is placed on the

    underlying hypervolumes. DB2 UDB database data allocation will be discussed in more detail later in the

    paper.

    Concatenated MetavolumeA concatenated metavolume writes data to a hypervolume until the hypervolume size is reached before

    placing data onto the next hypervolume. Therefore, when first allocating data to the metavolume, the meta

    head would receive all the data until the hypervolume size is reached. Only after the meta head is full will

    data be placed onto the next hypervolume.

    Figure 4. Logical Data Placement on a Concatenated Metavolume

    Member TailHead

    9 GB Hypers 36 GB Metavolume

    Member

    Head Tail

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    9/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 8

    Striped Metavolume

    In the case of astriped metavolume , data will be placed on the underlying hypervolumes in multiples of

    Symmetrix cylinders. When an application writes data to the metavolume, the first write will take place on

    the meta head, and the subsequent write will reside on the next member of the metavolume. Allocation will

    continue in this fashion until the meta tail is reached. The process is then repeated starting at the meta head

    once again.

    Figure 5. Logical Data Placement on a Striped Metavolume

    Stripe SizeThe amount of data written to a single hypervolume is known as thestripe size. The size is based on units

    of disk cylinders with the default and minimum value being two cylinders. Since a cylinder is 480 KB of

    data, the minimum stripe size is 960 KB.

    Figure 6. Close Up of a Stripe on a Single Hypervolume

    Stripe WidthThestripe width is the stripe size times the number of hypers per meta. So, if we have the default stripe

    size of 960 KB and four hypers per meta, the stripe width would be 3840 KB.

    Formula 2. Stripe Width (with diagram)

    Stripe Size = 2 cylinders

    Head Tail

    stripe width = ( stripe size ) * ( hypers per meta )

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    10/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 9

    Mirrored MetavolumesAny metavolume, whether it be striped or concatenated, can also be mirrored. This means another

    complete copy of the metavolume is created and is stored on a different physical disk using like hyper

    volumes (same hyper volume size with the same hypers per meta but different physical disks). Even

    though mirrored metavolumes require twice as much physical disk space, the metavolume size does not

    change. The device presented to the server will still be the same size as a nonmirrored metavolume. Whena read request takes place against a mirrored metavolume, either hyper volume where the data resides may

    service the request. Consequently, when a write request takes place, the write must occur on both hyper

    volumes. To safeguard redundancy, a Symmetrix system ensures that mirrored copies are not created on

    the same physical disk.

    Figure 7. Logical Data Placement on a Mirrored Striped Metavolume

    Channel Directors

    Host adapters on the Symmetrix system are known as channel directors. This is where the serverphysically attaches to the storage server via cables. Each card contains a number of fiber, SCSI, or serial

    ESCON ports.

    Same Logical

    Data Written

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    11/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 10

    DB2 UDB Concepts and DefinitionsThe following is a brief summary of some DB2 UDB V7.2 concepts and terminology required to

    understand the contents of this white paper. Although some of the terminology has changed for DB2 UDB

    V8.1, the overall concepts remain the same. For a more in-depth discussion of these and other DB2 UDB

    terms, refer to the DB2 UDB manuals available online.

    For DB2 UDB V7.2:

    http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v7pubs.d2w/en_main

    For DB2 UDB V8.1:

    http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8infocenter.d2w/report?target=mainFrame&fn=c0008880.htm

    Instances

    An instance, in DB2 UDB, is a logical database manager environment where you can create and/or catalog

    databases and set various instance-wide configuration parameters. A database manager instance can also

    be defined as being similar to an image of the actual database manager environment. Furthermore, you canhave several instancesof the database manager product on the same database server. You can use these

    instances to separate the development environment from the production environment, tune the database

    manager to a particular environment, and protect sensitive information from a particular group of people.

    For a partitioned database environment, all database partitions will reside within a single instance and will

    share at the instance level a common set of configuration parameters.

    Databases

    A database is created within an instance. They present logical data as a collection of database objects (e.g.,

    tables and indexes). Each database includes a set of system catalog tables that describe the logical and

    physical structure of the data, configuration files containing the parameter values allocated for the database,

    and recovery log(s).

    DB2 UDB allows multiple databases to be defined within a single database instance. Configuration

    parameters can also be set at the database level to tune various characteristics, such as memory usage and

    logging.

    Database Partit ions

    DB2 UDB allows the user to divide a single database into multiple logical database partitions. Each of

    these database partitions can look and behave as an independent database. Therefore, multiple database

    partitions can reside on the same server, and/or database partitions can reside on many servers. They are all

    part of the same database that is joined through the catalog database partition where the database is actually

    created. This database partition stores the overall database configuration information. Each databaseparti tion also has access to its own set of database-level configuration parameters.

    Another term for a database partition is a node. A unique node number identifies each node.

    NodegroupsA nodegroupis a set of one or more database partitions. For nonpartitioned database implementations,

    there is only one nonconfigurable nodegroup, which is always made up of a single database partition.

    Figure 9 shows how five database partitions can be divided into three different nodegroups. As you can

    see, a database partition can reside within multiple nodegroups. In this example, nodegroup 1 is made up

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    12/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 11

    of database partitions 1, 2, 3, and 4. Nodegroup 2 contains only a single database partition, database

    partition 1, and finally nodegroup 3 compris es database partitions 4 and 5.

    Figure 8. One DB2 UDB Database Comprising Five Database Partitions Grouped into ThreeNodegroups

    Buffer Pools

    A buffer pool is the main memory allocated in the host processor to cache table and index data pages asthey are being read from disk, or being modified. The purpose of the buffer pool is to improve system

    performance. Data can be accessed much faster from memory than from disk; therefore, the fewer times

    the database manager needs to read from or write to disk (I/O) the better the performance. Buffer pools are

    created by database partitions and each partition can have multiple buffer pools.

    Tables

    The primary database object is the table. A table is defined as a named data object consisting of a specific

    number of columns and a various number rows. Tables are uniquely identified units of storage maintained

    within a DB2 table space. They consist of a series of logically linked blocks of storage that have been

    given the same name. They also have a unique structure for storing information that permits that

    information to be related to information in other tables.

    When creating a table, you can choose to have certain objects, such as indexes, stored separately from therest of the table data. In order to do this, the table must be defined to a DMS (data-managed space) table

    space.

    Table Spaces

    A database is logically organized into table spaces. A table space is a place to store tables. The table space

    is where the database is defined to use the disk storage subsystem. One method to spread a table space over

    one or more physical storage devices is to simply specify multiple containers.

    Database

    Partition 1

    Database

    Partition 2

    Database

    Partition 3Database

    Partition 4

    Database

    Partition 5

    Nodegroup 1

    Nodegroup 2

    Nodegroup 3

    DB2 UDB Database

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    13/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 12

    There are three main types of user table spaces: regular, temporary, and long. In addition to these user-

    defined table spaces, DB2 also defines separate system and catalog table spaces. For partitioned database

    environments, the catalog table space resides on the catalog database partition.

    System-Managed versus Database-Managed Table SpacesFor partitioned databases, the table spaces can reside in nodegroups. During the create table space

    command, the containers themselves are assigned to a specific database partition in the nodegroup, thus

    maintaining the shared nothing character of DB2 UDB. Table spaces can be either system-managed

    space (SMS), or data-managed space (DMS). For an SMS table space, each container is a directory in the

    file system, and the operating systems file manager controls the storage space. For a DMS table space,

    each container is either a fixed-size pre-allocated file or a physical volume, and the database manager

    controls the storage space itself.

    Containers

    A containeris an allocation of physical storage. It is a way to define the device that will be made available

    for storing database objects. Containers may be assigned to file systems by specifying a directory. Such

    containers are identified as PATH containers and are used with SMS table spaces. Containers may also

    reference files that reside within a directory. These are identified as FILE containers, and a specific size

    must be identified. FILE containers are only used with DMS file table spaces. Containers may also

    reference raw character devices. These containers are used by DMS raw table spaces and are identified as

    DEVICE containers. Note that the device must already exist on the system before the container can be

    used. In all cases, containers must be unique and can belong to only one table space.

    Pages

    Data is transferred to and from devices in discrete blocks that are buffered in memory calledpages. DB2

    UDB supports various page sizes including 4 KB, 8 KB, 16 KB and 32 KB. When an application accesses

    data randomly, the page size determines the amount of data transferred. In other words, it will correspond

    to the data transfer request size to the disk array. Page size determines the maximum length of a row, and

    is associated with the maximum size of a table space. These limits are shown in Table 1. In all cases DB2

    UDB limits the number of data rows on a single page to 255 rows.

    Table 1. Page Size Limits

    Page Size Max Table Space Size Max Row Length

    4 KB 64 GB 4005 B

    8 KB 128 GB 8101 B

    16 KB 256 GB 16293 B

    32 KB 512 GB 32677 B

    Extents

    An extent is the unit at which space is allocated within a container of a table space for a single table space

    object. This allocation consists of multiple pages. The size of the extent is specified when the table space is

    created. Note that when data is written to a table space with multiple containers, the data is striped acrossall containers in extent-sized blocks.

    Prefetch Size

    The number of pages that the database manager will prefetch can be defined for each table space using the

    PREFETCHSIZE clause with either the CREATE TABLESPACE or ALTER TABLESPACE statements.

    The value specified is maintained in the PREFETCHSIZE column of the SYSCAT.TABLESPACES

    system catalog table.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    14/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 13

    Prefetching

    Prefetching is a technique for anticipating data needs and reading ahead from storage in large blocks. By

    transferring data in larger blocks, fewer system resources are expended and less total time is required.

    Sequential prefetches read consecutive pages into the buffer pool before they are needed by DB2. List

    prefetches are more complex. In this case, the DB2 optimizer optimizes the retrieval of randomly located

    data.The amount of data being prefetched is part of what determines the amount of parallel I/O activity.

    Ordinarily the database administrator should define a prefetch value large enough to allow parallel use of

    all of the available containers, and therefore all of the arrays physical disks.

    Consider the following example:

    A table space is defined with a page size of 16 KB using raw DMS.

    The table space is defined across four containers, and each container resides on a separate logical disk,

    and each logical disk resides on a separate RAID array.

    The extent size is defined as 16 pages (or 256 KB).

    The prefetch value is specified as 64 pages (number of containersx extent size).

    Suppose a user issued a query that results in a table space scan, which then results in DB2 performing a

    prefetch operation. The following would happen:

    DB2 UDB would recognize that this prefetch request for 64 pages (a megabyte) evenly spans four

    containers, and would issue four parallel I/O requests, one against each of those containers. The

    request size to each container would be 16 pages, or 256 KB.

    The AIX

    Logical Volume Manager would divide the 256 KB request to each AIX logical volume into

    smaller units (128 KB is the largest), and pass them on to the array as back -to-backrequests against

    each logical disk.

    An array receives a request for 128 KB; if the data is not in cache, four arrays would operate in parallel

    to retrieve the data.

    After receiving several of these requests, the array would recognize that these DB2 UDB prefetch

    requests are arriving as sequential accesses, causing the array sequential prefetch to take effect.

    Page Cleaners

    Page cleaners write dirty pages from the buffer pool to disk, reducing the chance that agents looking for

    victim buffer pool slots in memory will have to incur the cost of writing dirty pages to disk. For example,if you have updated a large amount of data in a table, many data pages in the buffer pool may be updated

    but not written into disk storage (these pages are called dirty pages). Since agents cannot place fetched data

    pages into the dirty pages in the buffer pool, these dir ty pages must be flushed to disk storage before their

    buffer pool memory can be used for other data pages.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    15/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 14

    Configuring a Symmetrix SystemSeveral factors affect database performance and should be considered when configuring a Symmetrix

    system. Some examples are:

    The size of device required by the database

    The number of physical disk spindles that will service DB2 UDB to provide a device of the size

    required by the database

    The affect (if any) on the positioning schema of the maximum number of hypervolumes allowed with a

    Symmetrix system

    Creating Metavolum es

    When creating metavolumes for a database server,several factors must be considered in order to ensure

    reasonable performance.

    Metavolume SizeA metavolumes size depends on two factors: hypervolume size and the number of hypers per meta. The

    value for either of these factors can affect the overall metavolume performance.

    Hypervolume SizeIf you use a hyper size that is too large, you may not reach the six to ten desired spindles per CPU typically

    recommended by DB2 UDB for your server. For example, if the hypervolume size is 15 GB and the

    sought-after metavolume size is 30 GB, then only two physical disks (one per hypervolume) are required.

    Even if DB2 UDB uses multiple containers created out of these devices, only two physical disks will be

    servicing the requests. However, if a hypervolume size of 5 GB is used, and all six hypervolumes areplaced on different physical disks, then there will be six physical disks servicing the requests.

    However, if your hypervolume size is too small, it is possible to reach the maximum number of logical

    volumes allowed within a Symmetrix system. The equation in Formula 3 describes how to calculate the

    maximum number of physical disks that can be partitioned before reaching this limit. Formula 3 assumes

    each metavolume will be created using the same hypervolume size and number of hypers per meta.

    Formula 3. Maximum Number of Disks

    The rule of thumb for hypervolume size is 9 GB. This value is also easily divisible into common

    Symmetrix disk sizes.

    Hypers per Meta

    Although this does not explicitly affect performance, too many hypers per meta may increase the chancesof wasting disk space. This can only occur when physical disks are dedicated to the database server. In

    this case, it is possible to meet the database disk space requirements without fully allocating the underlying

    physical disks.

    If you use too few hypers per meta, you may not exploit the full performance potential of your Symmetrix

    system, since the underlying disks may not be able to fully parallelize your transactions. The suggested

    starting point is to create a metavolume using four hypers per meta.

    Finally, four hypers per meta combined with a hyper size of 9 GB will produce a 36 GB metavolume. A 36

    GB device is typically large enough without becoming unmanageable.

    maximum # of disks =

    maximum # of volumes -maximum # of volumes

    hypers per meta

    floorphysical disk size

    hyper volume size

    ( )

    ( )

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    16/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 15

    Hypervolumes versus Physical Disks

    The number of hypervolumes servicing a database system is not necessarily equivalent to the number of

    physical disks servicing that same system. Although each hypervolume within a metavolume is typically

    created upon a separate physical disk, those same physical disks can contain other hypervolumes. These

    hypervolumes can belong to other metavolumes servicing the same database system as well.

    Striped versus Concatenated Metavolumes

    When creating a metavolume, there are two methods in which a metavolume can be created: concatenatedand striped. To achieve the best performance, it is recommended that striped metavolumes be used. In the

    concatenated case, some disk spindles can be left idle from lack of data or fro m data always being found on

    the same disk. Therefore, your system will not benefit from using all drive heads. Using striped

    metavolumes will increase the average number of drive heads servicing a request since it will be more

    likely that data being retrieved will be found on different underlying hypervolumes.

    Figure 9. Logical Data Placement on a Striped Metavolume

    Stripe Size

    Another consideration when defining metavolumes is stripe size. The minimum, and currently the default,

    stripe size is 960 KB. This minimum value is based on the size of two disk cylinders. It is possible to set

    this value higher, but it is recommended that the stripe size be left at the default for most systems. This

    allows for the highest likelihood of requested data being spread across more than one underlyinghypervolumes, thus minimizing the chance for a bottleneck to occur on only one resource.

    Channel Directors

    Another physical performance consideration occurs when connecting a Symmetrix system to the database

    server. The I/O cables should be spread across as many channel directors as possible. Each channel

    director has a tangible throughput limit. Therefore, spreading the cables across all available channel

    directors will decrease the likelihood of the channel directors becoming a bottleneck. Figures 10a and 10b

    demonstrate the difference between the recommended and the not recommended method for attaching the

    cables.

    Multiple Fiber Channel (FC) connections per physical server provide both performance and redundancy to

    the overall configuration. With current generation 2 GB FC ports, DB2 UDB can be configured with two

    to four FC ports per physical server per attached Symmetrix system. (It is possible to configure more, but

    they would not normally provide additional value.)

    Head Tail

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    17/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 16

    Figure 10a. Two I/O Cables Connect toTwo Channel Directors (Recommended)

    Figure 10b.Two I/O Cables Connect to

    One Channel Directors (NotRecommended)

    Server

    Symmetrix

    I/O Cables

    Channel Director

    Server

    Symmetrix

    I/O Cables

    Channel Director

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    18/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 17

    Configuring the Operating System

    Mult ipathing w ith PowerPath

    EMC produces a software product,PowerPath , which enables multipathing for Symmetrix arrays and other

    storage systems. Multipathing can increase the overall throughput between storage and server by increasingthe number of I/O channels available to the server to address a specific device. For more detailed

    instructions on multipathing with PowerPath, refer to the PowerPath product guide.

    There are several load balancing policy settings for PowerPath. Changing the policy can have a large impact

    on DB2 UDB performance. However, the default policy, Symmetrix Optimization, is generally best. AIX

    DB2 UDB users must use version 2.1.0 or higher in order to avoid a known performance defect in

    PowerPath.

    Operating Sys tem Lo gical Volume Str ip ing

    For decision support systems, logical volumes at the operating system level should not be striped. The

    striping at the DB2 UDB container level and on the Symmetrix system is enough to exploit parallelism

    without compromising overall sequential detection. Adding additional layers of striping may cause data tobe placed in a random order on the underlying physical disks, which could affect when sequential detection

    occurs. This is less of an issue on systems where the workload is generally random I/O.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    19/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 18

    Configuring DB2 UDB

    Table Space Container Config urations

    There are many different ways the devices presented to a server by a Symmetrix system can be allocated for

    use by DB2 UDB. However, most schemas can be categorized into two major philosophies:shared nothingandshared everything. This discussion assumes each metavolume corresponds to a single file system on the

    database server.

    Shared Nothing

    The basic concept behindshared nothingis resources are isolated for use by specific applications. For DB2

    UDB, this usually means isolating physical disks for use by a particular database partition or table space.

    Therefore, all metavolumes residing on a set of physical disks are used by a single database partition or table

    space. This must be done carefully as more than one metavolume can reside on a physical disk. When

    successful, each database partition or table space will have its own dedicated set of physical disks.

    Figure 12 shows an example of isolating physical disks at the database partition level. In this example, there

    are 16 physical disks. Each set of four physical disks has been divided into four metavolumes as is in Figure

    2. Therefore, the total of 16 separate metavolumes can be addressed by a server.

    For this example, we want to create two SMS table spaces for an imaginary database that has two database

    partit ions. As Figure 12 shows, the file systems that are mounted on the top eight metavolumes will be

    assigned to database partition 1, while the file systems on the bottom eight metavolumes will be assigned to

    database partition 2. Thus, the underlying disks are isolated to be used exclusively by a specific database

    partition. This particular layout corresponds to the CREATE TABLESPACE statement presented in Figure

    13.

    Although not highlighted in the example, shared nothing is typically easier to configure and manage since

    the creation of numerous additional devices is not usually required. However, in some cases, performance

    under this configuration may not be optimal. When a system has a limited number of physical disks, sharing

    all the physical disks between all the DB2 UDB database partitions can cause a performance gain.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    20/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 19

    Figure 11. 16 Physical Disks Arranged in a Shared Nothing Configuration forDB2 UDB

    CREATE TABLESPACE My_Tablespace PAGESIZE 16KMANAGED BY SYSTEM

    USING( /node1/meta_volume1/My_Tablespace,/node1/meta_volume2/My_Tablespace,/node1/meta_volume3/My_Tablespace,

    /node1/meta_volume4/My_Tablespace,/node1/meta_volume5/My_Tablespace,/node1/meta_volume6/My_Tablespace,/node1/meta_volume7/My_Tablespace,/node1/meta_volume8/My_Tablespace) ON NODE (1)

    USING( /node2/meta_volume9/My_Tablespace,

    /node2/meta_volume10/My_Tablespace,/node2/meta_volume11/My_Tablespace,/node2/meta_volume12/My_Tablespace,

    /node2/meta_volume13/My_Tablespace,/node2/meta_volume14/My_Tablespace,/node2/meta_volume15/My_Tablespace,/node2/meta_volume16/My_Tablespace) ON NODE (2)

    EXTENTSIZE 16PREFETCHSIZE 128;

    Mounted On

    /node1/meta_volume1Metavolume 1

    Metavolume 2

    Metavolume 3

    Metavolume 4

    /node1/meta_volume2

    /node1/meta_volume4

    /node1/meta_volume3

    Physical

    Disks

    /node1/meta_volume5Metavolume 5

    Metavolume 6

    Metavolume 7

    Metavolume 8

    /node1/meta_volume6

    /node1/meta_volume8

    /node1/meta_volume7

    /node2/meta_volume9Metavolume 9

    Metavolume 10

    Metavolume 11Metavolume 12

    /node2/meta_volume10

    /node2/meta_volume12

    /node2/meta_volume11

    /node2/meta_volume13Metavolume 13

    Metavolume 14

    Metavolume 15

    Metavolume 16

    /node2/meta_volume14

    /node2/meta_volume16

    /node2/meta_volume15

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    21/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 20

    Figure 12. Example CREATE TABLESPACE Statement to Figure 11

    Shared Everything

    With shared everything, resources are not isolated for use. All applications should have access to all the

    resources. For DB2 UDB, this typically means all database partitions and table spaces will reside on allphysical disks. Therefore, each physical disk will need to be addressable by each database partition. This

    can only be accomplished by creating at least one hypervolume per database partition on every physical

    disk. In addition, if you are planning on using DMS raw table space containers, a separate metavolume must

    be created for each table space in each database partition on every physical disk. You should notice how this

    design can quickly increase the number of devices that must be managed by your system administrator.

    Therefore, the chance of a possible performance gain should be weighed against the extra administrative

    costs.

    Figures 13 and 14 provide an example of creating a shared everything table space on a database with two

    database partitions. Note that the Symmetrix disk configuration for this example has the exact same layoutas in the previous example for shared nothing (Figure 11). As before, 16 physical disks have been divided

    into 16 separate metavolumes. This difference is in how the metavolumes are address by the database

    server(s). Look closely at the ordering of the file system names used as containers for the two database

    partitions in Figure 14. Notice how each database partition in the create table space statement has access to

    each physical disk.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    22/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 21

    Figure 13.16 Physical Disks Arranged in a Shared Everything Configurationfor DB2 UDB

    CREATE TABLESPACE My_Tablespace PAGESIZE 16KMANAGED BY SYSTEM

    USING( /node1/meta_volume1/My_Tablespace,

    /node1/meta_volume2/My_Tablespace,/node1/meta_volume5/My_Tablespace,/node1/meta_volume6/My_Tablespace,/node1/meta_volume9/My_Tablespace,/node1/meta_volume10/My_Tablespace,/node1/meta_volume13/My_Tablespace,/node1/meta_volume14/My_Tablespace) ON NODE (1)

    USING( /node2/meta_volume3/My_Tablespace,

    /node2/meta_volume4/My_Tablespace,/node2/meta_volume7/My_Tablespace,/node2/meta_volume8/My_Tablespace,/node2/meta_volume11/My_Tablespace,/node2/meta_volume12/My_Tablespace,/node2/meta_volume15/My_Tablespace,/node2/meta_volume16/My_Tablespace) ON NODE (2)

    EXTENTSIZE 16PREFETCHSIZE 128;

    Mounted On

    /node1/meta_volume1Metavolume 1

    Metavolume 2

    Metavolume 3

    Metavolume 4

    /node1/meta_volume2

    /node2/meta_volume4

    /node2/meta_volume3

    Physical

    Disks

    /node1/meta_volume5Metavolume 5

    Metavolume 6

    Metavolume 7

    Metavolume 8

    /node1/meta_volume6

    /node2/meta_volume8

    /node2/meta_volume7

    /node1/meta_volume9Metavolume 9

    Metavolume10

    Metavolume 11

    Metavolume 12

    /node1/meta_volume10

    /node2/meta_volume12

    /node2/meta_volume11

    /node1/meta_volume13Metavolume 13

    Metavolume 14

    Metavolume 15

    Metavolume 16

    /node1/meta_volume14

    /node2/meta_volume16

    /node2/meta_volume15

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    23/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 22

    Figure 14. Example CREATE TABLESPACE Statement that Corresponds with Figure 13

    JBOD

    The final way to lay out Symmetrix physical disks is called JBOD (just a bunch of disks). In essence, it is

    another example of shared nothing, where physical dis ks are isolated for use. It is only possible to create aJBOD schema without the use of metavolumes. In this case, each hypervolume is presented to the server as

    a separate addressable device. These devices are then used as containers by various DB2 UDB table spaces.

    Basically, this is the same as the previous shared nothing schema, without the metavolume layer. When

    databases are less than 1 TB in size, removing the metavolume layer does provide an additional performance

    gain. However, if your database is larger, or has a chance of growing past 1 TB, do not use a JBOD

    schema.

    Table Space Con figu ration

    Extent SizeExtent size is usually configured to be the same as the stripe width of the devices on which the table space

    resides. However, a typical stripe width for the Symmetrix system is 3840 KB (960 KB stripe size * 4

    hypers per meta), which is significantly larger than other like systems. Setting the extent size to the stripewidth can actually impede performance; instead, the extent size should be configured around 256 KB.

    Prefetch SizePrefetch size specifies how much data should be read into the buffer pool on a prefetch data request.

    Prefetching data can help queries avoid unnecessary page faults. Therefore, the value of the most efficient

    prefetch size for a table space is closely linked to its workload, and must be tuned on a per-system basis.

    However, a good starting point for a Symmetrix-based system is to multiply the number of containers in thetable space by its extent size in KB, and then double it: This is twice the usual rule of thumb for prefetch

    size and is linked to the ability of the Symmetrix mirrored metavolumes to fulfill a read request from two

    separate physical disks.

    prefetchsize (KB) = extentsize (KB) * # of containers * 2

    Formula 4. Prefetch Size

    Note that prefetchsize is tunable after table space creation. This is not true for extent size and page size.

    These values are set at table space creation time and cannot be altered without re-defining the table space

    and re-loading its data.

    Overhead and Transfer RateTwo other parameters that relate to I/O preference can be configured for a table space: overhead and transfer

    rate. These parameters are used when making optimization decisions, and help determine the relative cost of

    random versus sequential accesses.

    Overheadprovides an estimate (in milliseconds) of the time required by the container before any data is read

    into memory. This overhead activity includes the container's I/O-controller overhead, as well as the disklatency time, which includes the disk seek time.

    Transfer rateprovides an estimate (in milliseconds) of the time required to read one page of data into

    memory.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    24/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 23

    Table 2. Suggested Overhead and Transfer Rate Values

    Transfer Rate

    Disk Capacity Overhead 4 KB 8 KB 16 KB 32 KB

    36 GB 10K RPM 8.7 0.1 0.1 0.3 0.6

    50 GB 7200 RPM 11.6 0.1 0.1 0.3 0.6

    73 GB 10K RPM 8.6 0.1 0.1 0.2 0.5

    181 GB 7200 RPM 11.7 0.1 0.1 0.2 0.5

    Other Tunin g Parameters

    I/O Servers

    The number of I/O servers configured for a database can also have a significant impact on performance. I/O

    servers are used on behalf of the database agents to perform I/O prefetches and asynchronous I/O for utilities

    such as backup and restore. This value, like prefetch size depends on overall system workload. However, a

    good starting point for configuring I/O servers is to count the number of containers in the table space withthe most containers, and multiply that number by two.

    DB2_PARALLEL_IO

    It is recommended that DB2_PARALLEL_IO be set to ONfor all table spaces using containers created on

    RAID devices. And Symmetrix striped metavolumes fall into this category. DB2_PARALLEL_IO allows

    for multiple read and writes to occur on a single container, thus increasing throughput.

    Multipage File Allocation

    In an SMS table space, a file is extended one page at a time as the object grows. If you need improved insert

    performance, you can consider enabling multipage file allocation. This allows the system to allocate or

    extend the file by more than one page at a time. You must run db2empfato enable multipage file

    allocation. In a partitioned database environment, run this utility on each database partition. Once multipage

    file allocation is enabled, it cannot be disabled.

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    25/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 24

    Understanding Existing SystemsIf your database system already exists, it is still possible to understand how the database system relates to

    the underlying disks and vice versa. This information can be vital when monitoring a system for

    performance. In addit ion, it can also help in isolating bottlenecks. Figure 16 gives a general view of the

    relationship between a DB2 UDB database created using a Symmetrix system for storage for an imaginary

    system. The following procedure walks through an example set of commands that can be used to

    determine the nature of the layout for a system. The example output corresponds to Figure 16 and the steps

    follow the diagram from right to left.

    Procedure/Step: Command (examples for AIX):

    Example Output (output corresponds with Figure 14):

    1. Determine the number of

    database partitions.

    db2 connect to (e.g.,my_db)

    db2 list nodes

    db2 connect reset

    NODE NUMBER----------------------

    01

    2 record(s) selected.

    2. Determine which table

    spaces reside within a

    particular database partition

    and their corresponding table

    space ID value.

    export DB2NODE=

    (e.g., 1)

    db2 connect to (e.g., my_db)db2 list tablespaces

    Note:Only table space with IDs 3 and 4 are shown in diagram.

    Tablespaces for Current Database

    Tablespace ID = 0Name = SYSCATSPACE

    Type = System managed spaceContents = Any dataState = 0x0000

    Detailed explanation:Normal

    Tablespace ID = 1Name = TEMPSPACE1

    Type = System managed spaceContents = System Temporary data

    State = 0x0000Detailed explanation:

    Normal

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    26/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 25

    Tablespace ID = 2Name = USERSPACE1Type = System managed spaceContents = Any dataState = 0x0000

    Detailed explanation:

    Normal

    Tablespace ID = 3Name = TABLESPACE1Type = System managed spaceContents = Any dataState = 0x0000

    Detailed explanation:Normal

    Tablespace ID = 4Name = TABLESPACE2Type = System managed spaceContents = Any dataState = 0x0000

    Detailed explanation:Normal

    3. Determine which containers

    belong to a table space

    specified using its table

    space ID.

    Db2 list tablespace containers for

    (e.g., 3)db2 connect reset

    export DB2NODE=

    Tablespace Containers for Tablespace 3

    Container ID = 0

    Name = /my_fs0/tbspaceType = Path

    Container ID = 1Name = /my_fs1/tbspaceType = Path

    Container ID = 2Name = /my_fs2/tbspaceType = Path

    Container ID = 3Name = /my_fs3/tbspace

    Type = Path

    Container ID = 4Name = /my_fs4/tbspaceType = Path

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    27/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 26

    4. Determine the logical

    volume in which the

    container name resides.

    df (e.g., /my_fs2/tbspace)

    Filesystem 512-blocks Free %Used Iused %Iused Mounted on/dev/lv_myfs2 75497472 45462376 40% 17134 1% /my_fs2

    5. Determine the PowerPathphysical volumes that make

    up the logical volume.

    lslv l (e.g., lv_myfs2)

    lv_myfs2:/my_fs2PV COPIES IN BAND DISTRIBUTIONhdiskpower0 288:000:000 20% 058:058:058:058:056hdiskpower1 288:000:000 20% 058:058:058:058:056hdiskpower2 288:000:000 20% 058:058:058:058:056

    hdiskpower3 288:000:000 20% 058:058:058:058:056

    6. Determine the Symmetrix

    volume ID for the meta

    head.

    powermt display dev= (e.g., hdiskpower0)

    Pseudo name=hdiskpower0

    Symmetrix frame ID=000276901285; volume ID=0174state=alive; policy=SymmOpt; priority=0; queued-IOs=0

    ======================================================================--------- Host Devices -------- - Symm - --- Path ---- -- Stats ---### HW-path device director mode state q-IOs errors

    ======================================================================2 fscsi1 hdisk11 FA 3aA active open 0 03 fscsi2 hdisk18 FA 4aA active open 0 04 fscsi3 hdisk25 FA 13aA active open 0 0

    0 fscsi4 hdisk32 FA 14aA active open 0 01 fscsi0 hdisk44 FA 13bA active open 0 0

    7. Determine the Symmetrixphysical disks that make up

    the metavolume on which

    the hdiskpower resides.

    symdev -DA ALL list | head -9

    symdev -DA ALL list | grep ""

    (e.g., hdiskpower0)

    Symmetrix ID: 000276901285

    Device Name Directors Device------------------------- ------------------ -------------------------

    Cap

    Sym Physical SA :P DA :IT Hyper Config (MB)------------------------- ------------------ -------------------------0177 /dev/rhdiskpower0 ???:? 01A:D5 1 2-Way Mir (m) -0176 /dev/rhdiskpower0 ???:? 02A:D5 1 2-Way Mir (m) -

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    28/29

    EMC Corporation and IBM Corporation

    Best Practices for EMC Symmetrix with IBM DB2 Universal Database 27

    0175 /dev/rhdiskpower0 ???:? 15A:D5 1 2-Way Mir (m) -0174 /dev/rhdiskpower0 13B:0 16A:D5 1 2-Way Mir (M) 371250174 /dev/rhdiskpower0 13B:0 01B:C3 1 2-Way Mir (M) 371250175 /dev/rhdiskpower0 ???:? 02B:C3 1 2-Way Mir (m) -0176 /dev/rhdiskpower0 ???:? 15B:C3 1 2-Way Mir (m) -

    0177 /dev/rhdiskpower0 ???:? 16B:C3 1 2-Way Mir (m) -

    8. Determine which otherphysical volumes reside on

    a particular Symmetrix

    physical disk.

    symdev -DA ALL list | head -9

    symdev -DA ALL list | grep " " |

    \ sort -k 5(e.g.,01A:D5 )

    Symmetrix ID: 000276901285

    Device Name Directors Device------------------------- ------------------ -------------------------

    CapSym Physical SA :P DA :IT Hyper Config (MB)

    ------------------------- ------------------ -------------------------

    0177 /dev/rhdiskpower0 ???:? 01A:D5 1 2-Way Mir (m) -0168 /dev/rhdiskpower22 13B:0 01A:D5 2 2-Way Mir (M) 2475001A7 /dev/rhdiskpower43 13B:0 01A:D5 3 2-Way Mir (m) 61880198 /dev/rhdiskpower34 13B:0 01A:D5 4 2-Way Mir (M) 206250117 /dev/rhdiskpower1 ???:? 01A:D5 5 2-Way Mir (m) -

  • 8/10/2019 Best Practices for Emc Symmetrix4655

    29/29

    EMC Corporation and IBM Corporation

    Figure 15. Overall View of an Example Relationship between DB2 UDB and Symmetrix