eu08d01

Upload: ji-yong-jung

Post on 07-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 EU08D01

    1/36

    1

    13 October 2008 11:15 12:15

    Platform: DB2 UDB for Linux, UNIX, Windows

    Sigen ChenLockheed Martin

    Session: D01

    Bottlenecks Elimination inReal World DB2 Applications

    ABSTRACT

    Database application performance for a given system (hardware and software)

    may be determined by application behavior, APIs, database design and layout,

    data size, system configurations. This presentation will cover some aspects

    based on the performance improving practice from real world database

    applications. The focus will be on understanding the application behavior;

    creating the right indexes; writing optimal queries, exploring the query features

    wisely; using appropriate APIs for a given requirement, not only on the

    programming language level, but also on the statement attributes such as

    cursor type, data type for binding, fetch orientation, array options; practicing

    proactive maintenance to ensure optimal data layout and statistics; tuning the

    key configuration parameters based on application behavior and system

    monitoring data. The troubleshooting examples and sample code segments are

    used to exemplify the practice. Performance issue debugging and analysis is

    also included.

    In short -

    Presenting some experience from managing Real world DB2 databases

    Sharing some performance data from database application benchmarking

    Exercising some DB2 coding (APIs) options just for curiosity from database

    application performance point of viewing

  • 8/4/2019 EU08D01

    2/36

    2

    2

    Summary

    Diagnosing the real database applications

    Using DB2 native tools and system tools.

    Creating the correct index

    Adding the right indexes

    Removing the unnecessary indexes.

    Choosing the right API for a given job, i.e.,

    Embedded, CLI, ADO/IBM provider, ADO/MS Bridge,JDBC T2, Perl and Shell Script.

    Using proper data type in SQLBindCol(), using arrayfetch/insert, right cursor types, and proper fetching/insertingAPIs.

    Tuning several key cfg parameters such as parallelism,avg_appls etc., refining the options of maintenance tools.

    1. Discussing how to identify the bottlenecks by analyzing thedebugging data using system tools (vmstat, top, prstat, sar, pmapetc.), DB2 native tools (snapshot, even monitor, access plan,db2pd etc..) and profiling tools.

    2. Showing how to collect and analyze the query access plan, andusing the right indexes to reduce the cost of a bottleneck queries.

    3. Analyzing several commonly used DB2 Supported APIs(Embedded-SQL, CLI, JDBC, ADO, Perl, CLP), and theirperformance difference through our test data; Comparingseveral fetch/insert orientations of CLI, statement attributes, andtest the performance.

    4. Writing the most efficient queries, using the query optionswisely, such as blocking features. After all the DBMS supposedto do exactly what application (queries) requested to do.

    5. Understanding the application nature (OLTP or DSS or mixed),and tuning the DBM and DB configuration parametersaccordingly; maintaining the database proactively to ensure theoptimal database performance.

  • 8/4/2019 EU08D01

    3/36

    3

    3

    Performance Factors

    $, Hardware Infrastructure (cpu/mem/io, network),BP, Reasonable Data Layout

    Application behavior

    APIs (Language, Interface)

    Database application design and data layout

    Data size (response time vs size)

    System configurations

    System Maintenance (proactive vs responsive)

    What Could Affect A Given Database Application System Performance?

    - $ and HW infrastructure (cpu/mem/disk, network) is out of the scope of this presentation;

    - Its also assumed that you would have reasonable BP hit ratio and data layout (TS, LOGs,

    striping).

    For a given system (platform HD, SW), something a DBA can do to improve perferformance.

    - Understand business objectives and the applications behavior OLTP, DSS (DW), or mixed?Tuning the system accordingly

    - Number of the active applications. Is parallelism necessary?

    - How applications are implemented? C, Java etc..

    - What APIs are employed? - one may not have control over all language and APIs used by

    applications, but a DBA does have control on maintenance programs and batch jobs.

    - Disks layout and data distribution? Is HA involved? Is DPF involved?

    - As data size growing, performance can be affected significantly (exponential), keep

    scalability in mind. Performance improvement can be a dynamic DBA task.

    - Proactive Maintenance reorg, statistics, binding etc..

    - Troubleshooting examples and some sample code segments are used to exemplify theproactive practice. Performance issue debugging and analysis is also included.

  • 8/4/2019 EU08D01

    4/36

    4

    4

    Performance Improvement Approaches

    Understanding the application behavior Writing optimal queries, exploring the query features wisely

    Creating the necessary indexes

    Using appropriate APIs for a given requirement

    Programming language level

    Statement attributes such as cursor type, data type forbinding, fetch orientation, array options;

    Proactive maintenance to ensure optimal data layout andupdated statistics

    Tuning the key configuration parameters based on applicationbehavior and system monitoring data.

    A DBMS is supposed to do just what applications requested to do.Therefore understanding the application behavior is most important in order tomaximize the performance for a given system. (Occasionally a DBMS does notdo whats expected, then it would become a PMR issue)

    -Indexes can help most of the queries, but not always

    - Developers ought to optimize the queries, not just barely make them work.

    -API

    - Program Level

    choose the right language for your job - C, Java, Perl, or shell scripts

    -Coding level data type, cursor type, fetch orientation, array option,blocking etc..

    - Maintenance as most DBAs do (backup, necessary reorg, update statistics,rebind, data integrity check).

    -Does the database need reorg? Data growth, insertion mode, Online or offline

    -Do I have enough security on LOGs (Primary, Mirror, Archive), how Logsshould be distributed?

    - What RUNSTATS option is the best suited to my system?

    - Configuration Parameter setting (DBM CFG, DB CFG, and registry) basedon benchmarking or stress test

  • 8/4/2019 EU08D01

    5/36

    5

    5

    Examples Summary

    Approach - DB2 native tools + OS fundamental tools

    Create correct index is the key (2~43x on multiple applications)

    Choosing the right API for a given job is essentialEmbedded(1.00)

    CLI (1.03)

    ADO/IBM provider (1.31)

    ADO/MS Bridge (1.47)

    JDBC T2 (1.56)

    Shell Script (4.80)

    Using proper data type (i.e., in SQLBindCol); right cursor types; andproper fetching/inserting APIs

    Tuning based on application behave (e.g., parallelism, avg_appls

    etc) to resolve memory shortage, locking, response time runstats options (e.g., had 37x performance impact )

    Brief summary of the data/example showing the impact.

    When troubleshooting an issue, where to start?

    - Approach: Basic native tools are always the good places to start, such as

    CPU, mem, io, then examining the snapshot data and event monitoring,queries

    Some prefer to buy monitoring tools, make sure you understand

    how data is collected and interpreted

    - If find long executing queries (bottleneck queries), analyze the access plan ->

    focus on the most costly Plan steps

    - Coding APIs - A business decision and developers skill set. The numbers in

    parenthesis are relative response time in comparison, smaller the better

    - Using proper data type and appropriate cursor type, and fetch orientation.

    Numbers in the parenthesis are the relative execution time- Tuning is based on applications behavior. Configuration parameters should

    be based on benchmarking tests

    - Ensure DB has updated necessary statistics, and optimized access plan

  • 8/4/2019 EU08D01

    6/36

    6

    6

    Understand the Nature of Applications

    OLTP or DSS or Mixed

    Possible limitations vs tolerance

    Example - parallelism (DFT_DEGREE,INTRA_PARALLEL, DFT_QUERYOPT,AVG_APPLS)

    OLTP applications expect faster instant response;

    DSS applications may have complex queries or larger result set. The

    expectation and tolerance may be different.

    Configuration may need to take the application expectation into account.

    OLTP DSS

    Opt level low High

    AVG_APPLS 1 vary: depends on number of complex query

    applications and bufferpool size

    Parallelism no yes

    ------

    DFT_DEGREE 1 [ANY, -1, 1 - 32 767] (CURRENT DEGREE)

    MAX_QUERYDEGREE -1 (ANY) [ANY, 1 - 32 767]

    Number of parallel operations within a database partition when the statementis executed

    INTRA_PARALLEL NO (0) [SYSTEM (-1), NO (0), YES (1)] may require

    more FCM buffer

    DFT_QUERYOPT 5 [ 0 -- 9 ]

    AVG_APPLS 1 or N efficiently using Bufferpool

  • 8/4/2019 EU08D01

    7/36

    7

    7

    Example 1. AVG_APPLS

    SQL10013N, could not load the library Overall application performance improve 3~54%

    The bottleneck query execution time (seconds) and CPUusage (%)

    160.006avg_appls=1

    50105avg_appls=5

    CPU usage

    (%, 4-way Sun)

    Time (Sec.)

    SQL10013N The specified library "" could not be loaded

    In an OLTP application system, response time is essential. What would be

    your tolerable response time when hit a button (or link)? Sub-seconds?

    One would want to tune the system run as quick as possible, which means

    allowing an application to use all the available resource (bufferpool in thiscase) and be done with it.

    When an OLTP query takes about several seconds or more, user might just

    navigate away from the site. In some cases, that means potentially loose

    business.

  • 8/4/2019 EU08D01

    8/36

    8

    8

    Example 2. Intra_parallel

    Turning Intra_Parallel OFF freed up about 1.5GBreal memory and 2 GB swap memory in an 32-bitSun/Solaris system saved system from crashing

    Disabling the Intra_parallelism improved someapplication performance by 2~5%

    Conclusion: choose the features wisely

    Problem: system crashed because swap memory was exhausted.

    Parallelism is a great feature. However, would it help you?

    How did I know it was the intra_parallel=YES that caused the crash?

    Eerror message suggested that No FCM request blocks are available

    (SQL6043C) ; Number of FCM request blocks (FCM_NUM_RQB) can not

    be increased

    2 GB memory saving means a great deal on a 4-way (Sun V880) box.

    Analogy example for this would be that for a simple job that requires climbing

    a ladder, one person can do the job just fine. Two people would be crowded,

    and might cause crash!

  • 8/4/2019 EU08D01

    9/36

    9

    9

    Writing optimal queries/program,

    exploring the query features wisely

    Too many to mention

    A Simple Query Example

    Select C1,Cx from T1 where C1 in (x,y) optimizefor 1000 rows

    What is the expected resultSet?

    Is the blocking necessary?

    Local or n-tier system

    Select C1,Cx from T1 where C1 in (x,y,) optimize for 1,000 rows

    Even a simple query like the above requires the careful coding is the

    blocking really needed? What is the expected resultset? Local database or

    remote? Too often we have seen such clause show up in the OLTP applicationqueries, which caused the performance problem to users.

  • 8/4/2019 EU08D01

    10/36

    10

    10

    Example 3. Using result set block vs non-block undervarious APIs (Win2k-390 system, 100,000 rows)

    0.0410.364.94ADO

    0.0010.461.93JDBC T2

    0.0210.746.49CLI

    0.0310.645.59Embedded

    Stdev/aveR.T.*Stdev/aveR.T.*

    BLOCKING (optimize forN rows)

    NON-BLOCKINGAPI

    R.T. = relative time against the same API used

    Row blocking is a technique that reduces database manager overhead by retrieving a block ofrows in a single operation. These rows are stored in a cache, and each FETCH request in theapplication gets the next row from the cache. When all the rows in a block have beenprocessed, another block of rows is retrieved by the database manager.

    Our test data of fetching 100,000 rows from a 10 columns table (rs=239 bytes, number of rowsper block is 84) in a win2k-zOS system indicated that without blocking, results can befluctuated (being that stdev vs Average is higher), and about 2-6 times slower than that usingblocking.

    The cache is allocated when an application issues an OPEN CURSOR request and isdeallocated when the cursor is closed. The size of the cache is determined by a configurationparameter which is used to allocate memory for the I/O block. The database managerparameter used depends on whether the client is local or remote:

    For local applications, aslheapsz (default 15 x 4K) is used to allocate the cache for rowblocking.

    For remote applications, rqrioblk (default 32K) on the client workstation is used to allocatethe cache for row blocking. The cache is allocated on the database client.

    -- just in case someone wants to know how to determine the size

    Rowsperblock=aslheapsz*4096/rsRowsperblock=rqrioblk/rs

    UNAMBIG, ALL, NO

    -- what if the query only return a handful or several records?

    The blocking would make query response time longer? Because it would try to find outthe first N rows, until it could not get as many rows as specified.

  • 8/4/2019 EU08D01

    11/36

    11

    11

    Example 4.1. Reuse the Statement via Parameter Markers

    intmain () {

    SQLHANDLE henv, hdbc, hstmt;char * sqlstmt = (SQLCHAR *) INSERT INTO T1 (C2, C5) VALUES(?,?);SQLINTEGER *col2, lvalue;SQLCHAR *col5;int rc=0, pass=0;/* allocate henv, hdbc, connect to database *//* allocate statement handle */rc = SQLAllocHandle (SQL_HANDLE_STMT, hdbc, &hstmt);/* prepare the statement */

    rc = SQLPrepare (hstmt, sqlstmt, SQL_NTS);/* assign values to the input variables */col2 = (SQLINTEGER *)malloc(sizeof(int)); *col2=1;col5 = (SQLCHAR *) malloc((sizeof(char))*100);strcpy ((char *)col5, "my 100 characters string, but could be shorter");/* bind the values to the parameter markers */rc = SQLBindParameter(hstmt, 1,

    SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0,(SQLINTEGER *)col2, sizeof((SQLINTEGER *)col2 ),&lvalue);

    rc = SQLBindParameter(hstmt, 2,SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0,(SQLCHAR *)col5, 100, NULL );

    /* execute the statement, assume that 100,000 rows to be inserted into the table */

    while(pass++

  • 8/4/2019 EU08D01

    12/36

    12

    12

    Example 4.2. Reuse the Statement via Parameter Markers

    int main () {

    /* allocate henv, hdbc, connect to database allocate statement handle *//* prepare the statement */

    rc = SQLPrepare (hstmt, sqlstmt, SQL_NTS);/* assign values to the input variables, bind the values to the parameter markers,

    execute the statement, assume that 100,000 rows to be inserted into thetable */

    while(pass++

  • 8/4/2019 EU08D01

    13/36

    13

    13

    Using Appropriate APIs for a

    Given Requirement Scenario: An on going batch job to set document

    status for a list of docIDs passed. Time is essential

    Shell script meant to be interactive (input/invokeCLP/SQL/Commit)

    Programming language, such as C, may allowstreamline the logic, reuse the stmts, more cursormanipulation options etc..

    C:Perl:ksh(opt):ksh(prim) =1:3.76:302:1066

    What presented here is a simple update statement that needs to be executed

    frequently with a list of the record-IDs as input.

    Update table1 set c1=U where c2 in (?)

    What needed was a streamline program, quick and efficiently process the

    documents.

    Efficiency is the key. Numbers were collected against a local database. No

    network traffic was involved, difference is purely caused by API difference

  • 8/4/2019 EU08D01

    14/36

    14

    14

    Example 5. Several APIs Performance

    Comparison in a Local Solaris System

    0

    200

    400

    600

    800

    1000

    1200

    RelativeTime

    C Perl Ksh (opt) Ksh (prim)APIs

    1 3.76

    302

    1066

    C:Perl:ksh(opt):ksh(prim) =1:3.76:302:1066

    (50,000 records for testing, updating)

    C CLI well written, prepare stmt once, reuse it.

    Perl prepare stmt once, reuse it, one more layer of the Interface

    Ksh (opt) auto commit off, quiet, remove the unnecessary print steps etc..

    Ksh (prim) interactive, stdout IO, redundancy with auto commit on this is

    more likely some people would be programming, quick and dirty code barely

    make it work.

  • 8/4/2019 EU08D01

    15/36

    15

    15

    Example 6. APIs Performance in

    two-tier DB2 Connect System

    1 1.03

    1.31

    1.471.56

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    RelativeTime

    Embedded CLI ADO/IBM

    Provider

    ADO/MS

    Bridge

    JDBC T2

    Driver

    APIs

    Notice that numbers in this slide are collected in 2-tier system using a

    composite workload (all kind of the SQLs)

    The comparison data of using CLI, JDBC (Driver Type 2), ADO (use both

    IBM OLE DB Provider for DB2 Server and Microsoft OLE DB Bridge forODBC Drivers), and Static Embedded SQL in a Windows2000 - zOS two-tier

    system. DB2 Connect Server was on the Windows 2000 application client.

    If the time for using Embedded-SQL is normalized to 1.00, the performance

    sequence for fetching data using various APIs (fastest to slowest) is Embedded

    SQL (1.00), CLI (1.03), ADO/IBM provider (1.31), ADO/Microsoft Bridge

    (1.47), and JDBC (1.56). DB2 CLI is comparable to the Embedded SQL! IBM

    Provider outperformed Microsoft Bridge. JDBC is just as expected.

    The magnitude of difference among APIs is that 2-tier system is smaller than

    that in local system. It could be because that in a multiple tier system, more

    factors came into play, such as MF server generally slower; and data transfer

    between server and client.

  • 8/4/2019 EU08D01

    16/36

    16

    16

    Example 7. Performance of three fetch APIswith different data type in binding

    SQLGetData() -

    SQL_C_CHAR

    3.55SQLGetData - PDT

    3.32

    SQLBindCol -

    SQL_C_CHAR

    1.38SQLBindCol - PDT

    1

    SQLFetchScroll -

    SQL_C_CHAR

    1

    SQLFetchScroll - PDT

    0.89

    0

    1

    2

    3

    4

    RelativeTime

    For fetching data, 10 cols x 200,000 rows in our test case, if the time for usingtypical SQLBindCol() is normalized to 1.00, the performance sequence fromthe fastest to the slowest is:

    Proper data type in binding Using SQL_C_CHAR in binding

    SQLFetchScroll 0.89 vs 1

    SQLFetch/SQLBindCol 1 vs 1.38

    SQLGetData 3.32 vs 3.55

    Using proper data type in binding is always better than using SQL_C_CHAR,Therefore, use proper data type in binding, and use array fetch wheneverpossible

    Typically, an application may choose to allocate the maximum memory thecolumn value could occupy and bind it via SQLBindCol(), based oninformation about a column in the result set (obtained via a call toSQLDescribeCol(), for example, or prior knowledge). However, in the case of

    character and binary data, the column can be arbitrarily long. If the length ofthe column value exceeds the length of the buffer the application can allocateor afford to allocate, a feature of SQLGetData() lets the application userepeated calls to obtain in sequence the value of a single column in moremanageable pieces. This API may be used to Java or GUI type of theapplication. Tradeoff is the slow performance.

  • 8/4/2019 EU08D01

    17/36

    17

    17

    Example 8. SQLFetch Orientation

    11.2

    2.9

    0

    1

    2

    3

    RelativeTime

    FORW

    ARD_

    ONLY

    STATIC

    KEYS

    ET-DRIVE

    N

    SQL_CURSOR Orientation

    Cursor Type and SQLFetchScroll()

    In the above examples, the fetch was sequential, i.e., retrieving rows starting with the first row, and ending with the lastrow. In that case, we know SQLFetchScroll() gives the best performance. What if an application to allow the userto scroll through a set of data both forwards and backwards? DB2 CLI has three type of the scrollable cursors

    (1) forward only (default) cursor - can only scrolls forward.

    (2) static read-only cursor - is static, once it is created no rows will be added or removed, and no value in any rowswill change

    (3) keyset-driven cursor - has ability to detect changes to the underlying data, and the ability to use the cursor to

    make changes to the underlying data. Keyset-driven cursor will reflect the changed values in existing rows, anddeleted rows; but it will not reflect added rows. Because the set of rows is determined once, when the cursor isopened. It does not re-issue the select statement to see if new rows have been added that should be included.

    To be able to scroll through the cursor back and forth, cursor has to be defined as SQL_CURSOR_STATIC orSQL_CURSOR_KEYSET_DRIVEN. The position of the rowset within the result set can be specified asSQL_FETCH_NEXT, SQL_FETCH_FIRST, SQL_FETCH_LAST, SQL_FETCH_RELATIVE,SQL_FETCH_ABSOLUTE, SQL_FETCH_PRIOR, and SQL_FETCH_BOOKMARK in the SQLFetchScroll()call.

    Performance impact

    From the performance point of the view, a static cursor involves the least overhead, if the application does not need theadditional feature of a keyset-driven cursor then a static cursor should be used. If the application needs to detectchanges to the underlying data, or needs to add, update, or delete data from the result set, then the keyset-drivencursor may be used. Also, if one needs to scroll the cursor back and forth, cursor type needed to be set toSQL_CURSOR_STATIC, the default value for the type of scrollable cursor isSQL_CURSOR_FORWARD_ONLY. If we compared the performance for fetching data using STATIC andKEYSET-DRIVEN with that using FORWARD_ONLY, we would see 1.2 and 2.9 times slower for Static and

    Keyset-drive cursor respectively compared to forward only cursor. I.e., the features come with a cost.

    An example of using various type of the cursors in array fetch with specified fetch orientation (see next slide)

  • 8/4/2019 EU08D01

    18/36

    18

    18

    Sample Code of Using Static Cursor

    /* cursor type has to be specified via SQLSetStmtAttr() before theSQLPrepare() */

    rc = SQLSetStmtAttr ( hstmt,

    SQL_ATTR_CURSOR_TYPE,

    (SQLPOINTER) SQL_CURSOR_STATIC,

    0);

    rc = SQLParepare(hstmt, sqlstmt, SQL_NTS);

    /* */

    /* fetch orientation may be specified in SQLFetchScroll() */

    rc = SQLFetchScroll(hstmt, SQL_FETCH_FIRST, 0);/* */

    To be able to scroll through the cursor back and forth, cursor has to be definedas

    SQL_CURSOR_STATIC or

    SQL_CURSOR_KEYSET_DRIVEN.

    The position of the rowset within the result set can be specified as

    SQL_FETCH_NEXT

    SQL_FETCH_FIRST

    SQL_FETCH_LAST

    SQL_FETCH_RELATIVE

    SQL_FETCH_ABSOLUTE

    SQL_FETCH_PRIOR and

    SQL_FETCH_BOOKMARK

    in the SQLFetchScroll() call.

    An example of using STATIC or KEYSET_DRIVEN cursor would be similarto that illustrated in the Sample code, except defining the cursor type andspecifying the fetch orientation

  • 8/4/2019 EU08D01

    19/36

    19

    19

    Example 9. Insert APIs Performance

    1

    0.85 0.81

    0.42 0.420.36

    0

    0.2

    0.4

    0.6

    0.8

    1

    RelativeTime

    SQLB

    indPa

    ram...

    SQLE

    xtendedBi

    nd

    NotLo

    ggedInitially

    Array_Ins

    ert(1

    00)

    CHAINN

    ING

    CLIU

    SE_LOA

    D

    SQL Insert APIs

    For inserting data, if the time for inserting 100,000 rows, one at a time usingSQLBindParameter() is normalized to 1.00,

    the performance sequence from fastest to the slowest is

    CLI USE_LOAD (0.36) - CLI API invokes LOAD; large data

    CHAINING (0.42) - referred to as CLI array input chaining. AllSQLExecute() requests associated with a prepared statement will not be sentto the server until either the SQL_ATTR_CHAINING_END statement attributeis set, or the available buffer space is consumed by rows that have beenchained.

    Array Insert (0.42, Size 100) Inserting multiple rows

    Row Insert with Not Logged Initially Activated (0.81) - reducing the logging

    SQLExtendedBind (0.85) bind array of the columns, some restrictions apply

    SQLBindParameter(1.00) - typical

    Had one only used single row insert via SQLBindParameter(), he would have

    missed a lot of the great options that CLI has to offer.When Array size > 10, changing size does not have significant impact

    Reducing logging with the NOT LOGGED INITIALLY parameter

    SQLExtendedBind()

    This function can be used to replace multiple calls to SQLBindCol() orSQLBindParameter(), however, important differences should be noted.

  • 8/4/2019 EU08D01

    20/36

    20

    20

    Typical Row Insert

    rc = SQLBindParameter(hstmt, 1,SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0,(SQLINTEGER *)col1, sizeof((SQLINTEGER *)col1 ),&lvalue );

    rc = SQLBindParameter(hstmt, 2,SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0,(SQLCHAR *)col2, 100, NULL );

    /* execute the statement, assume that n (100,000) rows to be inserted */while(pass++

  • 8/4/2019 EU08D01

    21/36

    21

    21

    Array Insert

    /* just make up some values for column Col1 and Col2 */SQLINTEGER col1[]= {1,2,3,4,5,6,7,8,9,10, 100};SQLCHAR col2[100][100]= {"A1","B2,C3,D4,E5,F6,G7,H8,I9,J10,z100};/* set array size, 100 for our sample code */

    rc=SQLSetStmtAttr(hstmt,SQL_ATTR_PARAMSET_SIZE,(SQLPOINTER)100, 0);

    /* bind the values to the parameter markers, which is the same as before except this time col1and col2 are arrays */

    rc = SQLBindParameter(hstmt, 1,

    SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0, col1, 0, NULL);

    rc = SQLBindParameter(hstmt, 2,

    SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0, col2, 100, NULL );

    while(pass++

  • 8/4/2019 EU08D01

    22/36

    22

    22

    Chaining

    /* */

    rc = SQLSetStmtAttr(statement.hstmt,

    SQL_ATTR_CHAINING_BEGIN,

    (SQLPOINTER) TRUE,

    0);

    while ( pass++

  • 8/4/2019 EU08D01

    23/36

    23

    23

    Use Load API

    /* allocate henv, hdbc, connect to database, allocate statement handle,

    prepare the statement, assign values to the input variables, bind thevalues to the parameter markers */

    /* begin to use load */rc = SQLSetStmtAttr(hstmt, SQL_ATTR_USE_LOAD_API,

    (SQLPOINTER) SQL_USE_LOAD_INSERT, 0 );/* execute the statement, assume that wed like to insert 100000 rows

    into the table */while(pass++

  • 8/4/2019 EU08D01

    24/36

    24

    24

    Create Necessary Indexes

    Bottleneck Queries First

    Including Stored Procedures, Triggers

    Only those needed Indexes can help, could alsohurt

    How do we know indexes are needed?

    0. identify bottleneck queries snapshot and event monitor data

    1. db2advis is a good tool to start

    2. Analyzing the access plan, find the bottlenecks, try to come up an index to

    reduce the cost

    3. Testing the index(es) created, ensure it improve the bottleneck queries w/o

    hurting other queries too much

  • 8/4/2019 EU08D01

    25/36

    25

    25

    Example 10. SQLs In The Procedures

    Trigger on icmut01005001CREATE TRIGGER CML.TG03_ICMUT01005001 AFTER UPDATE OFATTR0000001024 ON CML.ICMUT01005001 REFERENCING NEW AS NEWFOR EACH ROW MODE DB2SQL WHEN (UPPER(NEW.attr0000001024) notin ('IC','CN') OR NEW.attr0000001024 is null) BEGIN ATOMIC CALLCML.ICHG_QUE_PROC (NEW.ATTR0000001021, NEW.ATTR0000001024,NEW.ATTR0000001025); END

    SP on ICHG_QUE tableCREATE PROCEDURE CML.ICHG_QUE_PROC (IN ATTR1021CHARACTER(26), IN ATTR1024 CHARACTER(2), IN ATTR1025 TIMESTAMP)SPECIFIC CML.ICHG_QUE_PROC LANGUAGE SQL MODIFIES SQL DATABEGIN DECLARE V_CNT INTEGER DEFAULT 0; SELECT count(*) INTOV_CNT FROM CML.ICHG_QUE WHERE CML.ICHG_QUE.ATTR0000001021 =ATTR1021 WITH UR; IF V_CNT < 1 THEN INSERT INTO CML.ICHG_QUEATTR0000001021, ATTR0000001024, ATTR0000001025) VALUES(ATTR1021, ATTR1024, ATTR1025); END IF; END

    No index on ATTR0000001021 which is docID

    In some case a bottleneck SQL may not that obviously. For example when you

    have triggers or stored procedure calls, you may need to examine what SQLs

    in them.

    In the example above, obviously a trigger is defined to call a procedure when

    certain condition is met. The procedure contains a SQL counting for

    something. Unfortunately the count(*) stmt does not have the column in the

    WHERE clause defined as index, therefore a tablescan was inevitable

    whenever there is a modification on a table attribute.

    How many system could afford a tablescan?

  • 8/4/2019 EU08D01

    26/36

    26

    26

    An Index That Reduced The Cost

    IXSCAN

    ( 3)

    50.04 (cost)

    2 (IO)

    |

    477516

    INDEX: CML

    QUE1021

    TBSCAN

    ( 3)

    9539.24 (cost)

    2318 (IO)

    |

    474808

    TABLE: CML

    QUE

    IndexScan after the Index isadded on QUE (attr1021)TableScan Happened Beforethe Index Addition

    Tests on the laboratory server and production system indicated that this index

    addition has increased the performance by 230% using C/CLI where a few

    thousands records are in the table.

    What if there are more than a few thousands records in the table?

  • 8/4/2019 EU08D01

    27/36

    27

    27

    Example 11. Where Should The

    Indexes Be?Stmt: update CML.DocTab

    set docType =X

    where docID=? and docType in (Y,Z)

    docID is unique and docType is not, where the indexshould be?

    Which ever has higher cardinality.

  • 8/4/2019 EU08D01

    28/36

    28

    28

    An index that may hurt the performanceWhat if an index is defined on docType?

    /--------+--------\

    1 0.0202847

    FETCH TBSCAN

    ( 5) ( 7)

    100.048 0.0457899

    4 0

    /---+---\ |

    1 1.28141e+06 2

    IXSCAN TABLE: CML TEMP

    ( 6) DocTab ( 8)

    75.0417 0.0159013

    3 0

    | |

    1.28141e+06 2

    INDEX: CML TBSCAN

    IndeX2 ( 9)

    6.67186e-05

    0

    |

    2TABFNC: SYSIBM

    GENROW

    /---+---\

    1 1.28141e+06

    IXSCAN TABLE: CML

    ( 5) DocTab

    75.0417

    3

    |

    1.28141e+06

    INDEX: CML

    Index2

    Addition Index on doctypeBefore Adding the Index

    During examination of the query access plan, it was noticed that dropping an

    unnecessary index eliminated three extra operations on temp tables for the

    update SQL statement, and further improved the performance by nearly 40-

    times (60 minutes work of updating 50k rows is completed in 1.5 minutes).

    Why? doctype has low cardinality

    Stmt: update CML.DocTab set docType ='DR' where docID=? and

    docType in ('CN','IC')

    Choose an index on column(s) that has more cardinality (i.e., docID)

  • 8/4/2019 EU08D01

    29/36

    29

    29

    Example 12. APIs +/- Index Effect

    Right indexing (add what needed, remove unnecessary), plus proper APIs,

    have made 466x performance gain.

    From the Figures above, Indexes effect made API effects appeared small,

    however you are looking at double / triple / quadruple differences among theAPIs.

  • 8/4/2019 EU08D01

    30/36

    30

    30

    Time Saved (Indexes + APIs)

    0

    100

    200

    300

    400

    500

    600

    HoursNeededPerYear

    Proactive Optimization Process

    Existing Code

    Optimized (1st Year Including 40 hrs Coding Effort)

    Optimized (Subsequent Years)

    584

    1+401

    Considering the ongoing maintenance, each site may process as many as 2~3

    million records per year. It would take the original ksh script 584 hours, or the

    third partys Legacy program 1368 hours, to complete the job. The optimized

    approach can complete the job in 1.3 hours.

    Taking first years 40 hours effort of optimizing the methods into account, the

    first years hours for marking documents were reduced from 584 (ksh script)

    hours to 41 hours, this represents a net 1st year savings of 543 hours on each

    site. Subsequent years net saving would be 583 hours on each site. There are 7

    (N) such sites on our program.

    Points are

    Using the appropriate API for the right job. For example, C/CLI is much

    faster than ksh script for batch job processing of many records.

    Creating indexes wisely. i.e., adding a necessary index or dropping an

    unnecessary index.

    Some legacy code has had patches+patches+patches would it worthy re-

    writing the core pieces of the code?

  • 8/4/2019 EU08D01

    31/36

    31

    31

    Proactive Maintenance

    Reorg (online vs offline)

    Append_mode (online insertion)

    Runstats (various options)

    Monitor switches - do they need to be on?

    When you have taken care of the Indexes, bufferpools, cfg parameters, logs,

    sort, APIs etc.. What else would you do?

    How about a stress test to push the system to a level where potential

    bottlenecks may become apparent?How about proactive maintenance?

    Is your database need reorg (reorgchk)? Do I have time and resource to reorg?

    How often do I need to update statistics?

    Is there a need to leave the monitor switches on?

  • 8/4/2019 EU08D01

    32/36

    32

    32

    Example 13. APPEND_MODE

    -29.870.750.06-75.470.04ON vsOFF

    (diff %)

    importUPDATESELECTINSERT/select

    DELETE

    Online pages reorganization could have its pros and cons.

    Turing append mode ON helps the insert performance, however nightly or

    weekly reorg is needed.

    When APPEND_MODE is set to ON, new rows are always appended to the

    end of the table. No searching or maintenance of FSCRs (Free Space ControlRecords) takes place. This option is enabled using the ALTER TABLE

    APPEND ON statement, and can improve performance for tables that only

    grow, like journals.

    Performance test is needed to verify, because if does have slight performance

    degradation on select stmt.

  • 8/4/2019 EU08D01

    33/36

    33

    33

    Example 14.1 Runstats Options Effect

    38

    1

    05

    10152025303540

    RelativeTime

    DEFAULT Detailed

    Detailed runstats option

    NUM_FREQVALUES from 10 to 100,NUM_QUANTILES from 20 to 200

    Warning: Performance tests are needed to validate that the option change might help your applications.

    This a case of improving data validation utility (select queries mostly)

    RUNSTATS ON TABLE schema.OBJECTS ON ALL COLUMNS WITH DISTRIBUTION ON KEY COLUMNSDEFAULT NUM_FREQVALUES 100 NUM_QUANTILES 200 AND DETAILED INDEXES ALL ALLOW WRITEACCESS;

    NUM_FREQVALUES

    Defines the maximum number of frequency values to collect. It can be specified for an individual column in the ONCOLUMNS clause. If the value is not specified for an individual column, the frequency limit value will be picked upfrom that specified in the DEFAULT clause. If it is not specified there ei ther, the maximum number of frequencyvalues to be collected will be what is set in the NUM_FREQVALUES database configuration parameter.

    Current value Number of frequent values retained (NUM_FREQVALUES) = 10

    The "most frequent value" statistics help the optimizer understand the distribution of data values within a column. Ahigher value results in more information being available to the SQL optimizer but requires additional catalog space.When 0 is specified, no frequent-value statistics are retained, even if you request that distribution statistics be collected.

    NUM_QUANTILES

    Defines the maximum number of distribution quantile values to collect. It can be specified for an individual column inthe ON COLUMNS clause. If the value is not specified for an individual column, the quantile limit value will be pickedup from that specified in the DEFAULT clause. If it is not specified there either, the maximum number of quantilevalues to be collected will be what is set in the NUM_QUANTILES database configuration parameter.

    Current number of quantiles retained (NUM_QUANTILES) = 20

    The "quantile" statistics help the optimizer understand the distribution of data values within a column. A higher valueresults in more information being available to the SQL optimizer but requires additional catalog space. When 0 or 1 isspecified, no quantile statistics are retained, even if you request that distribution statistics be collected.

    Increasing the value of these two parameters increases the amount of statistics heap (stat_heap_sz) used whencollecting statistics. The default value of statistics heap size (4KB) (STAT_HEAP_SZ) is 4384. You may have toincrease this configuration parameterl.

  • 8/4/2019 EU08D01

    34/36

    34

    34

    Example 14.2 RUNSTATS CMD

    RUNSTATS ON TABLE RMADMIN.RMOBJECTS ONALL COLUMNS WITH DISTRIBUTION ON KEYCOLUMNS DEFAULT

    NUM_FREQVALUES 100

    NUM_QUANTILES 200

    AND DETAILED INDEXES ALL ALLOW WRITEACCESS ;

    Default value for num_freqvalues = 10,

    num_quantiles = 20

  • 8/4/2019 EU08D01

    35/36

    35

    35

    How To Identify A Bottleneck?

    Collecting and analyzing the debug data using basic system tools(vmstat, top, prstat, sar, pmap, iostats etc.); DB2 native tools(snapshot, event monitor, access plan, db2pd, db2advis etc..); andprofiling tools if need to.

    Query access plan - using the right indexes to reduce the cost of abottleneck queries

    Exploring the APIs features based on your need. DB2 supportedAPIs (Embedded-SQL, CLI, JDBC, ADO, Perl, CLP), and theirperformance difference; fetch/insert orientations, statement attributes

    Using the query options wisely, such as blocking features, parametermarking to reuse the statement if repeated calling the same. DBMSsupposed to do exactly what application (queries) requested to do

    Understanding the application nature (OLTP or DSS or mixed), andtuning the DBM and DB configuration parameters accordingly;

    Maintaining the database proactively to ensure the optimal databaseperformance

    Could the bottlenecks identification and elimination be automated?

    Is there anyone interesting in writing up a program that can automatically

    identify performance bottlenecks and eliminate them? Stay tuned.

  • 8/4/2019 EU08D01

    36/36

    36

    Sigen Chen

    Lockheed Martin

    Baltimore, Maryland USA

    [email protected]

    Session D01

    Bottlenecks Elimination in Real World DB2 Applications