eu08d01

8/4/2019 EU08D01

1/36

1

13 October 2008 11:15 12:15

Platform: DB2 UDB for Linux, UNIX, Windows

Sigen ChenLockheed Martin

Session: D01

Bottlenecks Elimination inReal World DB2 Applications

ABSTRACT

Database application performance for a given system (hardware and software)

may be determined by application behavior, APIs, database design and layout,

data size, system configurations. This presentation will cover some aspects

based on the performance improving practice from real world database

applications. The focus will be on understanding the application behavior;

creating the right indexes; writing optimal queries, exploring the query features

wisely; using appropriate APIs for a given requirement, not only on the

programming language level, but also on the statement attributes such as

cursor type, data type for binding, fetch orientation, array options; practicing

proactive maintenance to ensure optimal data layout and statistics; tuning the

key configuration parameters based on application behavior and system

monitoring data. The troubleshooting examples and sample code segments are

used to exemplify the practice. Performance issue debugging and analysis is

also included.

In short -

Presenting some experience from managing Real world DB2 databases

Sharing some performance data from database application benchmarking

Exercising some DB2 coding (APIs) options just for curiosity from database

application performance point of viewing

8/4/2019 EU08D01

2/36

2

2

Summary

Diagnosing the real database applications

Using DB2 native tools and system tools.

Creating the correct index

Adding the right indexes

Removing the unnecessary indexes.

Choosing the right API for a given job, i.e.,

Embedded, CLI, ADO/IBM provider, ADO/MS Bridge,JDBC T2, Perl and Shell Script.

Using proper data type in SQLBindCol(), using arrayfetch/insert, right cursor types, and proper fetching/insertingAPIs.

Tuning several key cfg parameters such as parallelism,avg_appls etc., refining the options of maintenance tools.

1. Discussing how to identify the bottlenecks by analyzing thedebugging data using system tools (vmstat, top, prstat, sar, pmapetc.), DB2 native tools (snapshot, even monitor, access plan,db2pd etc..) and profiling tools.

2. Showing how to collect and analyze the query access plan, andusing the right indexes to reduce the cost of a bottleneck queries.

3. Analyzing several commonly used DB2 Supported APIs(Embedded-SQL, CLI, JDBC, ADO, Perl, CLP), and theirperformance difference through our test data; Comparingseveral fetch/insert orientations of CLI, statement attributes, andtest the performance.

4. Writing the most efficient queries, using the query optionswisely, such as blocking features. After all the DBMS supposedto do exactly what application (queries) requested to do.

5. Understanding the application nature (OLTP or DSS or mixed),and tuning the DBM and DB configuration parametersaccordingly; maintaining the database proactively to ensure theoptimal database performance.

8/4/2019 EU08D01

3/36

3

3

Performance Factors

$, Hardware Infrastructure (cpu/mem/io, network),BP, Reasonable Data Layout

Application behavior

APIs (Language, Interface)

Database application design and data layout

Data size (response time vs size)

System configurations

System Maintenance (proactive vs responsive)

What Could Affect A Given Database Application System Performance?

- $ and HW infrastructure (cpu/mem/disk, network) is out of the scope of this presentation;

- Its also assumed that you would have reasonable BP hit ratio and data layout (TS, LOGs,

striping).

For a given system (platform HD, SW), something a DBA can do to improve perferformance.

- Understand business objectives and the applications behavior OLTP, DSS (DW), or mixed?Tuning the system accordingly

- Number of the active applications. Is parallelism necessary?

- How applications are implemented? C, Java etc..

- What APIs are employed? - one may not have control over all language and APIs used by

applications, but a DBA does have control on maintenance programs and batch jobs.

- Disks layout and data distribution? Is HA involved? Is DPF involved?

- As data size growing, performance can be affected significantly (exponential), keep

scalability in mind. Performance improvement can be a dynamic DBA task.

- Proactive Maintenance reorg, statistics, binding etc..

- Troubleshooting examples and some sample code segments are used to exemplify theproactive practice. Performance issue debugging and analysis is also included.

8/4/2019 EU08D01

4/36

4

4

Performance Improvement Approaches

Understanding the application behavior Writing optimal queries, exploring the query features wisely

Creating the necessary indexes

Using appropriate APIs for a given requirement

Programming language level

Statement attributes such as cursor type, data type forbinding, fetch orientation, array options;

Proactive maintenance to ensure optimal data layout andupdated statistics

Tuning the key configuration parameters based on applicationbehavior and system monitoring data.

A DBMS is supposed to do just what applications requested to do.Therefore understanding the application behavior is most important in order tomaximize the performance for a given system. (Occasionally a DBMS does notdo whats expected, then it would become a PMR issue)

-Indexes can help most of the queries, but not always

- Developers ought to optimize the queries, not just barely make them work.

-API

- Program Level

choose the right language for your job - C, Java, Perl, or shell scripts

-Coding level data type, cursor type, fetch orientation, array option,blocking etc..

- Maintenance as most DBAs do (backup, necessary reorg, update statistics,rebind, data integrity check).

-Does the database need reorg? Data growth, insertion mode, Online or offline

-Do I have enough security on LOGs (Primary, Mirror, Archive), how Logsshould be distributed?

- What RUNSTATS option is the best suited to my system?

- Configuration Parameter setting (DBM CFG, DB CFG, and registry) basedon benchmarking or stress test

8/4/2019 EU08D01

5/36

5

5

Examples Summary

Approach - DB2 native tools + OS fundamental tools

Create correct index is the key (2~43x on multiple applications)

Choosing the right API for a given job is essentialEmbedded(1.00)

CLI (1.03)

ADO/IBM provider (1.31)

ADO/MS Bridge (1.47)

JDBC T2 (1.56)

Shell Script (4.80)

Using proper data type (i.e., in SQLBindCol); right cursor types; andproper fetching/inserting APIs

Tuning based on application behave (e.g., parallelism, avg_appls

etc) to resolve memory shortage, locking, response time runstats options (e.g., had 37x performance impact )

Brief summary of the data/example showing the impact.

When troubleshooting an issue, where to start?

- Approach: Basic native tools are always the good places to start, such as

CPU, mem, io, then examining the snapshot data and event monitoring,queries

Some prefer to buy monitoring tools, make sure you understand

how data is collected and interpreted

- If find long executing queries (bottleneck queries), analyze the access plan ->

focus on the most costly Plan steps

- Coding APIs - A business decision and developers skill set. The numbers in

parenthesis are relative response time in comparison, smaller the better

- Using proper data type and appropriate cursor type, and fetch orientation.

Numbers in the parenthesis are the relative execution time- Tuning is based on applications behavior. Configuration parameters should

be based on benchmarking tests

- Ensure DB has updated necessary statistics, and optimized access plan

8/4/2019 EU08D01

6/36

6

6

Understand the Nature of Applications

OLTP or DSS or Mixed

Possible limitations vs tolerance

Example - parallelism (DFT_DEGREE,INTRA_PARALLEL, DFT_QUERYOPT,AVG_APPLS)

OLTP applications expect faster instant response;

DSS applications may have complex queries or larger result set. The

expectation and tolerance may be different.

Configuration may need to take the application expectation into account.

OLTP DSS

Opt level low High

AVG_APPLS 1 vary: depends on number of complex query

applications and bufferpool size

Parallelism no yes

------

DFT_DEGREE 1 [ANY, -1, 1 - 32 767] (CURRENT DEGREE)

MAX_QUERYDEGREE -1 (ANY) [ANY, 1 - 32 767]

Number of parallel operations within a database partition when the statementis executed

INTRA_PARALLEL NO (0) [SYSTEM (-1), NO (0), YES (1)] may require

more FCM buffer

DFT_QUERYOPT 5 [ 0 -- 9 ]

AVG_APPLS 1 or N efficiently using Bufferpool

8/4/2019 EU08D01

7/36

7

7

Example 1. AVG_APPLS

SQL10013N, could not load the library Overall application performance improve 3~54%

The bottleneck query execution time (seconds) and CPUusage (%)

160.006avg_appls=1

50105avg_appls=5

CPU usage

(%, 4-way Sun)

Time (Sec.)

SQL10013N The specified library "" could not be loaded

In an OLTP application system, response time is essential. What would be

your tolerable response time when hit a button (or link)? Sub-seconds?

One would want to tune the system run as quick as possible, which means

allowing an application to use all the available resource (bufferpool in thiscase) and be done with it.

When an OLTP query takes about several seconds or more, user might just

navigate away from the site. In some cases, that means potentially loose

business.

8/4/2019 EU08D01

8/36

8

8

Example 2. Intra_parallel

Turning Intra_Parallel OFF freed up about 1.5GBreal memory and 2 GB swap memory in an 32-bitSun/Solaris system saved system from crashing

Disabling the Intra_parallelism improved someapplication performance by 2~5%

Conclusion: choose the features wisely

Problem: system crashed because swap memory was exhausted.

Parallelism is a great feature. However, would it help you?

How did I know it was the intra_parallel=YES that caused the crash?

Eerror message suggested that No FCM request blocks are available

(SQL6043C) ; Number of FCM request blocks (FCM_NUM_RQB) can not

be increased

2 GB memory saving means a great deal on a 4-way (Sun V880) box.

Analogy example for this would be that for a simple job that requires climbing

a ladder, one person can do the job just fine. Two people would be crowded,

and might cause crash!

8/4/2019 EU08D01

9/36

9

9

Writing optimal queries/program,

exploring the query features wisely

Too many to mention

A Simple Query Example

Select C1,Cx from T1 where C1 in (x,y) optimizefor 1000 rows

What is the expected resultSet?

Is the blocking necessary?

Local or n-tier system

Select C1,Cx from T1 where C1 in (x,y,) optimize for 1,000 rows

Even a simple query like the above requires the careful coding is the

blocking really needed? What is the expected resultset? Local database or

remote? Too often we have seen such clause show up in the OLTP applicationqueries, which caused the performance problem to users.

8/4/2019 EU08D01

10/36

10

10

Example 3. Using result set block vs non-block undervarious APIs (Win2k-390 system, 100,000 rows)

0.0410.364.94ADO

0.0010.461.93JDBC T2

0.0210.746.49CLI

0.0310.645.59Embedded

Stdev/aveR.T.*Stdev/aveR.T.*

BLOCKING (optimize forN rows)

NON-BLOCKINGAPI

R.T. = relative time against the same API used

Row blocking is a technique that reduces database manager overhead by retrieving a block ofrows in a single operation. These rows are stored in a cache, and each FETCH request in theapplication gets the next row from the cache. When all the rows in a block have beenprocessed, another block of rows is retrieved by the database manager.

Our test data of fetching 100,000 rows from a 10 columns table (rs=239 bytes, number of rowsper block is 84) in a win2k-zOS system indicated that without blocking, results can befluctuated (being that stdev vs Average is higher), and about 2-6 times slower than that usingblocking.

The cache is allocated when an application issues an OPEN CURSOR request and isdeallocated when the cursor is closed. The size of the cache is determined by a configurationparameter which is used to allocate memory for the I/O block. The database managerparameter used depends on whether the client is local or remote:

For local applications, aslheapsz (default 15 x 4K) is used to allocate the cache for rowblocking.

For remote applications, rqrioblk (default 32K) on the client workstation is used to allocatethe cache for row blocking. The cache is allocated on the database client.

-- just in case someone wants to know how to determine the size

Rowsperblock=aslheapsz*4096/rsRowsperblock=rqrioblk/rs

UNAMBIG, ALL, NO

-- what if the query only return a handful or several records?

The blocking would make query response time longer? Because it would try to find outthe first N rows, until it could not get as many rows as specified.

8/4/2019 EU08D01

11/36

11

11

Example 4.1. Reuse the Statement via Parameter Markers

intmain () {

SQLHANDLE henv, hdbc, hstmt;char * sqlstmt = (SQLCHAR *) INSERT INTO T1 (C2, C5) VALUES(?,?);SQLINTEGER *col2, lvalue;SQLCHAR *col5;int rc=0, pass=0;/* allocate henv, hdbc, connect to database *//* allocate statement handle */rc = SQLAllocHandle (SQL_HANDLE_STMT, hdbc, &hstmt);/* prepare the statement */

rc = SQLPrepare (hstmt, sqlstmt, SQL_NTS);/* assign values to the input variables */col2 = (SQLINTEGER *)malloc(sizeof(int)); *col2=1;col5 = (SQLCHAR *) malloc((sizeof(char))*100);strcpy ((char *)col5, "my 100 characters string, but could be shorter");/* bind the values to the parameter markers */rc = SQLBindParameter(hstmt, 1,

SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0,(SQLINTEGER *)col2, sizeof((SQLINTEGER *)col2 ),&lvalue);

rc = SQLBindParameter(hstmt, 2,SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0,(SQLCHAR *)col5, 100, NULL );

/* execute the statement, assume that 100,000 rows to be inserted into the table */

while(pass++

8/4/2019 EU08D01

12/36

12

12

Example 4.2. Reuse the Statement via Parameter Markers

int main () {

/* allocate henv, hdbc, connect to database allocate statement handle *//* prepare the statement */

rc = SQLPrepare (hstmt, sqlstmt, SQL_NTS);/* assign values to the input variables, bind the values to the parameter markers,

execute the statement, assume that 100,000 rows to be inserted into thetable */

while(pass++

8/4/2019 EU08D01

13/36

13

13

Using Appropriate APIs for a

Given Requirement Scenario: An on going batch job to set document

status for a list of docIDs passed. Time is essential

Shell script meant to be interactive (input/invokeCLP/SQL/Commit)

Programming language, such as C, may allowstreamline the logic, reuse the stmts, more cursormanipulation options etc..

C:Perl:ksh(opt):ksh(prim) =1:3.76:302:1066

What presented here is a simple update statement that needs to be executed

frequently with a list of the record-IDs as input.

Update table1 set c1=U where c2 in (?)

What needed was a streamline program, quick and efficiently process the

documents.

Efficiency is the key. Numbers were collected against a local database. No

network traffic was involved, difference is purely caused by API difference

8/4/2019 EU08D01

14/36

14

14

Example 5. Several APIs Performance

Comparison in a Local Solaris System

0

200

400

600

800

1000

1200

RelativeTime

C Perl Ksh (opt) Ksh (prim)APIs

1 3.76

302

1066

C:Perl:ksh(opt):ksh(prim) =1:3.76:302:1066

(50,000 records for testing, updating)

C CLI well written, prepare stmt once, reuse it.

Perl prepare stmt once, reuse it, one more layer of the Interface

Ksh (opt) auto commit off, quiet, remove the unnecessary print steps etc..

Ksh (prim) interactive, stdout IO, redundancy with auto commit on this is

more likely some people would be programming, quick and dirty code barely

make it work.

8/4/2019 EU08D01

15/36

15

15

Example 6. APIs Performance in

two-tier DB2 Connect System

1 1.03

1.31

1.471.56

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

RelativeTime

Embedded CLI ADO/IBM

Provider

ADO/MS

Bridge

JDBC T2

Driver

APIs

Notice that numbers in this slide are collected in 2-tier system using a

composite workload (all kind of the SQLs)

The comparison data of using CLI, JDBC (Driver Type 2), ADO (use both

IBM OLE DB Provider for DB2 Server and Microsoft OLE DB Bridge forODBC Drivers), and Static Embedded SQL in a Windows2000 - zOS two-tier

system. DB2 Connect Server was on the Windows 2000 application client.

If the time for using Embedded-SQL is normalized to 1.00, the performance

sequence for fetching data using various APIs (fastest to slowest) is Embedded

SQL (1.00), CLI (1.03), ADO/IBM provider (1.31), ADO/Microsoft Bridge

(1.47), and JDBC (1.56). DB2 CLI is comparable to the Embedded SQL! IBM

Provider outperformed Microsoft Bridge. JDBC is just as expected.

The magnitude of difference among APIs is that 2-tier system is smaller than

that in local system. It could be because that in a multiple tier system, more

factors came into play, such as MF server generally slower; and data transfer

between server and client.

8/4/2019 EU08D01

16/36

16

16

Example 7. Performance of three fetch APIswith different data type in binding

SQLGetData() -

SQL_C_CHAR

3.55SQLGetData - PDT

3.32

SQLBindCol -

SQL_C_CHAR

1.38SQLBindCol - PDT

1

SQLFetchScroll -

SQL_C_CHAR

1

SQLFetchScroll - PDT

0.89

0

1

2

3

4

RelativeTime

For fetching data, 10 cols x 200,000 rows in our test case, if the time for usingtypical SQLBindCol() is normalized to 1.00, the performance sequence fromthe fastest to the slowest is:

Proper data type in binding Using SQL_C_CHAR in binding

SQLFetchScroll 0.89 vs 1

SQLFetch/SQLBindCol 1 vs 1.38

SQLGetData 3.32 vs 3.55

Using proper data type in binding is always better than using SQL_C_CHAR,Therefore, use proper data type in binding, and use array fetch wheneverpossible

Typically, an application may choose to allocate the maximum memory thecolumn value could occupy and bind it via SQLBindCol(), based oninformation about a column in the result set (obtained via a call toSQLDescribeCol(), for example, or prior knowledge). However, in the case of

character and binary data, the column can be arbitrarily long. If the length ofthe column value exceeds the length of the buffer the application can allocateor afford to allocate, a feature of SQLGetData() lets the application userepeated calls to obtain in sequence the value of a single column in moremanageable pieces. This API may be used to Java or GUI type of theapplication. Tradeoff is the slow performance.

8/4/2019 EU08D01

17/36

17

17

Example 8. SQLFetch Orientation

11.2

2.9

0

1

2

3

RelativeTime

FORW

ARD_

ONLY

STATIC

KEYS

ET-DRIVE

N

SQL_CURSOR Orientation

Cursor Type and SQLFetchScroll()

In the above examples, the fetch was sequential, i.e., retrieving rows starting with the first row, and ending with the lastrow. In that case, we know SQLFetchScroll() gives the best performance. What if an application to allow the userto scroll through a set of data both forwards and backwards? DB2 CLI has three type of the scrollable cursors

(1) forward only (default) cursor - can only scrolls forward.

(2) static read-only cursor - is static, once it is created no rows will be added or removed, and no value in any rowswill change

(3) keyset-driven cursor - has ability to detect changes to the underlying data, and the ability to use the cursor to

make changes to the underlying data. Keyset-driven cursor will reflect the changed values in existing rows, anddeleted rows; but it will not reflect added rows. Because the set of rows is determined once, when the cursor isopened. It does not re-issue the select statement to see if new rows have been added that should be included.

To be able to scroll through the cursor back and forth, cursor has to be defined as SQL_CURSOR_STATIC orSQL_CURSOR_KEYSET_DRIVEN. The position of the rowset within the result set can be specified asSQL_FETCH_NEXT, SQL_FETCH_FIRST, SQL_FETCH_LAST, SQL_FETCH_RELATIVE,SQL_FETCH_ABSOLUTE, SQL_FETCH_PRIOR, and SQL_FETCH_BOOKMARK in the SQLFetchScroll()call.

Performance impact

From the performance point of the view, a static cursor involves the least overhead, if the application does not need theadditional feature of a keyset-driven cursor then a static cursor should be used. If the application needs to detectchanges to the underlying data, or needs to add, update, or delete data from the result set, then the keyset-drivencursor may be used. Also, if one needs to scroll the cursor back and forth, cursor type needed to be set toSQL_CURSOR_STATIC, the default value for the type of scrollable cursor isSQL_CURSOR_FORWARD_ONLY. If we compared the performance for fetching data using STATIC andKEYSET-DRIVEN with that using FORWARD_ONLY, we would see 1.2 and 2.9 times slower for Static and

Keyset-drive cursor respectively compared to forward only cursor. I.e., the features come with a cost.

An example of using various type of the cursors in array fetch with specified fetch orientation (see next slide)

8/4/2019 EU08D01

18/36

18

18

Sample Code of Using Static Cursor

/* cursor type has to be specified via SQLSetStmtAttr() before theSQLPrepare() */

rc = SQLSetStmtAttr ( hstmt,

SQL_ATTR_CURSOR_TYPE,

(SQLPOINTER) SQL_CURSOR_STATIC,

0);

rc = SQLParepare(hstmt, sqlstmt, SQL_NTS);

/* */

/* fetch orientation may be specified in SQLFetchScroll() */

rc = SQLFetchScroll(hstmt, SQL_FETCH_FIRST, 0);/* */

To be able to scroll through the cursor back and forth, cursor has to be definedas

SQL_CURSOR_STATIC or

SQL_CURSOR_KEYSET_DRIVEN.

The position of the rowset within the result set can be specified as

SQL_FETCH_NEXT

SQL_FETCH_FIRST

SQL_FETCH_LAST

SQL_FETCH_RELATIVE

SQL_FETCH_ABSOLUTE

SQL_FETCH_PRIOR and

SQL_FETCH_BOOKMARK

in the SQLFetchScroll() call.

An example of using STATIC or KEYSET_DRIVEN cursor would be similarto that illustrated in the Sample code, except defining the cursor type andspecifying the fetch orientation

8/4/2019 EU08D01

19/36

19

19

Example 9. Insert APIs Performance

1

0.85 0.81

0.42 0.420.36

0

0.2

0.4

0.6

0.8

1

RelativeTime

SQLB

indPa

ram...

SQLE

xtendedBi

nd

NotLo

ggedInitially

Array_Ins

ert(1

00)

CHAINN

ING

CLIU

SE_LOA

D

SQL Insert APIs

For inserting data, if the time for inserting 100,000 rows, one at a time usingSQLBindParameter() is normalized to 1.00,

the performance sequence from fastest to the slowest is

CLI USE_LOAD (0.36) - CLI API invokes LOAD; large data

CHAINING (0.42) - referred to as CLI array input chaining. AllSQLExecute() requests associated with a prepared statement will not be sentto the server until either the SQL_ATTR_CHAINING_END statement attributeis set, or the available buffer space is consumed by rows that have beenchained.

Array Insert (0.42, Size 100) Inserting multiple rows

Row Insert with Not Logged Initially Activated (0.81) - reducing the logging

SQLExtendedBind (0.85) bind array of the columns, some restrictions apply

SQLBindParameter(1.00) - typical

Had one only used single row insert via SQLBindParameter(), he would have

missed a lot of the great options that CLI has to offer.When Array size > 10, changing size does not have significant impact

Reducing logging with the NOT LOGGED INITIALLY parameter

SQLExtendedBind()

This function can be used to replace multiple calls to SQLBindCol() orSQLBindParameter(), however, important differences should be noted.

8/4/2019 EU08D01

20/36

20

20

Typical Row Insert

rc = SQLBindParameter(hstmt, 1,SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0,(SQLINTEGER *)col1, sizeof((SQLINTEGER *)col1 ),&lvalue );

rc = SQLBindParameter(hstmt, 2,SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0,(SQLCHAR *)col2, 100, NULL );

/* execute the statement, assume that n (100,000) rows to be inserted */while(pass++

8/4/2019 EU08D01

21/36

21

21

Array Insert

/* just make up some values for column Col1 and Col2 */SQLINTEGER col1[]= {1,2,3,4,5,6,7,8,9,10, 100};SQLCHAR col2[100][100]= {"A1","B2,C3,D4,E5,F6,G7,H8,I9,J10,z100};/* set array size, 100 for our sample code */

rc=SQLSetStmtAttr(hstmt,SQL_ATTR_PARAMSET_SIZE,(SQLPOINTER)100, 0);

/* bind the values to the parameter markers, which is the same as before except this time col1and col2 are arrays */

rc = SQLBindParameter(hstmt, 1,

SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0, col1, 0, NULL);

rc = SQLBindParameter(hstmt, 2,

SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0, col2, 100, NULL );

while(pass++

8/4/2019 EU08D01

22/36

22

22

Chaining

/* */

rc = SQLSetStmtAttr(statement.hstmt,

SQL_ATTR_CHAINING_BEGIN,

(SQLPOINTER) TRUE,

0);

while ( pass++

8/4/2019 EU08D01

23/36

23

23

Use Load API

/* allocate henv, hdbc, connect to database, allocate statement handle,

prepare the statement, assign values to the input variables, bind thevalues to the parameter markers */

/* begin to use load */rc = SQLSetStmtAttr(hstmt, SQL_ATTR_USE_LOAD_API,

(SQLPOINTER) SQL_USE_LOAD_INSERT, 0 );/* execute the statement, assume that wed like to insert 100000 rows

into the table */while(pass++

8/4/2019 EU08D01

24/36

24

24

Create Necessary Indexes

Bottleneck Queries First

Including Stored Procedures, Triggers

Only those needed Indexes can help, could alsohurt

How do we know indexes are needed?

0. identify bottleneck queries snapshot and event monitor data

1. db2advis is a good tool to start

2. Analyzing the access plan, find the bottlenecks, try to come up an index to

reduce the cost

3. Testing the index(es) created, ensure it improve the bottleneck queries w/o

hurting other queries too much

8/4/2019 EU08D01

25/36

25

25

Example 10. SQLs In The Procedures

Trigger on icmut01005001CREATE TRIGGER CML.TG03_ICMUT01005001 AFTER UPDATE OFATTR0000001024 ON CML.ICMUT01005001 REFERENCING NEW AS NEWFOR EACH ROW MODE DB2SQL WHEN (UPPER(NEW.attr0000001024) notin ('IC','CN') OR NEW.attr0000001024 is null) BEGIN ATOMIC CALLCML.ICHG_QUE_PROC (NEW.ATTR0000001021, NEW.ATTR0000001024,NEW.ATTR0000001025); END

SP on ICHG_QUE tableCREATE PROCEDURE CML.ICHG_QUE_PROC (IN ATTR1021CHARACTER(26), IN ATTR1024 CHARACTER(2), IN ATTR1025 TIMESTAMP)SPECIFIC CML.ICHG_QUE_PROC LANGUAGE SQL MODIFIES SQL DATABEGIN DECLARE V_CNT INTEGER DEFAULT 0; SELECT count(*) INTOV_CNT FROM CML.ICHG_QUE WHERE CML.ICHG_QUE.ATTR0000001021 =ATTR1021 WITH UR; IF V_CNT < 1 THEN INSERT INTO CML.ICHG_QUEATTR0000001021, ATTR0000001024, ATTR0000001025) VALUES(ATTR1021, ATTR1024, ATTR1025); END IF; END

No index on ATTR0000001021 which is docID

In some case a bottleneck SQL may not that obviously. For example when you

have triggers or stored procedure calls, you may need to examine what SQLs

in them.

In the example above, obviously a trigger is defined to call a procedure when

certain condition is met. The procedure contains a SQL counting for

something. Unfortunately the count(*) stmt does not have the column in the

WHERE clause defined as index, therefore a tablescan was inevitable

whenever there is a modification on a table attribute.

How many system could afford a tablescan?

8/4/2019 EU08D01

26/36

26

26

An Index That Reduced The Cost

IXSCAN

( 3)

50.04 (cost)

2 (IO)

|

477516

INDEX: CML

QUE1021

TBSCAN

( 3)

9539.24 (cost)

2318 (IO)

|

474808

TABLE: CML

QUE

IndexScan after the Index isadded on QUE (attr1021)TableScan Happened Beforethe Index Addition

Tests on the laboratory server and production system indicated that this index

addition has increased the performance by 230% using C/CLI where a few

thousands records are in the table.

What if there are more than a few thousands records in the table?

8/4/2019 EU08D01

27/36

27

27

Example 11. Where Should The

Indexes Be?Stmt: update CML.DocTab

set docType =X

where docID=? and docType in (Y,Z)

docID is unique and docType is not, where the indexshould be?

Which ever has higher cardinality.

8/4/2019 EU08D01

28/36

28

28

An index that may hurt the performanceWhat if an index is defined on docType?

/--------+--------\

1 0.0202847

FETCH TBSCAN

( 5) ( 7)

100.048 0.0457899

4 0

/---+---\ |

1 1.28141e+06 2

IXSCAN TABLE: CML TEMP

( 6) DocTab ( 8)

75.0417 0.0159013

3 0

| |

1.28141e+06 2

INDEX: CML TBSCAN

IndeX2 ( 9)

6.67186e-05

0

|

2TABFNC: SYSIBM

GENROW

/---+---\

1 1.28141e+06

IXSCAN TABLE: CML

( 5) DocTab

75.0417

3

|

1.28141e+06

INDEX: CML

Index2

Addition Index on doctypeBefore Adding the Index

During examination of the query access plan, it was noticed that dropping an

unnecessary index eliminated three extra operations on temp tables for the

update SQL statement, and further improved the performance by nearly 40-

times (60 minutes work of updating 50k rows is completed in 1.5 minutes).

Why? doctype has low cardinality

Stmt: update CML.DocTab set docType ='DR' where docID=? and

docType in ('CN','IC')

Choose an index on column(s) that has more cardinality (i.e., docID)

8/4/2019 EU08D01

29/36

29

29

Example 12. APIs +/- Index Effect

Right indexing (add what needed, remove unnecessary), plus proper APIs,

have made 466x performance gain.

From the Figures above, Indexes effect made API effects appeared small,

however you are looking at double / triple / quadruple differences among theAPIs.

8/4/2019 EU08D01

30/36

30

30

Time Saved (Indexes + APIs)

0

100

200

300

400

500

600

HoursNeededPerYear

Proactive Optimization Process

Existing Code

Optimized (1st Year Including 40 hrs Coding Effort)

Optimized (Subsequent Years)

584

1+401

Considering the ongoing maintenance, each site may process as many as 2~3

million records per year. It would take the original ksh script 584 hours, or the

third partys Legacy program 1368 hours, to complete the job. The optimized

approach can complete the job in 1.3 hours.

Taking first years 40 hours effort of optimizing the methods into account, the

first years hours for marking documents were reduced from 584 (ksh script)

hours to 41 hours, this represents a net 1st year savings of 543 hours on each

site. Subsequent years net saving would be 583 hours on each site. There are 7

(N) such sites on our program.

Points are

Using the appropriate API for the right job. For example, C/CLI is much

faster than ksh script for batch job processing of many records.

Creating indexes wisely. i.e., adding a necessary index or dropping an

unnecessary index.

Some legacy code has had patches+patches+patches would it worthy re-

writing the core pieces of the code?

8/4/2019 EU08D01

31/36

31

31

Proactive Maintenance

Reorg (online vs offline)

Append_mode (online insertion)

Runstats (various options)

Monitor switches - do they need to be on?

When you have taken care of the Indexes, bufferpools, cfg parameters, logs,

sort, APIs etc.. What else would you do?

How about a stress test to push the system to a level where potential

bottlenecks may become apparent?How about proactive maintenance?

Is your database need reorg (reorgchk)? Do I have time and resource to reorg?

How often do I need to update statistics?

Is there a need to leave the monitor switches on?

8/4/2019 EU08D01

32/36

32

32

Example 13. APPEND_MODE

-29.870.750.06-75.470.04ON vsOFF

(diff %)

importUPDATESELECTINSERT/select

DELETE

Online pages reorganization could have its pros and cons.

Turing append mode ON helps the insert performance, however nightly or

weekly reorg is needed.

When APPEND_MODE is set to ON, new rows are always appended to the

end of the table. No searching or maintenance of FSCRs (Free Space ControlRecords) takes place. This option is enabled using the ALTER TABLE

APPEND ON statement, and can improve performance for tables that only

grow, like journals.

Performance test is needed to verify, because if does have slight performance

degradation on select stmt.

8/4/2019 EU08D01

33/36

33

33

Example 14.1 Runstats Options Effect

38

1

05

10152025303540

RelativeTime

DEFAULT Detailed

Detailed runstats option

NUM_FREQVALUES from 10 to 100,NUM_QUANTILES from 20 to 200

Warning: Performance tests are needed to validate that the option change might help your applications.

This a case of improving data validation utility (select queries mostly)

RUNSTATS ON TABLE schema.OBJECTS ON ALL COLUMNS WITH DISTRIBUTION ON KEY COLUMNSDEFAULT NUM_FREQVALUES 100 NUM_QUANTILES 200 AND DETAILED INDEXES ALL ALLOW WRITEACCESS;

NUM_FREQVALUES

Defines the maximum number of frequency values to collect. It can be specified for an individual column in the ONCOLUMNS clause. If the value is not specified for an individual column, the frequency limit value will be picked upfrom that specified in the DEFAULT clause. If it is not specified there ei ther, the maximum number of frequencyvalues to be collected will be what is set in the NUM_FREQVALUES database configuration parameter.

Current value Number of frequent values retained (NUM_FREQVALUES) = 10

The "most frequent value" statistics help the optimizer understand the distribution of data values within a column. Ahigher value results in more information being available to the SQL optimizer but requires additional catalog space.When 0 is specified, no frequent-value statistics are retained, even if you request that distribution statistics be collected.

NUM_QUANTILES

Defines the maximum number of distribution quantile values to collect. It can be specified for an individual column inthe ON COLUMNS clause. If the value is not specified for an individual column, the quantile limit value will be pickedup from that specified in the DEFAULT clause. If it is not specified there either, the maximum number of quantilevalues to be collected will be what is set in the NUM_QUANTILES database configuration parameter.

Current number of quantiles retained (NUM_QUANTILES) = 20

The "quantile" statistics help the optimizer understand the distribution of data values within a column. A higher valueresults in more information being available to the SQL optimizer but requires additional catalog space. When 0 or 1 isspecified, no quantile statistics are retained, even if you request that distribution statistics be collected.

Increasing the value of these two parameters increases the amount of statistics heap (stat_heap_sz) used whencollecting statistics. The default value of statistics heap size (4KB) (STAT_HEAP_SZ) is 4384. You may have toincrease this configuration parameterl.

8/4/2019 EU08D01

34/36

34

34

Example 14.2 RUNSTATS CMD

RUNSTATS ON TABLE RMADMIN.RMOBJECTS ONALL COLUMNS WITH DISTRIBUTION ON KEYCOLUMNS DEFAULT

NUM_FREQVALUES 100

NUM_QUANTILES 200

AND DETAILED INDEXES ALL ALLOW WRITEACCESS ;

Default value for num_freqvalues = 10,

num_quantiles = 20

8/4/2019 EU08D01

35/36

35

35

How To Identify A Bottleneck?

Collecting and analyzing the debug data using basic system tools(vmstat, top, prstat, sar, pmap, iostats etc.); DB2 native tools(snapshot, event monitor, access plan, db2pd, db2advis etc..); andprofiling tools if need to.

Query access plan - using the right indexes to reduce the cost of abottleneck queries

Exploring the APIs features based on your need. DB2 supportedAPIs (Embedded-SQL, CLI, JDBC, ADO, Perl, CLP), and theirperformance difference; fetch/insert orientations, statement attributes

Using the query options wisely, such as blocking features, parametermarking to reuse the statement if repeated calling the same. DBMSsupposed to do exactly what application (queries) requested to do

Understanding the application nature (OLTP or DSS or mixed), andtuning the DBM and DB configuration parameters accordingly;

Maintaining the database proactively to ensure the optimal databaseperformance

Could the bottlenecks identification and elimination be automated?

Is there anyone interesting in writing up a program that can automatically

identify performance bottlenecks and eliminate them? Stay tuned.

8/4/2019 EU08D01

36/36

36

Sigen Chen

Lockheed Martin

Baltimore, Maryland USA

[email protected]

Session D01

Bottlenecks Elimination in Real World DB2 Applications

eu08d01

Documents