eu08d01
TRANSCRIPT
-
8/4/2019 EU08D01
1/36
1
13 October 2008 11:15 12:15
Platform: DB2 UDB for Linux, UNIX, Windows
Sigen ChenLockheed Martin
Session: D01
Bottlenecks Elimination inReal World DB2 Applications
ABSTRACT
Database application performance for a given system (hardware and software)
may be determined by application behavior, APIs, database design and layout,
data size, system configurations. This presentation will cover some aspects
based on the performance improving practice from real world database
applications. The focus will be on understanding the application behavior;
creating the right indexes; writing optimal queries, exploring the query features
wisely; using appropriate APIs for a given requirement, not only on the
programming language level, but also on the statement attributes such as
cursor type, data type for binding, fetch orientation, array options; practicing
proactive maintenance to ensure optimal data layout and statistics; tuning the
key configuration parameters based on application behavior and system
monitoring data. The troubleshooting examples and sample code segments are
used to exemplify the practice. Performance issue debugging and analysis is
also included.
In short -
Presenting some experience from managing Real world DB2 databases
Sharing some performance data from database application benchmarking
Exercising some DB2 coding (APIs) options just for curiosity from database
application performance point of viewing
-
8/4/2019 EU08D01
2/36
2
2
Summary
Diagnosing the real database applications
Using DB2 native tools and system tools.
Creating the correct index
Adding the right indexes
Removing the unnecessary indexes.
Choosing the right API for a given job, i.e.,
Embedded, CLI, ADO/IBM provider, ADO/MS Bridge,JDBC T2, Perl and Shell Script.
Using proper data type in SQLBindCol(), using arrayfetch/insert, right cursor types, and proper fetching/insertingAPIs.
Tuning several key cfg parameters such as parallelism,avg_appls etc., refining the options of maintenance tools.
1. Discussing how to identify the bottlenecks by analyzing thedebugging data using system tools (vmstat, top, prstat, sar, pmapetc.), DB2 native tools (snapshot, even monitor, access plan,db2pd etc..) and profiling tools.
2. Showing how to collect and analyze the query access plan, andusing the right indexes to reduce the cost of a bottleneck queries.
3. Analyzing several commonly used DB2 Supported APIs(Embedded-SQL, CLI, JDBC, ADO, Perl, CLP), and theirperformance difference through our test data; Comparingseveral fetch/insert orientations of CLI, statement attributes, andtest the performance.
4. Writing the most efficient queries, using the query optionswisely, such as blocking features. After all the DBMS supposedto do exactly what application (queries) requested to do.
5. Understanding the application nature (OLTP or DSS or mixed),and tuning the DBM and DB configuration parametersaccordingly; maintaining the database proactively to ensure theoptimal database performance.
-
8/4/2019 EU08D01
3/36
3
3
Performance Factors
$, Hardware Infrastructure (cpu/mem/io, network),BP, Reasonable Data Layout
Application behavior
APIs (Language, Interface)
Database application design and data layout
Data size (response time vs size)
System configurations
System Maintenance (proactive vs responsive)
What Could Affect A Given Database Application System Performance?
- $ and HW infrastructure (cpu/mem/disk, network) is out of the scope of this presentation;
- Its also assumed that you would have reasonable BP hit ratio and data layout (TS, LOGs,
striping).
For a given system (platform HD, SW), something a DBA can do to improve perferformance.
- Understand business objectives and the applications behavior OLTP, DSS (DW), or mixed?Tuning the system accordingly
- Number of the active applications. Is parallelism necessary?
- How applications are implemented? C, Java etc..
- What APIs are employed? - one may not have control over all language and APIs used by
applications, but a DBA does have control on maintenance programs and batch jobs.
- Disks layout and data distribution? Is HA involved? Is DPF involved?
- As data size growing, performance can be affected significantly (exponential), keep
scalability in mind. Performance improvement can be a dynamic DBA task.
- Proactive Maintenance reorg, statistics, binding etc..
- Troubleshooting examples and some sample code segments are used to exemplify theproactive practice. Performance issue debugging and analysis is also included.
-
8/4/2019 EU08D01
4/36
4
4
Performance Improvement Approaches
Understanding the application behavior Writing optimal queries, exploring the query features wisely
Creating the necessary indexes
Using appropriate APIs for a given requirement
Programming language level
Statement attributes such as cursor type, data type forbinding, fetch orientation, array options;
Proactive maintenance to ensure optimal data layout andupdated statistics
Tuning the key configuration parameters based on applicationbehavior and system monitoring data.
A DBMS is supposed to do just what applications requested to do.Therefore understanding the application behavior is most important in order tomaximize the performance for a given system. (Occasionally a DBMS does notdo whats expected, then it would become a PMR issue)
-Indexes can help most of the queries, but not always
- Developers ought to optimize the queries, not just barely make them work.
-API
- Program Level
choose the right language for your job - C, Java, Perl, or shell scripts
-Coding level data type, cursor type, fetch orientation, array option,blocking etc..
- Maintenance as most DBAs do (backup, necessary reorg, update statistics,rebind, data integrity check).
-Does the database need reorg? Data growth, insertion mode, Online or offline
-Do I have enough security on LOGs (Primary, Mirror, Archive), how Logsshould be distributed?
- What RUNSTATS option is the best suited to my system?
- Configuration Parameter setting (DBM CFG, DB CFG, and registry) basedon benchmarking or stress test
-
8/4/2019 EU08D01
5/36
5
5
Examples Summary
Approach - DB2 native tools + OS fundamental tools
Create correct index is the key (2~43x on multiple applications)
Choosing the right API for a given job is essentialEmbedded(1.00)
CLI (1.03)
ADO/IBM provider (1.31)
ADO/MS Bridge (1.47)
JDBC T2 (1.56)
Shell Script (4.80)
Using proper data type (i.e., in SQLBindCol); right cursor types; andproper fetching/inserting APIs
Tuning based on application behave (e.g., parallelism, avg_appls
etc) to resolve memory shortage, locking, response time runstats options (e.g., had 37x performance impact )
Brief summary of the data/example showing the impact.
When troubleshooting an issue, where to start?
- Approach: Basic native tools are always the good places to start, such as
CPU, mem, io, then examining the snapshot data and event monitoring,queries
Some prefer to buy monitoring tools, make sure you understand
how data is collected and interpreted
- If find long executing queries (bottleneck queries), analyze the access plan ->
focus on the most costly Plan steps
- Coding APIs - A business decision and developers skill set. The numbers in
parenthesis are relative response time in comparison, smaller the better
- Using proper data type and appropriate cursor type, and fetch orientation.
Numbers in the parenthesis are the relative execution time- Tuning is based on applications behavior. Configuration parameters should
be based on benchmarking tests
- Ensure DB has updated necessary statistics, and optimized access plan
-
8/4/2019 EU08D01
6/36
6
6
Understand the Nature of Applications
OLTP or DSS or Mixed
Possible limitations vs tolerance
Example - parallelism (DFT_DEGREE,INTRA_PARALLEL, DFT_QUERYOPT,AVG_APPLS)
OLTP applications expect faster instant response;
DSS applications may have complex queries or larger result set. The
expectation and tolerance may be different.
Configuration may need to take the application expectation into account.
OLTP DSS
Opt level low High
AVG_APPLS 1 vary: depends on number of complex query
applications and bufferpool size
Parallelism no yes
------
DFT_DEGREE 1 [ANY, -1, 1 - 32 767] (CURRENT DEGREE)
MAX_QUERYDEGREE -1 (ANY) [ANY, 1 - 32 767]
Number of parallel operations within a database partition when the statementis executed
INTRA_PARALLEL NO (0) [SYSTEM (-1), NO (0), YES (1)] may require
more FCM buffer
DFT_QUERYOPT 5 [ 0 -- 9 ]
AVG_APPLS 1 or N efficiently using Bufferpool
-
8/4/2019 EU08D01
7/36
7
7
Example 1. AVG_APPLS
SQL10013N, could not load the library Overall application performance improve 3~54%
The bottleneck query execution time (seconds) and CPUusage (%)
160.006avg_appls=1
50105avg_appls=5
CPU usage
(%, 4-way Sun)
Time (Sec.)
SQL10013N The specified library "" could not be loaded
In an OLTP application system, response time is essential. What would be
your tolerable response time when hit a button (or link)? Sub-seconds?
One would want to tune the system run as quick as possible, which means
allowing an application to use all the available resource (bufferpool in thiscase) and be done with it.
When an OLTP query takes about several seconds or more, user might just
navigate away from the site. In some cases, that means potentially loose
business.
-
8/4/2019 EU08D01
8/36
8
8
Example 2. Intra_parallel
Turning Intra_Parallel OFF freed up about 1.5GBreal memory and 2 GB swap memory in an 32-bitSun/Solaris system saved system from crashing
Disabling the Intra_parallelism improved someapplication performance by 2~5%
Conclusion: choose the features wisely
Problem: system crashed because swap memory was exhausted.
Parallelism is a great feature. However, would it help you?
How did I know it was the intra_parallel=YES that caused the crash?
Eerror message suggested that No FCM request blocks are available
(SQL6043C) ; Number of FCM request blocks (FCM_NUM_RQB) can not
be increased
2 GB memory saving means a great deal on a 4-way (Sun V880) box.
Analogy example for this would be that for a simple job that requires climbing
a ladder, one person can do the job just fine. Two people would be crowded,
and might cause crash!
-
8/4/2019 EU08D01
9/36
9
9
Writing optimal queries/program,
exploring the query features wisely
Too many to mention
A Simple Query Example
Select C1,Cx from T1 where C1 in (x,y) optimizefor 1000 rows
What is the expected resultSet?
Is the blocking necessary?
Local or n-tier system
Select C1,Cx from T1 where C1 in (x,y,) optimize for 1,000 rows
Even a simple query like the above requires the careful coding is the
blocking really needed? What is the expected resultset? Local database or
remote? Too often we have seen such clause show up in the OLTP applicationqueries, which caused the performance problem to users.
-
8/4/2019 EU08D01
10/36
10
10
Example 3. Using result set block vs non-block undervarious APIs (Win2k-390 system, 100,000 rows)
0.0410.364.94ADO
0.0010.461.93JDBC T2
0.0210.746.49CLI
0.0310.645.59Embedded
Stdev/aveR.T.*Stdev/aveR.T.*
BLOCKING (optimize forN rows)
NON-BLOCKINGAPI
R.T. = relative time against the same API used
Row blocking is a technique that reduces database manager overhead by retrieving a block ofrows in a single operation. These rows are stored in a cache, and each FETCH request in theapplication gets the next row from the cache. When all the rows in a block have beenprocessed, another block of rows is retrieved by the database manager.
Our test data of fetching 100,000 rows from a 10 columns table (rs=239 bytes, number of rowsper block is 84) in a win2k-zOS system indicated that without blocking, results can befluctuated (being that stdev vs Average is higher), and about 2-6 times slower than that usingblocking.
The cache is allocated when an application issues an OPEN CURSOR request and isdeallocated when the cursor is closed. The size of the cache is determined by a configurationparameter which is used to allocate memory for the I/O block. The database managerparameter used depends on whether the client is local or remote:
For local applications, aslheapsz (default 15 x 4K) is used to allocate the cache for rowblocking.
For remote applications, rqrioblk (default 32K) on the client workstation is used to allocatethe cache for row blocking. The cache is allocated on the database client.
-- just in case someone wants to know how to determine the size
Rowsperblock=aslheapsz*4096/rsRowsperblock=rqrioblk/rs
UNAMBIG, ALL, NO
-- what if the query only return a handful or several records?
The blocking would make query response time longer? Because it would try to find outthe first N rows, until it could not get as many rows as specified.
-
8/4/2019 EU08D01
11/36
11
11
Example 4.1. Reuse the Statement via Parameter Markers
intmain () {
SQLHANDLE henv, hdbc, hstmt;char * sqlstmt = (SQLCHAR *) INSERT INTO T1 (C2, C5) VALUES(?,?);SQLINTEGER *col2, lvalue;SQLCHAR *col5;int rc=0, pass=0;/* allocate henv, hdbc, connect to database *//* allocate statement handle */rc = SQLAllocHandle (SQL_HANDLE_STMT, hdbc, &hstmt);/* prepare the statement */
rc = SQLPrepare (hstmt, sqlstmt, SQL_NTS);/* assign values to the input variables */col2 = (SQLINTEGER *)malloc(sizeof(int)); *col2=1;col5 = (SQLCHAR *) malloc((sizeof(char))*100);strcpy ((char *)col5, "my 100 characters string, but could be shorter");/* bind the values to the parameter markers */rc = SQLBindParameter(hstmt, 1,
SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0,(SQLINTEGER *)col2, sizeof((SQLINTEGER *)col2 ),&lvalue);
rc = SQLBindParameter(hstmt, 2,SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0,(SQLCHAR *)col5, 100, NULL );
/* execute the statement, assume that 100,000 rows to be inserted into the table */
while(pass++
-
8/4/2019 EU08D01
12/36
12
12
Example 4.2. Reuse the Statement via Parameter Markers
int main () {
/* allocate henv, hdbc, connect to database allocate statement handle *//* prepare the statement */
rc = SQLPrepare (hstmt, sqlstmt, SQL_NTS);/* assign values to the input variables, bind the values to the parameter markers,
execute the statement, assume that 100,000 rows to be inserted into thetable */
while(pass++
-
8/4/2019 EU08D01
13/36
13
13
Using Appropriate APIs for a
Given Requirement Scenario: An on going batch job to set document
status for a list of docIDs passed. Time is essential
Shell script meant to be interactive (input/invokeCLP/SQL/Commit)
Programming language, such as C, may allowstreamline the logic, reuse the stmts, more cursormanipulation options etc..
C:Perl:ksh(opt):ksh(prim) =1:3.76:302:1066
What presented here is a simple update statement that needs to be executed
frequently with a list of the record-IDs as input.
Update table1 set c1=U where c2 in (?)
What needed was a streamline program, quick and efficiently process the
documents.
Efficiency is the key. Numbers were collected against a local database. No
network traffic was involved, difference is purely caused by API difference
-
8/4/2019 EU08D01
14/36
14
14
Example 5. Several APIs Performance
Comparison in a Local Solaris System
0
200
400
600
800
1000
1200
RelativeTime
C Perl Ksh (opt) Ksh (prim)APIs
1 3.76
302
1066
C:Perl:ksh(opt):ksh(prim) =1:3.76:302:1066
(50,000 records for testing, updating)
C CLI well written, prepare stmt once, reuse it.
Perl prepare stmt once, reuse it, one more layer of the Interface
Ksh (opt) auto commit off, quiet, remove the unnecessary print steps etc..
Ksh (prim) interactive, stdout IO, redundancy with auto commit on this is
more likely some people would be programming, quick and dirty code barely
make it work.
-
8/4/2019 EU08D01
15/36
15
15
Example 6. APIs Performance in
two-tier DB2 Connect System
1 1.03
1.31
1.471.56
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
RelativeTime
Embedded CLI ADO/IBM
Provider
ADO/MS
Bridge
JDBC T2
Driver
APIs
Notice that numbers in this slide are collected in 2-tier system using a
composite workload (all kind of the SQLs)
The comparison data of using CLI, JDBC (Driver Type 2), ADO (use both
IBM OLE DB Provider for DB2 Server and Microsoft OLE DB Bridge forODBC Drivers), and Static Embedded SQL in a Windows2000 - zOS two-tier
system. DB2 Connect Server was on the Windows 2000 application client.
If the time for using Embedded-SQL is normalized to 1.00, the performance
sequence for fetching data using various APIs (fastest to slowest) is Embedded
SQL (1.00), CLI (1.03), ADO/IBM provider (1.31), ADO/Microsoft Bridge
(1.47), and JDBC (1.56). DB2 CLI is comparable to the Embedded SQL! IBM
Provider outperformed Microsoft Bridge. JDBC is just as expected.
The magnitude of difference among APIs is that 2-tier system is smaller than
that in local system. It could be because that in a multiple tier system, more
factors came into play, such as MF server generally slower; and data transfer
between server and client.
-
8/4/2019 EU08D01
16/36
16
16
Example 7. Performance of three fetch APIswith different data type in binding
SQLGetData() -
SQL_C_CHAR
3.55SQLGetData - PDT
3.32
SQLBindCol -
SQL_C_CHAR
1.38SQLBindCol - PDT
1
SQLFetchScroll -
SQL_C_CHAR
1
SQLFetchScroll - PDT
0.89
0
1
2
3
4
RelativeTime
For fetching data, 10 cols x 200,000 rows in our test case, if the time for usingtypical SQLBindCol() is normalized to 1.00, the performance sequence fromthe fastest to the slowest is:
Proper data type in binding Using SQL_C_CHAR in binding
SQLFetchScroll 0.89 vs 1
SQLFetch/SQLBindCol 1 vs 1.38
SQLGetData 3.32 vs 3.55
Using proper data type in binding is always better than using SQL_C_CHAR,Therefore, use proper data type in binding, and use array fetch wheneverpossible
Typically, an application may choose to allocate the maximum memory thecolumn value could occupy and bind it via SQLBindCol(), based oninformation about a column in the result set (obtained via a call toSQLDescribeCol(), for example, or prior knowledge). However, in the case of
character and binary data, the column can be arbitrarily long. If the length ofthe column value exceeds the length of the buffer the application can allocateor afford to allocate, a feature of SQLGetData() lets the application userepeated calls to obtain in sequence the value of a single column in moremanageable pieces. This API may be used to Java or GUI type of theapplication. Tradeoff is the slow performance.
-
8/4/2019 EU08D01
17/36
17
17
Example 8. SQLFetch Orientation
11.2
2.9
0
1
2
3
RelativeTime
FORW
ARD_
ONLY
STATIC
KEYS
ET-DRIVE
N
SQL_CURSOR Orientation
Cursor Type and SQLFetchScroll()
In the above examples, the fetch was sequential, i.e., retrieving rows starting with the first row, and ending with the lastrow. In that case, we know SQLFetchScroll() gives the best performance. What if an application to allow the userto scroll through a set of data both forwards and backwards? DB2 CLI has three type of the scrollable cursors
(1) forward only (default) cursor - can only scrolls forward.
(2) static read-only cursor - is static, once it is created no rows will be added or removed, and no value in any rowswill change
(3) keyset-driven cursor - has ability to detect changes to the underlying data, and the ability to use the cursor to
make changes to the underlying data. Keyset-driven cursor will reflect the changed values in existing rows, anddeleted rows; but it will not reflect added rows. Because the set of rows is determined once, when the cursor isopened. It does not re-issue the select statement to see if new rows have been added that should be included.
To be able to scroll through the cursor back and forth, cursor has to be defined as SQL_CURSOR_STATIC orSQL_CURSOR_KEYSET_DRIVEN. The position of the rowset within the result set can be specified asSQL_FETCH_NEXT, SQL_FETCH_FIRST, SQL_FETCH_LAST, SQL_FETCH_RELATIVE,SQL_FETCH_ABSOLUTE, SQL_FETCH_PRIOR, and SQL_FETCH_BOOKMARK in the SQLFetchScroll()call.
Performance impact
From the performance point of the view, a static cursor involves the least overhead, if the application does not need theadditional feature of a keyset-driven cursor then a static cursor should be used. If the application needs to detectchanges to the underlying data, or needs to add, update, or delete data from the result set, then the keyset-drivencursor may be used. Also, if one needs to scroll the cursor back and forth, cursor type needed to be set toSQL_CURSOR_STATIC, the default value for the type of scrollable cursor isSQL_CURSOR_FORWARD_ONLY. If we compared the performance for fetching data using STATIC andKEYSET-DRIVEN with that using FORWARD_ONLY, we would see 1.2 and 2.9 times slower for Static and
Keyset-drive cursor respectively compared to forward only cursor. I.e., the features come with a cost.
An example of using various type of the cursors in array fetch with specified fetch orientation (see next slide)
-
8/4/2019 EU08D01
18/36
18
18
Sample Code of Using Static Cursor
/* cursor type has to be specified via SQLSetStmtAttr() before theSQLPrepare() */
rc = SQLSetStmtAttr ( hstmt,
SQL_ATTR_CURSOR_TYPE,
(SQLPOINTER) SQL_CURSOR_STATIC,
0);
rc = SQLParepare(hstmt, sqlstmt, SQL_NTS);
/* */
/* fetch orientation may be specified in SQLFetchScroll() */
rc = SQLFetchScroll(hstmt, SQL_FETCH_FIRST, 0);/* */
To be able to scroll through the cursor back and forth, cursor has to be definedas
SQL_CURSOR_STATIC or
SQL_CURSOR_KEYSET_DRIVEN.
The position of the rowset within the result set can be specified as
SQL_FETCH_NEXT
SQL_FETCH_FIRST
SQL_FETCH_LAST
SQL_FETCH_RELATIVE
SQL_FETCH_ABSOLUTE
SQL_FETCH_PRIOR and
SQL_FETCH_BOOKMARK
in the SQLFetchScroll() call.
An example of using STATIC or KEYSET_DRIVEN cursor would be similarto that illustrated in the Sample code, except defining the cursor type andspecifying the fetch orientation
-
8/4/2019 EU08D01
19/36
19
19
Example 9. Insert APIs Performance
1
0.85 0.81
0.42 0.420.36
0
0.2
0.4
0.6
0.8
1
RelativeTime
SQLB
indPa
ram...
SQLE
xtendedBi
nd
NotLo
ggedInitially
Array_Ins
ert(1
00)
CHAINN
ING
CLIU
SE_LOA
D
SQL Insert APIs
For inserting data, if the time for inserting 100,000 rows, one at a time usingSQLBindParameter() is normalized to 1.00,
the performance sequence from fastest to the slowest is
CLI USE_LOAD (0.36) - CLI API invokes LOAD; large data
CHAINING (0.42) - referred to as CLI array input chaining. AllSQLExecute() requests associated with a prepared statement will not be sentto the server until either the SQL_ATTR_CHAINING_END statement attributeis set, or the available buffer space is consumed by rows that have beenchained.
Array Insert (0.42, Size 100) Inserting multiple rows
Row Insert with Not Logged Initially Activated (0.81) - reducing the logging
SQLExtendedBind (0.85) bind array of the columns, some restrictions apply
SQLBindParameter(1.00) - typical
Had one only used single row insert via SQLBindParameter(), he would have
missed a lot of the great options that CLI has to offer.When Array size > 10, changing size does not have significant impact
Reducing logging with the NOT LOGGED INITIALLY parameter
SQLExtendedBind()
This function can be used to replace multiple calls to SQLBindCol() orSQLBindParameter(), however, important differences should be noted.
-
8/4/2019 EU08D01
20/36
20
20
Typical Row Insert
rc = SQLBindParameter(hstmt, 1,SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0,(SQLINTEGER *)col1, sizeof((SQLINTEGER *)col1 ),&lvalue );
rc = SQLBindParameter(hstmt, 2,SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0,(SQLCHAR *)col2, 100, NULL );
/* execute the statement, assume that n (100,000) rows to be inserted */while(pass++
-
8/4/2019 EU08D01
21/36
21
21
Array Insert
/* just make up some values for column Col1 and Col2 */SQLINTEGER col1[]= {1,2,3,4,5,6,7,8,9,10, 100};SQLCHAR col2[100][100]= {"A1","B2,C3,D4,E5,F6,G7,H8,I9,J10,z100};/* set array size, 100 for our sample code */
rc=SQLSetStmtAttr(hstmt,SQL_ATTR_PARAMSET_SIZE,(SQLPOINTER)100, 0);
/* bind the values to the parameter markers, which is the same as before except this time col1and col2 are arrays */
rc = SQLBindParameter(hstmt, 1,
SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,0, 0, col1, 0, NULL);
rc = SQLBindParameter(hstmt, 2,
SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,100, 0, col2, 100, NULL );
while(pass++
-
8/4/2019 EU08D01
22/36
22
22
Chaining
/* */
rc = SQLSetStmtAttr(statement.hstmt,
SQL_ATTR_CHAINING_BEGIN,
(SQLPOINTER) TRUE,
0);
while ( pass++
-
8/4/2019 EU08D01
23/36
23
23
Use Load API
/* allocate henv, hdbc, connect to database, allocate statement handle,
prepare the statement, assign values to the input variables, bind thevalues to the parameter markers */
/* begin to use load */rc = SQLSetStmtAttr(hstmt, SQL_ATTR_USE_LOAD_API,
(SQLPOINTER) SQL_USE_LOAD_INSERT, 0 );/* execute the statement, assume that wed like to insert 100000 rows
into the table */while(pass++
-
8/4/2019 EU08D01
24/36
24
24
Create Necessary Indexes
Bottleneck Queries First
Including Stored Procedures, Triggers
Only those needed Indexes can help, could alsohurt
How do we know indexes are needed?
0. identify bottleneck queries snapshot and event monitor data
1. db2advis is a good tool to start
2. Analyzing the access plan, find the bottlenecks, try to come up an index to
reduce the cost
3. Testing the index(es) created, ensure it improve the bottleneck queries w/o
hurting other queries too much
-
8/4/2019 EU08D01
25/36
25
25
Example 10. SQLs In The Procedures
Trigger on icmut01005001CREATE TRIGGER CML.TG03_ICMUT01005001 AFTER UPDATE OFATTR0000001024 ON CML.ICMUT01005001 REFERENCING NEW AS NEWFOR EACH ROW MODE DB2SQL WHEN (UPPER(NEW.attr0000001024) notin ('IC','CN') OR NEW.attr0000001024 is null) BEGIN ATOMIC CALLCML.ICHG_QUE_PROC (NEW.ATTR0000001021, NEW.ATTR0000001024,NEW.ATTR0000001025); END
SP on ICHG_QUE tableCREATE PROCEDURE CML.ICHG_QUE_PROC (IN ATTR1021CHARACTER(26), IN ATTR1024 CHARACTER(2), IN ATTR1025 TIMESTAMP)SPECIFIC CML.ICHG_QUE_PROC LANGUAGE SQL MODIFIES SQL DATABEGIN DECLARE V_CNT INTEGER DEFAULT 0; SELECT count(*) INTOV_CNT FROM CML.ICHG_QUE WHERE CML.ICHG_QUE.ATTR0000001021 =ATTR1021 WITH UR; IF V_CNT < 1 THEN INSERT INTO CML.ICHG_QUEATTR0000001021, ATTR0000001024, ATTR0000001025) VALUES(ATTR1021, ATTR1024, ATTR1025); END IF; END
No index on ATTR0000001021 which is docID
In some case a bottleneck SQL may not that obviously. For example when you
have triggers or stored procedure calls, you may need to examine what SQLs
in them.
In the example above, obviously a trigger is defined to call a procedure when
certain condition is met. The procedure contains a SQL counting for
something. Unfortunately the count(*) stmt does not have the column in the
WHERE clause defined as index, therefore a tablescan was inevitable
whenever there is a modification on a table attribute.
How many system could afford a tablescan?
-
8/4/2019 EU08D01
26/36
26
26
An Index That Reduced The Cost
IXSCAN
( 3)
50.04 (cost)
2 (IO)
|
477516
INDEX: CML
QUE1021
TBSCAN
( 3)
9539.24 (cost)
2318 (IO)
|
474808
TABLE: CML
QUE
IndexScan after the Index isadded on QUE (attr1021)TableScan Happened Beforethe Index Addition
Tests on the laboratory server and production system indicated that this index
addition has increased the performance by 230% using C/CLI where a few
thousands records are in the table.
What if there are more than a few thousands records in the table?
-
8/4/2019 EU08D01
27/36
27
27
Example 11. Where Should The
Indexes Be?Stmt: update CML.DocTab
set docType =X
where docID=? and docType in (Y,Z)
docID is unique and docType is not, where the indexshould be?
Which ever has higher cardinality.
-
8/4/2019 EU08D01
28/36
28
28
An index that may hurt the performanceWhat if an index is defined on docType?
/--------+--------\
1 0.0202847
FETCH TBSCAN
( 5) ( 7)
100.048 0.0457899
4 0
/---+---\ |
1 1.28141e+06 2
IXSCAN TABLE: CML TEMP
( 6) DocTab ( 8)
75.0417 0.0159013
3 0
| |
1.28141e+06 2
INDEX: CML TBSCAN
IndeX2 ( 9)
6.67186e-05
0
|
2TABFNC: SYSIBM
GENROW
/---+---\
1 1.28141e+06
IXSCAN TABLE: CML
( 5) DocTab
75.0417
3
|
1.28141e+06
INDEX: CML
Index2
Addition Index on doctypeBefore Adding the Index
During examination of the query access plan, it was noticed that dropping an
unnecessary index eliminated three extra operations on temp tables for the
update SQL statement, and further improved the performance by nearly 40-
times (60 minutes work of updating 50k rows is completed in 1.5 minutes).
Why? doctype has low cardinality
Stmt: update CML.DocTab set docType ='DR' where docID=? and
docType in ('CN','IC')
Choose an index on column(s) that has more cardinality (i.e., docID)
-
8/4/2019 EU08D01
29/36
29
29
Example 12. APIs +/- Index Effect
Right indexing (add what needed, remove unnecessary), plus proper APIs,
have made 466x performance gain.
From the Figures above, Indexes effect made API effects appeared small,
however you are looking at double / triple / quadruple differences among theAPIs.
-
8/4/2019 EU08D01
30/36
30
30
Time Saved (Indexes + APIs)
0
100
200
300
400
500
600
HoursNeededPerYear
Proactive Optimization Process
Existing Code
Optimized (1st Year Including 40 hrs Coding Effort)
Optimized (Subsequent Years)
584
1+401
Considering the ongoing maintenance, each site may process as many as 2~3
million records per year. It would take the original ksh script 584 hours, or the
third partys Legacy program 1368 hours, to complete the job. The optimized
approach can complete the job in 1.3 hours.
Taking first years 40 hours effort of optimizing the methods into account, the
first years hours for marking documents were reduced from 584 (ksh script)
hours to 41 hours, this represents a net 1st year savings of 543 hours on each
site. Subsequent years net saving would be 583 hours on each site. There are 7
(N) such sites on our program.
Points are
Using the appropriate API for the right job. For example, C/CLI is much
faster than ksh script for batch job processing of many records.
Creating indexes wisely. i.e., adding a necessary index or dropping an
unnecessary index.
Some legacy code has had patches+patches+patches would it worthy re-
writing the core pieces of the code?
-
8/4/2019 EU08D01
31/36
31
31
Proactive Maintenance
Reorg (online vs offline)
Append_mode (online insertion)
Runstats (various options)
Monitor switches - do they need to be on?
When you have taken care of the Indexes, bufferpools, cfg parameters, logs,
sort, APIs etc.. What else would you do?
How about a stress test to push the system to a level where potential
bottlenecks may become apparent?How about proactive maintenance?
Is your database need reorg (reorgchk)? Do I have time and resource to reorg?
How often do I need to update statistics?
Is there a need to leave the monitor switches on?
-
8/4/2019 EU08D01
32/36
32
32
Example 13. APPEND_MODE
-29.870.750.06-75.470.04ON vsOFF
(diff %)
importUPDATESELECTINSERT/select
DELETE
Online pages reorganization could have its pros and cons.
Turing append mode ON helps the insert performance, however nightly or
weekly reorg is needed.
When APPEND_MODE is set to ON, new rows are always appended to the
end of the table. No searching or maintenance of FSCRs (Free Space ControlRecords) takes place. This option is enabled using the ALTER TABLE
APPEND ON statement, and can improve performance for tables that only
grow, like journals.
Performance test is needed to verify, because if does have slight performance
degradation on select stmt.
-
8/4/2019 EU08D01
33/36
33
33
Example 14.1 Runstats Options Effect
38
1
05
10152025303540
RelativeTime
DEFAULT Detailed
Detailed runstats option
NUM_FREQVALUES from 10 to 100,NUM_QUANTILES from 20 to 200
Warning: Performance tests are needed to validate that the option change might help your applications.
This a case of improving data validation utility (select queries mostly)
RUNSTATS ON TABLE schema.OBJECTS ON ALL COLUMNS WITH DISTRIBUTION ON KEY COLUMNSDEFAULT NUM_FREQVALUES 100 NUM_QUANTILES 200 AND DETAILED INDEXES ALL ALLOW WRITEACCESS;
NUM_FREQVALUES
Defines the maximum number of frequency values to collect. It can be specified for an individual column in the ONCOLUMNS clause. If the value is not specified for an individual column, the frequency limit value will be picked upfrom that specified in the DEFAULT clause. If it is not specified there ei ther, the maximum number of frequencyvalues to be collected will be what is set in the NUM_FREQVALUES database configuration parameter.
Current value Number of frequent values retained (NUM_FREQVALUES) = 10
The "most frequent value" statistics help the optimizer understand the distribution of data values within a column. Ahigher value results in more information being available to the SQL optimizer but requires additional catalog space.When 0 is specified, no frequent-value statistics are retained, even if you request that distribution statistics be collected.
NUM_QUANTILES
Defines the maximum number of distribution quantile values to collect. It can be specified for an individual column inthe ON COLUMNS clause. If the value is not specified for an individual column, the quantile limit value will be pickedup from that specified in the DEFAULT clause. If it is not specified there either, the maximum number of quantilevalues to be collected will be what is set in the NUM_QUANTILES database configuration parameter.
Current number of quantiles retained (NUM_QUANTILES) = 20
The "quantile" statistics help the optimizer understand the distribution of data values within a column. A higher valueresults in more information being available to the SQL optimizer but requires additional catalog space. When 0 or 1 isspecified, no quantile statistics are retained, even if you request that distribution statistics be collected.
Increasing the value of these two parameters increases the amount of statistics heap (stat_heap_sz) used whencollecting statistics. The default value of statistics heap size (4KB) (STAT_HEAP_SZ) is 4384. You may have toincrease this configuration parameterl.
-
8/4/2019 EU08D01
34/36
34
34
Example 14.2 RUNSTATS CMD
RUNSTATS ON TABLE RMADMIN.RMOBJECTS ONALL COLUMNS WITH DISTRIBUTION ON KEYCOLUMNS DEFAULT
NUM_FREQVALUES 100
NUM_QUANTILES 200
AND DETAILED INDEXES ALL ALLOW WRITEACCESS ;
Default value for num_freqvalues = 10,
num_quantiles = 20
-
8/4/2019 EU08D01
35/36
35
35
How To Identify A Bottleneck?
Collecting and analyzing the debug data using basic system tools(vmstat, top, prstat, sar, pmap, iostats etc.); DB2 native tools(snapshot, event monitor, access plan, db2pd, db2advis etc..); andprofiling tools if need to.
Query access plan - using the right indexes to reduce the cost of abottleneck queries
Exploring the APIs features based on your need. DB2 supportedAPIs (Embedded-SQL, CLI, JDBC, ADO, Perl, CLP), and theirperformance difference; fetch/insert orientations, statement attributes
Using the query options wisely, such as blocking features, parametermarking to reuse the statement if repeated calling the same. DBMSsupposed to do exactly what application (queries) requested to do
Understanding the application nature (OLTP or DSS or mixed), andtuning the DBM and DB configuration parameters accordingly;
Maintaining the database proactively to ensure the optimal databaseperformance
Could the bottlenecks identification and elimination be automated?
Is there anyone interesting in writing up a program that can automatically
identify performance bottlenecks and eliminate them? Stay tuned.
-
8/4/2019 EU08D01
36/36
36
Sigen Chen
Lockheed Martin
Baltimore, Maryland USA
Session D01
Bottlenecks Elimination in Real World DB2 Applications