8 years of performance solutions - lessons learned
DESCRIPTION
DB2 PerformanceTRANSCRIPT
1
8 Years of Performance Solutions: Lessons Learned
Scott HayesPresident & CEO, DBIIBM DB2 GOLD Consultant
Session: D02
Monday, May 8, 2006 • 1:00 p.m. – 2:10 p.m.
Platform: DB2 UDB for Linux, UNIX, and Windows
2
2
8 Years of UDB Performance Solutions:Life Lessons from the field
• Planning, Goals, Service Level Agreements• A Performance Achievement Methodology• Critical Performance Metrics and Analysis• Physical Design Solutions for GREAT Results• Life Lessons from the field
3
3
Lessons Learned: Vision Required
BEGIN WITH
THE END IN MIND
What are your goals?If you don’t know where you want to be, you won’t ever get there.
4
4
Travel is pointless without a destination
• What are your performance tuning goals?• Avoid Hardware Upgrades / reduce CPU? • Improve Transaction Response Time?• Add more users?• Reduce Phone Rage?
• Service Level Agreement Requirements?• 95% of transactions complete < 5 seconds?• Application is “Available” 99.92%?
• What does Available mean?
SEEK FIRST TO UNDERSTAND,
THEN TO BE UNDERSTOOD
You need to establish defined goals otherwise you’ll never know whether or not you are or were successful.
Understand your businesses needs, and then your business will be more receptive to changes you would like to make.
5
5
Lessons Learned : You Need a Game Plan
FAILURE TO PLAN IS
PLANNING TO FAIL
Wisdom worth remembering.
6
6
A Performance Methodology
Measure OverallDatabase Performance
(Key Metrics)
ImplementPhysical Design
Changes
Modify DBM, DB, and BP
Configurationsas appropriate
IdentifyCostly SQL
•Random Read %•Table Rows Read/TX•Sufficient Agents•BP & Cache Hit Ratios
High CPU CostsHigh Sort CostsHigh Avg Elapsed TimesInefficient Index Usage
Engage IBM Index/Design Advisor
Update DB, DBM CFGAlter Bufferpool BPNAME
PerformanceMonitoring
SQL Analysis
& Tuning
Administration &Space Management
Adm
inis
trat
ion
GAME PLANPLAY
TO WIN!
Tuning is an iterative process. Change one thing at a time. Assess the health of the database. If the database health is poor, then identify the tables that are getting in trouble with high I/O rates, then find the statements that are the driving force behind those high I/O rates. Modify the physical design (add, change, drop indexes, use MDC, MQT/AST tables, etc) to reduce the table I/O burden. Re-measure, repeat.
For more details, a free white paper on DB2 UDB LUW tuning is available at http://www.database-brothers.com/papers.php
7
7
Database Health Check• Random Read % (or
Sync Read % (SRP)) =100 - ( Data + Index AsyncReads * 100 / Data + Index Total Reads )
• db2 “Get Snapshot for Database”
> 90% 80 - 90 %50 - 80 %< 50 %
These percentages are valid for OLTP databases. Details and guidelines for data warehouse databases are available in a free DB2 UDB Tuning White Paper available from http://www.database-brothers.com/papers.php Generally, DW RR% should be 10-30%.
8
8
Table I/O Health Check• Compute Average Rows
Read per Transaction (Commits + Rollbacks) for database and each table.
• Get Snapshot for Database + Tables
RR/TX < 10 RR/TX > 10 & < 50RR/TX > 50RR/TX > 100
Rank tables by most heavily read to least… trouble
makers are at the top!!!
Because not every transaction accesses every table is precisely why this guideline formula works so well. We expect RR/TX to be in the single digits.
9
9
Lessons Learned : Laser Focus
A WELL STATED PROBLEM IS A HALF SOLVED
PROBLEM
You cannot solve a problem until you KNOW what problem you are trying to solve. Be careful and be wary – you may find several symptoms easily but have to dig and dig to find root cause problems.
10
10
Workload Analysis > Well Stated Problem• 80+% of Tuning Benefit comes from complete and
accurate understanding of the SQL workload and its costs
• What is the most costly, most harmful, SQL during peak periods? Recent periods? Over time?• Highest CPU Consumption• Highest Sort Time Consumption• Highest average Elapsed times• Highest Read I/O (rows read)
• Grouping & Cost Aggregation of similarly structured SQL statements is imperative to “True Cost” determination
A comprehensive statement workload cost analysis will often provide you with the required “well stated problem”.
11
11
SQL Equalization & Cost Aggregation
• How DB2 Sees the SQL Workload:
• How the DBA needs to see the SQL Workload:
Select c1, c2, c4 from tbl where c5 = ‘0210’ cpu=.1
SQL Statement Count TotCPUSelect c1, c2, c4 from tbl where c5 = ‘?’ 1 .1
Select c1, c2, c4 from tbl where c5 = ‘0220’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 2 .2
Select c1, c2, c4 from tbl where c5 > ‘0500’ cpu=.3
Select c1, c2, c4 from tbl where c5 > ‘?’ 1 .3
Select c1, c2, c4 from tbl where c5 > ‘8800’ cpu=.3
Select c1, c2, c4 from tbl where c5 > ‘?’ 2 .6
Select c1, c2, c4 from tbl where c5 = ‘0300’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 3 .3
Select c1, c2, c4 from tbl where c5 = ‘0400’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 4 .4
Select c1, c2, c4 from tbl where c5 = ‘0450’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 5 .5
Select c1, c2, c4 from tbl where c5 = ‘0490’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 6 .6
Select c1, c2, c4 from tbl where c5 = ‘0670’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 7 .7
Select c1, c2, c4 from tbl where c5 = ‘0680’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 8 .8
Select c1, c2, c4 from tbl where c8 = ‘Bob’ cpu=.2
Select c1, c2, c4 from tbl where c8 = ‘?’ 1 .2
Select c1, c2, c4 from tbl where c5 = ‘0110’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 9 .9
Select c1, c2, c4 from tbl where c5 = ‘0120’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 10 1.0
Select c1, c2, c4 from tbl where c5 = ‘0190’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 11 1.1
Select c1, c2, c4 from tbl where c5 = ‘0390’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 12 1.2
Select c1, c2, c4 from tbl where c5 = ‘0790’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 13 1.3
Select c1, c2, c4 from tbl where c5 = ‘2380’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 14 1.4
Select c1, c2, c4 from tbl where c5 = ‘4560’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 15 1.5
Select c1, c2, c4 from tbl where c5 = ‘0360’ cpu=.1
Select c1, c2, c4 from tbl where c5 = ‘?’ 16 1.6 CPU%66.6
25.0
8.33
Totals: 19 2.4 100.00
100’s of SQL statements per second…
Relative
CostsSQL Snapshot shows 19 different statements!
WRONG ANSWER!
SQL Equalization and Cost Aggregation is discussed in painful detail in US Patent # 6,772,411
12
12
Solving Problems > Effective Solutions• Given a costly SQL statement, 3 possible solutions:1) Physical Design Change (95%)
Add an IndexAdd/modify Cluster Index
•Most potent weapon against poor application performanceDrop Ineffective/Costly Indexes
•Low Cardinality, Skewed Distributions, Redundant IndexesGenerated Columns with new supporting IndexMQT/AST tables
2) Tweak Catalog Statistics to “fool” optimizer (2%)A temporary and difficult to maintain “solution”
3) Re-write/modify SQL (3%)The DB2 Optimizer Re-writes SQL. Isn’t re-writing re-written SQL redundant?
Given the odds, which horse would you bet on?
13
13
Lessons Learned : Small Stuff
SWEAT THE SMALL STUFF
Small drops of rain delivered over a period of 40 days and 40 nights had dramatic impact on earth – so says the Bible.
14
14
– Small Tables
• A simple SELECT executed with high frequency against a table with only 32 rows consumed 34% of ALL CPU time on an SMP 4-way
• Myth: Small tables don’t need indexes
• Realities: • Explains don’t identify costly SQL against small tables• Explains don’t consider frequency of execution• Only Dynamic SQL Equalization finds high cost SQL• Even ONE row tables can benefit from indexes
Doodle here.
15
15
Recipe for Success
63½When life hands you lemons, make lemonade…
sugar
Use big lemons and bottled spring water – then you’ll have gourmet lemonade.
16
16
Lessons Learned : SORT Costs
SORTIS A FOUR
LETTER WORD
Doodle here.
17
17
– SORTSORT is a Four Letter Word• “SELECT * from TB where C1 = ? Order By
C3” put an internet company out of business by consuming 68% of CPU on RS6K 8-way!
• Myth: Small Sorts are inconsequential• Realities:
• Small Sorts consume CPU!• Explains don’t consider frequency of execution• Only Dynamic SQL Equalization finds most
costly Sorts• Just *5* Clustering Indexes can often restore
50% of a machine’s capacity!!!
50% more capacity means twice as many users on the same hardware, or transaction rates going twice as fast, or CPU utilization cut so you can run other work on the machine.
18
18
Impact of SORTHEAP / Small Sorts Avoid Sort Overflows• 256 - “Small” Sort
Overflows 1MB SORTHEAP, 2 sorts, 1 overflows
• 512 - “Small” Sort completes within 2MB SORTHEAP, 2 sorts, 0 overflows
• Avoiding the overflow cut sort time by 20% (1/5th), reduced elapsed time by 10%, and sliced CPU burn by 15%
• 128 - No Order by clause (1 index rid sort)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
128 256 512 128/1
CPU Sec Sort SecElapsed Sec
For OLTP, < 3% sorts overflow
Sorts are evil, even small sorts.Use Clustering Indexes or MDC to eliminate and reduce sort costs.Get rid of “order by” or “distinct” if you don’t need them.
19
19
Lessons Learned : Efficient Indexes
Index AND-ing
Works, BUTComposite Indexes
work better!
It’s all about precision.
20
20
– Composite Indexes are Best• “SELECT * from TB where C1 = ? And C2 = ?
And C3 >= ?” caused an SLA to be missed and service contracts nearly lost
• Myth: Use 3 single column indexes on C1, C2, and C3
• Realities: • Index AND-ing can be CPU and I/O expensive• A single composite index on columns C1, C2, & C3
is dramatically faster and more efficient• The INDEX/Design Advisor favors composite
indexes, but identifying the costly SQL is the trick >> SQL EQ!
Single column indexes are expensive to maintain and provide insufficient filtration cardinality on their own – which leads to multiple index access – which requires processing multiple indexes – and that can be expensive!
21
21
Lessons Learned: Be PreparedThe
Boy Scout Motto
applies to asynchronous
I/O but not DB2 Indexes!
I am an Eagle Scout. I know it is important to be prepared. That’s why I pack extra socks for every trip.
22
22
Index Design• Indexes with Cardinality = 1 are a performance death
sentence. Do not create indexes “just in case”…• Indexes with Skewed distributions can be expensive to
maintain on Insert, Update, Delete• Redundant Indexes are expensive to maintain,
consume disk, and provide no value to DB2 – Drop them!• IX on C1, C2 <<- Redundant Index• IX on C1, C2, C4
• For multi-column indexes, place the column that is most frequently known (= predicate) first.
• Use Clustering Indexes to reduce Sort & CPU costs
For more helpful information on DB2 performance physical design, visit DBI’s site at http://www.database-brothers.com/db2tips.php
23
23
Lessons Learned : Performance Drag
Parachutes are for Sky
Diversand Drag
RacingWho else uses parachutes?
24
24
Closing Files = Putting the brakeson DB2• Default DB CFG MAXFILOP = 64
• TOO SMALL FOR MOST DATABASES!• Closing Files BURNS CPU and SLOWS
SQL!• Ensure “Database Files Closed = 0” via
Database Snapshots• Increase MAXFILOP until Files Closed = 0
Closing Files continues to be an ongoing DB2 UDB performance problem.
25
25
Lessons Learned : Poisons
ONE BAD APPLE
CAN SPOIL THE
WHOLE BUNCH
I witnessed this several times as a kid growing up.
26
26
Where’s the rotten fruit in your database?• Which tablespace has the slowest overall disk read
response time (ORMSORMS)?• Which tablespace has the slowest overall disk write
response time (OWMSOWMS)?• Which tablespace has the highest physical reads per
transaction (PRTXPRTX)? Logical Reads (LRTXLRTX)?• Is PREFETCH I/O effective and efficient (APPRAPPR)?• How do these tablespace metrics compare to the
overall average for the database?• Do any tables have (Overflows x 100/Rows Read) >
3%? Overflow Rows require double the logical I/O!
Get the rotten fruit out of your database before the whole thing gets spoiled!
27
27
Attendee Notes - Metrics• db2 “get snapshot for tablespaces on DBNAME”• ORMS = (Total buffer pool read time (ms) / (Buffer pool data
physical reads + Buffer pool index physical reads) )• OWMS = (Total buffer pool write time (ms) / (Buffer pool data
writes + Buffer pool index writes) )• PRTX = (Buffer pool data physical reads + Buffer pool index
physical reads) / # Commits + Rollbacks [from DB Snapshot]• LRTX = (Buffer pool data logical reads + Buffer pool index
logical reads) / # Commits + Rollbacks• APPR = (Asynchronous pool data page reads + Asynchronous
pool index page reads) / Asynchronous read requests• TB RR/TX = (Rows Read / # Commits + Rollbacks)
Bonus notes – adding notes to an attendee notes slide – now that’s a lot of notes! Have I mentioned yet that you can get a free white paper with even more formulas and techniques at http://www.database-brothers.com/papers.php ?
28
28
A Quote of NoteA Quote of Note
“You can’t build a reputation on what you are going to do”
- Henry Ford
People will judge and admire you based on your actual achievements. If you can apply some of what you learn from this presentation, you should be able to rise to “hero” status very quickly.
29
29
Lessons Learned: NEVER• Use the word ALWAYS• Listen to people who tell you what you CANNOT do • Rely on a single snapshot, or small time slice, for
making significant tuning decisions • what’s happening “right now” is often meaningless
•Stop looking at the trees and try to see the forest
• Use the word NEVER
Practice Forest Management – It is important to see the big picture.
30
30
Passion, Commitment, and Discipline
Anyone care to dance?
31
31
You’re “Done” Passionately Tuning When…•• OLTP:OLTP:• Rows Read/TX/TB < 10• DB BP Sync Reads > 90%• BP, Pkg Cache, & Catlg
Cache hit ratios > 95%• There are no bad apples
• No Slow TS (ORMS, OWMS) > 10ms
• No SQL > 10% CPU• No SQL > 50% SLA time• No SQL w/ Rows
Read/Rows Fetched > 100 (IXEFF)
• No Files Closed• No Lock or Token Waits
• Phone Rage Ends
•• Data Warehouse:Data Warehouse:• Prefetch is Effective (APPR > 10
for each TS)• No Slow TS (ORMS, OWMS)• TEMPSPACE defined where data
isn’t – has 3-6 containers• DB BP Sync Reads > 25%• Catlg Cache Hit > 95%• No Files Closed• SQL having Frequency>1 uses
• MQTs / ASTs• MDC tables• Effective Indexes
• Phone Rage Ends
Do you have a database that meets these criteria? Congratulations! Time to celebrate! But, remember, there’s always tomorrow and DB2 can change its mind when data volumes and number of users grow!
32
32
Scott HayesDatabase-Brothers Inc. (DBI)
www.Database-Brothers.com
Session D028 Years of Performance Solutions – Lessons Learned
The End. Please complete your session evaluation cards. Thank you!